Dynamics
Harvard Lomax and Thomas H. Pulliam
NASA Ames Research Center
David W. Zingg
University of Toronto Institute for Aerospace Studies
August 26, 1999
Contents
1 INTRODUCTION 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Problem Speciﬁcation and Geometry Preparation . . . . . . . 2
1.2.2 Selection of Governing Equations and Boundary Conditions . 3
1.2.3 Selection of Gridding Strategy and Numerical Method . . . . 3
1.2.4 Assessment and Interpretation of Results . . . . . . . . . . . . 4
1.3 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 CONSERVATION LAWS AND THE MODEL EQUATIONS 7
2.1 Conservation Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 The NavierStokes and Euler Equations . . . . . . . . . . . . . . . . . 8
2.3 The Linear Convection Equation . . . . . . . . . . . . . . . . . . . . 12
2.3.1 Diﬀerential Form . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 Solution in Wave Space . . . . . . . . . . . . . . . . . . . . . . 13
2.4 The Diﬀusion Equation . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 Diﬀerential Form . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.2 Solution in Wave Space . . . . . . . . . . . . . . . . . . . . . . 15
2.5 Linear Hyperbolic Systems . . . . . . . . . . . . . . . . . . . . . . . . 16
2.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3 FINITEDIFFERENCE APPROXIMATIONS 21
3.1 Meshes and FiniteDiﬀerence Notation . . . . . . . . . . . . . . . . . 21
3.2 Space Derivative Approximations . . . . . . . . . . . . . . . . . . . . 24
3.3 FiniteDiﬀerence Operators . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.1 Point Diﬀerence Operators . . . . . . . . . . . . . . . . . . . . 25
3.3.2 Matrix Diﬀerence Operators . . . . . . . . . . . . . . . . . . . 25
3.3.3 Periodic Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.4 Circulant Matrices . . . . . . . . . . . . . . . . . . . . . . . . 30
iii
3.4 Constructing Diﬀerencing Schemes of Any Order . . . . . . . . . . . . 31
3.4.1 Taylor Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4.2 Generalization of Diﬀerence Formulas . . . . . . . . . . . . . . 34
3.4.3 Lagrange and Hermite Interpolation Polynomials . . . . . . . 35
3.4.4 Practical Application of Pad´e Formulas . . . . . . . . . . . . . 37
3.4.5 Other HigherOrder Schemes . . . . . . . . . . . . . . . . . . . 38
3.5 Fourier Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5.1 Application to a Spatial Operator . . . . . . . . . . . . . . . . 39
3.6 Diﬀerence Operators at Boundaries . . . . . . . . . . . . . . . . . . . 43
3.6.1 The Linear Convection Equation . . . . . . . . . . . . . . . . 44
3.6.2 The Diﬀusion Equation . . . . . . . . . . . . . . . . . . . . . . 46
3.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4 THE SEMIDISCRETE APPROACH 51
4.1 Reduction of PDE’s to ODE’s . . . . . . . . . . . . . . . . . . . . . . 52
4.1.1 The Model ODE’s . . . . . . . . . . . . . . . . . . . . . . . . 52
4.1.2 The Generic Matrix Form . . . . . . . . . . . . . . . . . . . . 53
4.2 Exact Solutions of Linear ODE’s . . . . . . . . . . . . . . . . . . . . 54
4.2.1 Eigensystems of SemiDiscrete Linear Forms . . . . . . . . . . 54
4.2.2 Single ODE’s of First and SecondOrder . . . . . . . . . . . . 55
4.2.3 Coupled FirstOrder ODE’s . . . . . . . . . . . . . . . . . . . 57
4.2.4 General Solution of Coupled ODE’s with Complete Eigensystems 59
4.3 Real Space and Eigenspace . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3.2 Eigenvalue Spectrums for Model ODE’s . . . . . . . . . . . . . 62
4.3.3 Eigenvectors of the Model Equations . . . . . . . . . . . . . . 63
4.3.4 Solutions of the Model ODE’s . . . . . . . . . . . . . . . . . . 65
4.4 The Representative Equation . . . . . . . . . . . . . . . . . . . . . . 67
4.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5 FINITEVOLUME METHODS 71
5.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.2 Model Equations in Integral Form . . . . . . . . . . . . . . . . . . . . 73
5.2.1 The Linear Convection Equation . . . . . . . . . . . . . . . . 73
5.2.2 The Diﬀusion Equation . . . . . . . . . . . . . . . . . . . . . . 74
5.3 OneDimensional Examples . . . . . . . . . . . . . . . . . . . . . . . 74
5.3.1 A SecondOrder Approximation to the Convection Equation . 75
5.3.2 A FourthOrder Approximation to the Convection Equation . 77
5.3.3 A SecondOrder Approximation to the Diﬀusion Equation . . 78
5.4 A TwoDimensional Example . . . . . . . . . . . . . . . . . . . . . . 80
5.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6 TIMEMARCHING METHODS FOR ODE’S 85
6.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.2 Converting TimeMarching Methods to O∆E’s . . . . . . . . . . . . . 87
6.3 Solution of Linear O∆E’s With Constant Coeﬃcients . . . . . . . . . 88
6.3.1 First and SecondOrder Diﬀerence Equations . . . . . . . . . 89
6.3.2 Special Cases of Coupled FirstOrder Equations . . . . . . . . 90
6.4 Solution of the Representative O∆E’s . . . . . . . . . . . . . . . . . 91
6.4.1 The Operational Form and its Solution . . . . . . . . . . . . . 91
6.4.2 Examples of Solutions to TimeMarching O∆E’s . . . . . . . . 92
6.5 The λ −σ Relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.5.1 Establishing the Relation . . . . . . . . . . . . . . . . . . . . . 93
6.5.2 The Principal σRoot . . . . . . . . . . . . . . . . . . . . . . . 95
6.5.3 Spurious σRoots . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.5.4 OneRoot TimeMarching Methods . . . . . . . . . . . . . . . 96
6.6 Accuracy Measures of TimeMarching Methods . . . . . . . . . . . . 97
6.6.1 Local and Global Error Measures . . . . . . . . . . . . . . . . 97
6.6.2 Local Accuracy of the Transient Solution (er
λ
, σ , er
ω
) . . . . 98
6.6.3 Local Accuracy of the Particular Solution (er
µ
) . . . . . . . . 99
6.6.4 Time Accuracy For Nonlinear Applications . . . . . . . . . . . 100
6.6.5 Global Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.7 Linear Multistep Methods . . . . . . . . . . . . . . . . . . . . . . . . 102
6.7.1 The General Formulation . . . . . . . . . . . . . . . . . . . . . 102
6.7.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.7.3 TwoStep Linear Multistep Methods . . . . . . . . . . . . . . 105
6.8 PredictorCorrector Methods . . . . . . . . . . . . . . . . . . . . . . . 106
6.9 RungeKutta Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.10 Implementation of Implicit Methods . . . . . . . . . . . . . . . . . . . 110
6.10.1 Application to Systems of Equations . . . . . . . . . . . . . . 110
6.10.2 Application to Nonlinear Equations . . . . . . . . . . . . . . . 111
6.10.3 Local Linearization for Scalar Equations . . . . . . . . . . . . 112
6.10.4 Local Linearization for Coupled Sets of Nonlinear Equations . 115
6.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7 STABILITY OF LINEAR SYSTEMS 121
7.1 Dependence on the Eigensystem . . . . . . . . . . . . . . . . . . . . . 122
7.2 Inherent Stability of ODE’s . . . . . . . . . . . . . . . . . . . . . . . 123
7.2.1 The Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.2.2 Complete Eigensystems . . . . . . . . . . . . . . . . . . . . . . 123
7.2.3 Defective Eigensystems . . . . . . . . . . . . . . . . . . . . . . 123
7.3 Numerical Stability of O∆E ’s . . . . . . . . . . . . . . . . . . . . . . 124
7.3.1 The Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.3.2 Complete Eigensystems . . . . . . . . . . . . . . . . . . . . . . 125
7.3.3 Defective Eigensystems . . . . . . . . . . . . . . . . . . . . . . 125
7.4 TimeSpace Stability and Convergence of O∆E’s . . . . . . . . . . . . 125
7.5 Numerical Stability Concepts in the Complex σPlane . . . . . . . . . 128
7.5.1 σRoot Traces Relative to the Unit Circle . . . . . . . . . . . 128
7.5.2 Stability for Small ∆t . . . . . . . . . . . . . . . . . . . . . . . 132
7.6 Numerical Stability Concepts in the Complex λh Plane . . . . . . . . 135
7.6.1 Stability for Large h. . . . . . . . . . . . . . . . . . . . . . . . 135
7.6.2 Unconditional Stability, AStable Methods . . . . . . . . . . . 136
7.6.3 Stability Contours in the Complex λh Plane. . . . . . . . . . . 137
7.7 Fourier Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . 141
7.7.1 The Basic Procedure . . . . . . . . . . . . . . . . . . . . . . . 141
7.7.2 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7.7.3 Relation to Circulant Matrices . . . . . . . . . . . . . . . . . . 143
7.8 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
8 CHOICE OF TIMEMARCHING METHODS 149
8.1 Stiﬀness Deﬁnition for ODE’s . . . . . . . . . . . . . . . . . . . . . . 149
8.1.1 Relation to λEigenvalues . . . . . . . . . . . . . . . . . . . . 149
8.1.2 Driving and Parasitic Eigenvalues . . . . . . . . . . . . . . . . 151
8.1.3 Stiﬀness Classiﬁcations . . . . . . . . . . . . . . . . . . . . . . 151
8.2 Relation of Stiﬀness to Space Mesh Size . . . . . . . . . . . . . . . . 152
8.3 Practical Considerations for Comparing Methods . . . . . . . . . . . 153
8.4 Comparing the Eﬃciency of Explicit Methods . . . . . . . . . . . . . 154
8.4.1 Imposed Constraints . . . . . . . . . . . . . . . . . . . . . . . 154
8.4.2 An Example Involving Diﬀusion . . . . . . . . . . . . . . . . . 154
8.4.3 An Example Involving Periodic Convection . . . . . . . . . . . 155
8.5 Coping With Stiﬀness . . . . . . . . . . . . . . . . . . . . . . . . . . 158
8.5.1 Explicit Methods . . . . . . . . . . . . . . . . . . . . . . . . . 158
8.5.2 Implicit Methods . . . . . . . . . . . . . . . . . . . . . . . . . 159
8.5.3 A Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.6 Steady Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
9 RELAXATION METHODS 163
9.1 Formulation of the Model Problem . . . . . . . . . . . . . . . . . . . 164
9.1.1 Preconditioning the Basic Matrix . . . . . . . . . . . . . . . . 164
9.1.2 The Model Equations . . . . . . . . . . . . . . . . . . . . . . . 166
9.2 Classical Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
9.2.1 The Delta Form of an Iterative Scheme . . . . . . . . . . . . . 168
9.2.2 The Converged Solution, the Residual, and the Error . . . . . 168
9.2.3 The Classical Methods . . . . . . . . . . . . . . . . . . . . . . 169
9.3 The ODE Approach to Classical Relaxation . . . . . . . . . . . . . . 170
9.3.1 The Ordinary Diﬀerential Equation Formulation . . . . . . . . 170
9.3.2 ODE Form of the Classical Methods . . . . . . . . . . . . . . 172
9.4 Eigensystems of the Classical Methods . . . . . . . . . . . . . . . . . 173
9.4.1 The PointJacobi System . . . . . . . . . . . . . . . . . . . . . 174
9.4.2 The GaussSeidel System . . . . . . . . . . . . . . . . . . . . . 176
9.4.3 The SOR System . . . . . . . . . . . . . . . . . . . . . . . . . 180
9.5 Nonstationary Processes . . . . . . . . . . . . . . . . . . . . . . . . . 182
9.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
10 MULTIGRID 191
10.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
10.1.1 Eigenvector and Eigenvalue Identiﬁcation with Space Frequencies191
10.1.2 Properties of the Iterative Method . . . . . . . . . . . . . . . 192
10.2 The Basic Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
10.3 A TwoGrid Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
10.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
11 NUMERICAL DISSIPATION 203
11.1 OneSided FirstDerivative Space Diﬀerencing . . . . . . . . . . . . . 204
11.2 The Modiﬁed Partial Diﬀerential Equation . . . . . . . . . . . . . . . 205
11.3 The LaxWendroﬀ Method . . . . . . . . . . . . . . . . . . . . . . . . 207
11.4 Upwind Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
11.4.1 FluxVector Splitting . . . . . . . . . . . . . . . . . . . . . . . 210
11.4.2 FluxDiﬀerence Splitting . . . . . . . . . . . . . . . . . . . . . 212
11.5 Artiﬁcial Dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
11.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
12 SPLIT AND FACTORED FORMS 217
12.1 The Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
12.2 Factoring Physical Representations — Time Splitting . . . . . . . . . 218
12.3 Factoring Space Matrix Operators in 2–D . . . . . . . . . . . . . . . . 220
12.3.1 Mesh Indexing Convention . . . . . . . . . . . . . . . . . . . . 220
12.3.2 Data Bases and Space Vectors . . . . . . . . . . . . . . . . . . 221
12.3.3 Data Base Permutations . . . . . . . . . . . . . . . . . . . . . 221
12.3.4 Space Splitting and Factoring . . . . . . . . . . . . . . . . . . 223
12.4 SecondOrder Factored Implicit Methods . . . . . . . . . . . . . . . . 226
12.5 Importance of Factored Forms in 2 and 3 Dimensions . . . . . . . . . 226
12.6 The Delta Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
12.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
13 LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS 233
13.1 The Representative Equation for Circulant Operators . . . . . . . . . 233
13.2 Example Analysis of Circulant Systems . . . . . . . . . . . . . . . . . 234
13.2.1 Stability Comparisons of TimeSplit Methods . . . . . . . . . 234
13.2.2 Analysis of a SecondOrder TimeSplit Method . . . . . . . . 237
13.3 The Representative Equation for SpaceSplit Operators . . . . . . . . 238
13.4 Example Analysis of 2D Model Equations . . . . . . . . . . . . . . . 242
13.4.1 The Unfactored Implicit Euler Method . . . . . . . . . . . . . 242
13.4.2 The Factored Nondelta Form of the Implicit Euler Method . . 243
13.4.3 The Factored Delta Form of the Implicit Euler Method . . . . 243
13.4.4 The Factored Delta Form of the Trapezoidal Method . . . . . 244
13.5 Example Analysis of the 3D Model Equation . . . . . . . . . . . . . 245
13.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
A USEFUL RELATIONS AND DEFINITIONS FROM LINEAR AL
GEBRA 249
A.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
A.2 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
A.3 Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
A.4 Eigensystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
A.5 Vector and Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . 254
B SOME PROPERTIES OF TRIDIAGONAL MATRICES 257
B.1 Standard Eigensystem for Simple Tridiagonals . . . . . . . . . . . . . 257
B.2 Generalized Eigensystem for Simple Tridiagonals . . . . . . . . . . . . 258
B.3 The Inverse of a Simple Tridiagonal . . . . . . . . . . . . . . . . . . . 259
B.4 Eigensystems of Circulant Matrices . . . . . . . . . . . . . . . . . . . 260
B.4.1 Standard Tridiagonals . . . . . . . . . . . . . . . . . . . . . . 260
B.4.2 General Circulant Systems . . . . . . . . . . . . . . . . . . . . 261
B.5 Special Cases Found From Symmetries . . . . . . . . . . . . . . . . . 262
B.6 Special Cases Involving Boundary Conditions . . . . . . . . . . . . . 263
C THE HOMOGENEOUS PROPERTYOF THE EULER EQUATIONS265
Chapter 1
INTRODUCTION
1.1 Motivation
The material in this book originated from attempts to understand and systemize nu
merical solution techniques for the partial diﬀerential equations governing the physics
of ﬂuid ﬂow. As time went on and these attempts began to crystallize, underlying
constraints on the nature of the material began to form. The principal such constraint
was the demand for uniﬁcation. Was there one mathematical structure which could
be used to describe the behavior and results of most numerical methods in common
use in the ﬁeld of ﬂuid dynamics? Perhaps the answer is arguable, but the authors
believe the answer is aﬃrmative and present this book as justiﬁcation for that be
lief. The mathematical structure is the theory of linear algebra and the attendant
eigenanalysis of linear systems.
The ultimate goal of the ﬁeld of computational ﬂuid dynamics (CFD) is to under
stand the physical events that occur in the ﬂow of ﬂuids around and within designated
objects. These events are related to the action and interaction of phenomena such
as dissipation, diﬀusion, convection, shock waves, slip surfaces, boundary layers, and
turbulence. In the ﬁeld of aerodynamics, all of these phenomena are governed by
the compressible NavierStokes equations. Many of the most important aspects of
these relations are nonlinear and, as a consequence, often have no analytic solution.
This, of course, motivates the numerical solution of the associated partial diﬀerential
equations. At the same time it would seem to invalidate the use of linear algebra for
the classiﬁcation of the numerical methods. Experience has shown that such is not
the case.
As we shall see in a later chapter, the use of numerical methods to solve partial
diﬀerential equations introduces an approximation that, in eﬀect, can change the
form of the basic partial diﬀerential equations themselves. The new equations, which
1
2 CHAPTER 1. INTRODUCTION
are the ones actually being solved by the numerical process, are often referred to as
the modiﬁed partial diﬀerential equations. Since they are not precisely the same as
the original equations, they can, and probably will, simulate the physical phenomena
listed above in ways that are not exactly the same as an exact solution to the basic
partial diﬀerential equation. Mathematically, these diﬀerences are usually referred to
as truncation errors. However, the theory associated with the numerical analysis of
ﬂuid mechanics was developed predominantly by scientists deeply interested in the
physics of ﬂuid ﬂow and, as a consequence, these errors are often identiﬁed with a
particular physical phenomenon on which they have a strong eﬀect. Thus methods are
said to have a lot of “artiﬁcial viscosity” or said to be highly dispersive. This means
that the errors caused by the numerical approximation result in a modiﬁed partial
diﬀerential equation having additional terms that can be identiﬁed with the physics
of dissipation in the ﬁrst case and dispersion in the second. There is nothing wrong,
of course, with identifying an error with a physical process, nor with deliberately
directing an error to a speciﬁc physical process, as long as the error remains in some
engineering sense “small”. It is safe to say, for example, that most numerical methods
in practical use for solving the nondissipative Euler equations create a modiﬁed partial
diﬀerential equation that produces some form of dissipation. However, if used and
interpreted properly, these methods give very useful information.
Regardless of what the numerical errors are called, if their eﬀects are not thor
oughly understood and controlled, they can lead to serious diﬃculties, producing
answers that represent little, if any, physical reality. This motivates studying the
concepts of stability, convergence, and consistency. On the other hand, even if the
errors are kept small enough that they can be neglected (for engineering purposes),
the resulting simulation can still be of little practical use if ineﬃcient or inappropriate
algorithms are used. This motivates studying the concepts of stiﬀness, factorization,
and algorithm development in general. All of these concepts we hope to clarify in
this book.
1.2 Background
The ﬁeld of computational ﬂuid dynamics has a broad range of applicability. Indepen
dent of the speciﬁc application under study, the following sequence of steps generally
must be followed in order to obtain a satisfactory solution.
1.2.1 Problem Speciﬁcation and Geometry Preparation
The ﬁrst step involves the speciﬁcation of the problem, including the geometry, ﬂow
conditions, and the requirements of the simulation. The geometry may result from
1.2. BACKGROUND 3
measurements of an existing conﬁguration or may be associated with a design study.
Alternatively, in a design context, no geometry need be supplied. Instead, a set
of objectives and constraints must be speciﬁed. Flow conditions might include, for
example, the Reynolds number and Mach number for the ﬂow over an airfoil. The
requirements of the simulation include issues such as the level of accuracy needed, the
turnaround time required, and the solution parameters of interest. The ﬁrst two of
these requirements are often in conﬂict and compromise is necessary. As an example
of solution parameters of interest in computing the ﬂowﬁeld about an airfoil, one may
be interested in i) the lift and pitching moment only, ii) the drag as well as the lift
and pitching moment, or iii) the details of the ﬂow at some speciﬁc location.
1.2.2 Selection of Governing Equations and Boundary Con
ditions
Once the problem has been speciﬁed, an appropriate set of governing equations and
boundary conditions must be selected. It is generally accepted that the phenomena of
importance to the ﬁeld of continuum ﬂuid dynamics are governed by the conservation
of mass, momentum, and energy. The partial diﬀerential equations resulting from
these conservation laws are referred to as the NavierStokes equations. However, in
the interest of eﬃciency, it is always prudent to consider solving simpliﬁed forms
of the NavierStokes equations when the simpliﬁcations retain the physics which are
essential to the goals of the simulation. Possible simpliﬁed governing equations include
the potentialﬂow equations, the Euler equations, and the thinlayer NavierStokes
equations. These may be steady or unsteady and compressible or incompressible.
Boundary types which may be encountered include solid walls, inﬂow and outﬂow
boundaries, periodic boundaries, symmetry boundaries, etc. The boundary conditions
which must be speciﬁed depend upon the governing equations. For example, at a solid
wall, the Euler equations require ﬂow tangency to be enforced, while the NavierStokes
equations require the noslip condition. If necessary, physical models must be chosen
for processes which cannot be simulated within the speciﬁed constraints. Turbulence
is an example of a physical process which is rarely simulated in a practical context (at
the time of writing) and thus is often modelled. The success of a simulation depends
greatly on the engineering insight involved in selecting the governing equations and
physical models based on the problem speciﬁcation.
1.2.3 Selection of Gridding Strategy and Numerical Method
Next a numerical method and a strategy for dividing the ﬂow domain into cells, or
elements, must be selected. We concern ourselves here only with numerical meth
ods requiring such a tessellation of the domain, which is known as a grid, or mesh.
4 CHAPTER 1. INTRODUCTION
Many diﬀerent gridding strategies exist, including structured, unstructured, hybrid,
composite, and overlapping grids. Furthermore, the grid can be altered based on
the solution in an approach known as solutionadaptive gridding. The numerical
methods generally used in CFD can be classiﬁed as ﬁnitediﬀerence, ﬁnitevolume,
ﬁniteelement, or spectral methods. The choices of a numerical method and a grid
ding strategy are strongly interdependent. For example, the use of ﬁnitediﬀerence
methods is typically restricted to structured grids. Here again, the success of a sim
ulation can depend on appropriate choices for the problem or class of problems of
interest.
1.2.4 Assessment and Interpretation of Results
Finally, the results of the simulation must be assessed and interpreted. This step can
require postprocessing of the data, for example calculation of forces and moments,
and can be aided by sophisticated ﬂow visualization tools and error estimation tech
niques. It is critical that the magnitude of both numerical and physicalmodel errors
be well understood.
1.3 Overview
It should be clear that successful simulation of ﬂuid ﬂows can involve a wide range of
issues from grid generation to turbulence modelling to the applicability of various sim
pliﬁed forms of the NavierStokes equations. Many of these issues are not addressed
in this book. Some of them are presented in the books by Anderson, Tannehill, and
Pletcher [1] and Hirsch [2]. Instead we focus on numerical methods, with emphasis
on ﬁnitediﬀerence and ﬁnitevolume methods for the Euler and NavierStokes equa
tions. Rather than presenting the details of the most advanced methods, which are
still evolving, we present a foundation for developing, analyzing, and understanding
such methods.
Fortunately, to develop, analyze, and understand most numerical methods used to
ﬁnd solutions for the complete compressible NavierStokes equations, we can make use
of much simpler expressions, the socalled “model” equations. These model equations
isolate certain aspects of the physics contained in the complete set of equations. Hence
their numerical solution can illustrate the properties of a given numerical method
when applied to a more complicated system of equations which governs similar phys
ical phenomena. Although the model equations are extremely simple and easy to
solve, they have been carefully selected to be representative, when used intelligently,
of diﬃculties and complexities that arise in realistic two and threedimensional ﬂuid
ﬂow simulations. We believe that a thorough understanding of what happens when
1.4. NOTATION 5
numerical approximations are applied to the model equations is a major ﬁrst step in
making conﬁdent and competent use of numerical approximations to the Euler and
NavierStokes equations. As a word of caution, however, it should be noted that,
although we can learn a great deal by studying numerical methods as applied to the
model equations and can use that information in the design and application of nu
merical methods to practical problems, there are many aspects of practical problems
which can only be understood in the context of the complete physical systems.
1.4 Notation
The notation is generally explained as it is introduced. Bold type is reserved for real
physical vectors, such as velocity. The vector symbol is used for the vectors (or
column matrices) which contain the values of the dependent variable at the nodes
of a grid. Otherwise, the use of a vector consisting of a collection of scalars should
be apparent from the context and is not identiﬁed by any special notation. For
example, the variable u can denote a scalar Cartesian velocity component in the Euler
and NavierStokes equations, a scalar quantity in the linear convection and diﬀusion
equations, and a vector consisting of a collection of scalars in our presentation of
hyperbolic systems. Some of the abbreviations used throughout the text are listed
and deﬁned below.
PDE Partial diﬀerential equation
ODE Ordinary diﬀerential equation
O∆E Ordinary diﬀerence equation
RHS Righthand side
P.S. Particular solution of an ODE or system of ODE’s
S.S. Fixed (timeinvariant) steadystate solution
kD kdimensional space
bc
Boundary conditions, usually a vector
O(α) A term of order (i.e., proportional to) α
6 CHAPTER 1. INTRODUCTION
Chapter 2
CONSERVATION LAWS AND
THE MODEL EQUATIONS
We start out by casting our equations in the most general form, the integral conserva
tionlaw form, which is useful in understanding the concepts involved in ﬁnitevolume
schemes. The equations are then recast into divergence form, which is natural for
ﬁnitediﬀerence schemes. The Euler and NavierStokes equations are brieﬂy discussed
in this Chapter. The main focus, though, will be on representative model equations,
in particular, the convection and diﬀusion equations. These equations contain many
of the salient mathematical and physical features of the full NavierStokes equations.
The concepts of convection and diﬀusion are prevalent in our development of nu
merical methods for computational ﬂuid dynamics, and the recurring use of these
model equations allows us to develop a consistent framework of analysis for consis
tency, accuracy, stability, and convergence. The model equations we study have two
properties in common. They are linear partial diﬀerential equations (PDE’s) with
coeﬃcients that are constant in both space and time, and they represent phenomena
of importance to the analysis of certain aspects of ﬂuid dynamic problems.
2.1 Conservation Laws
Conservation laws, such as the Euler and NavierStokes equations and our model
equations, can be written in the following integral form:
V (t
2
)
QdV −
V (t
1
)
QdV +
t
2
t
1
S(t)
n.FdSdt =
t
2
t
1
V (t)
PdV dt (2.1)
In this equation, Q is a vector containing the set of variables which are conserved,
e.g., mass, momentum, and energy, per unit volume. The equation is a statement of
7
8 CHAPTER 2. CONSERVATION LAWS AND THE MODEL EQUATIONS
the conservation of these quantities in a ﬁnite region of space with volume V (t) and
surface area S(t) over a ﬁnite interval of time t
2
−t
1
. In two dimensions, the region
of space, or cell, is an area A(t) bounded by a closed contour C(t). The vector n is
a unit vector normal to the surface pointing outward, F is a set of vectors, or tensor,
containing the ﬂux of Q per unit area per unit time, and P is the rate of production
of Q per unit volume per unit time. If all variables are continuous in time, then Eq.
2.1 can be rewritten as
d
dt
V (t)
QdV +
S(t)
n.FdS =
V (t)
PdV (2.2)
Those methods which make various numerical approximations of the integrals in Eqs.
2.1 and 2.2 and ﬁnd a solution for Q on that basis are referred to as ﬁnitevolume
methods. Many of the advanced codes written for CFD applications are based on the
ﬁnitevolume concept.
On the other hand, a partial derivative form of a conservation law can also be
derived. The divergence form of Eq. 2.2 is obtained by applying Gauss’s theorem to
the ﬂux integral, leading to
∂Q
∂t
+∇.F = P (2.3)
where ∇. is the wellknown divergence operator given, in Cartesian coordinates, by
∇. ≡
i
∂
∂x
+j
∂
∂y
+k
∂
∂z
. (2.4)
and i, j, and k are unit vectors in the x, y, and z coordinate directions, respectively.
Those methods which make various approximations of the derivatives in Eq. 2.3 and
ﬁnd a solution for Q on that basis are referred to as ﬁnitediﬀerence methods.
2.2 The NavierStokes and Euler Equations
The NavierStokes equations form a coupled system of nonlinear PDE’s describing
the conservation of mass, momentum and energy for a ﬂuid. For a Newtonian ﬂuid
in one dimension, they can be written as
∂Q
∂t
+
∂E
∂x
= 0 (2.5)
with
Q =
ρ
ρu
e
¸
¸
¸
¸
¸
¸
¸
, E =
ρu
ρu
2
+p
u(e +p)
¸
¸
¸
¸
¸
¸
¸
−
0
4
3
µ
∂u
∂x
4
3
µu
∂u
∂x
+κ
∂T
∂x
¸
¸
¸
¸
¸
¸
¸
(2.6)
2.2. THE NAVIERSTOKES AND EULER EQUATIONS 9
where ρ is the ﬂuid density, u is the velocity, e is the total energy per unit volume, p is
the pressure, T is the temperature, µ is the coeﬃcient of viscosity, and κ is the thermal
conductivity. The total energy e includes internal energy per unit volume ρ (where
is the internal energy per unit mass) and kinetic energy per unit volume ρu
2
/2.
These equations must be supplemented by relations between µ and κ and the ﬂuid
state as well as an equation of state, such as the ideal gas law. Details can be found
in Anderson, Tannehill, and Pletcher [1] and Hirsch [2]. Note that the convective
ﬂuxes lead to ﬁrst derivatives in space, while the viscous and heat conduction terms
involve second derivatives. This form of the equations is called conservationlaw or
conservative form. Nonconservative forms can be obtained by expanding derivatives
of products using the product rule or by introducing diﬀerent dependent variables,
such as u and p. Although nonconservative forms of the equations are analytically
the same as the above form, they can lead to quite diﬀerent numerical solutions in
terms of shock strength and shock speed, for example. Thus the conservative form is
appropriate for solving ﬂows with features such as shock waves.
Many ﬂows of engineering interest are steady (timeinvariant), or at least may be
treated as such. For such ﬂows, we are often interested in the steadystate solution of
the NavierStokes equations, with no interest in the transient portion of the solution.
The steady solution to the onedimensional NavierStokes equations must satisfy
∂E
∂x
= 0 (2.7)
If we neglect viscosity and heat conduction, the Euler equations are obtained. In
twodimensional Cartesian coordinates, these can be written as
∂Q
∂t
+
∂E
∂x
+
∂F
∂y
= 0 (2.8)
with
Q =
q
1
q
2
q
3
q
4
¸
¸
¸
¸
¸
=
ρ
ρu
ρv
e
¸
¸
¸
¸
¸
, E =
ρu
ρu
2
+ p
ρuv
u(e +p)
¸
¸
¸
¸
¸
, F =
ρv
ρuv
ρv
2
+p
v(e +p)
¸
¸
¸
¸
¸
(2.9)
where u and v are the Cartesian velocity components. Later on we will make use of
the following form of the Euler equations as well:
∂Q
∂t
+A
∂Q
∂x
+B
∂Q
∂y
= 0 (2.10)
The matrices A =
∂E
∂Q
and B =
∂F
∂Q
are known as the ﬂux Jacobians. The ﬂux vectors
given above are written in terms of the primitive variables, ρ, u, v, and p. In order
10 CHAPTER 2. CONSERVATION LAWS AND THE MODEL EQUATIONS
to derive the ﬂux Jacobian matrices, we must ﬁrst write the ﬂux vectors E and F in
terms of the conservative variables, q
1
, q
2
, q
3
, and q
4
, as follows:
E =
E
1
E
2
E
3
E
4
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
=
q
2
(γ −1)q
4
+
3−γ
2
q
2
2
q
1
−
γ−1
2
q
2
3
q
1
q
3
q
2
q
1
γ
q
4
q
2
q
1
−
γ−1
2
q
3
2
q
2
1
+
q
2
3
q
2
q
2
1
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
(2.11)
F =
F
1
F
2
F
3
F
4
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
=
q
3
q
3
q
2
q
1
(γ −1)q
4
+
3−γ
2
q
2
3
q
1
−
γ−1
2
q
2
2
q
1
γ
q
4
q
3
q
1
−
γ−1
2
q
2
2
q
3
q
2
1
+
q
3
3
q
2
1
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
(2.12)
We have assumed that the pressure satisﬁes p = (γ − 1)[e − ρ(u
2
+ v
2
)/2] from the
ideal gas law, where γ is the ratio of speciﬁc heats, c
p
/c
v
. From this it follows that
the ﬂux Jacobian of E can be written in terms of the conservative variables as
A =
∂E
i
∂q
j
=
0 1 0 0
a
21
(3 −γ)
q
2
q
1
(1 −γ)
q
3
q
1
γ −1
−
q
2
q
1
q
3
q
1
q
3
q
1
q
2
q
1
0
a
41
a
42
a
43
γ
q
2
q
1
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
(2.13)
where
a
21
=
γ −1
2
q
3
q
1
2
−
3 −γ
2
q
2
q
1
2
2.2. THE NAVIERSTOKES AND EULER EQUATIONS 11
a
41
= (γ −1)
q
2
q
1
3
+
q
3
q
1
2
q
2
q
1
¸
¸
−γ
q
4
q
1
q
2
q
1
a
42
= γ
q
4
q
1
−
γ −1
2
3
q
2
q
1
2
+
q
3
q
1
2
¸
¸
a
43
= −(γ −1)
q
2
q
1
q
3
q
1
(2.14)
and in terms of the primitive variables as
A =
0 1 0 0
a
21
(3 −γ)u (1 −γ)v (γ −1)
−uv v u 0
a
41
a
42
a
43
γu
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
(2.15)
where
a
21
=
γ −1
2
v
2
−
3 −γ
2
u
2
a
41
= (γ −1)u(u
2
+v
2
) −γ
ue
ρ
a
42
= γ
e
ρ
−
γ −1
2
(3u
2
+v
2
)
a
43
= (1 −γ)uv (2.16)
Derivation of the two forms of B = ∂F/∂Q is similar. The eigenvalues of the ﬂux
Jacobian matrices are purely real. This is the deﬁning feature of hyperbolic systems
of PDE’s, which are further discussed in Section 2.5. The homogeneous property of
the Euler equations is discussed in Appendix C.
The NavierStokes equations include both convective and diﬀusive ﬂuxes. This
motivates the choice of our two scalar model equations associated with the physics
of convection and diﬀusion. Furthermore, aspects of convective phenomena associ
ated with coupled systems of equations such as the Euler equations are important in
developing numerical methods and boundary conditions. Thus we also study linear
hyperbolic systems of PDE’s.
12 CHAPTER 2. CONSERVATION LAWS AND THE MODEL EQUATIONS
2.3 The Linear Convection Equation
2.3.1 Diﬀerential Form
The simplest linear model for convection and wave propagation is the linear convection
equation given by the following PDE:
∂u
∂t
+a
∂u
∂x
= 0 (2.17)
Here u(x, t) is a scalar quantity propagating with speed a, a real constant which may
be positive or negative. The manner in which the boundary conditions are speciﬁed
separates the following two phenomena for which this equation is a model:
(1) In one type, the scalar quantity u is given on one boundary, corresponding
to a wave entering the domain through this “inﬂow” boundary. No bound
ary condition is speciﬁed at the opposite side, the “outﬂow” boundary. This
is consistent in terms of the wellposedness of a 1
st
order PDE. Hence the
wave leaves the domain through the outﬂow boundary without distortion or
reﬂection. This type of phenomenon is referred to, simply, as the convection
problem. It represents most of the “usual” situations encountered in convect
ing systems. Note that the lefthand boundary is the inﬂow boundary when
a is positive, while the righthand boundary is the inﬂow boundary when a is
negative.
(2) In the other type, the ﬂow being simulated is periodic. At any given time,
what enters on one side of the domain must be the same as that which is
leaving on the other. This is referred to as the biconvection problem. It is
the simplest to study and serves to illustrate many of the basic properties of
numerical methods applied to problems involving convection, without special
consideration of boundaries. Hence, we pay a great deal of attention to it in
the initial chapters.
Now let us consider a situation in which the initial condition is given by u(x, 0) =
u
0
(x), and the domain is inﬁnite. It is easy to show by substitution that the exact
solution to the linear convection equation is then
u(x, t) = u
0
(x −at) (2.18)
The initial waveform propagates unaltered with speed a to the right if a is positive
and to the left if a is negative. With periodic boundary conditions, the waveform
travels through one boundary and reappears at the other boundary, eventually re
turning to its initial position. In this case, the process continues forever without any
2.3. THE LINEAR CONVECTION EQUATION 13
change in the shape of the solution. Preserving the shape of the initial condition
u
0
(x) can be a diﬃcult challenge for a numerical method.
2.3.2 Solution in Wave Space
We now examine the biconvection problem in more detail. Let the domain be given
by 0 ≤ x ≤ 2π. We restrict our attention to initial conditions in the form
u(x, 0) = f(0)e
iκx
(2.19)
where f(0) is a complex constant, and κ is the wavenumber. In order to satisfy the
periodic boundary conditions, κ must be an integer. It is a measure of the number of
wavelengths within the domain. With such an initial condition, the solution can be
written as
u(x, t) = f(t)e
iκx
(2.20)
where the time dependence is contained in the complex function f(t). Substituting
this solution into the linear convection equation, Eq. 2.17, we ﬁnd that f(t) satisﬁes
the following ordinary diﬀerential equation (ODE)
df
dt
= −iaκf (2.21)
which has the solution
f(t) = f(0)e
−iaκt
(2.22)
Substituting f(t) into Eq. 2.20 gives the following solution
u(x, t) = f(0)e
iκ(x−at)
= f(0)e
i(κx−ωt)
(2.23)
where the frequency, ω, the wavenumber, κ, and the phase speed, a, are related by
ω = κa (2.24)
The relation between the frequency and the wavenumber is known as the dispersion
relation. The linear relation given by Eq. 2.24 is characteristic of wave propagation
in a nondispersive medium. This means that the phase speed is the same for all
wavenumbers. As we shall see later, most numerical methods introduce some disper
sion; that is, in a simulation, waves with diﬀerent wavenumbers travel at diﬀerent
speeds.
An arbitrary initial waveform can be produced by summing initial conditions of
the form of Eq. 2.19. For M modes, one obtains
u(x, 0) =
M
¸
m=1
f
m
(0)e
iκmx
(2.25)
14 CHAPTER 2. CONSERVATION LAWS AND THE MODEL EQUATIONS
where the wavenumbers are often ordered such that κ
1
≤ κ
2
≤ · · · ≤ κ
M
. Since the
wave equation is linear, the solution is obtained by summing solutions of the form of
Eq. 2.23, giving
u(x, t) =
M
¸
m=1
f
m
(0)e
iκm(x−at)
(2.26)
Dispersion and dissipation resulting from a numerical approximation will cause the
shape of the solution to change from that of the original waveform.
2.4 The Diﬀusion Equation
2.4.1 Diﬀerential Form
Diﬀusive ﬂuxes are associated with molecular motion in a continuum ﬂuid. A simple
linear model equation for a diﬀusive process is
∂u
∂t
= ν
∂
2
u
∂x
2
(2.27)
where ν is a positive real constant. For example, with u representing the tempera
ture, this parabolic PDE governs the diﬀusion of heat in one dimension. Boundary
conditions can be periodic, Dirichlet (speciﬁed u), Neumann (speciﬁed ∂u/∂x), or
mixed Dirichlet/Neumann.
In contrast to the linear convection equation, the diﬀusion equation has a nontrivial
steadystate solution, which is one that satisﬁes the governing PDE with the partial
derivative in time equal to zero. In the case of Eq. 2.27, the steadystate solution
must satisfy
∂
2
u
∂x
2
= 0 (2.28)
Therefore, u must vary linearly with x at steady state such that the boundary con
ditions are satisﬁed. Other steadystate solutions are obtained if a source term g(x)
is added to Eq. 2.27, as follows:
∂u
∂t
= ν
¸
∂
2
u
∂x
2
−g(x)
¸
(2.29)
giving a steady statesolution which satisﬁes
∂
2
u
∂x
2
−g(x) = 0 (2.30)
2.4. THE DIFFUSION EQUATION 15
In two dimensions, the diﬀusion equation becomes
∂u
∂t
= ν
¸
∂
2
u
∂x
2
+
∂
2
u
∂y
2
−g(x, y)
¸
(2.31)
where g(x, y) is again a source term. The corresponding steady equation is
∂
2
u
∂x
2
+
∂
2
u
∂y
2
−g(x, y) = 0 (2.32)
While Eq. 2.31 is parabolic, Eq. 2.32 is elliptic. The latter is known as the Poisson
equation for nonzero g, and as Laplace’s equation for zero g.
2.4.2 Solution in Wave Space
We now consider a series solution to Eq. 2.27. Let the domain be given by 0 ≤ x ≤ π
with boundary conditions u(0) = u
a
, u(π) = u
b
. It is clear that the steadystate
solution is given by a linear function which satisﬁes the boundary conditions, i.e.,
h(x) = u
a
+ (u
b
−u
a
)x/π. Let the initial condition be
u(x, 0) =
M
¸
m=1
f
m
(0) sinκ
m
x +h(x) (2.33)
where κ must be an integer in order to satisfy the boundary conditions. A solution
of the form
u(x, t) =
M
¸
m=1
f
m
(t) sin κ
m
x +h(x) (2.34)
satisﬁes the initial and boundary conditions. Substituting this form into Eq. 2.27
gives the following ODE for f
m
:
df
m
dt
= −κ
2
m
νf
m
(2.35)
and we ﬁnd
f
m
(t) = f
m
(0)e
−κ
2
m
νt
(2.36)
Substituting f
m
(t) into equation 2.34, we obtain
u(x, t) =
M
¸
m=1
f
m
(0)e
−κ
2
m
νt
sinκ
m
x +h(x) (2.37)
The steadystate solution (t →∞) is simply h(x). Eq. 2.37 shows that high wavenum
ber components (large κ
m
) of the solution decay more rapidly than low wavenumber
components, consistent with the physics of diﬀusion.
16 CHAPTER 2. CONSERVATION LAWS AND THE MODEL EQUATIONS
2.5 Linear Hyperbolic Systems
The Euler equations, Eq. 2.8, form a hyperbolic system of partial diﬀerential equa
tions. Other systems of equations governing convection and wave propagation phe
nomena, such as the Maxwell equations describing the propagation of electromagnetic
waves, are also of hyperbolic type. Many aspects of numerical methods for such sys
tems can be understood by studying a onedimensional constantcoeﬃcient linear
system of the form
∂u
∂t
+A
∂u
∂x
= 0 (2.38)
where u = u(x, t) is a vector of length m and A is a real m × m matrix. For
conservation laws, this equation can also be written in the form
∂u
∂t
+
∂f
∂x
= 0 (2.39)
where f is the ﬂux vector and A =
∂f
∂u
is the ﬂux Jacobian matrix. The entries in the
ﬂux Jacobian are
a
ij
=
∂f
i
∂u
j
(2.40)
The ﬂux Jacobian for the Euler equations is derived in Section 2.2.
Such a system is hyperbolic if A is diagonalizable with real eigenvalues.
1
Thus
Λ = X
−1
AX (2.41)
where Λ is a diagonal matrix containing the eigenvalues of A, and X is the matrix
of right eigenvectors. Premultiplying Eq. 2.38 by X
−1
, postmultiplying A by the
product XX
−1
, and noting that X and X
−1
are constants, we obtain
∂X
−1
u
∂t
+
∂
Λ
. .. .
X
−1
AX X
−1
u
∂x
= 0 (2.42)
With w = X
−1
u, this can be rewritten as
∂w
∂t
+ Λ
∂w
∂x
= 0 (2.43)
When written in this manner, the equations have been decoupled into m scalar equa
tions of the form
∂w
i
∂t
+λ
i
∂w
i
∂x
= 0 (2.44)
1
See Appendix A for a brief review of some basic relations and deﬁnitions from linear algebra.
2.6. PROBLEMS 17
The elements of w are known as characteristic variables. Each characteristic variable
satisﬁes the linear convection equation with the speed given by the corresponding
eigenvalue of A.
Based on the above, we see that a hyperbolic system in the form of Eq. 2.38 has a
solution given by the superposition of waves which can travel in either the positive or
negative directions and at varying speeds. While the scalar linear convection equation
is clearly an excellent model equation for hyperbolic systems, we must ensure that
our numerical methods are appropriate for wave speeds of arbitrary sign and possibly
widely varying magnitudes.
The onedimensional Euler equations can also be diagonalized, leading to three
equations in the form of the linear convection equation, although they remain non
linear, of course. The eigenvalues of the ﬂux Jacobian matrix, or wave speeds, are
u, u + c, and u − c, where u is the local ﬂuid velocity, and c =
γp/ρ is the local
speed of sound. The speed u is associated with convection of the ﬂuid, while u + c
and u − c are associated with sound waves. Therefore, in a supersonic ﬂow, where
u > c, all of the wave speeds have the same sign. In a subsonic ﬂow, where u < c,
wave speeds of both positive and negative sign are present, corresponding to the fact
that sound waves can travel upstream in a subsonic ﬂow.
The signs of the eigenvalues of the matrix A are also important in determining
suitable boundary conditions. The characteristic variables each satisfy the linear con
vection equation with the wave speed given by the corresponding eigenvalue. There
fore, the boundary conditions can be speciﬁed accordingly. That is, characteristic
variables associated with positive eigenvalues can be speciﬁed at the left boundary,
which corresponds to inﬂow for these variables. Characteristic variables associated
with negative eigenvalues can be speciﬁed at the right boundary, which is the in
ﬂow boundary for these variables. While other boundary condition treatments are
possible, they must be consistent with this approach.
2.6 Problems
1. Show that the 1D Euler equations can be written in terms of the primitive
variables R = [ρ, u, p]
T
as follows:
∂R
∂t
+M
∂R
∂x
= 0
where
M =
u ρ 0
0 u ρ
−1
0 γp u
¸
¸
¸
18 CHAPTER 2. CONSERVATION LAWS AND THE MODEL EQUATIONS
Assume an ideal gas, p = (γ −1)(e −ρu
2
/2).
2. Find the eigenvalues and eigenvectors of the matrix M derived in question 1.
3. Derive the ﬂux Jacobian matrix A = ∂E/∂Q for the 1D Euler equations result
ing from the conservative variable formulation (Eq. 2.5). Find its eigenvalues
and compare with those obtained in question 2.
4. Show that the two matrices M and A derived in questions 1 and 3, respectively,
are related by a similarity transform. (Hint: make use of the matrix S =
∂Q/∂R.)
5. Write the 2D diﬀusion equation, Eq. 2.31, in the form of Eq. 2.2.
6. Given the initial condition u(x, 0) = sinx deﬁned on 0 ≤ x ≤ 2π, write it in the
form of Eq. 2.25, that is, ﬁnd the necessary values of f
m
(0). (Hint: use M = 2
with κ
1
= 1 and κ
2
= −1.) Next consider the same initial condition deﬁned
only at x = 2πj/4, j = 0, 1, 2, 3. Find the values of f
m
(0) required to reproduce
the initial condition at these discrete points using M = 4 with κ
m
= m−1.
7. Plot the ﬁrst three basis functions used in constructing the exact solution to
the diﬀusion equation in Section 2.4.2. Next consider a solution with boundary
conditions u
a
= u
b
= 0, and initial conditions from Eq. 2.33 with f
m
(0) = 1
for 1 ≤ m ≤ 3, f
m
(0) = 0 for m > 3. Plot the initial condition on the domain
0 ≤ x ≤ π. Plot the solution at t = 1 with ν = 1.
8. Write the classical wave equation ∂
2
u/∂t
2
= c
2
∂
2
u/∂x
2
as a ﬁrstorder system,
i.e., in the form
∂U
∂t
+A
∂U
∂x
= 0
where U = [∂u/∂x, ∂u/∂t]
T
. Find the eigenvalues and eigenvectors of A.
9. The CauchyRiemann equations are formed from the coupling of the steady
compressible continuity (conservation of mass) equation
∂ρu
∂x
+
∂ρv
∂y
= 0
and the vorticity deﬁnition
ω = −
∂v
∂x
+
∂u
∂y
= 0
2.6. PROBLEMS 19
where ω = 0 for irrotational ﬂow. For isentropic and homenthalpic ﬂow, the
system is closed by the relation
ρ =
1 −
γ −1
2
u
2
+v
2
−1
1
γ−1
Note that the variables have been nondimensionalized. Combining the two
PDE’s, we have
∂f(q)
∂x
+
∂g(q)
∂y
= 0
where
q =
u
v
, f =
−ρu
v
, g =
−ρv
−u
One approach to solving these equations is to add a timedependent term and
ﬁnd the steady solution of the following equation:
∂q
∂t
+
∂f
∂x
+
∂g
∂y
= 0
(a) Find the ﬂux Jacobians of f and g with respect to q.
(b) Determine the eigenvalues of the ﬂux Jacobians.
(c) Determine the conditions (in terms of ρ and u) under which the system is
hyperbolic, i.e., has real eigenvalues.
(d) Are the above ﬂuxes homogeneous? (See Appendix C.)
20 CHAPTER 2. CONSERVATION LAWS AND THE MODEL EQUATIONS
Chapter 3
FINITEDIFFERENCE
APPROXIMATIONS
In common with the equations governing unsteady ﬂuid ﬂow, our model equations
contain partial derivatives with respect to both space and time. One can approxi
mate these simultaneously and then solve the resulting diﬀerence equations. Alterna
tively, one can approximate the spatial derivatives ﬁrst, thereby producing a system
of ordinary diﬀerential equations. The time derivatives are approximated next, lead
ing to a timemarching method which produces a set of diﬀerence equations. This
is the approach emphasized here. In this chapter, the concept of ﬁnitediﬀerence
approximations to partial derivatives is presented. These can be applied either to
spatial derivatives or time derivatives. Our emphasis in this chapter is on spatial
derivatives; time derivatives are treated in Chapter 6. Strategies for applying these
ﬁnitediﬀerence approximations will be discussed in Chapter 4.
All of the material below is presented in a Cartesian system. We emphasize the
fact that quite general classes of meshes expressed in general curvilinear coordinates
in physical space can be transformed to a uniform Cartesian mesh with equispaced
intervals in a socalled computational space, as shown in Figure 3.1. The computational
space is uniform; all the geometric variation is absorbed into variable coeﬃcients of the
transformed equations. For this reason, in much of the following accuracy analysis,
we use an equispaced Cartesian system without being unduly restrictive or losing
practical application.
3.1 Meshes and FiniteDiﬀerence Notation
The simplest mesh involving both time and space is shown in Figure 3.2. Inspection
of this ﬁgure permits us to deﬁne the terms and notation needed to describe ﬁnite
21
22 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
x
y
Figure 3.1: Physical and computational spaces.
diﬀerence approximations. In general, the dependent variables, u, for example, are
functions of the independent variables t, and x, y, z. For the ﬁrst several chapters
we consider primarily the 1D case u = u(x, t). When only one variable is denoted,
dependence on the other is assumed. The mesh index for x is always j, and that for
t is always n. Then on an equispaced grid
x = x
j
= j∆x (3.1)
t = t
n
= n∆t = nh (3.2)
where ∆x is the spacing in x and ∆t the spacing in t, as shown in Figure 3.2. Note
that h = ∆t throughout. Later k and l are used for y and z in a similar way. When
n, j, k, l are used for other purposes (which is sometimes necessary), local context
should make the meaning obvious.
The convention for subscript and superscript indexing is as follows:
u(t +kh) = u([n +k]h) = u
n+k
u(x +m∆x) = u([j +m]∆x) = u
j+m
(3.3)
u(x +m∆x, t +kh) = u
(n+k)
j+m
Notice that when used alone, both the time and space indices appear as a subscript,
but when used together, time is always a superscript and is usually enclosed with
parentheses to distinguish it from an exponent.
3.1. MESHES AND FINITEDIFFERENCE NOTATION 23
t
x j2 j1 j j+1 j+2
n
n1
n+1
x
t
Grid or
Node
Points
∆
∆
Figure 3.2: Spacetime grid arrangement.
Derivatives are expressed according to the usual conventions. Thus for partial
derivatives in space or time we use interchangeably
∂
x
u =
∂u
∂x
, ∂
t
u =
∂u
∂t
, ∂
xx
u =
∂
2
u
∂x
2
, etc. (3.4)
For the ordinary time derivative in the study of ODE’s we use
u
=
du
dt
(3.5)
In this text, subscripts on dependent variables are never used to express derivatives.
Thus u
x
will not be used to represent the ﬁrst derivative of u with respect to x.
The notation for diﬀerence approximations follows the same philosophy, but (with
one exception) it is not unique. By this we mean that the symbol δ is used to represent
a diﬀerence approximation to a derivative such that, for example,
δ
x
≈ ∂
x
, δ
xx
≈ ∂
xx
(3.6)
but the precise nature (and order) of the approximation is not carried in the symbol
δ. Other ways are used to determine its precise meaning. The one exception is the
symbol ∆, which is deﬁned such that
∆t
n
= t
n+1
−t
n
, ∆x
j
= x
j+1
−x
j
, ∆u
n
= u
n+1
−u
n
, etc. (3.7)
When there is no subscript on ∆t or ∆x, the spacing is uniform.
24 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
3.2 Space Derivative Approximations
A diﬀerence approximation can be generated or evaluated by means of a simple Taylor
series expansion. For example, consider u(x, t) with t ﬁxed. Then, following the
notation convention given in Eqs. 3.1 to 3.3, x = j∆x and u(x + k∆x) = u(j∆x +
k∆x) = u
j+k
. Expanding the latter term about x gives
1
u
j+k
= u
j
+ (k∆x)
∂u
∂x
j
+
1
2
(k∆x)
2
∂
2
u
∂x
2
j
+· · · +
1
n!
(k∆x)
n
∂
n
u
∂x
n
j
+· · · (3.8)
Local diﬀerence approximations to a given partial derivative can be formed from linear
combinations of u
j
and u
j+k
for k = ±1, ±2, · · ·.
For example, consider the Taylor series expansion for u
j+1
:
u
j+1
= u
j
+ (∆x)
∂u
∂x
j
+
1
2
(∆x)
2
∂
2
u
∂x
2
j
+· · · +
1
n!
(∆x)
n
∂
n
u
∂x
n
j
+· · · (3.9)
Now subtract u
j
and divide by ∆x to obtain
u
j+1
−u
j
∆x
=
∂u
∂x
j
+
1
2
(∆x)
∂
2
u
∂x
2
j
+· · · (3.10)
Thus the expression (u
j+1
−u
j
)/∆x is a reasonable approximation for
∂u
∂x
j
as long as
∆x is small relative to some pertinent length scale. Next consider the space diﬀerence
approximation (u
j+1
−u
j−1
)/(2∆x). Expand the terms in the numerator about j and
regroup the result to form the following equation
u
j+1
−u
j−1
2∆x
−
∂u
∂x
j
=
1
6
∆x
2
∂
3
u
∂x
3
j
+
1
120
∆x
4
∂
5
u
∂x
5
j
. . . (3.11)
When expressed in this manner, it is clear that the discrete terms on the left side of
the equation represent a ﬁrst derivative with a certain amount of error which appears
on the right side of the equal sign. It is also clear that the error depends on the
grid spacing to a certain order. The error term containing the grid spacing to the
lowest power gives the order of the method. From Eq. 3.10, we see that the expression
(u
j+1
−u
j
)/∆x is a ﬁrstorder approximation to
∂u
∂x
j
. Similarly, Eq. 3.11 shows that
(u
j+1
−u
j−1
)/(2∆x) is a secondorder approximation to a ﬁrst derivative. The latter
is referred to as the threepoint centered diﬀerence approximation, and one often sees
the summary result presented in the form
∂u
∂x
j
=
u
j+1
−u
j−1
2∆x
+O(∆x
2
) (3.12)
1
We assume that u(x, t) is continuously diﬀerentiable.
3.3. FINITEDIFFERENCE OPERATORS 25
3.3 FiniteDiﬀerence Operators
3.3.1 Point Diﬀerence Operators
Perhaps the most common examples of ﬁnitediﬀerence formulas are the threepoint
centereddiﬀerence approximations for the ﬁrst and second derivatives:
2
∂u
∂x
j
=
1
2∆x
(u
j+1
−u
j−1
) +O(∆x
2
) (3.13)
∂
2
u
∂x
2
j
=
1
∆x
2
(u
j+1
−2u
j
+u
j−1
) +O(∆x
2
) (3.14)
These are the basis for point diﬀerence operators since they give an approximation to
a derivative at one discrete point in a mesh in terms of surrounding points. However,
neither of these expressions tells us how other points in the mesh are diﬀerenced or
how boundary conditions are enforced. Such additional information requires a more
sophisticated formulation.
3.3.2 Matrix Diﬀerence Operators
Consider the relation
(δ
xx
u)
j
=
1
∆x
2
(u
j+1
−2u
j
+u
j−1
) (3.15)
which is a point diﬀerence approximation to a second derivative. Now let us derive a
matrix operator representation for the same approximation. Consider the four point
mesh with boundary points at a and b shown below. Notice that when we speak of
“the number of points in a mesh”, we mean the number of interior points excluding
the boundaries.
a 1 2 3 4 b
x = 0 − − − − π
j = 1 · · M
Four point mesh. ∆x = π/(M + 1)
Now impose Dirichlet boundary conditions, u(0) = u
a
, u(π) = u
b
and use the
centered diﬀerence approximation given by Eq. 3.15 at every point in the mesh. We
2
We will derive the second derivative operator shortly.
26 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
arrive at the four equations:
(δ
xx
u)
1
=
1
∆x
2
(u
a
−2u
1
+u
2
)
(δ
xx
u)
2
=
1
∆x
2
(u
1
−2u
2
+u
3
)
(δ
xx
u)
3
=
1
∆x
2
(u
2
−2u
3
+u
4
)
(δ
xx
u)
4
=
1
∆x
2
(u
3
−2u
4
+u
b
) (3.16)
Writing these equations in the more suggestive form
(δ
xx
u)
1
= ( u
a
−2u
1
+u
2
)/∆x
2
(δ
xx
u)
2
= ( u
1
−2u
2
+u
3
)/∆x
2
(δ
xx
u)
3
= ( u
2
−2u
3
+u
4
)/∆x
2
(δ
xx
u)
4
= ( u
3
−2u
4
+u
b
)/∆x
2
(3.17)
it is clear that we can express them in a vectormatrix form, and further, that the
resulting matrix has a very special form. Introducing
u =
u
1
u
2
u
3
u
4
¸
¸
¸
¸
¸
,
bc
=
1
∆x
2
u
a
0
0
u
b
¸
¸
¸
¸
¸
(3.18)
and
A =
1
∆x
2
−2 1
1 −2 1
1 −2 1
1 −2
¸
¸
¸
¸
¸
(3.19)
we can rewrite Eq. 3.17 as
δ
xx
u = Au +
bc
(3.20)
This example illustrates a matrix diﬀerence operator. Each line of a matrix diﬀer
ence operator is based on a point diﬀerence operator, but the point operators used
from line to line are not necessarily the same. For example, boundary conditions may
dictate that the lines at or near the bottom or top of the matrix be modiﬁed. In the
extreme case of the matrix diﬀerence operator representing a spectral method, none
3.3. FINITEDIFFERENCE OPERATORS 27
of the lines is the same. The matrix operators representing the threepoint central
diﬀerence approximations for a ﬁrst and second derivative with Dirichlet boundary
conditions on a fourpoint mesh are
δ
x
=
1
2∆x
0 1
−1 0 1
−1 0 1
−1 0
¸
¸
¸
¸
¸
, δ
xx
=
1
∆x
2
−2 1
1 −2 1
1 −2 1
1 −2
¸
¸
¸
¸
¸
(3.21)
As a further example, replace the fourth line in Eq. 3.16 by the following point
operator for a Neumann boundary condition (See Section 3.6.):
(δ
xx
u)
4
=
2
3
1
∆x
∂u
∂x
b
−
2
3
1
∆x
2
(u
4
−u
3
) (3.22)
where the boundary condition is
∂u
∂x
x=π
=
∂u
∂x
b
(3.23)
Then the matrix operator for a threepoint centraldiﬀerencing scheme at interior
points and a secondorder approximation for a Neumann condition on the right is
given by
δ
xx
=
1
∆x
2
−2 1
1 −2 1
1 −2 1
2/3 −2/3
¸
¸
¸
¸
¸
(3.24)
Each of these matrix diﬀerence operators is a square matrix with elements that are
all zeros except for those along bands which are clustered around the central diagonal.
We call such a matrix a banded matrix and introduce the notation
B(M : a, b, c) =
b c
a b c
.
.
.
a b c
a b
¸
¸
¸
¸
¸
¸
¸
¸
¸
1
.
.
.
M
(3.25)
where the matrix dimensions are M × M. Use of M in the argument is optional,
and the illustration is given for a simple tridiagonal matrix although any number of
28 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
bands is a possibility. A tridiagonal matrix without constants along the bands can be
expressed as B(
a,
b,
c). The arguments for a banded matrix are always odd in number
and the central one always refers to the central diagonal.
We can now generalize our previous examples. Deﬁning u as
3
u =
u
1
u
2
u
3
.
.
.
u
M
¸
¸
¸
¸
¸
¸
¸
¸
¸
(3.26)
we can approximate the second derivative of u by
δ
xx
u =
1
∆x
2
B(1, −2, 1)
u +
bc
(3.27)
where
bc
stands for the vector holding the Dirichlet boundary conditions on the
left and right sides:
bc
=
1
∆x
2
[u
a
, 0, · · · , 0, u
b
]
T
(3.28)
If we prescribe Neumann boundary conditions on the right side, as in Eqs. 3.24 and
3.22, we ﬁnd
δ
xx
u =
1
∆x
2
B(
a,
b, 1)
u +
bc
(3.29)
where
a = [1, 1, · · · , 2/3]
T
b = [−2, −2, −2, · · · , −2/3]
T
bc
=
1
∆x
2
¸
u
a
, 0, 0, · · · ,
2∆x
3
∂u
∂x
b
¸
T
Notice that the matrix operators given by Eqs. 3.27 and 3.29 carry more informa
tion than the point operator given by Eq. 3.15. In Eqs. 3.27 and 3.29, the boundary
conditions have been uniquely speciﬁed and it is clear that the same point operator
has been applied at every point in the ﬁeld except at the boundaries. The ability to
specify in the matrix derivative operator the exact nature of the approximation at the
3
Note that u is a function of time only since each element corresponds to one speciﬁc spatial
location.
3.3. FINITEDIFFERENCE OPERATORS 29
various points in the ﬁeld including the boundaries permits the use of quite general
constructions which will be useful later in considerations of stability.
Since we make considerable use of both matrix and point operators, it is important
to establish a relation between them. A point operator is generally written for some
derivative at the reference point j in terms of neighboring values of the function. For
example
(δ
x
u)
j
= a
2
u
j−2
+a
1
u
j−1
+bu
j
+c
1
u
j+1
(3.30)
might be the point operator for a ﬁrst derivative. The corresponding matrix operator
has for its arguments the coeﬃcients giving the weights to the values of the function
at the various locations. A jshift in the point operator corresponds to a diagonal
shift in the matrix operator. Thus the matrix equivalent of Eq. 3.30 is
δ
x
u = B(a
2
, a
1
, b, c
1
, 0)u (3.31)
Note the addition of a zero in the ﬁfth element which makes it clear that b is the
coeﬃcient of u
j
.
3.3.3 Periodic Matrices
The above illustrated cases in which the boundary conditions are ﬁxed. If the bound
ary conditions are periodic, the form of the matrix operator changes. Consider the
eightpoint periodic mesh shown below. This can either be presented on a linear mesh
with repeated entries, or more suggestively on a circular mesh as in Figure 3.3. When
the mesh is laid out on the perimeter of a circle, it doesn’t really matter where the
numbering starts as long as it “ends” at the point just preceding its starting location.
· · · 7 8 1 2 3 4 5 6 7 8 1 2 · · ·
x = − − 0 − − − − − − − 2π −
j = 0 1 · · · · · · M
Eight points on a linear periodic mesh. ∆x = 2π/M
The matrix that represents diﬀerencing schemes for scalar equations on a periodic
mesh is referred to as a periodic matrix. A typical periodic tridiagonal matrix operator
30 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
1
2
3
4
5
6
7
8
Figure 3.3: Eight points on a circular mesh.
with nonuniform entries is given for a 6point mesh by
B
p
(6 :
a,
b,
c) =
b
1
c
2
a
6
a
1
b
2
c
3
a
2
b
3
c
4
a
3
b
4
c
5
a
4
b
5
c
6
c
1
a
5
b
6
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
(3.32)
3.3.4 Circulant Matrices
In general, as shown in the example, the elements along the diagonals of a periodic
matrix are not constant. However, a special subset of a periodic matrix is the circulant
matrix, formed when the elements along the various bands are constant. Circulant
matrices play a vital role in our analysis. We will have much more to say about them
later. The most general circulant matrix of order 4 is
b
0
b
1
b
2
b
3
b
3
b
0
b
1
b
2
b
2
b
3
b
0
b
1
b
1
b
2
b
3
b
0
¸
¸
¸
¸
¸
(3.33)
Notice that each row of a circulant matrix is shifted (see Figure 3.3) one element to
the right of the one above it. The special case of a tridiagonal circulant matrix is
3.4. CONSTRUCTING DIFFERENCING SCHEMES OF ANY ORDER 31
given by
B
p
(M : a, b, c) =
b c a
a b c
.
.
.
a b c
c a b
¸
¸
¸
¸
¸
¸
¸
¸
¸
1
.
.
.
M
(3.34)
When the standard threepoint centraldiﬀerencing approximations for a ﬁrst and
second derivative, see Eq. 3.21, are used with periodic boundary conditions, they take
the form
(δ
x
)
p
=
1
2∆x
0 1 −1
−1 0 1
−1 0 1
1 −1 0
¸
¸
¸
¸
¸
=
1
2∆x
B
p
(−1, 0, 1)
and
(δ
xx
)
p
=
1
∆x
2
−2 1 1
1 −2 1
1 −2 1
1 1 −2
¸
¸
¸
¸
¸
=
1
∆x
2
B
p
(1, −2, 1) (3.35)
Clearly, these special cases of periodic operators are also circulant operators. Later
on we take advantage of this special property. Notice that there are no boundary
condition vectors since this information is all interior to the matrices themselves.
3.4 Constructing Diﬀerencing Schemes of Any Or
der
3.4.1 Taylor Tables
The Taylor series expansion of functions about a ﬁxed point provides a means for con
structing ﬁnitediﬀerence pointoperators of any order. A simple and straightforward
way to carry this out is to construct a “Taylor table,” which makes extensive use of
the expansion given by Eq. 3.8. As an example, consider Table 3.1, which represents
a Taylor table for an approximation of a second derivative using three values of the
function centered about the point at which the derivative is to be evaluated.
32 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
∂
2
u
∂x
2
j
−
1
∆x
2
(a u
j−1
+b u
j
+c u
j+1
) = ?
∆x· ∆x
2
· ∆x
3
· ∆x
4
·
u
j
∂u
∂x
j
∂
2
u
∂x
2
j
∂
3
u
∂x
3
j
∂
4
u
∂x
4
j
∆x
2
·
∂
2
u
∂x
2
j
1
−a · u
j−1
−a −a · (1) ·
1
1
−a · (1)
2
·
1
2
−a · (1)
3
·
1
6
−a · (1)
4
·
1
24
−b · u
j
−b
−c · u
j+1
−c −c · (1) ·
1
1
−c · (1)
2
·
1
2
−c · (1)
3
·
1
6
−c · (1)
4
·
1
24
Table 3.1. Taylor table for centered 3point Lagrangian approximation to a second
derivative.
The table is constructed so that some of the algebra is simpliﬁed. At the top of the
table we see an expression with a question mark. This represents one of the questions
that a study of this table can answer; namely, what is the local error caused by the
use of this approximation? Notice that all of the terms in the equation appear in
a column at the left of the table (although, in this case, ∆x
2
has been multiplied
into each term in order to simplify the terms to be put into the table). Then notice
that at the head of each column there appears the common factor that occurs in the
expansion of each term about the point j, that is,
∆x
k
·
∂
k
u
∂x
k
j
k = 0, 1, 2, · · ·
The columns to the right of the leftmost one, under the headings, make up the Taylor
table. Each entry is the coeﬃcient of the term at the top of the corresponding column
in the Taylor series expansion of the term to the left of the corresponding row. For
example, the last row in the table corresponds to the Taylor series expansion of
−c u
j+1
:
−c u
j+1
= −c u
j
−c · (1) ·
1
1
∆x ·
∂u
∂x
j
−c · (1)
2
·
1
2
∆x
2
·
∂
2
u
∂x
2
j
−c · (1)
3
·
1
6
∆x
3
·
∂
3
u
∂x
3
j
−c · (1)
4
·
1
24
∆x
4
·
∂
4
u
∂x
4
j
−· · · (3.36)
A Taylor table is simply a convenient way of forming linear combinations of Taylor
series on a term by term basis.
3.4. CONSTRUCTING DIFFERENCING SCHEMES OF ANY ORDER 33
Consider the sum of each of these columns. To maximize the order of accuracy
of the method, we proceed from left to right and force, by the proper choice of a, b,
and c, these sums to be zero. One can easily show that the sums of the ﬁrst three
columns are zero if we satisfy the equation
−1 −1 −1
1 0 −1
−1 0 −1
¸
¸
¸
a
b
c
¸
¸
¸ =
0
0
−2
¸
¸
¸
The solution is given by [a, b, c] = [1, −2, 1].
The columns that do not sum to zero constitute the error.
We designate the ﬁrst nonvanishing sum to be er
t
, and refer to
it as the Taylor series error.
In this case er
t
occurs at the ﬁfth column in the table (for this example all even
columns will vanish by symmetry) and one ﬁnds
er
t
=
1
∆x
2
¸
−a
24
+
−c
24
∆x
4
∂
4
u
∂x
4
j
=
−∆x
2
12
∂
4
u
∂x
4
j
(3.37)
Note that ∆x
2
has been divided through to make the error term consistent. We
have just derived the familiar 3point centraldiﬀerencing point operator for a second
derivative
∂
2
u
∂x
2
j
−
1
∆x
2
(u
j−1
−2u
j
+u
j+1
) = O(∆x
2
) (3.38)
The Taylor table for a 3point backwarddiﬀerencing operator representing a ﬁrst
derivative is shown in Table 3.2.
34 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
∂u
∂x
j
−
1
∆x
(a
2
u
j−2
+a
1
u
j−1
+b u
j
) = ?
∆x· ∆x
2
· ∆x
3
· ∆x
4
·
u
j
∂u
∂x
j
∂
2
u
∂x
2
j
∂
3
u
∂x
3
j
∂
4
u
∂x
4
j
∆x ·
∂u
∂x
j
1
−a
2
· u
j−2
−a
2
−a
2
· (2) ·
1
1
−a
2
· (2)
2
·
1
2
−a
2
· (2)
3
·
1
6
−a
2
· (2)
4
·
1
24
−a
1
· u
j−1
−a
1
−a
1
· (1) ·
1
1
−a
1
· (1)
2
·
1
2
−a
1
· (1)
3
·
1
6
−a
1
· (1)
4
·
1
24
−b · u
j
−b
Table 3.2. Taylor table for backward 3point Lagrangian approximation to a ﬁrst
derivative.
This time the ﬁrst three columns sum to zero if
−1 −1 −1
2 1 0
−4 −1 0
¸
¸
¸
a
2
a
1
b
¸
¸
¸ =
0
−1
0
¸
¸
¸
which gives [a
2
, a
1
, b] =
1
2
[1, −4, 3]. In this case the fourth column provides the leading
truncation error term:
er
t
=
1
∆x
¸
8a
2
6
+
a
1
6
∆x
3
∂
3
u
∂x
3
j
=
∆x
2
3
∂
3
u
∂x
3
j
(3.39)
Thus we have derived a secondorder backwarddiﬀerence approximation of a ﬁrst
derivative:
∂u
∂x
j
−
1
2∆x
(u
j−2
−4u
j−1
+ 3u
j
) = O(∆x
2
) (3.40)
3.4.2 Generalization of Diﬀerence Formulas
In general, a diﬀerence approximation to the mth derivative at grid point j can be
cast in terms of q +p + 1 neighboring points as
∂
m
u
∂x
m
j
−
q
¸
i=−p
a
i
u
j+i
= er
t
(3.41)
3.4. CONSTRUCTING DIFFERENCING SCHEMES OF ANY ORDER 35
where the a
i
are coeﬃcients to be determined through the use of Taylor tables to
produce approximations of a given order. Clearly this process can be used to ﬁnd
forward, backward, skewed, or central point operators of any order for any derivative.
It could be computer automated and extended to higher dimensions. More important,
however, is the fact that it can be further generalized. In order to do this, let us
approach the subject in a slightly diﬀerent way, that is from the point of view of
interpolation formulas. These formulas are discussed in many texts on numerical
analysis.
3.4.3 Lagrange and Hermite Interpolation Polynomials
The Lagrangian interpolation polynomial is given by
u(x) =
K
¸
k=0
a
k
(x)u
k
(3.42)
where a
k
(x) are polynomials in x of degree K. The construction of the a
k
(x) can be
taken from the simple Lagrangian formula for quadratic interpolation (or extrapola
tion) with nonequispaced points
u(x) = u
0
(x
1
−x)(x
2
−x)
(x
1
−x
0
)(x
2
−x
0
)
+u
1
(x
0
−x)(x
2
−x)
(x
0
−x
1
)(x
2
−x
1
)
+u
2
(x
0
−x)(x
1
−x)
(x
0
−x
2
)(x
1
−x
2
)
(3.43)
Notice that the coeﬃcient of each u
k
is one when x = x
k
, and zero when x takes
any other discrete value in the set. If we take the ﬁrst or second derivative of u(x),
impose an equispaced mesh, and evaluate these derivatives at the appropriate dis
crete point, we rederive the ﬁnitediﬀerence approximations just presented. Finite
diﬀerence schemes that can be derived from Eq. 3.42 are referred to as Lagrangian
approximations.
A generalization of the Lagrangian approach is brought about by using Hermitian
interpolation. To construct a polynomial for u(x), Hermite formulas use values of the
function and its derivative(s) at given points in space. Our illustration is for the case
in which discrete values of the function and its ﬁrst derivative are used, producing
the expression
u(x) =
¸
a
k
(x)u
k
+
¸
b
k
(x)
∂u
∂x
k
(3.44)
36 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
Obviously higherorder derivatives could be included as the problems dictate. A com
plete discussion of these polynomials can be found in many references on numerical
methods, but here we need only the concept.
The previous examples of a Taylor table constructed explicit point diﬀerence op
erators from Lagrangian interpolation formulas. Consider next the Taylor table for
an implicit space diﬀerencing scheme for a ﬁrst derivative arising from the use of a
Hermite interpolation formula. A generalization of Eq. 3.41 can include derivatives
at neighboring points, i.e.,
s
¸
i=−r
b
i
∂
m
u
∂x
m
j+i
+
q
¸
i=−p
a
i
u
j+i
= er
t
(3.45)
analogous to Eq. 3.44. An example formula is illustrated at the top of Table 3.3. Here
not only is the derivative at point j represented, but also included are derivatives at
points j −1 and j + 1, which also must be expanded using Taylor series about point
j. This requires the following generalization of the Taylor series expansion given in
Eq. 3.8:
∂
m
u
∂x
m
j+k
=
¸
∞
¸
n=0
1
n!
(k∆x)
n
∂
n
∂x
n
¸
∂
m
u
∂x
m
¸
j
(3.46)
The derivative terms now have coeﬃcients (the coeﬃcient on the j point is taken
as one to simplify the algebra) which must be determined using the Taylor table
approach as outlined below.
d
∂u
∂x
j−1
+
∂u
∂x
j
+e
∂u
∂x
j+1
−
1
∆x
(au
j−1
+bu
j
+cu
j+1
) = ?
∆x· ∆x
2
· ∆x
3
· ∆x
4
· ∆x
5
·
u
j
∂u
∂x
j
∂
2
u
∂x
2
j
∂
3
u
∂x
3
j
∂
4
u
∂x
4
j
∂
5
u
∂x
5
j
∆x · d
∂u
∂x
j−1
d d · (1) ·
1
1
d · (1)
2
·
1
2
d · (1)
3
·
1
6
d · (1)
4
·
1
24
∆x ·
∂u
∂x
j
1
∆x · e
∂u
∂x
j+1
e e · (1) ·
1
1
e · (1)
2
·
1
2
e · (1)
3
·
1
6
e · (1)
4
·
1
24
−a · u
j−1
−a −a · (1) ·
1
1
−a · (1)
2
·
1
2
−a · (1)
3
·
1
6
−a · (1)
4
·
1
24
−a · (1)
5
·
1
120
−b · u
j
−b
−c · u
j+1
−c −c · (1) ·
1
1
−c · (1)
2
·
1
2
−c · (1)
3
·
1
6
−c · (1)
4
·
1
24
−c · (1)
5
·
1
120
Table 3.3. Taylor table for central 3point Hermitian approximation to a ﬁrst derivative.
3.4. CONSTRUCTING DIFFERENCING SCHEMES OF ANY ORDER 37
To maximize the order of accuracy, we must satisfy the relation
−1 −1 −1 0 0
1 0 −1 1 1
−1 0 −1 −2 2
1 0 −1 3 3
−1 0 −1 −4 4
¸
¸
¸
¸
¸
¸
¸
a
b
c
d
e
¸
¸
¸
¸
¸
¸
¸
=
0
−1
0
0
0
¸
¸
¸
¸
¸
¸
¸
having the solution [a, b, c, d, e] =
1
4
[−3, 0, 3, 1, 1]. Under these conditions the sixth
column sums to
er
t
=
∆x
4
120
∂
5
u
∂x
5
j
(3.47)
and the method can be expressed as
∂u
∂x
j−1
+ 4
∂u
∂x
j
+
∂u
∂x
j+1
−
3
∆x
(−u
j−1
+u
j+1
) = O(∆x
4
) (3.48)
This is also referred to as a Pad´e formula.
3.4.4 Practical Application of Pad´e Formulas
It is one thing to construct methods using the Hermitian concept and quite another
to implement them in a computer code. In the form of a point operator it is probably
not evident at ﬁrst just how Eq. 3.48 can be applied. However, the situation is quite
easy to comprehend if we express the same method in the form of a matrix operator.
A banded matrix notation for Eq. 3.48 is
1
6
B(1, 4, 1)δ
x
u =
1
2∆x
B(−1, 0, 1)
u +
bc
(3.49)
in which Dirichlet boundary conditions have been imposed.
4
Mathematically this is
equivalent to
δ
x
u = 6[B(1, 4, 1)]
−1
¸
1
2∆x
B(−1, 0, 1)
u +
bc
(3.50)
which can be reexpressed by the “predictorcorrector” sequence
˜ u =
1
2∆x
B(−1, 0, 1)
u +
bc
δ
x
u = 6 [B(1, 4, 1)]
−1
˜ u (3.51)
4
In this case the vector containing the boundary conditions would include values of both u and
∂u/∂x at both boundaries.
38 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
With respect to practical implementation, the meaning of the predictor in this
sequence should be clear. It simply says – take the vector array
u, diﬀerence it,
add the boundary conditions, and store the result in the intermediate array
˜ u. The
meaning of the second row is more subtle, since it is demanding the evaluation of an
inverse operator, but it still can be given a simple interpretation. An inverse matrix
operator implies the solution of a coupled set of linear equations. These operators
are very common in ﬁnite diﬀerence applications. They appear in the form of banded
matrices having a small bandwidth, in this case a tridiagonal. The evaluation of
[B(1, 4, 1)]
−1
is found by means of a tridiagonal “solver”, which is simple to code,
eﬃcient to run, and widely used. In general, Hermitian or Pad´e approximations can
be practical when they can be implemented by the use of eﬃcient banded solvers.
3.4.5 Other HigherOrder Schemes
Hermitian forms of the second derivative can also be easily derived by means of a
Taylor table. For example
δ
xx
u = 12 [B(1, 10, 1)]
−1
¸
1
∆x
2
B(1, −2, 1)
u +
bc
(3.52)
is O(∆x
4
) and makes use of only tridiagonal operations. It should be mentioned that
the spline approximation is one form of a Pad´e matrix diﬀerence operator. It is given
by
δ
xx
u = 6 [B(1, 4, 1)]
−1
¸
1
∆x
2
B(1, −2, 1)
u +
bc
(3.53)
but its order of accuracy is only O(∆x
2
). How much this reduction in accuracy is
oﬀset by the increased global continuity built into a spline ﬁt is not known. We note
that the spline ﬁt of a ﬁrst derivative is identical to any of the expressions in Eqs.
3.48 to 3.51.
A ﬁnal word on Hermitian approximations. Clearly they have an advantage over
3point Lagrangian schemes because of their increased accuracy. However, a more
subtle point is that they get this increase in accuracy using information that is still
local to the point where the derivatives are being evaluated. In application, this can
be advantageous at boundaries and in the vicinity of steep gradients. It is obvious, of
course, that ﬁve point schemes using Lagrangian approximations can be derived that
have the same order of accuracy as the methods given in Eqs. 3.48 and 3.52, but they
will have a wider spread of space indices. In particular, two Lagrangian schemes with
the same order of accuracy are (here we ignore the problem created by the boundary
conditions, although this is one of the principal issues in applying these schemes):
∂
u
∂x
−
1
12∆x
B
p
(1, −8, 0, 8, −1)
u = O
∆x
4
(3.54)
3.5. FOURIER ERROR ANALYSIS 39
∂
2
u
∂x
2
−
1
12∆x
2
B
p
(−1, 16, −30, 16, −1)
u = O
∆x
4
(3.55)
3.5 Fourier Error Analysis
In order to select a ﬁnitediﬀerence scheme for a given application one must be able
to assess the accuracy of the candidate schemes. The accuracy of an operator is often
expressed in terms of the order of the leading error term determined from a Taylor
table. While this is a useful measure, it provides a fairly limited description. Further
information about the error behavior of a ﬁnitediﬀerence scheme can be obtained
using Fourier error analysis.
3.5.1 Application to a Spatial Operator
An arbitrary periodic function can be decomposed into its Fourier components, which
are in the form e
iκx
, where κ is the wavenumber. It is therefore of interest to examine
how well a given ﬁnitediﬀerence operator approximates derivatives of e
iκx
. We will
concentrate here on ﬁrst derivative approximations, although the analysis is equally
applicable to higher derivatives.
The exact ﬁrst derivative of e
iκx
is
∂e
iκx
∂x
= iκe
iκx
(3.56)
If we apply, for example, a secondorder centered diﬀerence operator to u
j
= e
iκx
j
,
where x
j
= j∆x, we get
(δ
x
u)
j
=
u
j+1
−u
j−1
2∆x
=
e
iκ∆x(j+1)
−e
iκ∆x(j−1)
2∆x
=
(e
iκ∆x
−e
−iκ∆x
)e
iκx
j
2∆x
=
1
2∆x
[(cos κ∆x +i sin κ∆x) −(cos κ∆x −i sinκ∆x)]e
iκx
j
= i
sinκ∆x
∆x
e
iκx
j
= iκ
∗
e
iκx
j
(3.57)
40 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
0 0.5 1 1.5 2 2.5 3
0
0.5
1
1.5
2
2.5
3
κ x ∆
κ
*
x ∆
2 Central
nd
4 Central
th
4 Pade
th
Figure 3.4: Modiﬁed wavenumber for various schemes.
where κ
∗
is the modiﬁed wavenumber. The modiﬁed wavenumber is so named because
it appears where the wavenumber, κ, appears in the exact expression. Thus the degree
to which the modiﬁed wavenumber approximates the actual wavenumber is a measure
of the accuracy of the approximation.
For the secondorder centered diﬀerence operator the modiﬁed wavenumber is given
by
κ
∗
=
sinκ∆x
∆x
(3.58)
Note that κ
∗
approximates κ to secondorder accuracy, as is to be expected, since
sinκ∆x
∆x
= κ −
κ
3
∆x
2
6
+. . .
Equation 3.58 is plotted in Figure 3.4, along with similar relations for the stan
dard fourthorder centered diﬀerence scheme and the fourthorder Pad´e scheme. The
expression for the modiﬁed wavenumber provides the accuracy with which a given
wavenumber component of the solution is resolved for the entire wavenumber range
available in a mesh of a given size, 0 ≤ κ∆x ≤ π.
In general, ﬁnitediﬀerence operators can be written in the form
(δ
x
)
j
= (δ
a
x
)
j
+ (δ
s
x
)
j
3.5. FOURIER ERROR ANALYSIS 41
where (δ
a
x
)
j
is an antisymmetric operator and (δ
s
x
)
j
is a symmetric operator.
5
If we
restrict our interest to schemes extending from j −3 to j + 3, then
(δ
a
x
u)
j
=
1
∆x
[a
1
(u
j+1
−u
j−1
) +a
2
(u
j+2
−u
j−2
) +a
3
(u
j+3
−u
j−3
)]
and
(δ
s
x
u)
j
=
1
∆x
[d
0
u
j
+d
1
(u
j+1
+u
j−1
) +d
2
(u
j+2
+u
j−2
) +d
3
(u
j+3
+u
j−3
)]
The corresponding modiﬁed wavenumber is
iκ
∗
=
1
∆x
[d
0
+ 2(d
1
cos κ∆x +d
2
cos 2κ∆x +d
3
cos 3κ∆x)
+ 2i(a
1
sin κ∆x +a
2
sin2κ∆x +a
3
sin 3κ∆x) (3.59)
When the ﬁnitediﬀerence operator is antisymmetric (centered), the modiﬁed wavenum
ber is purely real. When the operator includes a symmetric component, the modiﬁed
wavenumber is complex, with the imaginary component being entirely error. The
fourthorder Pad´e scheme is given by
(δ
x
u)
j−1
+ 4(δ
x
u)
j
+ (δ
x
u)
j+1
=
3
∆x
(u
j+1
−u
j−1
)
The modiﬁed wavenumber for this scheme satisﬁes
6
iκ
∗
e
−iκ∆x
+ 4iκ
∗
+iκ
∗
e
iκ∆x
=
3
∆x
(e
iκ∆x
−e
−iκ∆x
)
which gives
iκ
∗
=
3i sinκ∆x
(2 + cos κ∆x)∆x
The modiﬁed wavenumber provides a useful tool for assessing diﬀerence approx
imations. In the context of the linear convection equation, the errors can be given
a physical interpretation. Consider once again the linear convection equation in the
form
∂u
∂t
+a
∂u
∂x
= 0
5
In terms of a circulant matrix operator A, the antisymmetric part is obtained from (A−A
T
)/2
and the symmetric part from (A + A
T
)/2.
6
Note that terms such as (δ
x
u)
j−1
are handled by letting (δ
x
u)
j
= ik
∗
e
iκj∆x
and evaluating the
shift in j.
42 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
on a domain extending from −∞ to ∞. Recall from Section 2.3.2 that a solution
initiated by a harmonic function with wavenumber κ is
u(x, t) = f(t)e
iκx
(3.60)
where f(t) satisﬁes the ODE
df
dt
= −iaκf
Solving for f(t) and substituting into Eq. 3.60 gives the exact solution as
u(x, t) = f(0)e
iκ(x−at)
If secondorder centered diﬀerences are applied to the spatial term, the following
ODE is obtained for f(t):
df
dt
= −ia
¸
sinκ∆x
∆x
f = −iaκ
∗
f (3.61)
Solving this ODE exactly (since we are considering the error from the spatial approx
imation only) and substituting into Eq. 3.60, we obtain
u
numerical
(x, t) = f(0)e
iκ(x−a
∗
t)
(3.62)
where a
∗
is the numerical (or modiﬁed) phase speed, which is related to the modiﬁed
wavenumber by
a
∗
a
=
κ
∗
κ
For the above example,
a
∗
a
=
sin κ∆x
κ∆x
The numerical phase speed is the speed at which a harmonic function is propagated
numerically. Since a
∗
/a ≤ 1 for this example, the numerical solution propagates
too slowly. Since a
∗
is a function of the wavenumber, the numerical approximation
introduces dispersion, although the original PDE is nondispersive. As a result, a
waveform consisting of many diﬀerent wavenumber components eventually loses its
original form.
Figure 3.5 shows the numerical phase speed for the schemes considered previously.
The number of points per wavelength (PPW) by which a given wave is resolved is
given by 2π/κ∆x. The resolving eﬃciency of a scheme can be expressed in terms
of the PPW required to produce errors below a speciﬁed level. For example, the
secondorder centered diﬀerence scheme requires 80 PPW to produce an error in
phase speed of less than 0.1 percent. The 5point fourthorder centered scheme and
3.6. DIFFERENCE OPERATORS AT BOUNDARIES 43
0 0.5 1 1.5 2 2.5 3
0
0.2
0.4
0.6
0.8
1
1.2
κ x ∆
2 Central
nd
4 Central
th
4 Pade
th
a
a
*
_
Figure 3.5: Numerical phase speed for various schemes.
the fourthorder Pad´e scheme require 15 and 10 PPW respectively to achieve the
same error level.
For our example using secondorder centered diﬀerences, the modiﬁed wavenumber
is purely real, but in the general case it can include an imaginary component as well,
as shown in Eq. 3.59. In that case, the error in the phase speed is determined from
the real part of the modiﬁed wavenumber, while the imaginary part leads to an error
in the amplitude of the solution, as can be seen by inspecting Eq. 3.61. Thus the
antisymmetric portion of the spatial diﬀerence operator determines the error in speed
and the symmetric portion the error in amplitude. This will be further discussed in
Section 11.2.
3.6 Diﬀerence Operators at Boundaries
As discussed in Section 3.3.2, a matrix diﬀerence operator incorporates both the
diﬀerence approximation in the interior of the domain and that at the boundaries.
In this section, we consider the boundary operators needed for our model equations
for convection and diﬀusion. In the case of periodic boundary conditions, no special
boundary operators are required.
44 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
3.6.1 The Linear Convection Equation
Referring to Section 2.3, the boundary conditions for the linear convection equation
can be either periodic or of inﬂowoutﬂow type. In the latter case, a Dirichlet bound
ary condition is given at the inﬂow boundary, while no condition is speciﬁed at the
outﬂow boundary. Here we assume that the wave speed a is positive; thus the left
hand boundary is the inﬂow boundary, and the righthand boundary is the outﬂow
boundary. Thus the vector of unknowns is
u = [u
1
, u
2
, . . . , u
M
]
T
(3.63)
and u
0
is speciﬁed.
Consider ﬁrst the inﬂow boundary. It is clear that as long as the interior diﬀerence
approximation does not extend beyond u
j−1
, then no special treatment is required
for this boundary. For example, with secondorder centered diﬀerences we obtain
δ
x
u = Au +
bc
(3.64)
with
A =
1
2∆x
0 1
−1 0 1
−1 0 1
.
.
.
¸
¸
¸
¸
¸
¸
,
bc
=
1
2∆x
−u
0
0
0
.
.
.
¸
¸
¸
¸
¸
¸
(3.65)
However, if we use the fourthorder interior operator given in Eq. 3.54, then the
approximation at j = 1 requires a value of u
j−2
, which is outside the domain. Hence,
a diﬀerent operator is required at j = 1 which extends only to j − 1, while having
the appropriate order of accuracy. Such an operator, known as a numerical boundary
scheme, can have an order of accuracy which is one order lower than that of the
interior scheme, and the global accuracy will equal that of the interior scheme.
7
For
example, with fourthorder centered diﬀerences, we can use the following thirdorder
operator at j = 1:
(δ
x
u)
1
=
1
6∆x
(−2u
0
−3u
1
+ 6u
2
−u
3
) (3.66)
which is easily derived using a Taylor table. The resulting diﬀerence operator has the
form of Eq. 3.64 with
A =
1
12∆x
−6 12 −2
−8 0 8 −1
1 −8 0 8 −1
.
.
.
¸
¸
¸
¸
¸
¸
,
bc
=
1
12∆x
−4u
0
u
0
0
.
.
.
¸
¸
¸
¸
¸
¸
(3.67)
7
Proof of this theorem is beyond the scope of this book; the interested reader should consult the
literature for further details.
3.6. DIFFERENCE OPERATORS AT BOUNDARIES 45
This approximation is globally fourthorder accurate.
At the outﬂow boundary, no boundary condition is speciﬁed. We must approx
imate ∂u/∂x at node M with no information about u
M+1
. Thus the secondorder
centereddiﬀerence operator, which requires u
j+1
, cannot be used at j = M. A
backwarddiﬀerence formula must be used. With a secondorder interior operator,
the following ﬁrstorder backward formula can be used:
(δ
x
u)
M
=
1
∆x
(u
M
−u
M−1
) (3.68)
This produces a diﬀerence operator with
A =
1
2∆x
0 1
−1 0 1
−1 0 1
.
.
.
−1 0 1
−2 2
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
,
bc
=
1
2∆x
−u
0
0
0
.
.
.
0
¸
¸
¸
¸
¸
¸
¸
¸
¸
(3.69)
In the case of a fourthorder centered interior operator the last two rows of A require
modiﬁcation.
Another approach to the development of boundary schemes is in terms of space
extrapolation. The following formula allows u
M+1
to be extrapolated from the interior
data to arbitrary order on an equispaced grid:
(1 −E
−1
)
p
u
M+1
= 0 (3.70)
where E is the shift operator deﬁned by Eu
j
= u
j+1
and the order of the approxima
tion is p −1. For example, with p = 2 we obtain
(1 −2E
−1
+E
−2
)u
M+1
= u
M+1
−2u
M
+u
M−1
= 0 (3.71)
which gives the following ﬁrstorder approximation to u
M+1
:
u
M+1
= 2u
M
−u
M−1
(3.72)
Substituting this into the secondorder centereddiﬀerence operator applied at node
M gives
(δ
x
u)
M
=
1
2∆x
(u
M+1
−u
M−1
) =
1
2∆x
(2u
M
−u
M−1
−u
M−1
)
=
1
∆x
(u
M
−u
M−1
) (3.73)
which is identical to Eq. 3.68.
46 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
3.6.2 The Diﬀusion Equation
In solving the diﬀusion equation, we must consider Dirichlet and Neumann boundary
conditions. The treatment of a Dirichlet boundary condition proceeds along the same
lines as the inﬂow boundary for the convection equation discussed above. With the
secondorder centered interior operator
(δ
xx
u)
j
=
1
∆x
2
(u
j+1
−2u
j
+u
j−1
) (3.74)
no modiﬁcations are required near boundaries, leading to the diﬀerence operator given
in Eq. 3.27.
For a Neumann boundary condition, we assume that ∂u/∂x is speciﬁed at j =
M + 1, that is
∂u
∂x
M+1
=
∂u
∂x
b
(3.75)
Thus we design an operator at node M which is in the following form:
(δ
xx
u)
M
=
1
∆x
2
(au
M−1
+bu
M
) +
c
∆x
∂u
∂x
M+1
(3.76)
where a, b, and c are constants which can easily be determined using a Taylor table,
as shown in Table 3.4.
∂
2
u
∂x
2
j
−
1
∆x
2
(au
j−1
+bu
j
) +
c
∆x
∂u
∂x
j+1
¸
¸
= ?
∆x· ∆x
2
· ∆x
3
· ∆x
4
·
u
j
∂u
∂x
j
∂
2
u
∂x
2
j
∂
3
u
∂x
3
j
∂
4
u
∂x
4
j
∆x
2
·
∂
2
u
∂x
2
j
1
−a · u
j−1
−a −a · (1) ·
1
1
−a · (1)
2
·
1
2
−a · (1)
3
·
1
6
−a · (1)
4
·
1
24
−b · u
j
−b
−∆x · c ·
∂u
∂x
j+1
−c −c · (1) ·
1
1
−c · (1)
2
·
1
2
−c · (1)
3
·
1
6
Table 3.4. Taylor table for Neumann boundary condition.
Solving for a, b, and c, we obtain the following operator:
(δ
xx
u)
M
=
1
3∆x
2
(2u
M−1
−2u
M
) +
2
3∆x
∂u
∂x
M+1
(3.77)
3.7. PROBLEMS 47
which produces the diﬀerence operator given in Eq. 3.29. Notice that this operator
is secondorder accurate. In the case of a numerical approximation to a Neumann
boundary condition, this is necessary to obtain a globally secondorder accurate for
mulation. This contrasts with the numerical boundary schemes described previously
which can be one order lower than the interior scheme.
We can also obtain the operator in Eq. 3.77 using the space extrapolation idea.
Consider a secondorder backwarddiﬀerence approximation applied at node M + 1:
∂u
∂x
M+1
=
1
2∆x
(u
M−1
−4u
M
+ 3u
M+1
) +O(∆x
2
) (3.78)
Solving for u
M+1
gives
u
M+1
=
1
3
¸
4u
M
−u
M−1
+ 2∆x
∂u
∂x
M+1
¸
+O(∆x
3
) (3.79)
Substituting this into the secondorder centered diﬀerence operator for a second
derivative applied at node M gives
(δ
xx
u)
M
=
1
∆x
2
(u
M+1
−2u
M
+u
M−1
) (3.80)
=
1
3∆x
2
¸
3u
M−1
−6u
M
+ 4u
M
−u
M−1
+ 2∆x
∂u
∂x
M+1
¸
=
1
3∆x
2
(2u
M−1
−2u
M
) +
2
3∆x
∂u
∂x
M+1
(3.81)
which is identical to Eq. 3.77.
3.7 Problems
1. Derive a thirdorder ﬁnitediﬀerence approximation to a ﬁrst derivative in the
form
(δ
x
u)
j
=
1
∆x
(au
j−2
+bu
j−1
+ cu
j
+du
j+1
)
Find the leading error term.
2. Derive a ﬁnitediﬀerence approximation to a ﬁrst derivative in the form
a(δ
x
u)
j−1
+ (δ
x
u)
j
=
1
∆x
(bu
j−1
+cu
j
+du
j+1
)
Find the leading error term.
48 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
3. Using a 4 (interior) point mesh, write out the 4×4 matrices and the boundary
condition vector formed by using the scheme derived in question 2 when both
u and ∂u/∂x are given at j = 0 and u is given at j = 5.
4. Repeat question 2 with d = 0.
5. Derive a ﬁnitediﬀerence approximation to a third derivative in the form
(δ
xxx
u)
j
=
1
∆x
3
(au
j−2
+bu
j−1
+cu
j
+du
j+1
+eu
j+2
)
Find the leading error term.
6. Derive a compact (or Pad´e) ﬁnitediﬀerence approximation to a second deriva
tive in the form
d(δ
xx
u)
j−1
+ (δ
xx
u)
j
+e(δ
xx
u)
j+1
=
1
∆x
2
(au
j−1
+bu
j
+cu
j+1
)
Find the leading error term.
7. Find the modiﬁed wavenumber for the operator derived in question 1. Plot the
real and imaginary parts of κ
∗
∆x vs. κ∆x for 0 ≤ κ∆x ≤ π. Compare the real
part with that obtained from the fourthorder centered operator (Eq. 3.54).
8. Application of the secondderivative operator to the function e
iκx
gives
∂
2
e
iκx
∂x
2
= −κ
2
e
iκx
Application of a diﬀerence operator for the second derivative gives
(δ
xx
e
iκj∆x
)
j
= −κ
∗2
e
iκx
thus deﬁning the modiﬁed wavenumber κ
∗
for a second derivative approxima
tion. Find the modiﬁed wavenumber for the secondorder centered diﬀerence op
erator for a second derivative, the noncompact fourthorder operator (Eq. 3.55),
and the compact fourthorder operator derived in question 6. Plot (κ
∗
∆x)
2
vs.
(κ∆x)
2
for 0 ≤ κ∆x ≤ π.
9. Find the gridpointsperwavelength (PPW) requirement to achieve a phase
speed error less than 0.1 percent for sixthorder noncompact and compact cen
tered approximations to a ﬁrst derivative.
3.7. PROBLEMS 49
10. Consider the following onesided diﬀerencing schemes, which are ﬁrst, second,
and thirdorder, respectively:
(δ
x
u)
j
= (u
j
−u
j−1
) /∆x
(δ
x
u)
j
= (3u
j
−4u
j−1
+u
j−2
) /(2∆x)
(δ
x
u)
j
= (11u
j
−18u
j−1
+ 9u
j−2
−2u
j−3
) /(6∆x)
Find the modiﬁed wavenumber for each of these schemes. Plot the real and
imaginary parts of κ
∗
∆x vs. κ∆x for 0 ≤ κ∆x ≤ π. Derive the two leading
terms in the truncation error for each scheme.
50 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS
Chapter 4
THE SEMIDISCRETE
APPROACH
One strategy for obtaining ﬁnitediﬀerence approximations to a PDE is to start by
diﬀerencing the space derivatives only, without approximating the time derivative.
In the following chapters, we proceed with an analysis making considerable use of
this concept, which we refer to as the semidiscrete approach. Diﬀerencing the space
derivatives converts the basic PDE into a set of coupled ODE’s. In the most general
notation, these ODE’s would be expressed in the form
du
dt
=
F(u, t) (4.1)
which includes all manner of nonlinear and timedependent possibilities. On occasion,
we use this form, but the rest of this chapter is devoted to a more specialized matrix
notation described below.
Another strategy for constructing a ﬁnitediﬀerence approximation to a PDE is
to approximate all the partial derivatives at once. This generally leads to a point
diﬀerence operator (see Section 3.3.1) which, in turn, can be used for the time advance
of the solution at any given point in the mesh. As an example let us consider the
model equation for diﬀusion
∂u
∂t
= ν
∂
2
u
∂x
2
Using threepoint centraldiﬀerencing schemes for both the time and space derivatives,
we ﬁnd
u
(n+1)
j
−u
(n−1)
j
2h
= ν
u
(n)
j+1
−2u
(n)
j
+u
(n)
j−1
∆x
2
¸
¸
or
u
(n+1)
j
= u
(n−1)
j
+
2hν
∆x
2
u
(n)
j+1
−2u
(n)
j
+u
(n)
j−1
(4.2)
51
52 CHAPTER 4. THE SEMIDISCRETE APPROACH
Clearly Eq. 4.2 is a diﬀerence equation which can be used at the space point j to
advance the value of u from the previous time levels n and n − 1 to the level n + 1.
It is a full discretization of the PDE. Note, however, that the spatial and temporal
discretizations are separable. Thus, this method has an intermediate semidiscrete
form and can be analyzed by the methods discussed in the next few chapters.
Another possibility is to replace the value of u
(n)
j
in the right hand side of Eq. 4.2
with the time average of u at that point, namely (u
(n+1)
j
+ u
(n−1)
j
)/2. This results in
the formula
u
(n+1)
j
= u
(n−1)
j
+
2hν
∆x
2
u
(n)
j+1
−2
¸
u
(n+1)
j
+u
(n−1)
j
2
¸
+u
(n)
j−1
¸
¸
(4.3)
which can be solved for u
(n+1)
and time advanced at the point j. In this case, the
spatial and temporal discretizations are not separable, and no semidiscrete form
exists.
Equation 4.2 is sometimes called Richardson’s method of overlapping steps and
Eq. 4.3 is referred to as the DuFortFrankel method. As we shall see later on, there
are subtle points to be made about using these methods to ﬁnd a numerical solution
to the diﬀusion equation. There are a number of issues concerning the accuracy,
stability, and convergence of Eqs. 4.2 and 4.3 which we cannot comment on until we
develop a framework for such investigations. We introduce these methods here only to
distinguish between methods in which the temporal and spatial terms are discretized
separately and those for which no such separation is possible. For the time being, we
shall separate the space diﬀerence approximations from the time diﬀerencing. In this
approach, we reduce the governing PDE’s to ODE’s by discretizing the spatial terms
and use the welldeveloped theory of ODE solutions to aid us in the development of
an analysis of accuracy and stability.
4.1 Reduction of PDE’s to ODE’s
4.1.1 The Model ODE’s
First let us consider the model PDE’s for diﬀusion and biconvection described in
Chapter 2. In these simple cases, we can approximate the space derivatives with
diﬀerence operators and express the resulting ODE’s with a matrix formulation. This
is a simple and natural formulation when the ODE’s are linear.
4.1. REDUCTION OF PDE’S TO ODE’S 53
Model ODE for Diﬀusion
For example, using the 3point centraldiﬀerencing scheme to represent the second
derivative in the scalar PDE governing diﬀusion leads to the following ODE diﬀusion
model
d
u
dt
=
ν
∆x
2
B(1, −2, 1)
u +
(bc) (4.4)
with Dirichlet boundary conditions folded into the
(bc) vector.
Model ODE for Biconvection
The term biconvection was introduced in Section 2.3. It is used for the scalar con
vection model when the boundary conditions are periodic. In this case, the 3point
centraldiﬀerencing approximation produces the ODE model given by
d
u
dt
= −
a
2∆x
B
p
(−1, 0, 1)
u (4.5)
where the boundary condition vector is absent because the ﬂow is periodic.
Eqs. 4.4 and 4.5 are the model ODE’s for diﬀusion and biconvection of a scalar in
one dimension. They are linear with coeﬃcient matrices which are independent of x
and t.
4.1.2 The Generic Matrix Form
The generic matrix form of a semidiscrete approximation is expressed by the equation
d
u
dt
= A
u −
f(t) (4.6)
Note that the elements in the matrix A depend upon both the PDE and the type of
diﬀerencing scheme chosen for the space terms. The vector
f(t) is usually determined
by the boundary conditions and possibly source terms. In general, even the Euler
and NavierStokes equations can be expressed in the form of Eq. 4.6. In such cases
the equations are nonlinear, that is, the elements of A depend on the solution u and
are usually derived by ﬁnding the Jacobian of a ﬂux vector. Although the equations
are nonlinear, the linear analysis presented in this book leads to diagnostics that are
surprisingly accurate when used to evaluate many aspects of numerical methods as
they apply to the Euler and NavierStokes equations.
54 CHAPTER 4. THE SEMIDISCRETE APPROACH
4.2 Exact Solutions of Linear ODE’s
In order to advance Eq. 4.1 in time, the system of ODE’s must be integrated using a
timemarching method. In order to analyze timemarching methods, we will make use
of exact solutions of coupled systems of ODE’s, which exist under certain conditions.
The ODE’s represented by Eq. 4.1 are said to be linear if F is linearly dependent on
u (i.e., if ∂F/∂u = A where A is independent of u). As we have already pointed out,
when the ODE’s are linear they can be expressed in a matrix notation as Eq. 4.6 in
which the coeﬃcient matrix, A, is independent of u. If A does depend explicitly on
t, the general solution cannot be written; whereas, if A does not depend explicitly on
t, the general solution to Eq. 4.6 can be written. This holds regardless of whether or
not the forcing function,
f, depends explicitly on t.
As we shall soon see, the exact solution of Eq. 4.6 can be written in terms of
the eigenvalues and eigenvectors of A. This will lead us to a representative scalar
equation for use in analyzing timemarching methods. These ideas are developed in
the following sections.
4.2.1 Eigensystems of SemiDiscrete Linear Forms
Complete Systems
An M × M matrix is represented by a complete eigensystem if it has a complete set
of linearly independent eigenvectors (see Appendix A) . An eigenvector,
x
m
, and its
corresponding eigenvalue, λ
m
, have the property that
A
x
m
= λ
m
x
m
or
[A−λ
m
I]
x
m
= 0 (4.7)
The eigenvalues are the roots of the equation
det[A−λI] = 0
We form the righthand eigenvector matrix of a complete system by ﬁlling its columns
with the eigenvectors
x
m
:
X =
x
1
,
x
2
. . . ,
x
M
The inverse is the lefthand eigenvector matrix, and together they have the property
that
X
−1
AX = Λ (4.8)
where Λ is a diagonal matrix whose elements are the eigenvalues of A.
4.2. EXACT SOLUTIONS OF LINEAR ODE’S 55
Defective Systems
If an M×M matrix does not have a complete set of linearly independent eigenvectors,
it cannot be transformed to a diagonal matrix of scalars, and it is said to be defective.
It can, however, be transformed to a diagonal set of blocks, some of which may be
scalars (see Appendix A). In general, there exists some S which transforms any
matrix A such that
S
−1
AS = J
where
J =
J
1
J
2
.
.
.
J
m
.
.
.
¸
¸
¸
¸
¸
¸
¸
¸
and
J
(n)
m
=
λ
m
1
λ
m
.
.
.
.
.
. 1
λ
m
¸
¸
¸
¸
¸
¸
1
.
.
.
.
.
.
n
The matrix J is said to be in Jordan canonical form, and an eigenvalue with multi
plicity n within a Jordan block is said to be a defective eigenvalue. Defective systems
play a role in numerical stability analysis.
4.2.2 Single ODE’s of First and SecondOrder
FirstOrder Equations
The simplest nonhomogeneous ODE of interest is given by the single, ﬁrstorder
equation
du
dt
= λu +ae
µt
(4.9)
where λ, a, and µ are scalars, all of which can be complex numbers. The equation
is linear because λ does not depend on u, and has a general solution because λ does
not depend on t. It has a steadystate solution if the righthand side is independent
of t, i.e., if µ = 0, and is homogeneous if the forcing function is zero, i.e., if a = 0.
Although it is quite simple, the numerical analysis of Eq. 4.9 displays many of the
fundamental properties and issues involved in the construction and study of most
popular timemarching methods. This theme will be developed as we proceed.
56 CHAPTER 4. THE SEMIDISCRETE APPROACH
The exact solution of Eq. 4.9 is, for µ = λ,
u(t) = c
1
e
λt
+
ae
µt
µ −λ
where c
1
is a constant determined by the initial conditions. In terms of the initial
value of u, it can be written
u(t) = u(0)e
λt
+a
e
µt
−e
λt
µ −λ
The interesting question can arise: What happens to the solution of Eq. 4.9 when
µ = λ? This is easily found by setting µ = λ + , solving, and then taking the limit
as →0. Using this limiting device, we ﬁnd that the solution to
du
dt
= λu +ae
λt
(4.10)
is given by
u(t) = [u(0) +at]e
λt
As we shall soon see, this solution is required for the analysis of defective systems.
SecondOrder Equations
The homogeneous form of a secondorder equation is given by
d
2
u
dt
2
+a
1
du
dt
+a
0
u = 0 (4.11)
where a
1
and a
0
are complex constants. Now we introduce the diﬀerential operator
D such that
D ≡
d
dt
and factor u(t) out of Eq. 4.11, giving
(D
2
+a
1
D +a
0
) u(t) = 0
The polynomial in D is referred to as a characteristic polynomial and designated
P(D). Characteristic polynomials are fundamental to the analysis of both ODE’s and
O∆E’s, since the roots of these polynomials determine the solutions of the equations.
For ODE’s, we often label these roots in order of increasing modulus as λ
1
, λ
2
, · · ·, λ
m
,
4.2. EXACT SOLUTIONS OF LINEAR ODE’S 57
· · ·, λ
M
. They are found by solving the equation P(λ) = 0. In our simple example,
there would be two roots, λ
1
and λ
2
, determined from
P(λ) = λ
2
+a
1
λ +a
0
= 0 (4.12)
and the solution to Eq. 4.11 is given by
u(t) = c
1
e
λ
1
t
+c
2
e
λ
2
t
(4.13)
where c
1
and c
2
are constants determined from initial conditions. The proof of this
is simple and is found by substituting Eq. 4.13 into Eq. 4.11. One ﬁnds the result
c
1
e
λ
1
t
(λ
2
1
+a
1
λ
1
+a
0
) +c
2
e
λ
2
t
(λ
2
2
+a
1
λ
2
+a
0
)
which is identically zero for all c
1
, c
2
, and t if and only if the λ’s satisfy Eq. 4.12.
4.2.3 Coupled FirstOrder ODE’s
A Complete System
A set of coupled, ﬁrstorder, homogeneous equations is given by
u
1
= a
11
u
1
+ a
12
u
2
u
2
= a
21
u
1
+ a
22
u
2
(4.14)
which can be written
u
= A
u ,
u = [u
1
, u
2
]
T
, A = (a
ij
) =
¸
a
11
a
12
a
21
a
22
¸
Consider the possibility that a solution is represented by
u
1
= c
1
x
11
e
λ
1
t
+ c
2
x
12
e
λ
2
t
u
2
= c
1
x
21
e
λ
1
t
+ c
2
x
22
e
λ
2
t
(4.15)
By substitution, these are indeed solutions to Eq. 4.14 if and only if
¸
a
11
a
12
a
21
a
22
¸ ¸
x
11
x
21
¸
= λ
1
¸
x
11
x
21
¸
,
¸
a
11
a
12
a
21
a
22
¸ ¸
x
12
x
22
¸
= λ
2
¸
x
12
x
22
¸
(4.16)
Notice that a higherorder equation can be reduced to a coupled set of ﬁrstorder
equations by introducing a new set of dependent variables. Thus, by setting
u
1
= u
, u
2
= u ,
we ﬁnd Eq. 4.11 can be written
u
1
= −a
1
u
1
−a
0
u
2
u
2
= u
1
(4.17)
which is a subset of Eq. 4.14.
58 CHAPTER 4. THE SEMIDISCRETE APPROACH
A Derogatory System
Eq. 4.15 is still a solution to Eq. 4.14 if λ
1
= λ
2
= λ, provided two linearly independent
vectors exist to satisfy Eq. 4.16 with A = Λ. In this case
¸
λ 0
0 λ
¸
1
0
= λ
¸
1
0
and
¸
λ 0
0 λ
¸
0
1
= λ
¸
0
1
provide such a solution. This is the case where A has a complete set of eigenvectors
and is not defective.
A Defective System
If A is defective, then it can be represented by the Jordan canonical form
¸
u
1
u
2
¸
=
¸
λ 0
1 λ
¸ ¸
u
1
u
2
¸
(4.18)
whose solution is not obvious. However, in this case, one can solve the top equation
ﬁrst, giving u
1
(t) = u
1
(0)e
λt
. Then, substituting this result into the second equation,
one ﬁnds
du
2
dt
= λu
2
+u
1
(0)e
λt
which is identical in form to Eq. 4.10 and has the solution
u
2
(t) = [u
2
(0) +u
1
(0)t]e
λt
From this the reader should be able to verify that
u
3
(t) =
a +bt +ct
2
e
λt
is a solution to
u
1
u
2
u
3
¸
¸
¸ =
λ
1 λ
1 λ
¸
¸
¸
u
1
u
2
u
3
¸
¸
¸
if
a = u
3
(0) , b = u
2
(0) , c =
1
2
u
1
(0) (4.19)
The general solution to such defective systems is left as an exercise.
4.2. EXACT SOLUTIONS OF LINEAR ODE’S 59
4.2.4 General Solution of Coupled ODE’s with Complete Eigen
systems
Let us consider a set of coupled, nonhomogeneous, linear, ﬁrstorder ODE’s with
constant coeﬃcients which might have been derived by space diﬀerencing a set of
PDE’s. Represent them by the equation
d
u
dt
= A
u −
f(t) (4.20)
Our assumption is that the M × M matrix A has a complete eigensystem
1
and can
be transformed by the left and right eigenvector matrices, X
−1
and X, to a diagonal
matrix Λ having diagonal elements which are the eigenvalues of A, see Section 4.2.1.
Now let us multiply Eq. 4.20 from the left by X
−1
and insert the identity combination
XX
−1
= I between A and
u. There results
X
−1
d
u
dt
= X
−1
AX · X
−1
u −X
−1
f(t) (4.21)
Since A is independent of both u and t, the elements in X
−1
and X are also indepen
dent of both u and t, and Eq. 4.21 can be modiﬁed to
d
dt
X
−1
u = ΛX
−1
u −X
−1
f(t)
Finally, by introducing the new variables
w and
g such that
w = X
−1
u ,
g(t) = X
−1
f(t) (4.22)
we reduce Eq. 4.20 to a new algebraic form
d
w
dt
= Λ
w −
g(t) (4.23)
It is important at this point to review the results of the previous paragraph. Notice
that Eqs. 4.20 and 4.23 are expressing exactly the same equality. The only diﬀerence
between them was brought about by algebraic manipulations which regrouped the
variables. However, this regrouping is crucial for the solution process because Eqs.
1
In the following, we exclude defective systems, not because they cannot be analyzed (the example
at the conclusion of the previous section proves otherwise), but because they are only of limited
interest in the general development of our theory.
60 CHAPTER 4. THE SEMIDISCRETE APPROACH
4.23 are no longer coupled. They can be written line by line as a set of independent,
single, ﬁrstorder equations, thus
w
1
= λ
1
w
1
−g
1
(t)
.
.
.
w
m
= λ
m
w
m
−g
m
(t)
.
.
.
w
M
= λ
M
w
M
−g
M
(t) (4.24)
For any given set of g
m
(t) each of these equations can be solved separately and then
recoupled, using the inverse of the relations given in Eqs. 4.22:
u(t) = X
w(t)
=
M
¸
m=1
w
m
(t)
x
m
(4.25)
where x
m
is the m’th column of X, i.e., the eigenvector corresponding to λ
m
.
We next focus on the very important subset of Eq. 4.20 when neither A nor
f has
any explicit dependence on t. In such a case, the g
m
in Eqs. 4.23 and 4.24 are also
time invariant and the solution to any line in Eq. 4.24 is
w
m
(t) = c
m
e
λmt
+
1
λ
m
g
m
where the c
m
are constants that depend on the initial conditions. Transforming back
to the usystem gives
u(t) = X
w(t)
=
M
¸
m=1
w
m
(t)
x
m
=
M
¸
m=1
c
m
e
λmt
x
m
+
M
¸
m=1
1
λ
m
g
m
x
m
=
M
¸
m=1
c
m
e
λmt
x
m
+XΛ
−1
X
−1
f
=
M
¸
m=1
c
m
e
λmt
x
m
+A
−1
f
. .. .
Transient
. .. .
Steadystate
(4.26)
4.3. REAL SPACE AND EIGENSPACE 61
Note that the steadystate solution is A
−1
f, as might be expected.
The ﬁrst group of terms on the right side of this equation is referred to classically
as the complementary solution or the solution of the homogeneous equations. The
second group is referred to classically as the particular solution or the particular
integral. In our application to ﬂuid dynamics, it is more instructive to refer to these
groups as the transient and steadystate solutions, respectively. An alternative, but
entirely equivalent, form of the solution is
u(t) = c
1
e
λ
1
t
x
1
+· · · +c
m
e
λmt
x
m
+· · · +c
M
e
λ
M
t
x
M
+ A
−1
f (4.27)
4.3 Real Space and Eigenspace
4.3.1 Deﬁnition
Following the semidiscrete approach discussed in Section 4.1, we reduce the partial
diﬀerential equations to a set of ordinary diﬀerential equations represented by the
generic form
d
u
dt
= A
u −
f (4.28)
The dependent variable u represents some physical quantity or quantities which relate
to the problem of interest. For the model problems on which we are focusing most
of our attention, the elements of A are independent of both u and t. This permits
us to say a great deal about such problems and serves as the basis for this section.
In particular, we can develop some very important and fundamental concepts that
underly the global properties of the numerical solutions to the model problems. How
these relate to the numerical solutions of more practical problems depends upon the
problem and, to a much greater extent, on the cleverness of the relator.
We begin by developing the concept of “spaces”. That is, we identify diﬀerent
mathematical reference frames (spaces) and view our solutions from within each.
In this way, we get diﬀerent perspectives of the same solution, and this can add
signiﬁcantly to our understanding.
The most natural reference frame is the physical one. We say
If a solution is expressed in terms of
u, it is said
to be in real space.
There is, however, another very useful frame. We saw in Sections 4.2.1 and 4.2 that
pre and postmultiplication of A by the appropriate similarity matrices transforms A
into a diagonal matrix, composed, in the most general case, of Jordan blocks or, in the
62 CHAPTER 4. THE SEMIDISCRETE APPROACH
simplest nondefective case, of scalars. Following Section 4.2 and, for simplicity, using
only complete systems for our examples, we found that Eq. 4.28 had the alternative
form
d
w
dt
= Λ
w −
g
which is an uncoupled set of ﬁrstorder ODE’s that can be solved independently for
the dependent variable vector
w. We say
If a solution is expressed in terms of
w, it is said to
be in eigenspace (often referred to as wave space).
The relations that transfer from one space to the other are:
w = X
−1
u
g = X
−1
f
u = X
w
f = X
g
The elements of
u relate directly to the local physics of the problem. However, the
elements of
w are linear combinations of all of the elements of
u, and individually
they have no direct local physical interpretation.
When the forcing function
f is independent of t, the solutions in the two spaces
are represented by
u(t) =
¸
c
m
e
λmt
x
m
+XΛ
−1
X
−1
f
and
w
m
(t) = c
m
e
λmt
+
1
λ
m
g
m
; m = 1, 2, · · · , M
for real space and eigenspace, respectively. At this point we make two observations:
1. the transient portion of the solution in real space consists of a linear combination
of contributions from each eigenvector, and
2. the transient portion of the solution in eigenspace provides the weighting of
each eigenvector component of the transient solution in real space.
4.3.2 Eigenvalue Spectrums for Model ODE’s
It is instructive to consider the eigenvalue spectrums of the ODE’s formulated by
central diﬀerencing the model equations for diﬀusion and biconvection. These model
ODE’s are presented in Section 4.1. Equations for the eigenvalues of the simple
4.3. REAL SPACE AND EIGENSPACE 63
tridiagonals, B(M : a, b, c) and B
p
(M : a, b, c), are given in Appendix B. From Section
B.1, we ﬁnd for the model diﬀusion equation with Dirichlet boundary conditions
λ
m
=
ν
∆x
2
¸
−2 + 2 cos
mπ
M + 1
=
−4ν
∆x
2
sin
2
mπ
2(M + 1)
; m = 1, 2, · · · , M (4.29)
and, from Section B.4, for the model biconvection equation
λ
m
=
−ia
∆x
sin
2mπ
M
m = 0, 1, · · · , M −1
= −iκ
∗
m
a m = 0, 1, · · · , M −1 (4.30)
where
κ
∗
m
=
sinκ
m
∆x
∆x
m = 0, 1, · · · , M −1 (4.31)
is the modiﬁed wavenumber from Section 3.5, κ
m
= m, and ∆x = 2π/M. Notice
that the diﬀusion eigenvalues are real and negative while those representing periodic
convection are all pure imaginary. The interpretation of this result plays a very
important role later in our stability analysis.
4.3.3 Eigenvectors of the Model Equations
Next we consider the eigenvectors of the two model equations. These follow as special
cases from the results given in Appendix B.
The Diﬀusion Model
Consider Eq. 4.4, the model ODE’s for diﬀusion. First, to help visualize the matrix
structure, we present results for a simple 4point mesh and then we give the general
case. The righthand eigenvector matrix X is given by
sin (x
1
) sin (2x
1
) sin (3x
1
) sin (4x
1
)
sin (x
2
) sin (2x
2
) sin (3x
2
) sin (4x
2
)
sin (x
3
) sin (2x
3
) sin (3x
3
) sin (4x
3
)
sin (x
4
) sin (2x
4
) sin (3x
4
) sin (4x
4
)
¸
¸
¸
¸
¸
The columns of the matrix are proportional to the eigenvectors. Recall that x
j
=
j∆x = jπ/(M + 1), so in general the relation u = X w can be written as
u
j
=
M
¸
m=1
w
m
sinmx
j
; j = 1, 2, · · · , M (4.32)
64 CHAPTER 4. THE SEMIDISCRETE APPROACH
For the inverse, or lefthand eigenvector matrix X
−1
, we ﬁnd
sin (x
1
) sin (x
2
) sin (x
3
) sin (x
4
)
sin (2x
1
) sin (2x
2
) sin (2x
3
) sin (2x
4
)
sin (3x
1
) sin (3x
2
) sin (3x
3
) sin (3x
4
)
sin (4x
1
) sin (4x
2
) sin (4x
3
) sin (4x
4
)
¸
¸
¸
¸
¸
The rows of the matrix are proportional to the eigenvectors. In general w = X
−1
u
gives
w
m
=
M
¸
j=1
u
j
sinmx
j
; m = 1, 2, · · · , M (4.33)
In the ﬁeld of harmonic analysis, Eq. 4.33 represents a sine transform of the func
tion u(x) for an Mpoint sample between the boundaries x = 0 and x = π with the
condition u(0) = u(π) = 0. Similarly, Eq. 4.32 represents the sine synthesis that
companions the sine transform given by Eq. 4.33. In summary,
For the model diﬀusion equation:
w = X
−1
u is a sine transform from real space to (sine) wave
space.
u = X
w is a sine synthesis from wave space back to real
space.
The Biconvection Model
Next consider the model ODE’s for periodic convection, Eq. 4.5. The coeﬃcient
matrices for these ODE’s are always circulant. For our model ODE, the righthand
eigenvectors are given by
x
m
= e
i j (2πm/M)
,
j = 0, 1, · · · , M −1
m = 0, 1, · · · , M −1
With x
j
= j · ∆x = j · 2π/M, we can write u = X w as
u
j
=
M−1
¸
m=0
w
m
e
imx
j
; j = 0, 1, · · · , M −1 (4.34)
4.3. REAL SPACE AND EIGENSPACE 65
For a 4point periodic mesh, we ﬁnd the following lefthand eigenvector matrix
from Appendix B.4:
w
1
w
2
w
3
w
4
¸
¸
¸
¸
¸
=
1
4
1 1 1 1
1 e
−2iπ/4
e
−4iπ/4
e
−6iπ/4
1 e
−4iπ/4
e
−8iπ/4
e
−12iπ/4
1 e
−6iπ/4
e
−12iπ/4
e
−18iπ/4
¸
¸
¸
¸
¸
u
1
u
2
u
3
u
4
¸
¸
¸
¸
¸
= X
−1
u
In general
w
m
=
1
M
M−1
¸
j=0
u
j
e
−imx
j
; m = 0, 1, · · · , M −1
This equation is identical to a discrete Fourier transform of the periodic dependent
variable
u using an Mpoint sample between and including x = 0 and x = 2π −∆x.
For circulant matrices, it is straightforward to establish the fact that the relation
u = X
w represents the Fourier synthesis of the variable
w back to
u. In summary,
For any circulant system:
w = X
−1
u is a complex Fourier transform from real space to
wave space.
u = X
w is a complex Fourier synthesis from wave space
back to real space.
4.3.4 Solutions of the Model ODE’s
We can now combine the results of the previous sections to write the solutions of our
model ODE’s.
The Diﬀusion Equation
For the diﬀusion equation, Eq. 4.27 becomes
u
j
(t) =
M
¸
m=1
c
m
e
λmt
sin mx
j
+ (A
−1
f)
j
, j = 1, 2, · · · , M (4.35)
where
λ
m
=
−4ν
∆x
2
sin
2
mπ
2(M + 1)
(4.36)
66 CHAPTER 4. THE SEMIDISCRETE APPROACH
With the modiﬁed wavenumber deﬁned as
κ
∗
m
=
2
∆x
sin
κ
m
∆x
2
(4.37)
and using κ
m
= m, we can write the ODE solution as
u
j
(t) =
M
¸
m=1
c
m
e
−νκ
∗
m
2
t
sinκ
m
x
j
+ (A
−1
f)
j
, j = 1, 2, · · · , M (4.38)
This can be compared with the exact solution to the PDE, Eq. 2.37, evaluated at the
nodes of the grid:
u
j
(t) =
M
¸
m=1
c
m
e
−νκm
2
t
sinκ
m
x
j
+h(x
j
), j = 1, 2, · · · , M (4.39)
We see that the solutions are identical except for the steady solution and the
modiﬁed wavenumber in the transient term. The modiﬁed wavenumber is an approx
imation to the actual wavenumber. The diﬀerence between the modiﬁed wavenumber
and the actual wavenumber depends on the diﬀerencing scheme and the grid resolu
tion. This diﬀerence causes the various modes (or eigenvector components) to decay
at rates which diﬀer from the exact solution. With conventional diﬀerencing schemes,
low wavenumber modes are accurately represented, while high wavenumber modes (if
they have signiﬁcant amplitudes) can have large errors.
The Convection Equation
For the biconvection equation, we obtain
u
j
(t) =
M−1
¸
m=0
c
m
e
λmt
e
iκmx
j
, j = 0, 1, · · · , M −1 (4.40)
where
λ
m
= −iκ
∗
m
a (4.41)
with the modiﬁed wavenumber deﬁned in Eq. 4.31. We can write this ODE solution
as
u
j
(t) =
M−1
¸
m=0
c
m
e
−iκ
∗
m
at
e
iκmx
j
, j = 0, 1, · · · , M −1 (4.42)
4.4. THE REPRESENTATIVE EQUATION 67
and compare it to the exact solution of the PDE, Eq. 2.26, evaluated at the nodes of
the grid:
u
j
(t) =
M−1
¸
m=0
f
m
(0)e
−iκmat
e
iκmx
j
, j = 0, 1, · · · , M −1 (4.43)
Once again the diﬀerence appears through the modiﬁed wavenumber contained in
λ
m
. As discussed in Section 3.5, this leads to an error in the speed with which various
modes are convected, since κ
∗
is real. Since the error in the phase speed depends on
the wavenumber, while the actual phase speed is independent of the wavenumber,
the result is erroneous numerical dispersion. In the case of noncentered diﬀerencing,
discussed in Chapter 11, the modiﬁed wavenumber is complex. The form of Eq. 4.42
shows that the imaginary portion of the modiﬁed wavenumber produces nonphysical
decay or growth in the numerical solution.
4.4 The Representative Equation
In Section 4.3, we pointed out that Eqs. 4.20 and 4.23 express identical results but in
terms of diﬀerent groupings of the dependent variables, which are related by algebraic
manipulation. This leads to the following important concept:
The numerical solution to a set of linear ODE’s (in which A is
not a function of t) is entirely equivalent to the solution obtained
if the equations are transformed to eigenspace, solved there in
their uncoupled form, and then returned as a coupled set to real
space.
The importance of this concept resides in its message that we can analyze time
marching methods by applying them to a single, uncoupled equation and our con
clusions will apply in general. This is helpful both in analyzing the accuracy of
timemarching methods and in studying their stability, topics which are covered in
Chapters 6 and 7.
Our next objective is to ﬁnd a “typical” single ODE to analyze. We found the
uncoupled solution to a set of ODE’s in Section 4.2. A typical member of the family
is
dw
m
dt
= λ
m
w
m
−g
m
(t) (4.44)
The goal in our analysis is to study typical behavior of general situations, not partic
ular problems. For such a purpose Eq. 4.44 is not quite satisfactory. The role of λ
m
is
clear; it stands for some representative eigenvalue in the original A matrix. However,
68 CHAPTER 4. THE SEMIDISCRETE APPROACH
the question is: What should we use for g
m
(t) when the time dependence cannot be
ignored? To answer this question, we note that, in principle, one can express any one
of the forcing terms g
m
(t) as a ﬁnite Fourier series. For example
−g(t) =
¸
k
a
k
e
ikt
for which Eq. 4.44 has the exact solution:
w(t) = ce
λt
+
¸
k
a
k
e
ikt
ik −λ
From this we can extract the k’th term and replace ik with µ. This leads to
The Representative ODE
dw
dt
= λw +ae
µt
(4.45)
which can be used to evaluate all manner of timemarching methods. In such evalua
tions the parameters λ and µ must be allowed to take the worst possible combination
of values that might occur in the ODE eigensystem. The exact solution of the repre
sentative ODE is (for µ = λ):
w(t) = ce
λt
+
ae
µt
µ −λ
(4.46)
4.5 Problems
1. Consider the ﬁnitediﬀerence operator derived in question 1 of Chapter 3. Using
this operator to approximate the spatial derivative in the linear convection equa
tion, write the semidiscrete form obtained with periodic boundary conditions
on a 5point grid (M = 5).
2. Consider the application of the operator given in Eq. 3.52 to the 1D diﬀusion
equation with Dirichlet boundary conditions. Write the resulting semidiscrete
ODE form. Find the entries in the boundarycondition vector.
3. Write the semidiscrete form resulting from the application of secondorder cen
tered diﬀerences to the following equation on the domain 0 ≤ x ≤ 1 with
boundary conditions u(0) = 0, u(1) = 1:
∂u
∂t
=
∂
2
u
∂x
2
−6x
4.5. PROBLEMS 69
4. Consider a grid with 10 interior points spanning the domain 0 ≤ x ≤ π. For
initial conditions u(x, 0) = sin(mx) and boundary conditions u(0, t) = u(π, t) =
0, plot the exact solution of the diﬀusion equation with ν = 1 at t = 1 with
m = 1 and m = 3. (Plot the solution at the grid nodes only.) Calculate the
corresponding modiﬁed wavenumbers for the secondorder centered operator
from Eq. 4.37. Calculate and plot the corresponding ODE solutions.
5. Consider the matrix
A = −B
p
(10, −1, 0, 1)/(2∆x)
corresponding to the ODE form of the biconvection equation resulting from the
application of secondorder central diﬀerencing on a 10point grid. Note that
the domain is 0 ≤ x ≤ 2π and ∆x = 2π/10. The grid nodes are given by
x
j
= j∆x, j = 0, 1, . . . 9. The eigenvalues of the above matrix A, as well as the
matrices X and X
−1
, can be found from Appendix B.4. Using these, compute
and plot the ODE solution at t = 2π for the initial condition u(x, 0) = sinx.
Compare with the exact solution of the PDE. Calculate the numerical phase
speed from the modiﬁed wavenumber corresponding to this initial condition
and show that it is consistent with the ODE solution. Repeat for the initial
condition u(x, 0) = sin2x.
70 CHAPTER 4. THE SEMIDISCRETE APPROACH
Chapter 5
FINITEVOLUME METHODS
In Chapter 3, we saw how to derive ﬁnitediﬀerence approximations to arbitrary
derivatives. In Chapter 4, we saw that the application of a ﬁnitediﬀerence approxi
mation to the spatial derivatives in our model PDE’s produces a coupled set of ODE’s.
In this Chapter, we will show how similar semidiscrete forms can be derived using
ﬁnitevolume approximations in space. Finitevolume methods have become popu
lar in CFD as a result, primarily, of two advantages. First, they ensure that the
discretization is conservative, i.e., mass, momentum, and energy are conserved in a
discrete sense. While this property can usually be obtained using a ﬁnitediﬀerence
formulation, it is obtained naturally from a ﬁnitevolume formulation. Second, ﬁnite
volume methods do not require a coordinate transformation in order to be applied on
irregular meshes. As a result, they can be applied on unstructured meshes consisting
of arbitrary polyhedra in three dimensions or arbitrary polygons in two dimensions.
This increased ﬂexibility can be used to great advantage in generating grids about
arbitrary geometries.
Finitevolume methods are applied to the integral form of the governing equations,
either in the form of Eq. 2.1 or Eq. 2.2. Consistent with our emphasis on semidiscrete
methods, we will study the latter form, which is
d
dt
V (t)
QdV +
S(t)
n.FdS =
V (t)
PdV (5.1)
We will begin by presenting the basic concepts which apply to ﬁnitevolume strategies.
Next we will give our model equations in the form of Eq. 5.1. This will be followed
by several examples which hopefully make these concepts clear.
71
72 CHAPTER 5. FINITEVOLUME METHODS
5.1 Basic Concepts
The basic idea of a ﬁnitevolume method is to satisfy the integral form of the con
servation law to some degree of approximation for each of many contiguous control
volumes which cover the domain of interest. Thus the volume V in Eq. 5.1 is that of a
control volume whose shape is dependent on the nature of the grid. In our examples,
we will consider only control volumes which do not vary with time. Examining Eq.
5.1, we see that several approximations must be made. The ﬂux is required at the
boundary of the control volume, which is a closed surface in three dimensions and a
closed contour in two dimensions. This ﬂux must then be integrated to ﬁnd the net
ﬂux through the boundary. Similarly, the source term P must be integrated over the
control volume. Next a timemarching method
1
can be applied to ﬁnd the value of
V
QdV (5.2)
at the next time step.
Let us consider these approximations in more detail. First, we note that the
average value of Q in a cell with volume V is
¯
Q ≡
1
V
V
QdV (5.3)
and Eq. 5.1 can be written as
V
d
dt
¯
Q +
S
n.FdS =
V
PdV (5.4)
for a control volume which does not vary with time. Thus after applying a time
marching method, we have updated values of the cellaveraged quantities
¯
Q. In order
to evaluate the ﬂuxes, which are a function of Q, at the controlvolume boundary, Q
can be represented within the cell by some piecewise approximation which produces
the correct value of
¯
Q. This is a form of interpolation often referred to as recon
struction. As we shall see in our examples, each cell will have a diﬀerent piecewise
approximation to Q. When these are used to calculate F(Q), they will generally
produce diﬀerent approximations to the ﬂux at the boundary between two control
volumes, that is, the ﬂux will be discontinuous. A nondissipative scheme analogous
to centered diﬀerencing is obtained by taking the average of these two ﬂuxes. Another
approach known as ﬂuxdiﬀerence splitting is described in Chapter 11.
The basic elements of a ﬁnitevolume method are thus the following:
1
Timemarching methods will be discussed in the next chapter.
5.2. MODEL EQUATIONS IN INTEGRAL FORM 73
1. Given the value of
¯
Q for each control volume, construct an approximation to
Q(x, y, z) in each control volume. Using this approximation, ﬁnd Q at the
controlvolume boundary. Evaluate F(Q) at the boundary. Since there is a
distinct approximation to Q(x, y, z) in each control volume, two distinct values
of the ﬂux will generally be obtained at any point on the boundary between two
control volumes.
2. Apply some strategy for resolving the discontinuity in the ﬂux at the control
volume boundary to produce a single value of F(Q) at any point on the bound
ary. This issue is discussed in Section 11.4.2.
3. Integrate the ﬂux to ﬁnd the net ﬂux through the controlvolume boundary
using some sort of quadrature.
4. Advance the solution in time to obtain new values of
¯
Q.
The order of accuracy of the method is dependent on each of the approximations.
These ideas should be clariﬁed by the examples in the remainder of this chapter.
In order to include diﬀusive ﬂuxes, the following relation between ∇Q and Q is
sometimes used:
V
∇QdV =
S
nQdS (5.5)
or, in two dimensions,
A
∇QdA =
C
nQdl (5.6)
where the unit vector n points outward from the surface or contour.
5.2 Model Equations in Integral Form
5.2.1 The Linear Convection Equation
A twodimensional form of the linear convection equation can be written as
∂u
∂t
+a cos θ
∂u
∂x
+a sinθ
∂u
∂y
= 0 (5.7)
This PDE governs a simple plane wave convecting the scalar quantity, u(x, y, t) with
speed a along a straight line making an angle θ with respect to the xaxis. The
onedimensional form is recovered with θ = 0.
74 CHAPTER 5. FINITEVOLUME METHODS
For unit speed a, the twodimensional linear convection equation is obtained from
the general divergence form, Eq. 2.3, with
Q = u (5.8)
F = iu cos θ +ju sinθ (5.9)
P = 0 (5.10)
Since Q is a scalar, F is simply a vector. Substituting these expressions into a two
dimensional form of Eq. 2.2 gives the following integral form
d
dt
A
udA+
C
n.(iu cos θ +ju sinθ)ds = 0 (5.11)
where A is the area of the cell which is bounded by the closed contour C.
5.2.2 The Diﬀusion Equation
The integral form of the twodimensional diﬀusion equation with no source term and
unit diﬀusion coeﬃcient ν is obtained from the general divergence form, Eq. 2.3, with
Q = u (5.12)
F = −∇u (5.13)
= −
i
∂u
∂x
+j
∂u
∂y
(5.14)
P = 0 (5.15)
Using these, we ﬁnd
d
dt
A
udA =
C
n.
i
∂u
∂x
+j
∂u
∂y
ds (5.16)
to be the integral form of the twodimensional diﬀusion equation.
5.3 OneDimensional Examples
We restrict our attention to a scalar dependent variable u and a scalar ﬂux f, as in
the model equations. We consider an equispaced grid with spacing ∆x. The nodes of
the grid are located at x
j
= j∆x as usual. Control volume j extends from x
j
−∆x/2
to x
j
+ ∆x/2, as shown in Fig. 5.1. We will use the following notation:
x
j−1/2
= x
j
−∆x/2, x
j+1/2
= x
j
+ ∆x/2 (5.17)
5.3. ONEDIMENSIONAL EXAMPLES 75
j j+1 j1 j+2 j2
j1/2
j+1/2
∆
x
L R L R
Figure 5.1: Control volume in one dimension.
u
j±1/2
= u(x
j±1/2
), f
j±1/2
= f(u
j±1/2
) (5.18)
With these deﬁnitions, the cellaverage value becomes
¯ u
j
(t) ≡
1
∆x
x
j+1/2
x
j−1/2
u(x, t)dx (5.19)
and the integral form becomes
d
dt
(∆x¯ u
j
) +f
j+1/2
−f
j−1/2
=
x
j+1/2
x
j−1/2
Pdx (5.20)
Now with ξ = x −x
j
, we can expand u(x) in Eq. 5.19 in a Taylor series about x
j
(with t ﬁxed) to get
¯ u
j
≡
1
∆x
∆x/2
−∆x/2
u
j
+ξ
∂u
∂x
j
+
ξ
2
2
∂
2
u
∂x
2
j
+
ξ
3
6
∂
3
u
∂x
3
j
+. . .
¸
¸
dξ
= u
j
+
∆x
2
24
∂
2
u
∂x
2
j
+
∆x
4
1920
∂
4
u
∂x
4
j
+O(∆x
6
) (5.21)
or
¯ u
j
= u
j
+O(∆x
2
) (5.22)
where u
j
is the value at the center of the cell. Hence the cellaverage value and the
value at the center of the cell diﬀer by a term of second order.
5.3.1 A SecondOrder Approximation to the Convection Equa
tion
In one dimension, the integral form of the linear convection equation, Eq. 5.11, be
comes
∆x
d¯ u
j
dt
+f
j+1/2
−f
j−1/2
= 0 (5.23)
76 CHAPTER 5. FINITEVOLUME METHODS
with f = u. We choose a piecewise constant approximation to u(x) in each cell such
that
u(x) = ¯ u
j
x
j−1/2
≤ x ≤ x
j+1/2
(5.24)
Evaluating this at j + 1/2 gives
f
L
j+1/2
= f(u
L
j+1/2
) = u
L
j+1/2
= ¯ u
j
(5.25)
where the L indicates that this approximation to f
j+1/2
is obtained from the approx
imation to u(x) in the cell to the left of x
j+1/2
, as shown in Fig. 5.1. The cell to the
right of x
j+1/2
, which is cell j + 1, gives
f
R
j+1/2
= ¯ u
j+1
(5.26)
Similarly, cell j is the cell to the right of x
j−1/2
, giving
f
R
j−1/2
= ¯ u
j
(5.27)
and cell j −1 is the cell to the left of x
j−1/2
, giving
f
L
j−1/2
= ¯ u
j−1
(5.28)
We have now accomplished the ﬁrst step from the list in Section 5.1; we have deﬁned
the ﬂuxes at the cell boundaries in terms of the cellaverage data. In this example,
the discontinuity in the ﬂux at the cell boundary is resolved by taking the average of
the ﬂuxes on either side of the boundary. Thus
ˆ
f
j+1/2
=
1
2
(f
L
j+1/2
+f
R
j+1/2
) =
1
2
(¯ u
j
+ ¯ u
j+1
) (5.29)
and
ˆ
f
j−1/2
=
1
2
(f
L
j−1/2
+f
R
j−1/2
) =
1
2
(¯ u
j−1
+ ¯ u
j
) (5.30)
where
ˆ
f denotes a numerical ﬂux which is an approximation to the exact ﬂux.
Substituting Eqs. 5.29 and 5.30 into the integral form, Eq. 5.23, we obtain
∆x
d¯ u
j
dt
+
1
2
(¯ u
j
+ ¯ u
j+1
) −
1
2
(¯ u
j−1
+ ¯ u
j
) = ∆x
d¯ u
j
dt
+
1
2
(¯ u
j+1
− ¯ u
j−1
) = 0 (5.31)
With periodic boundary conditions, this point operator produces the following semi
discrete form:
d
¯ u
dt
= −
1
2∆x
B
p
(−1, 0, 1)
¯ u (5.32)
5.3. ONEDIMENSIONAL EXAMPLES 77
This is identical to the expression obtained using secondorder centered diﬀerences,
except it is written in terms of the cell average
¯ u, rather than the nodal values, u.
Hence our analysis and understanding of the eigensystem of the matrix B
p
(−1, 0, 1)
is relevant to ﬁnitevolume methods as well as ﬁnitediﬀerence methods. Since the
eigenvalues of B
p
(−1, 0, 1) are pure imaginary, we can conclude that the use of the
average of the ﬂuxes on either side of the cell boundary, as in Eqs. 5.29 and 5.30, can
lead to a nondissipative ﬁnitevolume method.
5.3.2 A FourthOrder Approximation to the Convection Equa
tion
Let us replace the piecewise constant approximation in Section 5.3.1 with a piecewise
quadratic approximation as follows
u(ξ) = aξ
2
+bξ +c (5.33)
where ξ is again equal to x − x
j
. The three parameters a, b, and c are chosen to
satisfy the following constraints:
1
∆x
−∆x/2
−3∆x/2
u(ξ)dξ = ¯ u
j−1
1
∆x
∆x/2
−∆x/2
u(ξ)dξ = ¯ u
j
(5.34)
1
∆x
3∆x/2
∆x/2
u(ξ)dξ = ¯ u
j+1
These constraints lead to
a =
¯ u
j+1
−2¯ u
j
+ ¯ u
j−1
2∆x
2
b =
¯ u
j+1
− ¯ u
j−1
2∆x
(5.35)
c =
−¯ u
j−1
+ 26¯ u
j
− ¯ u
j+1
24
With these values of a, b, and c, the piecewise quadratic approximation produces
the following values at the cell boundaries:
u
L
j+1/2
=
1
6
(2¯ u
j+1
+ 5¯ u
j
− ¯ u
j−1
) (5.36)
78 CHAPTER 5. FINITEVOLUME METHODS
u
R
j−1/2
=
1
6
(−¯ u
j+1
+ 5¯ u
j
+ 2¯ u
j−1
) (5.37)
u
R
j+1/2
=
1
6
(−¯ u
j+2
+ 5¯ u
j+1
+ 2¯ u
j
) (5.38)
u
L
j−1/2
=
1
6
(2¯ u
j
+ 5¯ u
j−1
− ¯ u
j−2
) (5.39)
using the notation deﬁned in Section 5.3.1. Recalling that f = u, we again use the
average of the ﬂuxes on either side of the boundary to obtain
ˆ
f
j+1/2
=
1
2
[f(u
L
j+1/2
) +f(u
R
j+1/2
)]
=
1
12
(−¯ u
j+2
+ 7¯ u
j+1
+ 7¯ u
j
− ¯ u
j−1
) (5.40)
and
ˆ
f
j−1/2
=
1
2
[f(u
L
j−1/2
) +f(u
R
j−1/2
)]
=
1
12
(−¯ u
j+1
+ 7¯ u
j
+ 7¯ u
j−1
− ¯ u
j−2
) (5.41)
Substituting these expressions into the integral form, Eq. 5.23, gives
∆x
d¯ u
j
dt
+
1
12
(−¯ u
j+2
+ 8¯ u
j+1
−8¯ u
j−1
+ ¯ u
j−2
) = 0 (5.42)
This is a fourthorder approximation to the integral form of the equation, as can be
veriﬁed using Taylor series expansions (see question 1 at the end of this chapter).
With periodic boundary conditions, the following semidiscrete form is obtained:
d
¯ u
dt
= −
1
12∆x
B
p
(1, −8, 0, 8, −1)
¯ u (5.43)
This is a system of ODE’s governing the evolution of the cellaverage data.
5.3.3 A SecondOrder Approximation to the Diﬀusion Equa
tion
In this section, we describe two approaches to deriving a ﬁnitevolume approximation
to the diﬀusion equation. The ﬁrst approach is simpler to extend to multidimensions,
while the second approach is more suited to extension to higher order accuracy.
5.3. ONEDIMENSIONAL EXAMPLES 79
In one dimension, the integral form of the diﬀusion equation, Eq. 5.16, becomes
∆x
d¯ u
j
dt
+f
j+1/2
−f
j−1/2
= 0 (5.44)
with f = −∇u = −∂u/∂x. Also, Eq. 5.6 becomes
b
a
∂u
∂x
dx = u(b) −u(a) (5.45)
We can thus write the following expression for the average value of the gradient of u
over the interval x
j
≤ x ≤ x
j+1
:
1
∆x
x
j+1
x
j
∂u
∂x
dx =
1
∆x
(u
j+1
−u
j
) (5.46)
From Eq. 5.22, we know that the value of a continuous function at the center of a given
interval is equal to the average value of the function over the interval to secondorder
accuracy. Hence, to secondorder, we can write
ˆ
f
j+1/2
= −
∂u
∂x
j+1/2
= −
1
∆x
(¯ u
j+1
− ¯ u
j
) (5.47)
Similarly,
ˆ
f
j−1/2
= −
1
∆x
(¯ u
j
− ¯ u
j−1
) (5.48)
Substituting these into the integral form, Eq. 5.44, we obtain
∆x
d¯ u
j
dt
=
1
∆x
(¯ u
j−1
−2¯ u
j
+ ¯ u
j+1
) (5.49)
or, with Dirichlet boundary conditions,
d
¯ u
dt
=
1
∆x
2
B(1, −2, 1)
¯ u +
bc
(5.50)
This provides a semidiscrete ﬁnitevolume approximation to the diﬀusion equation,
and we see that the properties of the matrix B(1, −2, 1) are relevant to the study of
ﬁnitevolume methods as well as ﬁnitediﬀerence methods.
For our second approach, we use a piecewise quadratic approximation as in Section
5.3.2. From Eq. 5.33 we have
∂u
∂x
=
∂u
∂ξ
= 2aξ +b (5.51)
80 CHAPTER 5. FINITEVOLUME METHODS
with a and b given in Eq. 5.35. With f = −∂u/∂x, this gives
f
R
j+1/2
= f
L
j+1/2
= −
1
∆x
(¯ u
j+1
− ¯ u
j
) (5.52)
f
R
j−1/2
= f
L
j−1/2
= −
1
∆x
(¯ u
j
− ¯ u
j−1
) (5.53)
Notice that there is no discontinuity in the ﬂux at the cell boundary. This produces
d¯ u
j
dt
=
1
∆x
2
(¯ u
j−1
−2¯ u
j
+ ¯ u
j+1
) (5.54)
which is identical to Eq. 5.49. The resulting semidiscrete form with periodic bound
ary conditions is
d
¯ u
dt
=
1
∆x
2
B
p
(1, −2, 1)
¯ u (5.55)
which is written entirely in terms of cellaverage data.
5.4 A TwoDimensional Example
The above onedimensional examples of ﬁnitevolume approximations obscure some
of the practical aspects of such methods. Thus our ﬁnal example is a ﬁnitevolume
approximation to the twodimensional linear convection equation on a grid consisting
of regular triangles, as shown in Figure 5.2. As in Section 5.3.1, we use a piecewise
constant approximation in each control volume and the ﬂux at the control volume
boundary is the average of the ﬂuxes obtained on either side of the boundary. The
nodal data are stored at the vertices of the triangles formed by the grid. The control
volumes are regular hexagons with area A, ∆ is the length of the sides of the triangles,
and is the length of the sides of the hexagons. The following relations hold between
, ∆, and A.
=
1
√
3
∆
A =
3
√
3
2
2
A
=
2
3∆
(5.56)
The twodimensional form of the conservation law is
d
dt
A
QdA+
C
n.Fdl = 0 (5.57)
5.4. A TWODIMENSIONAL EXAMPLE 81
0
5
4
3
2
1
a
b c
d
e
f
l
p
∆
Figure 5.2: Triangular grid.
where we have ignored the source term. The contour in the line integral is composed
of the sides of the hexagon. Since these sides are all straight, the unit normals can
be taken outside the integral and the ﬂux balance is given by
d
dt
A
Q dA+
5
¸
ν=0
n
ν
·
ν
Fdl = 0
where ν indexes a side of the hexagon, as shown in Figure 5.2. A list of the normals
for the mesh orientation shown is given in Table 5.1.
Side, ν Outward Normal, n
0 (i −
√
3 j)/2
1 i
2 (i +
√
3 j)/2
3 (−i +
√
3 j)/2
4 −i
5 (−i −
√
3 j)/2
Table 5.1. Outward normals, see Fig. 5.2.
i and j are unit normals along x and y, respectively.
82 CHAPTER 5. FINITEVOLUME METHODS
For Eq. 5.11, the twodimensional linear convection equation, we have for side ν
n
ν
·
ν
Fdl = n
ν
· (i cos θ +j sinθ)
/2
−/2
u
ν
(ξ)dξ (5.58)
where ξ is a length measured from the middle of a side ν. Making the change of
variable z = ξ/, one has the expression
/2
−/2
u(ξ)dξ =
1/2
−1/2
u(z)dz (5.59)
Then, in terms of u and the hexagon area A, we have
d
dt
A
u dA +
5
¸
ν=0
n
ν
· (i cos θ +j sinθ)
¸
1/2
−1/2
u(z)dz
¸
ν
= 0 (5.60)
The values of n
ν
· (i cos θ + j sinθ) are given by the expressions in Table 5.2. There
are no numerical approximations in Eq. 5.60. That is, if the integrals in the equation
are evaluated exactly, the integrated time rate of change of the integral of u over the
area of the hexagon is known exactly.
Side, ν n
ν
· (i cos θ +j sinθ)
0 (cos θ −
√
3 sinθ)/2
1 cos θ
2 (cos θ +
√
3 sin θ)/2
3 (−cos θ +
√
3 sin θ)/2
4 −cos θ
5 (−cos θ −
√
3 sinθ)/2
Table 5.2. Weights of ﬂux integrals, see Eq. 5.60.
Introducing the cell average,
A
u dA = A¯ u
p
(5.61)
and the piecewiseconstant approximation u = ¯ u
p
over the entire hexagon, the ap
proximation to the ﬂux integral becomes trivial. Taking the average of the ﬂux on
either side of each edge of the hexagon gives for edge 1:
1
u(z)dz =
¯ u
p
+ ¯ u
a
2
1/2
−1/2
dz =
¯ u
p
+ ¯ u
a
2
(5.62)
5.5. PROBLEMS 83
Similarly, we have for the other ﬁve edges:
2
u(z)dz =
¯ u
p
+ ¯ u
b
2
(5.63)
3
u(z)dz =
¯ u
p
+ ¯ u
c
2
(5.64)
4
u(z)dz =
¯ u
p
+ ¯ u
d
2
(5.65)
5
u(z)dz =
¯ u
p
+ ¯ u
e
2
(5.66)
0
u(z)dz =
¯ u
p
+ ¯ u
f
2
(5.67)
Substituting these into Eq. 5.60, along with the expressions in Table 5.2, we obtain
A
d¯ u
p
dt
+
2
[(2 cos θ)(¯ u
a
− ¯ u
d
) + (cos θ +
√
3 sinθ)(¯ u
b
− ¯ u
e
)
+(−cos θ +
√
3 sinθ)(¯ u
c
− ¯ u
f
)] = 0 (5.68)
or
d¯ u
p
dt
+
1
3∆
[(2 cos θ)(¯ u
a
− ¯ u
d
) + (cos θ +
√
3 sinθ)(¯ u
b
− ¯ u
e
)
+(−cos θ +
√
3 sin θ)(¯ u
c
− ¯ u
f
)] = 0 (5.69)
The reader can verify, using Taylor series expansions, that this is a secondorder
approximation to the integral form of the twodimensional linear convection equation.
5.5 Problems
1. Use Taylor series to verify that Eq. 5.42 is a fourthorder approximation to Eq.
5.23.
2. Find the semidiscrete ODE form governing the cellaverage data resulting from
the use of a linear approximation in developing a ﬁnitevolume method for the
linear convection equation. Use the following linear approximation:
u(ξ) = aξ +b
84 CHAPTER 5. FINITEVOLUME METHODS
where b = ¯ u
j
and
a =
¯ u
j+1
− ¯ u
j−1
2∆x
and use the average ﬂux at the cell interface.
3. Using the ﬁrst approach given in Section 5.3.3, derive a ﬁnitevolume approx
imation to the spatial terms in the twodimensional diﬀusion equation on a
square grid.
4. Repeat question 3 for a grid consisting of equilateral triangles.
Chapter 6
TIMEMARCHING METHODS
FOR ODE’S
After discretizing the spatial derivatives in the governing PDE’s (such as the Navier
Stokes equations), we obtain a coupled system of nonlinear ODE’s in the form
du
dt
=
F(u, t) (6.1)
These can be integrated in time using a timemarching method to obtain a time
accurate solution to an unsteady ﬂow problem. For a steady ﬂow problem, spatial
discretization leads to a coupled system of nonlinear algebraic equations in the form
F(u) = 0 (6.2)
As a result of the nonlinearity of these equations, some sort of iterative method is
required to obtain a solution. For example, one can consider the use of Newton’s
method, which is widely used for nonlinear algebraic equations (See Section 6.10.3.).
This produces an iterative method in which a coupled system of linear algebraic
equations must be solved at each iteration. These can be solved iteratively using
relaxation methods, which will be discussed in Chapter 9, or directly using Gaussian
elimination or some variation thereof.
Alternatively, one can consider a timedependent path to the steady state and use
a timemarching method to integrate the unsteady form of the equations until the
solution is suﬃciently close to the steady solution. The subject of the present chapter,
timemarching methods for ODE’s, is thus relevant to both steady and unsteady ﬂow
problems. When using a timemarching method to compute steady ﬂows, the goal is
simply to remove the transient portion of the solution as quickly as possible; time
accuracy is not required. This motivates the study of stability and stiﬀness, topics
which are covered in the next two chapters.
85
86 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
Application of a spatial discretization to a PDE produces a coupled system of
ODE’s. Application of a timemarching method to an ODE produces an ordinary
diﬀerence equation (O∆E ). In earlier chapters, we developed exact solutions to our
model PDE’s and ODE’s. In this chapter we will present some basic theory of linear
O∆E’s which closely parallels that for linear ODE’s, and, using this theory, we will
develop exact solutions for the model O∆E’s arising from the application of time
marching methods to the model ODE’s.
6.1 Notation
Using the semidiscrete approach, we reduce our PDE to a set of coupled ODE’s
represented in general by Eq. 4.1. However, for the purpose of this chapter, we need
only consider the scalar case
du
dt
= u
= F(u, t) (6.3)
Although we use u to represent the dependent variable, rather than w, the reader
should recall the arguments made in Chapter 4 to justify the study of a scalar ODE.
Our ﬁrst task is to ﬁnd numerical approximations that can be used to carry out the
time integration of Eq. 6.3 to some given accuracy, where accuracy can be measured
either in a local or a global sense. We then face a further task concerning the numerical
stability of the resulting methods, but we postpone such considerations to the next
chapter.
In Chapter 2, we introduced the convention that the n subscript, or the (n) su
perscript, always points to a discrete time value, and h represents the time interval
∆t. Combining this notation with Eq. 6.3 gives
u
n
= F
n
= F(u
n
, t
n
) ; t
n
= nh
Often we need a more sophisticated notation for intermediate time steps involving
temporary calculations denoted by ˜ u, ¯ u, etc. For these we use the notation
˜ u
n+α
=
˜
F
n+α
= F(˜ u
n+α
, t
n
+αh)
The choice of u
or F to express the derivative in a scheme is arbitrary. They are
both commonly used in the literature on ODE’s.
The methods we study are to be applied to linear or nonlinear ODE’s, but the
methods themselves are formed by linear combinations of the dependent variable and
its derivative at various time intervals. They are represented conceptually by
u
n+1
= f
β
1
u
n+1
, β
0
u
n
, β
−1
u
n−1
, · · · , α
0
u
n
, α
−1
u
n−1
, · · ·
(6.4)
6.2. CONVERTING TIMEMARCHING METHODS TO O∆E’S 87
With an appropriate choice of the α
s and β
s, these methods can be constructed
to give a local Taylor series accuracy of any order. The methods are said to be
explicit if β
1
= 0 and implicit otherwise. An explicit method is one in which the new
predicted solution is only a function of known data, for example, u
n
, u
n−1
, u
n
, and
u
n−1
for a method using two previous time levels, and therefore the time advance is
simple. For an implicit method, the new predicted solution is also a function of the
time derivative at the new time level, that is, u
n+1
. As we shall see, for systems of
ODE’s and nonlinear problems, implicit methods require more complicated strategies
to solve for u
n+1
than explicit methods.
6.2 Converting TimeMarching Methods to O∆E’s
Examples of some very common forms of methods used for timemarching general
ODE’s are:
u
n+1
= u
n
+hu
n
(6.5)
u
n+1
= u
n
+hu
n+1
(6.6)
and
˜ u
n+1
= u
n
+hu
n
u
n+1
=
1
2
[u
n
+ ˜ u
n+1
+h˜ u
n+1
] (6.7)
According to the conditions presented under Eq. 6.4, the ﬁrst and third of these are
examples of explicit methods. We refer to them as the explicit Euler method and the
MacCormack predictorcorrector method,
1
respectively. The second is implicit and
referred to as the implicit (or backward) Euler method.
These methods are simple recipes for the time advance of a function in terms of its
value and the value of its derivative, at given time intervals. The material presented
in Chapter 4 develops a basis for evaluating such methods by introducing the concept
of the representative equation
du
dt
= u
= λu +ae
µt
(6.8)
written here in terms of the dependent variable, u. The value of this equation arises
from the fact that, by applying a timemarching method, we can analytically convert
1
Here we give only MacCormack’s timemarching method. The method commonly referred to as
MacCormack’s method, which is a fullydiscrete method, will be presented in Section 11.3
88 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
such a linear ODE into a linear O∆E . The latter are subject to a whole body of
analysis that is similar in many respects to, and just as powerful as, the theory of
ODE’s. We next consider examples of this conversion process and then go into the
general theory on solving O∆E’s.
Apply the simple explicit Euler scheme, Eq. 6.5, to Eq. 6.8. There results
u
n+1
= u
n
+h(λu
n
+ae
µhn
)
or
u
n+1
−(1 +λh)u
n
= hae
µhn
(6.9)
Eq. 6.9 is a linear O∆E , with constant coeﬃcients, expressed in terms of the depen
dent variable u
n
and the independent variable n. As another example, applying the
implicit Euler method, Eq. 6.6, to Eq. 6.8, we ﬁnd
u
n+1
= u
n
+h
λu
n+1
+ae
µh(n+1)
or
(1 −λh)u
n+1
−u
n
= he
µh
· ae
µhn
(6.10)
As a ﬁnal example, the predictorcorrector sequence, Eq. 6.7, gives
˜ u
n+1
−(1 +λh)u
n
= ahe
µhn
−
1
2
(1 +λh)˜ u
n+1
+u
n+1
−
1
2
u
n
=
1
2
ahe
µh(n+1)
(6.11)
which is a coupled set of linear O∆E’s with constant coeﬃcients. Note that the ﬁrst
line in Eq. 6.11 is identical to Eq. 6.9, since the predictor step in Eq. 6.7 is simply
the explicit Euler method. The second line in Eq. 6.11 is obtained by noting that
˜ u
n+1
= F(˜ u
n+1
, t
n
+h)
= λ˜ u
n+1
+ae
µh(n+1)
(6.12)
Now we need to develop techniques for analyzing these diﬀerence equations so that
we can compare the merits of the timemarching methods that generated them.
6.3 Solution of Linear O∆E’s With Constant Co
eﬃcients
The techniques for solving linear diﬀerence equations with constant coeﬃcients is as
well developed as that for ODE’s and the theory follows a remarkably parallel path.
This is demonstrated by repeating some of the developments in Section 4.2, but for
diﬀerence rather than diﬀerential equations.
6.3. SOLUTION OF LINEAR O∆E’S WITH CONSTANT COEFFICIENTS 89
6.3.1 First and SecondOrder Diﬀerence Equations
FirstOrder Equations
The simplest nonhomogeneous O∆E of interest is given by the single ﬁrstorder equa
tion
u
n+1
= σu
n
+ab
n
(6.13)
where σ, a, and b are, in general, complex parameters. The independent variable is
now n rather than t, and since the equations are linear and have constant coeﬃcients,
σ is not a function of either n or u. The exact solution of Eq. 6.13 is
u
n
= c
1
σ
n
+
ab
n
b −σ
where c
1
is a constant determined by the initial conditions. In terms of the initial
value of u it can be written
u
n
= u
0
σ
n
+a
b
n
−σ
n
b −σ
Just as in the development of Eq. 4.10, one can readily show that the solution of the
defective case, (b = σ),
u
n+1
= σu
n
+aσ
n
is
u
n
=
u
0
+anσ
−1
σ
n
This can all be easily veriﬁed by substitution.
SecondOrder Equations
The homogeneous form of a secondorder diﬀerence equation is given by
u
n+2
+a
1
u
n+1
+a
0
u
n
= 0 (6.14)
Instead of the diﬀerential operator D ≡
d
dt
used for ODE’s, we use for O∆E’s the
diﬀerence operator E (commonly referred to as the displacement or shift operator)
and deﬁned formally by the relations
u
n+1
= Eu
n
, u
n+k
= E
k
u
n
Further notice that the displacement operator also applies to exponents, thus
b
α
· b
n
= b
n+α
= E
α
· b
n
90 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
where α can be any fraction or irrational number.
The roles of D and E are the same insofar as once they have been introduced to
the basic equations the value of u(t) or u
n
can be factored out. Thus Eq. 6.14 can
now be reexpressed in an operational notion as
(E
2
+a
1
E +a
0
)u
n
= 0 (6.15)
which must be zero for all u
n
. Eq. 6.15 is known as the operational form of Eq. 6.14.
The operational form contains a characteristic polynomial P(E) which plays the same
role for diﬀerence equations that P(D) played for diﬀerential equations; that is, its
roots determine the solution to the O∆E. In the analysis of O∆E’s, we label these
roots σ
1
, σ
2
, · · ·, etc, and refer to them as the σroots. They are found by solving the
equation P(σ) = 0. In the simple example given above, there are just two σ roots
and in terms of them the solution can be written
u
n
= c
1
(σ
1
)
n
+c
2
(σ
2
)
n
(6.16)
where c
1
and c
2
depend upon the initial conditions. The fact that Eq. 6.16 is a
solution to Eq. 6.14 for all c
1
, c
2
and n should be veriﬁed by substitution.
6.3.2 Special Cases of Coupled FirstOrder Equations
A Complete System
Coupled, ﬁrstorder, linear homogeneous diﬀerence equations have the form
u
(n+1)
1
= c
11
u
(n)
1
+c
12
u
(n)
2
u
(n+1)
2
= c
21
u
(n)
1
+c
22
u
(n)
2
(6.17)
which can also be written
u
n+1
= C
u
n
,
u
n
=
u
(n)
1
, u
(n)
2
T
, C =
¸
c
11
c
12
c
21
c
22
The operational form of Eq. 6.17 can be written
¸
(c
11
−E) c
12
c
21
(c
22
−E)
¸
u
1
u
2
(n)
= [ C −E I ]
u
n
= 0
which must be zero for all u
1
and u
2
. Again we are led to a characteristic polynomial,
this time having the form P(E) = det [ C −E I ]. The σroots are found from
P(σ) = det
¸
(c
11
−σ) c
12
c
21
(c
22
−σ)
= 0
6.4. SOLUTION OF THE REPRESENTATIVE O∆E’S 91
Obviously the σ
k
are the eigenvalues of C and, following the logic of Section 4.2,
if
x are its eigenvectors, the solution of Eq. 6.17 is
u
n
=
2
¸
k=1
c
k
(σ
k
)
n
x
k
where c
k
are constants determined by the initial conditions.
A Defective System
The solution of O∆E’s with defective eigensystems follows closely the logic in Section
4.2.2 for defective ODE’s. For example, one can show that the solution to
¯ u
n+1
ˆ u
n+1
u
n+1
¸
¸
¸ =
σ
1 σ
1 σ
¸
¸
¸
¯ u
n
ˆ u
n
u
n
¸
¸
¸
is
¯ u
n
= ¯ u
0
σ
n
ˆ u
n
=
ˆ u
0
+ ¯ u
0
nσ
−1
σ
n
u
n
=
¸
u
0
+ ˆ u
0
nσ
−1
+ ¯ u
0
n(n −1)
2
σ
−2
¸
σ
n
(6.18)
6.4 Solution of the Representative O∆E’s
6.4.1 The Operational Form and its Solution
Examples of the nonhomogeneous, linear, ﬁrstorder ordinary diﬀerence equations,
produced by applying a timemarching method to the representative equation, are
given by Eqs. 6.9 to 6.11. Using the displacement operator, E, these equations can
be written
[E −(1 +λh)]u
n
= h · ae
µhn
(6.19)
[(1 −λh)E −1]u
n
= h · E · ae
µhn
(6.20)
¸
E −(1 +λh)
−
1
2
(1 +λh)E E −
1
2
¸ ¸
˜ u
u
¸
n
= h ·
¸
1
1
2
E
¸
· ae
µhn
(6.21)
92 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
All three of these equations are subsets of the operational form of the representative
O∆E
P(E)u
n
= Q(E) · ae
µhn
(6.22)
which is produced by applying timemarching methods to the representative ODE, Eq.
4.45. We can express in terms of Eq. 6.22 all manner of standard timemarching meth
ods having multiple time steps and various types of intermediate predictorcorrector
families. The terms P(E) and Q(E) are polynomials in E referred to as the charac
teristic polynomial and the particular polynomial, respectively.
The general solution of Eq. 6.22 can be expressed as
u
n
=
K
¸
k=1
c
k
(σ
k
)
n
+ae
µhn
·
Q(e
µh
)
P(e
µh
)
(6.23)
where σ
k
are the K roots of the characteristic polynomial, P(σ) = 0. When determi
nants are involved in the construction of P(E) and Q(E), as would be the case for
Eq. 6.21, the ratio Q(E)/P(E) can be found by Kramer’s rule. Keep in mind that
for methods such as in Eq. 6.21 there are multiple (two in this case) solutions, one
for u
n
and ˜ u
n
and we are usually only interested in the ﬁnal solution u
n
. Notice also,
the important subset of this solution which occurs when µ = 0, representing a time
invariant particular solution, or a steady state. In such a case
u
n
=
K
¸
k=1
c
k
(σ
k
)
n
+a ·
Q(1)
P(1)
6.4.2 Examples of Solutions to TimeMarching O∆E’s
As examples of the use of Eqs. 6.22 and 6.23, we derive the solutions of Eqs. 6.19 to
6.21. For the explicit Euler method, Eq. 6.19, we have
P(E) = E −1 −λh
Q(E) = h (6.24)
and the solution of its representative O∆E follows immediately from Eq. 6.23:
u
n
= c
1
(1 +λh)
n
+ae
µhn
·
h
e
µh
−1 −λh
For the implicit Euler method, Eq. 6.20, we have
P(E) = (1 −λh)E −1
Q(E) = hE (6.25)
6.5. THE λ −σ RELATION 93
so
u
n
= c
1
1
1 −λh
n
+ae
µhn
·
he
µh
(1 −λh)e
µh
−1
In the case of the coupled predictorcorrector equations, Eq. 6.21, one solves for the
ﬁnal family u
n
(one can also ﬁnd a solution for the intermediate family ˜ u), and there
results
P(E) = det
¸
E −(1 +λh)
−
1
2
(1 +λh)E E −
1
2
¸
= E
E −1 −λh −
1
2
λ
2
h
2
Q(E) = det
¸
E h
−
1
2
(1 +λh)E
1
2
hE
=
1
2
hE(E + 1 +λh)
The σroot is found from
P(σ) = σ
σ −1 −λh −
1
2
λ
2
h
2
= 0
which has only one nontrivial root (σ = 0 is simply a shift in the reference index).
The complete solution can therefore be written
u
n
= c
1
1 +λh +
1
2
λ
2
h
2
n
+ae
µhn
·
1
2
h
e
µh
+ 1 + λh
e
µh
−1 −λh −
1
2
λ
2
h
2
(6.26)
6.5 The λ −σ Relation
6.5.1 Establishing the Relation
We have now been introduced to two basic kinds of roots, the λroots and the σroots.
The former are the eigenvalues of the A matrix in the ODE’s found by space diﬀer
encing the original PDE, and the latter are the roots of the characteristic polynomial
in a representative O∆E found by applying a timemarching method to the repre
sentative ODE. There is a fundamental relation between the two which can be used
to identify many of the essential properties of a timemarch method. This relation is
ﬁrst demonstrated by developing it for the explicit Euler method.
First we make use of the semidiscrete approach to ﬁnd a system of ODE’s and
then express its solution in the form of Eq. 4.27. Remembering that t = nh, one can
write
u(t) = c
1
e
λ
1
h
n
x
1
+· · · +c
m
e
λmh
n
x
m
+· · · +c
M
e
λ
M
h
n
x
M
+ P.S. (6.27)
94 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
where for the present we are not interested in the form of the particular solution
(P.S.). Now the explicit Euler method produces for each λroot, one σroot, which
is given by σ = 1 + λh. So if we use the Euler method for the time advance of the
ODE’s, the solution
2
of the resulting O∆E is
u
n
= c
1
(σ
1
)
n
x
1
+· · · +c
m
(σ
m
)
n
x
m
+· · · +c
M
(σ
M
)
n
x
M
+ P.S. (6.28)
where the c
m
and the
x
m
in the two equations are identical and σ
m
= (1 +λ
m
h).
Comparing Eq. 6.27 and Eq. 6.28, we see a correspondence between σ
m
and e
λmh
.
Since the value of e
λh
can be expressed in terms of the series
e
λh
= 1 +λh +
1
2
λ
2
h
2
+
1
6
λ
3
h
3
+· · · +
1
n!
λ
n
h
n
+· · ·
the truncated expansion σ = 1 +λh is a reasonable
3
approximation for small enough
λh.
Suppose, instead of the Euler method, we use the leapfrog method for the time
advance, which is deﬁned by
u
n+1
= u
n−1
+ 2hu
n
(6.29)
Applying Eq. 6.8 to Eq. 6.29, we have the characteristic polynomial P(E) = E
2
−
2λhE −1, so that for every λ the σ must satisfy the relation
σ
2
m
−2λ
m
hσ
m
−1 = 0 (6.30)
Now we notice that each λ produces two σroots. For one of these we ﬁnd
σ
m
= λ
m
h +
1 +λ
2
m
h
2
(6.31)
= 1 +λ
m
h +
1
2
λ
2
m
h
2
−
1
8
λ
4
m
h
4
+· · · (6.32)
This is an approximation to e
λmh
with an error O(λ
3
h
3
). The other root, λ
m
h −
1 +λ
2
m
h
2
, will be discussed in Section 6.5.3.
2
Based on Section 4.4.
3
The error is O(λ
2
h
2
).
6.5. THE λ −σ RELATION 95
6.5.2 The Principal σRoot
Based on the above we make the following observation:
Application of the same timemarching method to all of the equations in
a coupled system linear ODE’s in the form of Eq. 4.6, always produces
one σroot for every λroot that satisﬁes the relation
σ = 1 +λh +
1
2
λ
2
h
2
+· · · +
1
k!
λ
k
h
k
+O
h
k+1
where k is the order of the timemarching method.
(6.33)
We refer to the root that has the above property as the principal σroot, and designate
it (σ
m
)
1
. The above property can be stated regardless of the details of the time
marching method, knowing only that its leading error is O
h
k+1
. Thus the principal
root is an approximation to e
λh
up to O
h
k
.
Note that a secondorder approximation to a derivative written in the form
(δ
t
u)
n
=
1
2h
(u
n+1
−u
n−1
) (6.34)
has a leading truncation error which is O(h
2
), while the secondorder timemarching
method which results from this approximation, which is the leapfrog method:
u
n+1
= u
n−1
+ 2hu
n
(6.35)
has a leading truncation error O(h
3
). This arises simply because of our notation
for the timemarching method in which we have multiplied through by h to get an
approximation for the function u
n+1
rather than the derivative as in Eq. 6.34. The
following example makes this clear. Consider a solution obtained at a given time T
using a secondorder timemarching method with a time step h. Now consider the
solution obtained using the same method with a time step h/2. Since the error per
time step is O(h
3
), this is reduced by a factor of eight (considering the leading term
only). However, twice as many time steps are required to reach the time T. Therefore
the error at the end of the simulation is reduced by a factor of four, consistent with
a secondorder approximation.
6.5.3 Spurious σRoots
We saw from Eq. 6.30 that the λ −σ relation for the leapfrog method produces two
σroots for each λ. One of these we identiﬁed as the principal root which always
96 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
has the property given in 6.33. The other is referred to as a spurious σroot and
designated (σ
m
)
2
. In general, the λ−σ relation produced by a timemarching scheme
can result in multiple σroots all of which, except for the principal one, are spurious.
All spurious roots are designated (σ
m
)
k
where k = 2, 3, · · ·. No matter whether a
σroot is principal or spurious, it is always some algebraic function of the product λh.
To express this fact we use the notation σ = σ(λh).
If a timemarching method produces spurious σroots, the solution for the O∆E in
the form shown in Eq. 6.28 must be modiﬁed. Following again the message of Section
4.4, we have
u
n
= c
11
(σ
1
)
n
1
x
1
+· · · +c
m1
(σ
m
)
n
1
x
m
+· · · +c
M1
(σ
M
)
n
1
x
M
+P.S.
+c
12
(σ
1
)
n
2
x
1
+· · · +c
m2
(σ
m
)
n
2
x
m
+· · · +c
M2
(σ
M
)
n
2
x
M
+c
13
(σ
1
)
n
3
x
1
+· · · +c
m3
(σ
m
)
n
3
x
m
+· · · +c
M3
(σ
M
)
n
3
x
M
+etc., if there are more spurious roots (6.36)
Spurious roots arise if a method uses data from time level n − 1 or earlier to
advance the solution from time level n to n + 1. Such roots originate entirely from
the numerical approximation of the timemarching method and have nothing to do
with the ODE being solved. However, generation of spurious roots does not, in itself,
make a method inferior. In fact, many very accurate methods in practical use for
integrating some forms of ODE’s have spurious roots.
It should be mentioned that methods with spurious roots are not self starting.
For example, if there is one spurious root to a method, all of the coeﬃcients (c
m
)
2
in Eq. 6.36 must be initialized by some starting procedure. The initial vector u
0
does not provide enough data to initialize all of the coeﬃcients. This results because
methods which produce spurious roots require data from time level n − 1 or earlier.
For example, the leapfrog method requires u
n−1
and thus cannot be started using
only u
n
.
Presumably (i.e., if one starts the method properly) the spurious coeﬃcients are
all initialized with very small magnitudes, and presumably the magnitudes of the
spurious roots themselves are all less than one (see Chapter 7). Then the presence of
spurious roots does not contaminate the answer. That is, after some ﬁnite time the
amplitude of the error associated with the spurious roots is even smaller then when
it was initialized. Thus while spurious roots must be considered in stability analysis,
they play virtually no role in accuracy analysis.
6.5.4 OneRoot TimeMarching Methods
There are a number of timemarching methods that produce only one σroot for each
λroot. We refer to them as oneroot methods. They are also called onestep methods.
6.6. ACCURACY MEASURES OF TIMEMARCHING METHODS 97
They have the signiﬁcant advantage of being selfstarting which carries with it the
very useful property that the timestep interval can be changed at will throughout
the marching process. Three oneroot methods were analyzed in Section 6.4.2. A
popular method having this property, the socalled θmethod, is given by the formula
u
n+1
= u
n
+h
(1 −θ)u
n
+θu
n+1
The θmethod represents the explicit Euler (θ = 0), the trapezoidal (θ =
1
2
), and the
implicit Euler methods (θ = 1), respectively. Its λ −σ relation is
σ =
1 + (1 −θ)λh
1 −θλh
It is instructive to compare the exact solution to a set of ODE’s (with a complete
eigensystem) having timeinvariant forcing terms with the exact solution to the O∆E’s
for oneroot methods. These are
u(t) = c
1
e
λ
1
h
n
x
1
+· · · +c
m
e
λmh
n
x
m
+· · · +c
M
e
λ
M
h
n
x
M
+A
−1
f
u
n
= c
1
(σ
1
)
n
x
1
+· · · +c
m
(σ
m
)
n
x
m
+· · · +c
M
(σ
M
)
n
x
M
+A
−1
f (6.37)
respectively. Notice that when t and n = 0, these equations are identical, so that all
the constants, vectors, and matrices are identical except the
u and the terms inside
the parentheses on the right hand sides. The only error made by introducing the time
marching is the error that σ makes in approximating e
λh
.
6.6 Accuracy Measures of TimeMarching Meth
ods
6.6.1 Local and Global Error Measures
There are two broad categories of errors that can be used to derive and evaluate time
marching methods. One is the error made in each time step. This is a local error such
as that found from a Taylor table analysis, see Section 3.4. It is usually used as the
basis for establishing the order of a method. The other is the error determined at the
end of a given event which has covered a speciﬁc interval of time composed of many
time steps. This is a global error. It is useful for comparing methods, as we shall see
in Chapter 8.
It is quite common to judge a timemarching method on the basis of results found
from a Taylor table. However, a Taylor series analysis is a very limited tool for ﬁnding
the more subtle properties of a numerical timemarching method. For example, it is
of no use in:
98 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
• ﬁnding spurious roots.
• evaluating numerical stability and separating the errors in phase and amplitude.
• analyzing the particular solution of predictorcorrector combinations.
• ﬁnding the global error.
The latter three of these are of concern to us here, and to study them we make use of
the material developed in the previous sections of this chapter. Our error measures
are based on the diﬀerence between the exact solution to the representative ODE,
given by
u(t) = ce
λt
+
ae
µt
µ −λ
(6.38)
and the solution to the representative O∆E’s, including only the contribution from
the principal root, which can be written as
u
n
= c
1
(σ
1
)
n
+ae
µhn
·
Q(e
µh
)
P(e
µh
)
(6.39)
6.6.2 Local Accuracy of the Transient Solution (er
λ
, σ , er
ω
)
Transient error
The particular choice of an error measure, either local or global, is to some extent
arbitrary. However, a necessary condition for the choice should be that the measure
can be used consistently for all methods. In the discussion of the λσ relation we
saw that all timemarching methods produce a principal σroot for every λroot that
exists in a set of linear ODE’s. Therefore, a very natural local error measure for the
transient solution is the value of the diﬀerence between solutions based on these two
roots. We designate this by er
λ
and make the following deﬁnition
er
λ
≡ e
λh
−σ
1
The leading error term can be found by expanding in a Taylor series and choosing
the ﬁrst nonvanishing term. This is similar to the error found from a Taylor table.
The order of the method is the last power of λh matched exactly.
6.6. ACCURACY MEASURES OF TIMEMARCHING METHODS 99
Amplitude and Phase Error
Suppose a λ eigenvalue is imaginary. Such can indeed be the case when we study the
equations governing periodic convection which produces harmonic motion. For such
cases it is more meaningful to express the error in terms of amplitude and phase.
Let λ = iω where ω is a real number representing a frequency. Then the numerical
method must produce a principal σroot that is complex and expressible in the form
σ
1
= σ
r
+iσ
i
≈ e
iωh
(6.40)
From this it follows that the local error in amplitude is measured by the deviation of
σ
1
 from unity, that is
er
a
= 1 −σ
1
 = 1 −
(σ
1
)
2
r
+ (σ
1
)
2
i
and the local error in phase can be deﬁned as
er
ω
≡ ωh −tan
−1
[(σ
1
)
i
/(σ
1
)
r
)] (6.41)
Amplitude and phase errors are important measures of the suitability of timemarching
methods for convection and wave propagation phenomena.
The approach to error analysis described in Section 3.5 can be extended to the
combination of a spatial discretization and a timemarching method applied to the
linear convection equation. The principal root, σ
1
(λh), is found using λ = −iaκ
∗
,
where κ
∗
is the modiﬁed wavenumber of the spatial discretization. Introducing the
Courant number, C
n
= ah/∆x, we have λh = −iC
n
κ
∗
∆x. Thus one can obtain
values of the principal root over the range 0 ≤ κ∆x ≤ π for a given value of the
Courant number. The above expression for er
ω
can be normalized to give the error
in the phase speed, as follows
er
p
=
er
ω
ωh
= 1 +
tan
−1
[(σ
1
)
i
/(σ
1
)
r
)]
C
n
κ∆x
(6.42)
where ω = −aκ. A positive value of er
p
corresponds to phase lag (the numerical phase
speed is too small), while a negative value corresponds to phase lead (the numerical
phase speed is too large).
6.6.3 Local Accuracy of the Particular Solution (er
µ
)
The numerical error in the particular solution is found by comparing the particular
solution of the ODE with that for the O∆E. We have found these to be given by
P.S.
(ODE)
= ae
µt
·
1
(µ −λ)
100 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
and
P.S.
(O∆E)
= ae
µt
·
Q(e
µh
)
P(e
µh
)
respectively. For a measure of the local error in the particular solution we introduce
the deﬁnition
er
µ
≡ h
P.S.
(O∆E)
P.S.
(ODE)
−1
¸
(6.43)
The multiplication by h converts the error from a global measure to a local one, so
that the order of er
λ
and er
µ
are consistent. In order to determine the leading error
term, Eq. 6.43 can be written in terms of the characteristic and particular polynomials
as
er
µ
=
c
o
µ −λ
·
(µ −λ)Q
e
µh
−P
e
µh
¸
(6.44)
where
c
o
= lim
h→0
h(µ −λ)
P
e
µh
The value of c
o
is a methoddependent constant that is often equal to one. If the
forcing function is independent of time, µ is equal to zero, and for this case, many
numerical methods generate an er
µ
that is also zero.
The algebra involved in ﬁnding the order of er
µ
can be quite tedious. However,
this order is quite important in determining the true order of a timemarching method
by the process that has been outlined. An illustration of this is given in the section
on RungeKutta methods.
6.6.4 Time Accuracy For Nonlinear Applications
In practice, timemarching methods are usually applied to nonlinear ODE’s, and it
is necessary that the advertised order of accuracy be valid for the nonlinear cases as
well as for the linear ones. A necessary condition for this to occur is that the local
accuracies of both the transient and the particular solutions be of the same order.
More precisely, a timemarching method is said to be of order k if
er
λ
= c
1
· (λh)
k
1
+1
(6.45)
er
µ
= c
2
· (λh)
k
2
+1
(6.46)
where k = smallest of(k
1
, k
2
) (6.47)
6.6. ACCURACY MEASURES OF TIMEMARCHING METHODS 101
The reader should be aware that this is not suﬃcient. For example, to derive all of
the necessary conditions for the fourthorder RungeKutta method presented later
in this chapter the derivation must be performed for a nonlinear ODE. However, the
analysis based on a linear nonhomogeneous ODE produces the appropriate conditions
for the majority of timemarching methods used in CFD.
6.6.5 Global Accuracy
In contrast to the local error measures which have just been discussed, we can also
deﬁne global error measures. These are useful when we come to the evaluation of
timemarching methods for speciﬁc purposes. This subject is covered in Chapter 8
after our introduction to stability in Chapter 7.
Suppose we wish to compute some timeaccurate phenomenon over a ﬁxed interval
of time using a constant time step. We refer to such a computation as an “event”.
Let T be the ﬁxed time of the event and h be the chosen step size. Then the required
number of time steps, is N, given by the relation
T = Nh
Global error in the transient
A natural extension of er
λ
to cover the error in an entire event is given by
Er
λ
≡ e
λT
−(σ
1
(λh))
N
(6.48)
Global error in amplitude and phase
If the event is periodic, we are more concerned with the global error in amplitude and
phase. These are given by
Er
a
= 1 −
(σ
1
)
2
r
+ (σ
1
)
2
i
N
(6.49)
and
Er
ω
≡ N
¸
ωh −tan
−1
(σ
1
)
i
(σ
1
)
r
¸
= ωT −N tan
−1
[(σ
1
)
i
/(σ
1
)
r
] (6.50)
102 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
Global error in the particular solution
Finally, the global error in the particular solution follows naturally by comparing the
solutions to the ODE and the O∆E. It can be measured by
Er
µ
≡ (µ −λ)
Q
e
µh
P
e
µh
−1
6.7 Linear Multistep Methods
In the previous sections, we have developed the framework of error analysis for time
advance methods and have randomly introduced a few methods without addressing
motivational, developmental or design issues. In the subsequent sections, we introduce
classes of methods along with their associated error analysis. We shall not spend much
time on development or design of these methods, since most of them have historic
origins from a wide variety of disciplines and applications. The Linear Multistep
Methods (LMM’s) are probably the most natural extension to time marching of the
space diﬀerencing schemes introduced in Chapter 3 and can be analyzed for accuracy
or designed using the Taylor table approach of Section 3.4.
6.7.1 The General Formulation
When applied to the nonlinear ODE
du
dt
= u
= F(u, t)
all linear multistep methods can be expressed in the general form
1
¸
k=1−K
α
k
u
n+k
= h
1
¸
k=1−K
β
k
F
n+k
(6.51)
where the notation for F is deﬁned in Section 6.1. The methods are said to be linear
because the α’s and β’s are independent of u and n, and they are said to be Kstep
because K timelevels of data are required to marching the solution one timestep, h.
They are explicit if β
1
= 0 and implicit otherwise.
When Eq. 6.51 is applied to the representative equation, Eq. 6.8, and the result is
expressed in operational form, one ﬁnds
¸
1
¸
k=1−K
α
k
E
k
¸
u
n
= h
¸
1
¸
k=1−K
β
k
E
k
¸
(λu
n
+ae
µhn
) (6.52)
6.7. LINEAR MULTISTEP METHODS 103
We recall from Section 6.5.2 that a timemarching method when applied to the repre
sentative equation must provide a σroot, labeled σ
1
, that approximates e
λh
through
the order of the method. The condition referred to as consistency simply means that
σ → 1 as h → 0, and it is certainly a necessary condition for the accuracy of any
time marching method. We can also agree that, to be of any value in time accuracy,
a method should at least be ﬁrstorder accurate, that is σ →(1 +λh) as h →0. One
can show that these conditions are met by any method represented by Eq. 6.51 if
¸
k
α
k
= 0 and
¸
k
β
k
=
¸
k
(K +k −1)α
k
Since both sides of Eq. 6.51 can be multiplied by an arbitrary constant, these methods
are often “normalized” by requiring
¸
k
β
k
= 1
Under this condition c
o
= 1 in Eq. 6.44.
6.7.2 Examples
There are many special explicit and implicit forms of linear multistep methods. Two
wellknown families of them, referred to as AdamsBashforth (explicit) and Adams
Moulton (implicit), can be designed using the Taylor table approach of Section 3.4.
The AdamsMoulton family is obtained from Eq. 6.51 with
α
1
= 1, α
0
= −1, α
k
= 0, k = −1, −2, · · · (6.53)
The AdamsBashforth family has the same α’s with the additional constraint that
β
1
= 0. The threestep AdamsMoulton method can be written in the following form
u
n+1
= u
n
+h(β
1
u
n+1
+β
0
u
n
+β
−1
u
n−1
+β
−2
u
n−2
) (6.54)
A Taylor table for Eq. 6.54 can be generated as
u
n
h · u
n
h
2
· u
n
h
3
· u
n
h
4
· u
n
u
n+1
1 1
1
2
1
6
1
24
−u
n
−1
−hβ
1
u
n+1
−β
1
−β
1
−β
1
1
2
−β
1
1
6
−hβ
0
u
n
−β
0
−hβ
−1
u
n−1
−β
−1
β
−1
−β
−1
1
2
β
−1
1
6
−hβ
−2
u
n−2
−(−2)
0
β
−2
−(−2)
1
β
−2
−(−2)
2
β
−2
1
2
−(−2)
3
β
−2
1
6
Table 6.1. Taylor table for the AdamsMoulton threestep linear multistep method.
104 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
This leads to the linear system
1 1 1 1
2 0 −2 −4
3 0 3 12
4 0 −4 −32
¸
¸
¸
¸
¸
β
1
β
0
β
−1
β
−2
¸
¸
¸
¸
¸
=
1
1
1
1
¸
¸
¸
¸
¸
(6.55)
to solve for the β’s, resulting in
β
1
= 9/24, β
0
= 19/24, β
−1
= −5/24, β
−2
= 1/24 (6.56)
which produces a method which is fourthorder accurate.
4
With β
1
= 0 one obtains
1 1 1
0 −2 −4
0 3 12
¸
¸
¸
β
0
β
−1
β
−2
¸
¸
¸ =
1
1
1
¸
¸
¸ (6.57)
giving
β
0
= 23/12, β
−1
= −16/12, β
−2
= 5/12 (6.58)
This is the thirdorder AdamsBashforth method.
A list of simple methods, some of which are very common in CFD applications,
is given below together with identifying names that are sometimes associated with
them. In the following material AB(n) and AM(n) are used as abbreviations for the
(n)th order AdamsBashforth and (n)th order AdamsMoulton methods. One can
verify that the Adams type schemes given below satisfy Eqs. 6.55 and 6.57 up to the
order of the method.
Explicit Methods
u
n+1
= u
n
+hu
n
Euler
u
n+1
= u
n−1
+ 2hu
n
Leapfrog
u
n+1
= u
n
+
1
2
h
3u
n
−u
n−1
AB2
u
n+1
= u
n
+
h
12
23u
n
−16u
n−1
+ 5u
n−2
AB3
Implicit Methods
u
n+1
= u
n
+hu
n+1
Implicit Euler
u
n+1
= u
n
+
1
2
h
u
n
+u
n+1
Trapezoidal (AM2)
u
n+1
=
1
3
4u
n
−u
n−1
+ 2hu
n+1
2ndorder Backward
u
n+1
= u
n
+
h
12
5u
n+1
+ 8u
n
−u
n−1
AM3
4
Recall from Section 6.5.2 that a kthorder timemarching method has a leading truncation error
term which is O(h
k+1
).
6.7. LINEAR MULTISTEP METHODS 105
6.7.3 TwoStep Linear Multistep Methods
High resolution CFD problems usually require very large data sets to store the spatial
information from which the time derivative is calculated. This limits the interest
in multistep methods to about two time levels. The most general twostep linear
multistep method (i.e., K=2 in Eq. 6.51), that is at least ﬁrstorder accurate, can be
written as
(1 +ξ)u
n+1
= [(1 + 2ξ)u
n
−ξu
n−1
] +h
θu
n+1
+ (1 −θ +ϕ)u
n
−ϕu
n−1
(6.59)
Clearly the methods are explicit if θ = 0 and implicit otherwise. A list of methods
contained in Eq. 6.59 is given in Table 6.2. Notice that the Adams methods have
ξ = 0, which corresponds to α
−1
= 0 in Eq. 6.51. Methods with ξ = −1/2, which
corresponds to α
0
= 0 in Eq. 6.51, are known as Milne methods.
θ ξ ϕ Method Order
0 0 0 Euler 1
1 0 0 Implicit Euler 1
1/2 0 0 Trapezoidal or AM2 2
1 1/2 0 2nd Order Backward 2
3/4 0 −1/4 Adams type 2
1/3 −1/2 −1/3 Lees Type 2
1/2 −1/2 −1/2 Two–step trapezoidal 2
5/9 −1/6 −2/9 A–contractive 2
0 −1/2 0 Leapfrog 2
0 0 1/2 AB2 2
0 −5/6 −1/3 Most accurate explicit 3
1/3 −1/6 0 Third–order implicit 3
5/12 0 1/12 AM3 3
1/6 −1/2 −1/6 Milne 4
Table 6.2. Some linear one and twostep methods, see Eq. 6.59.
One can show after a little algebra that both er
µ
and er
λ
are reduced to 0(h
3
)
(i.e., the methods are 2ndorder accurate) if
ϕ = ξ −θ +
1
2
The class of all 3rdorder methods is determined by imposing the additional constraint
ξ = 2θ −
5
6
Finally a unique fourthorder method is found by setting θ = −ϕ = −ξ/3 =
1
6
.
106 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
6.8 PredictorCorrector Methods
There are a wide variety of predictorcorrector schemes created and used for a variety
of purposes. Their use in solving ODE’s is relatively easy to illustrate and understand.
Their use in solving PDE’s can be much more subtle and demands concepts
5
which
have no counterpart in the analysis of ODE’s.
Predictorcorrector methods constructed to timemarch linear or nonlinear ODE’s
are composed of sequences of linear multistep methods, each of which is referred to
as a family in the solution process. There may be many families in the sequence, and
usually the ﬁnal family has a higher Taylorseries order of accuracy than the inter
mediate ones. Their use is motivated by ease of application and increased eﬃciency,
where measures of eﬃciency are discussed in the next two chapters.
A simple onepredictor, onecorrector example is given by
˜ u
n+α
= u
n
+αhu
n
u
n+1
= u
n
+h
β˜ u
n+α
+γu
n
(6.60)
where the parameters α, β and γ are arbitrary parameters to be determined. One
can analyze this sequence by applying it to the representative equation and using
the operational techniques outlined in Section 6.4. It is easy to show, following the
example leading to Eq. 6.26, that
P(E) = E
α
·
E −1 −(γ +β)λh −αβλ
2
h
2
(6.61)
Q(E) = E
α
· h · [βE
α
+γ +αβλh] (6.62)
Considering only local accuracy, one is led, by following the discussion in Section 6.6,
to the following observations. For the method to be secondorder accurate both er
λ
and er
µ
must be O(h
3
). For this to hold for er
λ
, it is obvious from Eq. 6.61 that
γ +β = 1 ; αβ =
1
2
which provides two equations for three unknowns. The situation for er
µ
requires some
algebra, but it is not diﬃcult to show using Eq. 6.44 that the same conditions also
make it O(h
3
). One concludes, therefore, that the predictorcorrector sequence
˜ u
n+α
= u
n
+αhu
n
u
n+1
= u
n
+
1
2
h
¸
1
α
˜ u
n+α
+
2α −1
α
u
n
(6.63)
is a secondorder accurate method for any α.
5
Such as alternating direction, fractionalstep, and hybrid methods.
6.9. RUNGEKUTTA METHODS 107
A classical predictorcorrector sequence is formed by following an AdamsBashforth
predictor of any order with an AdamsMoulton corrector having an order one higher.
The order of the combination is then equal to the order of the corrector. If the order
of the corrector is (k), we refer to these as ABM(k) methods. The AdamsBashforth
Moulton sequence for k = 3 is
˜ u
n+1
= u
n
+
1
2
h
3u
n
−u
n−1
u
n+1
= u
n
+
h
12
5˜ u
n+1
+ 8u
n
−u
n−1
(6.64)
Some simple, speciﬁc, secondorder accurate methods are given below. The Gazdag
method, which we discuss in Chapter 8, is
˜ u
n+1
= u
n
+
1
2
h
3˜ u
n
− ˜ u
n−1
u
n+1
= u
n
+
1
2
h
˜ u
n
+ ˜ u
n+1
(6.65)
The Burstein method, obtained from Eq. 6.63 with α = 1/2 is
˜ u
n+1/2
= u
n
+
1
2
hu
n
u
n+1
= u
n
+h˜ u
n+1/2
(6.66)
and, ﬁnally, MacCormack’s method, presented earlier in this chapter, is
˜ u
n+1
= u
n
+hu
n
u
n+1
=
1
2
[u
n
+ ˜ u
n+1
+h˜ u
n+1
] (6.67)
Note that MacCormack’s method can also be written as
˜ u
n+1
= u
n
+hu
n
u
n+1
= u
n
+
1
2
h[u
n
+ ˜ u
n+1
] (6.68)
from which it is clear that it is obtained from Eq. 6.63 with α = 1.
6.9 RungeKutta Methods
There is a special subset of predictorcorrector methods, referred to as RungeKutta
methods,
6
that produce just one σroot for each λroot such that σ(λh) corresponds
6
Although implicit and multistep RungeKutta methods exist, we will consider only singlestep,
explicit RungeKutta methods here.
108 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
to the Taylor series expansion of e
λh
out through the order of the method and then
truncates. Thus for a RungeKutta method of order k (up to 4th order), the principal
(and only) σroot is given by
σ = 1 +λh +
1
2
λ
2
h
2
+· · · +
1
k!
λ
k
h
k
(6.69)
It is not particularly diﬃcult to build this property into a method, but, as we pointed
out in Section 6.6.4, it is not suﬃcient to guarantee k’th order accuracy for the solution
of u
= F(u, t) or for the representative equation. To ensure k’th order accuracy, the
method must further satisfy the constraint that
er
µ
= O(h
k+1
) (6.70)
and this is much more diﬃcult.
The most widely publicized RungeKutta process is the one that leads to the
fourthorder method. We present it below in some detail. It is usually introduced in
the form
k
1
= hF(u
n
, t
n
)
k
2
= hF(u
n
+βk
1
, t
n
+αh)
k
3
= hF(u
n
+β
1
k
1
+γ
1
k
2
, t
n
+α
1
h)
k
4
= hF(u
n
+β
2
k
1
+γ
2
k
2
+δ
2
k
3
, t
n
+α
2
h)
followed by
u(t
n
+h) −u(t
n
) = µ
1
k
1
+µ
2
k
2
+µ
3
k
3
+µ
4
k
4
(6.71)
However, we prefer to present it using predictorcorrector notation. Thus, a scheme
entirely equivalent to 6.71 is
´ u
n+α
= u
n
+βhu
n
˜ u
n+α
1
= u
n
+β
1
hu
n
+γ
1
h´ u
n+α
u
n+α
2
= u
n
+β
2
hu
n
+γ
2
h´ u
n+α
+δ
2
h˜ u
n+α
1
u
n+1
= u
n
+µ
1
hu
n
+µ
2
h´ u
n+α
+µ
3
h˜ u
n+α
1
+µ
4
hu
n+α
2
(6.72)
Appearing in Eqs. 6.71 and 6.72 are a total of 13 parameters which are to be
determined such that the method is fourthorder according to the requirements in
Eqs. 6.69 and 6.70. First of all, the choices for the time samplings, α, α
1
, and α
2
, are
not arbitrary. They must satisfy the relations
α = β
α
1
= β
1
+γ
1
α
2
= β
2
+γ
2
+δ
2
(6.73)
6.9. RUNGEKUTTA METHODS 109
The algebra involved in ﬁnding algebraic equations for the remaining 10 parameters
is not trivial, but the equations follow directly from ﬁnding P(E) and Q(E) and then
satisfying the conditions in Eqs. 6.69 and 6.70. Using Eq. 6.73 to eliminate the β’s
we ﬁnd from Eq. 6.69 the four conditions
µ
1
+µ
2
+µ
3
+µ
4
= 1 (1)
µ
2
α +µ
3
α
1
+µ
4
α
2
= 1/2 (2)
µ
3
αγ
1
+µ
4
(αγ
2
+α
1
δ
2
) = 1/6 (3)
µ
4
αγ
1
δ
2
= 1/24 (4)
(6.74)
These four relations guarantee that the ﬁve terms in σ exactly match the ﬁrst 5 terms
in the expansion of e
λh
. To satisfy the condition that er
µ
= O(k
5
), we have to fulﬁll
four more conditions
µ
2
α
2
+µ
3
α
2
1
+µ
4
α
2
2
= 1/3 (3)
µ
2
α
3
+µ
3
α
3
1
+µ
4
α
3
2
= 1/4 (4)
µ
3
α
2
γ
1
+µ
4
(α
2
γ
2
+α
2
1
δ
2
) = 1/12 (4)
µ
3
αα
1
γ
1
+µ
4
α
2
(αγ
2
+α
1
δ
2
) = 1/8 (4)
(6.75)
The number in parentheses at the end of each equation indicates the order that
is the basis for the equation. Thus if the ﬁrst 3 equations in 6.74 and the ﬁrst
equation in 6.75 are all satisﬁed, the resulting method would be thirdorder accurate.
As discussed in Section 6.6.4, the fourth condition in Eq. 6.75 cannot be derived
using the methodology presented here, which is based on a linear nonhomogenous
representative ODE. A more general derivation based on a nonlinear ODE can be
found in several books.
7
There are eight equations in 6.74 and 6.75 which must be satisﬁed by the 10
unknowns. Since the equations are overdetermined, two parameters can be set arbi
trarily. Several choices for the parameters have been proposed, but the most popular
one is due to Runge. It results in the “standard” fourthorder RungeKutta method
expressed in predictorcorrector form as
´ u
n+1/2
= u
n
+
1
2
hu
n
˜ u
n+1/2
= u
n
+
1
2
h´ u
n+1/2
u
n+1
= u
n
+h˜ u
n+1/2
u
n+1
= u
n
+
1
6
h
u
n
+ 2
´ u
n+1/2
+ ˜ u
n+1/2
+u
n+1
(6.76)
7
The present approach based on a linear inhomogeneous equation provides all of the necessary
conditions for RungeKutta methods of up to third order.
110 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
Notice that this represents the simple sequence of conventional linear multistep meth
ods referred to, respectively, as
Euler Predictor
Euler Corrector
Leapfrog Predictor
Milne Corrector
≡ RK4
One can easily show that both the Burstein and the MacCormack methods given by
Eqs. 6.66 and 6.67 are secondorder RungeKutta methods, and thirdorder methods
can be derived from Eqs. 6.72 by setting µ
4
= 0 and satisfying only Eqs. 6.74 and the
ﬁrst equation in 6.75. It is clear that for orders one through four, RK methods of order
k require k evaluations of the derivative function to advance the solution one time
step. We shall discuss the consequences of this in Chapter 8. Higherorder Runge
Kutta methods can be developed, but they require more derivative evaluations than
their order. For example, a ﬁfthorder method requires six evaluations to advance
the solution one step. In any event, storage requirements reduce the usefulness of
RungeKutta methods of order higher than four for CFD applications.
6.10 Implementation of Implicit Methods
We have presented a wide variety of timemarching methods and shown how to derive
their λ − σ relations. In the next chapter, we will see that these methods can have
widely diﬀerent properties with respect to stability. This leads to various trade
oﬀs which must be considered in selecting a method for a speciﬁc application. Our
presentation of the timemarching methods in the context of a linear scalar equation
obscures some of the issues involved in implementing an implicit method for systems
of equations and nonlinear equations. These are covered in this Section.
6.10.1 Application to Systems of Equations
Consider ﬁrst the numerical solution of our representative ODE
u
= λu +ae
µt
(6.77)
using the implicit Euler method. Following the steps outlined in Section 6.2, we
obtained
(1 −λh)u
n+1
−u
n
= he
µh
· ae
µhn
(6.78)
6.10. IMPLEMENTATION OF IMPLICIT METHODS 111
Solving for u
n+1
gives
u
n+1
=
1
1 −λh
(u
n
+he
µh
· ae
µhn
) (6.79)
This calculation does not seem particularly onerous in comparison with the applica
tion of an explicit method to this ODE, requiring only an additional division.
Now let us apply the implicit Euler method to our generic system of equations
given by
u
= Au −
f(t) (6.80)
where u and
f are vectors and we still assume that A is not a function of u or t. Now
the equivalent to Eq. 6.78 is
(I −hA)u
n+1
−u
n
= −h
f(t +h) (6.81)
or
u
n+1
= (I −hA)
−1
[u
n
−h
f(t +h)] (6.82)
The inverse is not actually performed, but rather we solve Eq. 6.81 as a linear system
of equations. For our onedimensional examples, the system of equations which must
be solved is tridiagonal (e.g., for biconvection, A = −aB
p
(−1, 0, 1)/2∆x), and hence
its solution is inexpensive, but in multidimensions the bandwidth can be very large. In
general, the cost per time step of an implicit method is larger than that of an explicit
method. The primary area of application of implicit methods is in the solution of
stiﬀ ODE’s, as we shall see in Chapter 8.
6.10.2 Application to Nonlinear Equations
Now consider the general nonlinear scalar ODE given by
du
dt
= F(u, t) (6.83)
Application of the implicit Euler method gives
u
n+1
= u
n
+ hF(u
n+1
, t
n+1
) (6.84)
This is a nonlinear diﬀerence equation. As an example, consider the nonlinear ODE
du
dt
+
1
2
u
2
= 0 (6.85)
112 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
solved using implicit Euler time marching, which gives
u
n+1
+h
1
2
u
2
n+1
= u
n
(6.86)
which requires a nontrivial method to solve for u
n+1
. There are several diﬀerent
approaches one can take to solving this nonlinear diﬀerence equation. An iterative
method, such as Newton’s method (see below), can be used. In practice, the “initial
guess” for this nonlinear problem can be quite close to the solution, since the “initial
guess” is simply the solution at the previous time step, which implies that a lineariza
tion approach may be quite successful. Such an approach is described in the next
Section.
6.10.3 Local Linearization for Scalar Equations
General Development
Let us start the process of local linearization by considering Eq. 6.83. In order to
implement the linearization, we expand F(u, t) about some reference point in time.
Designate the reference value by t
n
and the corresponding value of the dependent
variable by u
n
. A Taylor series expansion about these reference quantities gives
F(u, t) = F(u
n
, t
n
) +
∂F
∂u
n
(u −u
n
) +
∂F
∂t
n
(t −t
n
)
+
1
2
∂
2
F
∂u
2
n
(u −u
n
)
2
+
∂
2
F
∂u∂t
n
(u −u
n
)(t −t
n
)
+
1
2
∂
2
F
∂t
2
n
(t −t
n
)
2
+· · · (6.87)
On the other hand, the expansion of u(t) in terms of the independent variable t is
u(t) = u
n
+ (t −t
n
)
∂u
∂t
n
+
1
2
(t −t
n
)
2
∂
2
u
∂t
2
n
+· · · (6.88)
If t is within h of t
n
, both (t − t
n
)
k
and (u − u
n
)
k
are O(h
k
), and Eq. 6.87 can be
written
F(u, t) = F
n
+
∂F
∂u
n
(u −u
n
) +
∂F
∂t
n
(t −t
n
) +O(h
2
) (6.89)
6.10. IMPLEMENTATION OF IMPLICIT METHODS 113
Notice that this is an expansion of the derivative of the function. Thus, relative to the
order of expansion of the function, it represents a secondorderaccurate, locallylinear
approximation to F(u, t) that is valid in the vicinity of the reference station t
n
and
the corresponding u
n
= u(t
n
). With this we obtain the locally (in the neighborhood
of t
n
) timelinear representation of Eq. 6.83, namely
du
dt
=
∂F
∂u
n
u +
F
n
−
∂F
∂u
n
u
n
+
∂F
∂t
n
(t −t
n
) +O(h
2
) (6.90)
Implementation of the Trapezoidal Method
As an example of how such an expansion can be used, consider the mechanics of
applying the trapezoidal method for the time integration of Eq. 6.83. The trapezoidal
method is given by
u
n+1
= u
n
+
1
2
h[F
n+1
+F
n
] +hO(h
2
) (6.91)
where we write hO(h
2
) to emphasize that the method is second order accurate. Using
Eq. 6.89 to evaluate F
n+1
= F(u
n+1
, t
n+1
), one ﬁnds
u
n+1
= u
n
+
1
2
h
¸
F
n
+
∂F
∂u
n
(u
n+1
−u
n
) +h
∂F
∂t
n
+O(h
2
) +F
n
¸
+hO(h
2
) (6.92)
Note that the O(h
2
) term within the brackets (which is due to the local linearization)
is multiplied by h and therefore is the same order as the hO(h
2
) error from the
Trapezoidal Method. The use of local time linearization updated at the end of each
time step, and the trapezoidal time march, combine to make a secondorderaccurate
numerical integration process. There are, of course, other secondorder implicit time
marching methods that can be used. The important point to be made here is that
local linearization updated at each time step has not reduced the order of accuracy of
a secondorder timemarching process.
A very useful reordering of the terms in Eq. 6.92 results in the expression
¸
1 −
1
2
h
∂F
∂u
n
¸
∆u
n
= hF
n
+
1
2
h
2
∂F
∂t
n
(6.93)
which is now in the delta form which will be formally introduced in Section 12.6. In
many ﬂuid mechanic applications the nonlinear function F is not an explicit function
114 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
of t. In such cases the partial derivative of F(u) with respect to t is zero and Eq. 6.93
simpliﬁes to the secondorder accurate expression
¸
1 −
1
2
h
∂F
∂u
n
¸
∆u
n
= hF
n
(6.94)
Notice that the RHS is extremely simple. It is the product of h and the RHS of
the basic equation evaluated at the previous time step. In this example, the basic
equation was the simple scalar equation 6.83, but for our applications, it is generally
the spacediﬀerenced form of the steadystate equation of some ﬂuid ﬂow problem.
A numerical timemarching procedure using Eq. 6.94 is usually implemented as
follows:
1. Solve for the elements of h
F
n
, store them in an array say
R, and save u
n
.
2. Solve for the elements of the matrix multiplying ∆u
n
and store in some appro
priate manner making use of sparseness or bandedness of the matrix if possible.
Let this storage area be referred to as B.
3. Solve the coupled set of linear equations
B∆u
n
=
R
for ∆u
n
. (Very seldom does one ﬁnd B
−1
in carrying out this step).
4. Find u
n+1
by adding ∆u
n
to u
n
, thus
u
n+1
= ∆u
n
+u
n
The solution for u
n+1
is generally stored such that it overwrites the value of u
n
and the process is repeated.
Implementation of the Implicit Euler Method
We have seen that the ﬁrstorder implicit Euler method can be written
u
n+1
= u
n
+hF
n+1
(6.95)
if we introduce Eq. 6.90 into this method, rearrange terms, and remove the explicit
dependence on time, we arrive at the form
¸
1 −h
∂F
∂u
n
¸
∆u
n
= hF
n
(6.96)
6.10. IMPLEMENTATION OF IMPLICIT METHODS 115
We see that the only diﬀerence between the implementation of the trapezoidal method
and the implicit Euler method is the factor of
1
2
in the brackets of the left side of
Eqs. 6.94 and 6.96. Omission of this factor degrades the method in time accuracy by
one order of h. We shall see later that this method is an excellent choice for steady
problems.
Newton’s Method
Consider the limit h → ∞ of Eq. 6.96 obtained by dividing both sides by h and
setting 1/h = 0. There results
−
∂F
∂u
n
∆u
n
= F
n
(6.97)
or
u
n+1
= u
n
−
¸
∂F
∂u
n
¸
−1
F
n
(6.98)
This is the wellknown Newton method for ﬁnding the roots of a nonlinear equation
F(u) = 0. The fact that it has quadratic convergence is veriﬁed by a glance at Eqs.
6.87 and 6.88 (remember the dependence on t has been eliminated for this case). By
quadratic convergence, we mean that the error after a given iteration is proportional
to the square of the error at the previous iteration, where the error is the diﬀerence
between the current solution and the converged solution. Quadratic convergence is
thus a very powerful property. Use of a ﬁnite value of h in Eq. 6.96 leads to linear
convergence, i.e., the error at a given iteration is some multiple of the error at the
previous iteration. The reader should ponder the meaning of letting h → ∞ for the
trapezoidal method, given by Eq. 6.94.
6.10.4 Local Linearization for Coupled Sets of Nonlinear Equa
tions
In order to present this concept, let us consider an example involving some sim
ple boundarylayer equations. We choose the FalknerSkan equations from classical
boundarylayer theory. Our task is to apply the implicit trapezoidal method to the
equations
d
3
f
dt
3
+f
d
2
f
dt
2
+β
¸
1 −
df
dt
2
¸
= 0 (6.99)
Here f represents a dimensionless stream function, and β is a scaling factor.
116 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
First of all we reduce Eq. 6.99 to a set of ﬁrstorder nonlinear equations by the
transformations
u
1
=
d
2
f
dt
2
, u
2
=
df
dt
, u
3
= f (6.100)
This gives the coupled set of three nonlinear equations
u
1
= F
1
= −u
1
u
3
−β
1 −u
2
2
u
2
= F
2
= u
1
u
3
= F
3
= u
2
(6.101)
and these can be represented in vector notation as
d
u
dt
=
F(
u) (6.102)
Now we seek to make the same local expansion that derived Eq. 6.90, except that
this time we are faced with a nonlinear vector function, rather than a simple nonlinear
scalar function. The required extension requires the evaluation of a matrix, called
the Jacobian matrix.
8
Let us refer to this matrix as A. It is derived from Eq. 6.102
by the following process
A = (a
ij
) = ∂F
i
/∂u
j
(6.103)
For the general case involving a third order matrix this is
A =
∂F
1
∂u
1
∂F
1
∂u
2
∂F
1
∂u
3
∂F
2
∂u
1
∂F
2
∂u
2
∂F
2
∂u
3
∂F
3
∂u
1
∂F
3
∂u
2
∂F
3
∂u
3
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
(6.104)
The expansion of
F(
u) about some reference state
u
n
can be expressed in a way
similar to the scalar expansion given by eq 6.87. Omitting the explicit dependency
on the independent variable t, and deﬁning
F
n
as
F(
u
n
), one has
9
8
Recall that we derived the Jacobian matrices for the twodimensional Euler equations in Section
2.2
9
The Taylor series expansion of a vector contains a vector for the ﬁrst term, a matrix times a
vector for the second term, and tensor products for the terms of higher order.
6.11. PROBLEMS 117
F(
u) =
F
n
+A
n
u −
u
n
+O(h
2
) (6.105)
where t −t
n
and the argument for O(h
2
) is the same as in the derivation of Eq. 6.88.
Using this we can write the local linearization of Eq. 6.102 as
d
u
dt
= A
n
u +
¸
F
n
−A
n
u
n
. .. .
“constant”
+O(h
2
) (6.106)
which is a locallylinear, secondorderaccurate approximation to a set of coupled
nonlinear ordinary diﬀerential equations that is valid for t ≤ t
n
+ h. Any ﬁrst or
secondorder timemarching method, explicit or implicit, could be used to integrate
the equations without loss in accuracy with respect to order. The number of times,
and the manner in which, the terms in the Jacobian matrix are updated as the solution
proceeds depends, of course, on the nature of the problem.
Returning to our simple boundarylayer example, which is given by Eq. 6.101, we
ﬁnd the Jacobian matrix to be
A =
−u
3
2βu
2
−u
1
1 0 0
0 1 0
¸
¸
¸ (6.107)
The student should be able to derive results for this example that are equivalent to
those given for the scalar case in Eq. 6.93. Thus for the FalknerSkan equations the
trapezoidal method results in
1 +
h
2
(u
3
)
n
−βh(u
2
)
n
h
2
(u
1
)
n
−
h
2
1 0
0 −
h
2
1
¸
¸
¸
¸
¸
(∆u
1
)
n
(∆u
2
)
n
(∆u
3
)
n
¸
¸
¸=h
−(u
1
u
3
)
n
−β(1 −u
2
2
)
n
(u
1
)
n
(u
2
)
n
¸
¸
¸
We ﬁnd
u
n+1
from ∆
u
n
+
u
n
, and the solution is now advanced one step. Reevaluate
the elements using
u
n+1
and continue. Without any iterating within a step advance,
the solution will be secondorderaccurate in time.
6.11 Problems
1. Find an expression for the nth term in the Fibonacci series, which is given
by 1, 1, 2, 3, 5, 8, . . . Note that the series can be expressed as the solution to a
diﬀerence equation of the form u
n+1
= u
n
+ u
n−1
. What is u
25
? (Let the ﬁrst
term given above be u
0
.)
118 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
2. The trapezoidal method u
n+1
= u
n
+
1
2
h(u
n+1
+u
n
) is used to solve the repre
sentative ODE.
(a) What is the resulting O∆E?
(b) What is its exact solution?
(c) How does the exact steadystate solution of the O∆E compare with the
exact steadystate solution of the ODE if µ = 0?
3. The 2ndorder backward method is given by
u
n+1
=
1
3
4u
n
−u
n−1
+ 2hu
n+1
(a) Write the O∆E for the representative equation. Identify the polynomials
P(E) and Q(E).
(b) Derive the λσ relation. Solve for the σroots and identify them as principal
or spurious.
(c) Find er
λ
and the ﬁrst two nonvanishing terms in a Taylor series expansion
of the spurious root.
(d) Perform a σroot trace relative to the unit circle for both diﬀusion and
convection.
4. Consider the timemarching scheme given by
u
n+1
= u
n−1
+
2h
3
(u
n+1
+u
n
+u
n−1
)
(a) Write the O∆E for the representative equation. Identify the polynomials
P(E) and Q(E).
(b) Derive the λ −σ relation.
(c) Find er
λ
.
5. Find the diﬀerence equation which results from applying the Gazdag predictor
corrector method (Eq. 6.65) to the representative equation. Find the λσ rela
tion.
6. Consider the following timemarching method:
˜ u
n+1/3
= u
n
+hu
n
/3
¯ u
n+1/2
= u
n
+h˜ u
n+1/3
/2
u
n+1
= u
n
+h¯ u
n+1/2
6.11. PROBLEMS 119
Find the diﬀerence equation which results from applying this method to the
representative equation. Find the λσ relation. Find the solution to the diﬀer
ence equation, including the homogeneous and particular solutions. Find er
λ
and er
µ
. What order is the homogeneous solution? What order is the particular
solution? Find the particular solution if the forcing term is ﬁxed.
7. Write a computer program to solve the onedimensional linear convection equa
tion with periodic boundary conditions and a = 1 on the domain 0 ≤ x ≤ 1.
Use 2ndorder centered diﬀerences in space and a grid of 50 points. For the
initial condition, use
u(x, 0) = e
−0.5[(x−0.5)/σ]
2
with σ = 0.08. Use the explicit Euler, 2ndorder AdamsBashforth (AB2), im
plicit Euler, trapezoidal, and 4thorder RungeKutta methods. For the explicit
Euler and AB2 methods, use a Courant number, ah/∆x, of 0.1; for the other
methods, use a Courant number of unity. Plot the solutions obtained at t = 1
compared to the exact solution (which is identical to the initial condition).
8. Repeat problem 7 using 4thorder (noncompact) diﬀerences in space. Use only
4thorder RungeKutta time marching at a Courant number of unity. Show
solutions at t = 1 and t = 10 compared to the exact solution.
9. Using the computer program written for problem 7, compute the solution at
t = 1 using 2ndorder centered diﬀerences in space coupled with the 4thorder
RungeKutta method for grids of 100, 200, and 400 nodes. On a loglog scale,
plot the error given by
M
¸
j=1
(u
j
−u
exact
j
)
2
M
where M is the number of grid nodes and u
exact
is the exact solution. Find the
global order of accuracy from the plot.
10. Using the computer program written for problem 8, repeat problem 9 using
4thorder (noncompact) diﬀerences in space.
11. Write a computer program to solve the onedimensional linear convection equa
tion with inﬂowoutﬂow boundary conditions and a = 1 on the domain 0 ≤
x ≤ 1. Let u(0, t) = sin ωt with ω = 10π. Run until a periodic steady state
is reached which is independent of the initial condition and plot your solution
120 CHAPTER 6. TIMEMARCHING METHODS FOR ODE’S
compared with the exact solution. Use 2ndorder centered diﬀerences in space
with a 1storder backward diﬀerence at the outﬂow boundary (as in Eq. 3.69)
together with 4thorder RungeKutta time marching. Use grids with 100, 200,
and 400 nodes and plot the error vs. the number of grid nodes, as described in
problem 9. Find the global order of accuracy.
12. Repeat problem 11 using 4thorder (noncompact) centered diﬀerences. Use a
thirdorder forwardbiased operator at the inﬂow boundary (as in Eq. 3.67). At
the last grid node, derive and use a 3rdorder backward operator (using nodes
j −3, j −2, j −1, and j) and at the second last node, use a 3rdorder backward
biased operator (using nodes j −2, j −1, j, and j +1; see problem 1 in Chapter
3).
13. Using the approach described in Section 6.6.2, ﬁnd the phase speed error, er
p
,
and the amplitude error, er
a
, for the combination of secondorder centered dif
ferences and 1st, 2nd, 3rd, and 4thorder RungeKutta timemarching at a
Courant number of unity. Also plot the phase speed error obtained using exact
integration in time, i.e., that obtained using the spatial discretization alone.
Note that the required σroots for the various RungeKutta methods can be
deduced from Eq. 6.69, without actually deriving the methods. Explain your
results.
Chapter 7
STABILITY OF LINEAR
SYSTEMS
A general deﬁnition of stability is neither simple nor universal and depends on the
particular phenomenon being considered. In the case of the nonlinear ODE’s of
interest in ﬂuid dynamics, stability is often discussed in terms of ﬁxed points and
attractors. In these terms a system is said to be stable in a certain domain if, from
within that domain, some norm of its solution is always attracted to the same ﬁxed
point. These are important and interesting concepts, but we do not dwell on them in
this work. Our basic concern is with timedependent ODE’s and O∆E ’s in which the
coeﬃcient matrices are independent of both u and t; see Section 4.2. We will refer
to such matrices as stationary. Chapters 4 and 6 developed the representative forms
of ODE’s generated from the basic PDE’s by the semidiscrete approach, and then
the O∆E’s generated from the representative ODE’s by application of timemarching
methods. These equations are represented by
d
u
dt
= A
u −
f(t) (7.1)
and
u
n+1
= C
u
n
−
g
n
(7.2)
respectively. For a onestep method, the latter form is obtained by applying a time
marching method to the generic ODE form in a fairly straightforward manner. For
example, the explicit Euler method leads to C = I + hA, and g
n
=
f(nh). Methods
involving two or more steps can always be written in the form of Eq. 7.2 by introducing
new dependent variables. Note also that only methods in which the time and space
discretizations are treated separately can be written in an intermediate semidiscrete
form such as Eq. 7.1. The fullydiscrete form, Eq. 7.2 and the associated stability
deﬁnitions and analysis are applicable to all methods.
121
122 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
7.1 Dependence on the Eigensystem
Our deﬁnitions of stability are based entirely on the behavior of the homogeneous
parts of Eqs. 7.1 and 7.2. The stability of Eq. 7.1 depends entirely on the eigensys
tem
1
of A. The stability of Eq. 7.2 can often also be related to the eigensystem of
its matrix. However, in this case the situation is not quite so simple since, in our
applications to partial diﬀerential equations (especially hyperbolic ones), a stability
deﬁnition can depend on both the time and space diﬀerencing. This is discussed in
Section 7.4. Analysis of these eigensystems has the important added advantage that
it gives an estimate of the rate at which a solution approaches a steadystate if a
system is stable. Consideration will be given to matrices that have both complete
and defective eigensystems, see Section 4.2.3, with a reminder that a complete system
can be arbitrarily close to a defective one, in which case practical applications can
make the properties of the latter appear to dominate.
If A and C are stationary, we can, in theory at least, estimate their fundamental
properties. For example, in Section 4.3.2, we found from our model ODE’s for dif
fusion and periodic convection what could be expected for the eigenvalue spectrums
of practical physical problems containing these phenomena. These expectations are
referred to many times in the following analysis of stability properties. They are
important enough to be summarized by the following:
• For diﬀusion dominated ﬂows the λeigenvalues tend to lie along the negative
real axis.
• For periodic convectiondominated ﬂows the λeigenvalues tend to lie along the
imaginary axis.
In many interesting cases, the eigenvalues of the matrices in Eqs. 7.1 and 7.2
are suﬃcient to determine the stability. In previous chapters, we designated these
eigenvalues as λ
m
and σ
m
for Eqs. 7.1 and 7.2, respectively, and we will ﬁnd it
convenient to examine the stability of various methods in both the complex λ and
complex σ planes.
1
This is not the case if the coeﬃcient matrix depends on t even if it is linear.
7.2. INHERENT STABILITY OF ODE’S 123
7.2 Inherent Stability of ODE’s
7.2.1 The Criterion
Here we state the standard stability criterion used for ordinary diﬀerential equa
tions.
For a stationary matrix A, Eq. 7.1 is inherently stable if, when
f is
constant,
u remains bounded as t →∞.
(7.3)
Note that inherent stability depends only on the transient solution of the ODE’s.
7.2.2 Complete Eigensystems
If a matrix has a complete eigensystem, all of its eigenvectors are linearly independent,
and the matrix can be diagonalized by a similarity transformation. In such a case it
follows at once from Eq. 6.27, for example, that the ODE’s are inherently stable if
and only if
(λ
m
) ≤ 0 for all m (7.4)
This states that, for inherent stability, all of the λ eigenvalues must lie on, or to the
left of, the imaginary axis in the complex λ plane. This criterion is satisﬁed for the
model ODE’s representing both diﬀusion and biconvection. It should be emphasized
(as it is an important practical consideration in convectiondominated systems) that
the special case for which λ = ±i is included in the domain of stability. In this case
it is true that
u does not decay as t → ∞, but neither does it grow, so the above
condition is met. Finally we note that for ODE’s with complete eigensystems the
eigenvectors play no role in the inherent stability criterion.
7.2.3 Defective Eigensystems
In order to understand the stability of ODE’s that have defective eigensystems, we
inspect the nature of their solutions in eigenspace. For this we draw on the results in
Sections 4.2.3 and especially on Eqs. 4.18 to 4.19 in that section. In an eigenspace
related to defective systems the form of the representative equation changes from a
single equation to a Jordan block. For example, instead of Eq. 4.45 a typical form of
the homogeneous part might be
u
1
u
2
u
3
¸
¸
¸ =
λ
1 λ
1 λ
¸
¸
¸
u
1
u
2
u
3
¸
¸
¸
124 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
for which one ﬁnds the solution
u
1
(t) = u
1
(0)e
λt
u
2
(t) = [u
2
(0) +u
1
(0)t]e
λt
u
3
(t) =
¸
u
3
(0) +u
2
(0)t +
1
2
u
1
(0)t
2
e
λt
(7.5)
Inspecting this solution, we see that for such cases condition 7.4 must be modiﬁed
to the form
(λ
m
) < 0 for all m (7.6)
since for pure imaginary λ, u
2
and u
3
would grow without bound (linearly or quadrat
ically) if u
2
(0) = 0 or u
1
(0) = 0. Theoretically this condition is suﬃcient for stability
in the sense of Statement 7.3 since t
k
e
−t
→0 as t →∞for all nonzero . However,
in practical applications the criterion may be worthless since there may be a very
large growth of the polynomial before the exponential “takes over” and brings about
the decay. Furthermore, on a computer such a growth might destroy the solution
process before it could be terminated.
Note that the stability condition 7.6 excludes the imaginary axis which tends to be
occupied by the eigenvalues related to biconvection problems. However, condition 7.6
is of little or no practical importance if signiﬁcant amounts of dissipation are present.
7.3 Numerical Stability of O∆E ’s
7.3.1 The Criterion
The O∆E companion to Statement 7.3 is
For a stationary matrix C, Eq. 7.2 is numerically stable if, when
g is
constant,
u
n
remains bounded as n →∞.
(7.7)
We see that numerical stability depends only on the transient solution of the O∆E ’s.
This deﬁnition of stability is sometimes referred to as asymptotic or time stability.
As we stated at the beginning of this chapter, stability deﬁnitions are not unique. A
deﬁnition often used in CFD literature stems from the development of PDE solutions
that do not necessarily follow the semidiscrete route. In such cases it is appropriate
to consider simultaneously the eﬀects of both the time and space approximations. A
timespace domain is ﬁxed and stability is deﬁned in terms of what happens to some
norm of the solution within this domain as the mesh intervals go to zero at some
constant ratio. We discuss this point of view in Section 7.4.
7.4. TIMESPACE STABILITY AND CONVERGENCE OF O∆E’S 125
7.3.2 Complete Eigensystems
Consider a set of O∆E ’s governed by a complete eigensystem. The stability criterion,
according to the condition set in Eq. 7.7, follows at once from a study of Eq. 6.28
and its companion for multiple σroots, Eq. 6.36. Clearly, for such systems a time
marching method is numerically stable if and only if
(σ
m
)
k
 ≤ 1 for all m and k (7.8)
This condition states that, for numerical stability, all of the σ eigenvalues (both
principal and spurious, if there are any) must lie on or inside the unit circle in the
complex σplane.
This deﬁnition of stability for O∆E ’s is consistent with the stability deﬁnition for
ODE’s. Again the sensitive case occurs for the periodicconvection model which places
the “correct” location of the principal σ–root precisely on the unit circle where the
solution is only neutrally stable. Further, for a complete eigensystem, the eigenvectors
play no role in the numerical stability assessment.
7.3.3 Defective Eigensystems
The discussion for these systems parallels the discussion for defective ODE’s. Examine
Eq. 6.18 and note its similarity with Eq. 7.5. We see that for defective O∆E’s the
required modiﬁcation to 7.8 is
(σ
m
)
k
 < 1 for all m and k (7.9)
since defective systems do not guarantee boundedness for σ = 1, for example in Eq.
7.5 if σ = 1 and either u
2
(0) or u
1
(0) = 0 we get linear or quadratic growth.
7.4 TimeSpace Stability and Convergence of O∆E’s
Let us now examine the concept of stability in a diﬀerent way. In the previous
discussion we considered in some detail the following approach:
1. The PDE’s are converted to ODE’s by approximating the space derivatives on
a ﬁnite mesh.
2. Inherent stability of the ODE’s is established by guaranteeing that (λ) ≤ 0.
3. Timemarch methods are developed which guarantee that σ(λh) ≤ 1 and this
is taken to be the condition for numerical stability.
126 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
This does guarantee that a stationary system, generated from a PDE on some ﬁxed
space mesh, will have a numerical solution that is bounded as t = nh → ∞. This
does not guarantee that desirable solutions are generated in the time march process
as both the time and space mesh intervals approach zero.
Now let us deﬁne stability in the timespace sense. First construct a ﬁnite time
space domain lying within 0 ≤ x ≤ L and 0 ≤ t ≤ T. Cover this domain with a grid
that is equispaced in both time and space and ﬁx the mesh ratio by the equation
2
c
n
=
∆t
∆x
Next reduce our O∆E approximation of the PDE to a twolevel (i.e., two timeplanes)
formula in the form of Eq. 7.2. The homogeneous part of this formula is
u
n+1
= Cu
n
(7.10)
Eq. 7.10 is said to be stable if any bounded initial vector, u
0
, produces a bounded
solution vector, u
n
, as the mesh shrinks to zero for a ﬁxed c
n
. This is the classical
deﬁnition of stability. It is often referred to as Lax or LaxRichtmyer stability. Clearly
as the mesh intervals go to zero, the number of time steps, N, must go to inﬁnity in
order to cover the entire ﬁxed domain, so the criterion in 7.7 is a necessary condition
for this stability criterion.
The signiﬁcance of this deﬁnition of stability arises through Lax’s Theorem, which
states that, if a numerical method is stable (in the sense of Lax) and consistent then
it is convergent. A method is consistent if it produces no error (in the Taylor series
sense) in the limit as the mesh spacing and the time step go to zero (with c
n
ﬁxed, in
the hyperbolic case). This is further discussed in Section 7.8. A method is convergent
if it converges to the exact solution as the mesh spacing and time step go to zero in
this manner.
3
Clearly, this is an important property.
Applying simple recursion to Eq. 7.10, we ﬁnd
u
n
= C
n
u
0
and using vector and matrix pnorms (see Appendix A) and their inequality relations,
we have
u
n
 = C
n
u
0
 ≤ C
n
 · u
0
 ≤ C
n
· u
0
 (7.11)
2
This ratio is appropriate for hyperbolic problems; a diﬀerent ratio may be needed for parabolic
problems such as the model diﬀusion equation.
3
In the CFD literature, the word converge is used with two entirely diﬀerent meanings. Here we
refer to a numerical solution converging to the exact solution of the PDE. Later we will refer to the
numerical solution converging to a steady state solution.
7.4. TIMESPACE STABILITY AND CONVERGENCE OF O∆E’S 127
Since the initial data vector is bounded, the solution vector is bounded if
C ≤ 1 (7.12)
where C represents any pnorm of C. This is often used as a suﬃcient condition
for stability.
Now we need to relate the stability deﬁnitions given in Eqs. 7.8 and 7.9 with that
given in Eq. 7.12. In Eqs. 7.8 and 7.9, stability is related to the spectral radius of
C, i.e., its eigenvalue of maximum magnitude. In Eq. 7.12, stability is related to a
pnorm of C. It is clear that the criteria are the same when the spectral radius is a
true pnorm.
Two facts about the relation between spectral radii and matrix norms are well
known:
1. The spectral radius of a matrix is its L
2
norm when the matrix is normal, i.e.,
it commutes with its transpose.
2. The spectral radius is the lower bound of all norms.
Furthermore, when C is normal, the second inequality in Eq. 7.11 becomes an equal
ity. In this case, Eq. 7.12 becomes both necessary and suﬃcient for stability. From
these relations we draw two important conclusions about the numerical stability of
methods used to solve PDE’s.
• The stability criteria in Eqs. 7.8 and 7.12 are identical for stationary systems
when the governing matrix is normal. This includes symmetric, asymmetric,
and circulant matrices. These criteria are both necessary and suﬃcient for
methods that generate such matrices and depend solely upon the eigenvalues of
the matrices.
• If the spectral radius of any governing matrix is greater than one, the method
is unstable by any criterion. Thus for general matrices, the spectral radius
condition is necessary
4
but not suﬃcient for stability.
4
Actually the necessary condition is that the spectral radius of C be less than or equal to 1 +
O(∆t), but this distinction is not critical for our purposes here.
128 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
7.5 Numerical Stability Concepts in the Complex
σPlane
7.5.1 σRoot Traces Relative to the Unit Circle
Whether or not the semidiscrete approach was taken to ﬁnd the diﬀerencing approx
imation of a set of PDE’s, the ﬁnal diﬀerence equations can be represented by
u
n+1
= Cu
n
−g
n
Furthermore if C has a complete
5
eigensystem, the solution to the homogeneous part
can always be expressed as
u
n
= c
1
σ
n
1
x
1
+· · · +c
m
σ
n
m
x
m
+ · · · +c
M
σ
n
M
x
M
where the σ
m
are the eigenvalues of C. If the semidiscrete approach is used, we can
ﬁnd a relation between the σ and the λ eigenvalues. This serves as a very convenient
guide as to where we might expect the σroots to lie relative to the unit circle in the
complex σplane. For this reason we will proceed to trace the locus of the σroots
as a function of the parameter λh for the equations modeling diﬀusion and periodic
convection
6
.
Locus of the exact trace
Figure 7.1 shows the exact trace of the σroot if it is generated by e
λh
representing
either diﬀusion or biconvection. In both cases the • represents the starting value
where h = 0 and σ = 1. For the diﬀusion model, λh is real and negative. As the
magnitude of λh increases, the trace representing the dissipation model heads towards
the origin as λh =→−∞. On the other hand, for the biconvection model, λh = iωh
is always imaginary. As the magnitude of ωh increases, the trace representing the
biconvection model travels around the circumference of the unit circle, which it never
leaves. We must be careful in interpreting σ when it is representing e
iωh
. The fact
that it lies on the unit circle means only that the amplitude of the representation is
correct, it tells us nothing of the phase error (see Eq. 6.41). The phase error relates
to the position on the unit circle.
5
The subject of defective eigensystems has been addressed. From now on we will omit further
discussion of this special case.
6
Or, if you like, the parameter h for ﬁxed values of λ equal to 1 and i for the diﬀusion and
biconvection cases, respectively.
7.5. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX σPLANE 129
Examples of some methods
Now let us compare the exact σroot traces with some that are produced by actual
timemarching methods. Table 7.1 shows the λσ relations for a variety of methods.
Figures 7.2 and 7.3 illustrate the results produced by various methods when they are
applied to the model ODE’s for diﬀusion and periodicconvection, Eqs. 4.4 and 4.5.
It is implied that the behavior shown is typical of what will happen if the methods
are applied to diﬀusion (or dissipation) dominated or periodic convectiondominated
problems as well as what does happen in the model cases. Most of the important
possibilities are covered by the illustrations.
1 σ −1 −λh = 0 Explicit Euler
2 σ
2
−2λhσ −1 = 0 Leapfrog
3 σ
2
−(1 +
3
2
λh)σ +
1
2
λh = 0 AB2
4 σ
3
−(1 +
23
12
λh)σ
2
+
16
12
λhσ −
5
12
λh = 0 AB3
5 σ
(
1 −λh) −1 = 0 Implicit Euler
6 σ(1 −
1
2
λh) −(1 +
1
2
λh) = 0 Trapezoidal
7 σ
2
(1 −
2
3
λh) −
4
3
σ +
1
3
= 0 2nd O Backward
8 σ
2
(1 −
5
12
λh) −(1 +
8
12
λh)σ +
1
12
λh = 0 AM3
9 σ
2
−(1 +
13
12
λh +
15
24
λ
2
h
2
)σ +
1
12
λh(1 +
5
2
λh) = 0 ABM3
10 σ
3
−(1 + 2λh)σ
2
+
3
2
λhσ −
1
2
λh = 0 Gazdag
11 σ −1 −λh −
1
2
λ
2
h
2
= 0 RK2
12 σ −1 −λh −
1
2
λ
2
h
2
−
1
6
λ
3
h
3
−
1
24
λ
4
h
4
= 0 RK4
13 σ
2
(1 −
1
3
λh) −
4
3
λhσ −(1 +
1
3
λh) = 0 Milne 4th
Table 7.1. Some λ −σ Relations
a. Explicit Euler Method
Figure 7.2 shows results for the explicit Euler method. When used for dissipation
dominated cases it is stable for the range 2≤ λh ≤0. (Usually the magnitude of λ has
to be estimated and often it is found by trial and error). When used for biconvection
the σ trace falls outside the unit circle for all ﬁnite h, and the method has no range
of stability in this case.
130 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
h= οο
 oo
,
σ =
e
σ =
e
i h
oo
,
a) Dissipation
b) Convection
λ
ω
h ω
h λ
h λ
Ι(σ)
Ι(σ)
(σ)
(σ) R
R
h=0 λ
h=0 ω
Figure 7.1: Exact traces of σroots for model equations.
b. Leapfrog Method
This is a tworoot method, since there are two σ’s produced by every λ. When
applied to dissipation dominated problems we see from Fig. 7.2 that the principal
root is stable for a range of λh, but the spurious root is not. In fact, the spurious
root starts on the unit circle and falls outside of it for all (λh) < 0. However, for
biconvection cases, when λ is pure imaginary, the method is not only stable, but it
also produces a σ that falls precisely on the unit circle in the range 0 ≤ ωh ≤ 1.
As was pointed out above, this does not mean, that the method is without error.
Although the ﬁgure shows that there is a range of ωh in which the leapfrog method
produces no error in amplitude, it says nothing about the error in phase. More is said
about this in Chapter 8.
c. SecondOrder AdamsBashforth Method
This is also a tworoot method but, unlike the leapfrog scheme, the spurious root
starts at the origin, rather than on the unit circle, see Fig. 7.2. Therefore, there is
a range of real negative λh for which the method will be stable. The ﬁgure shows
that the range ends when λh < −1.0 since at that point the spurious root leaves the
circle and σ
2
 becomes greater than one. The situation is quite diﬀerent when the
7.5. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX σPLANE 131
σ
1
σ
1
λ h=2
σ
1
σ
1
σ
2
σ
2
λ h=1
σ
1
σ
2
σ
2
σ
1
a) Euler Explicit
b) Leapfrog
c) AB2
h = 1 ω
Convection
Diffusion
Figure 7.2: Traces of σroots for various methods.
λroot is pure imaginary. In that case as ωh increases away from zero the spurious
root remains inside the circle and remains stable for a range of ωh. However, the
principal root falls outside the unit circle for all ωh > 0, and for the biconvection
model equation the method is unstable for all h.
d. Trapezoidal Method
The trapezoidal method is a very popular one for reasons that are partially illustrated
in Fig. 7.3. Its σroots fall on or inside the unit circle for both the dissipating and
the periodic convecting case and, in fact, it is stable for all values of λh for which
λ itself is inherently stable. Just like the leapfrog method it has the capability of
132 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
producing only phase error for the periodic convecting case, but there is a major
diﬀerence between the two since the trapezoidal method produces no amplitude error
for any ωh — not just a limited range between 0 ≤ ωh ≤ 1.
e. Gazdag Method
The Gazdag method was designed to produce low phase error. Since its characteristic
polynomial for σ is a cubic (Table 7.1, no. 10), it must have two spurious roots in
addition to the principal one. These are shown in Fig. 7.3. In both the dissipation
and biconvection cases, a spurious root limits the stability. For the dissipating case,
a spurious root leaves the unit circle when λh < −
1
2
, and for the biconvecting case,
when ωh >
2
3
. Note that both spurious roots are located at the origin when λ = 0.
f,g. Second and FourthOrder RungeKutta Methods, RK2 and RK4
Traces of the σroots for the second and fourthorder RungeKutta methods are
shown in Fig. 7.3. The ﬁgures show that both methods are stable for a range of
λh when λh is real and negative, but that the range of stability for RK4 is greater,
going almost all the way to 2.8, whereas RK2 is limited to 2. On the other hand for
biconvection RK2 is unstable for all ωh, whereas RK4 remains inside the unit circle
for 0 ≤ ωh ≤ 2
√
2. One can show that the RK4 stability limit is about λh < 2.8 for
all complex λh for which (λ) ≤ 0.
7.5.2 Stability for Small ∆t
It is interesting to pursue the question of stability when the time step size, h, is small
so that accuracy of all the λroots is of importance. Situations for which this is not
the case are considered in Chapter 8.
Mild instability
All conventional timemarching methods produce a principal root that is very close
to e
λh
for small values of λh. Therefore, on the basis of the principal root, the
stability of a method that is required to resolve a transient solution over a relatively
short time span may be a moot issue. Such cases are typiﬁed by the AB2 and RK2
methods when they are applied to a biconvection problem. Figs. 7.2c and 7.3f show
that for both methods the principal root falls outside the unit circle and is unstable
for all ωh. However, if the transient solution of interest can be resolved in a limited
number of time steps that are small in the sense of the ﬁgure, the error caused by this
instability may be relatively unimportant. If the root had fallen inside the circle the
7.5. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX σPLANE 133
σ
1
σ
1
σ
1
σ
1
σ
2
σ
3
σ
1
σ
1
σ
1
σ
1
σ
2
σ
3
d) Trapezoidal
e) Gazdag
f) RK2
g) RK4
λ h= οο h = 2/3 ω
λ
h=2
λ
h=2.8
h = 2 2 ω
Diffusion
Convection
Figure 7.3: Traces of σroots for various methods (cont’d).
134 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
method would have been declared stable but an error of the same magnitude would
have been committed, just in the opposite direction. For this reason the AB2 and the
RK2 methods have both been used in serious quantitative studies involving periodic
convection. This kind of instability is referred to as mild instability and is not a
serious problem under the circumstances discussed.
Catastrophic instability
There is a much more serious stability problem for small h that can be brought about
by the existence of certain types of spurious roots. One of the best illustrations of this
kind of problem stems from a critical study of the most accurate, explicit, twostep,
linear multistep method (see Table 7.1):
u
n+1
= −4u
n
+ 5u
n−1
+ 2h
2u
n
+u
n−1
(7.13)
One can show, using the methods given in Section 6.6, that this method is third
order accurate both in terms of er
λ
and er
µ
, so from an accuracy point of view it is
attractive. However, let us inspect its stability even for very small values of λh. This
can easily be accomplished by studying its characteristic polynomial when λh → 0.
From Eq. 7.13 it follows that for λh = 0, P(E) = E
2
+ 4E − 5. Factoring P(σ) = 0
we ﬁnd P(σ) = (σ − 1)(σ + 5) = 0. There are two σroots; σ
1
, the principal one,
equal to 1, and σ
2
, a spurious one, equal to 5!!
In order to evaluate the consequences of this result, one must understand how
methods with spurious roots work in practice. We know that they are not self start
ing, and the special procedures chosen to start them initialize the coeﬃcients of the
spurious roots, the c
mk
for k > 1 in Eq. 6.36. If the starting process is well designed
these coeﬃcients are forced to be very small, and if the method is stable, they get
smaller with increasing n. However, if the magnitude of one of the spurious σ is
equal to 5, one can see disaster is imminent because (5)
10
≈ 10
7
. Even a very small
initial value of c
mk
is quickly overwhelmed. Such methods are called catastrophically
unstable and are worthless for most, if not all, computations.
Milne and Adams type methods
If we inspect the σroot traces of the multiple root methods in Figs. 7.2 and 7.3, we
ﬁnd them to be of two types. One type is typiﬁed by the leapfrog method. In this
case a spurious root falls on the unit circle when h →0. The other type is exempliﬁed
by the 2ndorder AdamsBashforth and Gazdag methods. In this case all spurious
roots fall on the origin when h →0.
The former type is referred to as a Milne Method. Since at least one spurious root
for a Milne method always starts on the unit circle, the method is likely to become
7.6. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX λH PLANE135
unstable for some complex λ as h proceeds away from zero. On the basis of a Taylor
series expansion, however, these methods are generally the most accurate insofar as
they minimize the coeﬃcient in the leading term for er
t
.
The latter type is referred to as an Adams Method. Since for these methods
all spurious methods start at the origin for h = 0, they have a guaranteed range of
stability for small enough h. However, on the basis of the magnitude of the coeﬃcient
in the leading Taylor series error term, they suﬀer, relatively speaking, from accuracy.
For a given amount of computational work, the order of accuracy of the two
types is generally equivalent, and stability requirements in CFD applications generally
override the (usually small) increase in accuracy provided by a coeﬃcient with lower
magnitude.
7.6 Numerical Stability Concepts in the Complex
λh Plane
7.6.1 Stability for Large h.
The reason to study stability for small values of h is fairly easy to comprehend.
Presumably we are seeking to resolve some transient and, since the accuracy of the
transient solution for all of our methods depends on the smallness of the timestep,
we seek to make the size of this step as small as possible. On the other hand, the
cost of the computation generally depends on the number of steps taken to compute
a solution, and to minimize this we wish to make the step size as large as possible.
In the compromise, stability can play a part. Aside from ruling out catastrophically
unstable methods, however, the situation in which all of the transient terms are
resolved constitutes a rather minor role in stability considerations.
By far the most important aspect of numerical stability occurs under conditions
when:
• One has inherently stable, coupled systems with λ–eigenvalues having widely
separated magnitudes.
or
• We seek only to ﬁnd a steadystate solution using a path that includes the
unwanted transient.
In both of these cases there exist in the eigensystems relatively large values of
λh associated with eigenvectors that we wish to drive through the solution process
without any regard for their individual accuracy in eigenspace. This situation is
136 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
the major motivation for the study of numerical stability. It leads to the subject of
stiﬀness discussed in the next chapter.
7.6.2 Unconditional Stability, AStable Methods
Inherent stability of a set of ODE’s was deﬁned in Section 7.2 and, for coupled sets
with a complete eigensystem, it amounted to the requirement that the real parts of all
λ eigenvalues must lie on, or to the left of, the imaginary axis in the complex λ plane.
This serves as an excellent reference frame to discuss and deﬁne the general stability
features of timemarching methods. For example, we start with the deﬁnition:
A numerical method is unconditionally stable if it is stable for all ODE’s
that are inherently stable.
A method with this property is said to be Astable. A method is A
o
stable if the
region of stability contains the negative real axis in the complex λh plane, and Istable
if it contains the entire imaginary axis. By applying a fairly simple test for Astability
in terms of positive real functions to the class of twostep LMM’s given in Section
6.7.3, one ﬁnds these methods to be Astable if and only if
θ ≥ ϕ +
1
2
(7.14)
ξ ≥ −
1
2
(7.15)
ξ ≤ θ +ϕ −
1
2
(7.16)
A set of Astable implicit methods is shown in Table 7.2.
θ ξ ϕ Method Order
1 0 0 Implicit Euler 1
1/2 0 0 Trapezoidal 2
1 1/2 0 2nd O Backward 2
3/4 0 −1/4 Adams type 2
1/3 −1/2 −1/3 Lees 2
1/2 −1/2 −1/2 Twostep trapezoidal 2
5/8 −1/6 −2/9 Acontractive 2
Table 7.2. Some unconditionally stable (Astable) implicit methods.
Notice that none of these methods has an accuracy higher than secondorder. It can
be proved that the order of an Astable LMM cannot exceed two, and, furthermore
7.6. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX λH PLANE137
that of all 2ndorder Astable methods, the trapezoidal method has the smallest
truncation error.
Returning to the stability test using positive real functions one can show that a
twostep LMM is A
o
stable if and only if
θ ≥ ϕ +
1
2
(7.17)
ξ ≥ −
1
2
(7.18)
0 ≤ θ −ϕ (7.19)
For ﬁrstorder accuracy, the inequalities 7.17 to 7.19 are less stringent than 7.14 to
7.16. For secondorder accuracy, however, the parameters (θ, ξ, ϕ) are related by the
condition
ϕ = ξ −θ +
1
2
and the two sets of inequalities reduce to the same set which is
ξ ≤ 2θ −1 (7.20)
ξ ≥ −
1
2
(7.21)
Hence, twostep, secondorder accurate LMM’s that are Astable and A
o
stable share
the same (ϕ, ξ, θ) parameter space. Although the order of accuracy of an Astable
method cannot exceed two, A
o
stable LMM methods exist which have an accuracy of
arbitrarily high order.
It has been shown that for a method to be Istable it must also be Astable.
Therefore, no further discussion is necessary for the special case of Istability.
It is not diﬃcult to prove that methods having a characteristic polynomial for
which the coeﬃcient of the highest order term in E is unity
7
can never be uncondi
tionally stable. This includes all explicit methods and predictorcorrector methods
made up of explicit sequences. Such methods are referred to, therefore, as condition
ally stable methods.
7.6.3 Stability Contours in the Complex λh Plane.
A very convenient way to present the stability properties of a timemarching method
is to plot the locus of the complex λh for which σ = 1, such that the resulting
7
Or can be made equal to unity by a trivial normalization (division by a constant independent of
λh). The proof follows from the fact that the coeﬃcients of such a polynomial are sums of various
combinations of products of all its roots.
138 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
λh)
λh)
c) Euler Implicit
1
Stable
Unstable
b) Trapezoid Implicit
λh)
λh)
Stable Unstable
λh)
λh)
1
a) Euler Explicit
Stable
Unstable
θ = 0 θ = 1/2 θ = 1
I
R
I
R R
I(
(
(
(
( (
Figure 7.4: Stability contours for the θmethod.
contour goes through the point λh = 0. Here σ refers to the maximum absolute
value of any σ, principal or spurious, that is a root to the characteristic polynomial for
a given λh. It follows from Section 7.3 that on one side of this contour the numerical
method is stable while on the other, it is unstable. We refer to it, therefore, as a
stability contour.
Typical stability contours for both explicit and implicit methods are illustrated in
Fig. 7.4, which is derived from the oneroot θmethod given in Section 6.5.4.
Contours for explicit methods
Fig. 7.4a shows the stability contour for the explicit Euler method. In the following
two ways it is typical of all stability contours for explicit methods:
1. The contour encloses a ﬁnite portion of the lefthalf complex λhplane.
2. The region of stability is inside the boundary, and therefore, it is conditional.
However, this method includes no part of the imaginary axis (except for the origin)
and so it is unstable for the model biconvection problem. Although several explicit
methods share this deﬁciency (e.g., AB2, RK2), several others do not (e.g., leapfrog,
Gazdag, RK3, RK4), see Figs. 7.5 and 7.6. Notice in particular that the third and
fourthorder RungeKutta methods, Fig. 7.6, include a portion of the imaginary axis
out to ±1.9i and ±2
√
2i, respectively.
Contours for unconditionally stable implicit methods
Fig. 7.4c shows the stability contour for the implicit Euler method. It is typical of
many stability contours for unconditionally stable implicit methods. Notice that the
7.6. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX λH PLANE139
Figure 7.5: Stability contours for some explicit methods.
Stable
Regions
RK1
RK2
RK3
RK4
1.0
2.0
3.0
1.0 2.0 3.0
R( h)
I( h)
λ
λ
Figure 7.6: Stability contours for RungeKutta methods.
140 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
1.0
2.0
Unstable
a) 2nd Order Backward Implicit
Stable Outside
Unstable
b) Adams Type
Stable Outside
I( h)
R( h)
I( h)
R( h)
λ
λ
λ
λ
1.0
1.0
1.0
2.0 2.0
2.0
3.0
4.0
Figure 7.7: Stability contours for the 2 unconditionally stable implicit methods.
method is stable for the entire range of complex λh that fall outside the boundary.
This means that the method is numerically stable even when the ODE’s that it is
being used to integrate are inherently unstable. Some other implicit unconditionally
stable methods with the same property are shown in Fig. 7.7. In all of these cases
the imaginary axis is part of the stable region.
Not all unconditionally stable methods are stable in some regions where the ODE’s
they are integrating are inherently unstable. The classic example of a method that
is stable only when the generating ODE’s are themselves inherently stable is the
trapezoidal method, i.e., the special case of the θmethod for which θ =
1
2
. The
stability boundary for this case is shown in Fig. 7.4b. The boundary is the imaginary
axis and the numerical method is stable for λh lying on or to the left of this axis.
Two other methods that have this property are the twostep trapezoidal method
u
n+1
= u
n−1
+h
u
n+1
+u
n−1
and a method due to Lees
u
n+1
= u
n−1
+
2
3
h
u
n+1
+u
n
+u
n−1
Notice that both of these methods are of the Milne type.
Contours for conditionally stable implicit methods
Just because a method is implicit does not mean that it is unconditionally stable.
Two illustrations of this are shown in Fig. 7.8. One of these is the AdamsMoulton
3rdorder method (no. 8, Table 7.1). Another is the 4thorder Milne method given
by the point operator
u
n+1
= u
n−1
+
1
3
h
u
n+1
+ 4u
n
+u
n−1
and shown in Table 7.1 as no. 13. It is stable only for λ = ±iω when 0 ≤ ω ≤
√
3.
Its stability boundary is very similar to that for the leapfrog method (see Fig. 7.5b).
7.7. FOURIER STABILITY ANALYSIS 141
Stable
I( h)
1.0 2.0 3.0
1.0
2.0
3.0
λ
a) AM3
I( h)
1.0
2.0
3.0
λ
R( h) λ
R( h) λ
Stable
only on
Imag Axis
b) Milne 4th Order
Figure 7.8: Stability contours for the 2 conditionally stable implicit methods.
7.7 Fourier Stability Analysis
By far the most popular form of stability analysis for numerical schemes is the Fourier
or von Neumann approach. This analysis is usually carried out on point operators and
it does not depend on an intermediate stage of ODE’s. Strictly speaking it applies
only to diﬀerence approximations of PDE’s that produce O∆E’s which are linear,
have no space or time varying coeﬃcients, and have periodic boundary conditions.
8
In practical application it is often used as a guide for estimating the worthiness of
a method for more general problems. It serves as a fairly reliable necessary stability
condition, but it is by no means a suﬃcient one.
7.7.1 The Basic Procedure
One takes data from a “typical” point in the ﬂow ﬁeld and uses this as constant
throughout time and space according to the assumptions given above. Then one
imposes a spatial harmonic as an initial value on the mesh and asks the question:
Will its amplitude grow or decay in time? The answer is determined by ﬁnding the
conditions under which
u(x, t) = e
αt
· e
iκx
(7.22)
is a solution to the diﬀerence equation, where κ is real and κ∆x lies in the range
0 ≤ κ∆x ≤ π. Since, for the general term,
u
(n+)
j+m
= e
α(t+∆t)
· e
iκ(x+m∆x)
= e
α∆t
· e
iκm∆x
· u
(n)
j
8
Another way of viewing this is to consider it as an initial value problem on an inﬁnite space
domain.
142 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
the quantity u
(n)
j
is common to every term and can be factored out. In the remaining
expressions, we ﬁnd the term e
α∆t
which we represent by σ, thus:
σ ≡ e
α∆t
Then, since e
αt
=
e
α∆t
n
= σ
n
, it is clear that
For numerical stability σ ≤ 1
(7.23)
and the problem is to solve for the σ’s produced by any given method and, as a
necessary condition for stability, make sure that, in the worst possible combination
of parameters, condition 7.23 is satisﬁed
9
.
7.7.2 Some Examples
The procedure can best be explained by examples. Consider as a ﬁrst example the
ﬁnite diﬀerence approximation to the model diﬀusion equation known as Richardson’s
method of overlapping steps. This was mentioned in Section 4.1 and given as Eq.
4.1:
u
(n+1)
j
= u
(n−1)
j
+ν
2∆t
∆x
2
u
(n)
j+1
−2u
(n)
j
+u
(n)
j−1
(7.24)
Substitution of Eq. 7.22 into Eq. 7.24 gives the relation
σ = σ
−1
+ν
2∆t
∆x
2
e
iκ∆x
−2 +e
−iκ∆x
or
σ
2
+
¸
4ν∆t
∆x
2
(1 −cos κ∆x)
. .. .
2b
σ −1 = 0 (7.25)
Thus Eq. 7.22 is a solution of Eq. 7.24 if σ is a root of Eq. 7.25. The two roots of
7.25 are
σ
1,2
= −b ±
√
b
2
+ 1
from which it is clear that one σ is always > 1. We ﬁnd, therefore, that by the
Fourier stability test, Richardson’s method of overlapping steps is unstable for all ν,
κ and ∆t.
9
If boundedness is required in a ﬁnite time domain, the condition is often presented as σ ≤
1 + O(∆t).
7.8. CONSISTENCY 143
As another example consider the ﬁnitediﬀerence approximation for the model
biconvection equation
u
(n+1)
j
= u
(n)
j
−
a∆t
2∆x
u
(n)
j+1
−u
(n)
j−1
(7.26)
In this case
σ = 1 −
a∆t
∆x
· i · sin κ∆x
from which it is clear that σ > 1 for all nonzero a and κ. Thus we have another
ﬁnitediﬀerence approximation that, by the Fourier stability test, is unstable for any
choice of the free parameters.
7.7.3 Relation to Circulant Matrices
The underlying assumption in a Fourier stability analysis is that the C matrix, deter
mined when the diﬀerencing scheme is put in the form of Eq. 7.2, is circulant. Such
being the case, the e
iκx
in Eq. 7.22 represents an eigenvector of the system, and the
two examples just presented outline a simple procedure for ﬁnding the eigenvalues of
the circulant matrices formed by application of the two methods to the model prob
lems. The choice of σ for the stability parameter in the Fourier analysis, therefore,
is not an accident. It is exactly the same σ we have been using in all of our previous
discussions, but arrived at from a diﬀerent perspective.
If we examine the preceding examples from the viewpoint of circulant matrices
and the semidiscrete approach, the results present rather obvious conclusions. The
space diﬀerencing in Richardson’s method produces the matrix s · B
p
(1, −2, 1) where
s is a positive scalar coeﬃcient. From Appendix B we ﬁnd that the eigenvalues of this
matrix are real negative numbers. Clearly, the timemarching is being carried out by
the leapfrog method and, from Fig. 7.5, this method is unstable for all eigenvalues with
negative real parts. On the other hand, the space matrix in Eq. 7.26 is B
p
(−1, 0, 1),
and according to Appendix B, this matrix has pure imaginary eigenvalues. However,
in this case the explicit Euler method is being used for the timemarch and, according
to Fig. 7.4, this method is always unstable for such conditions.
7.8 Consistency
Consider the model equation for diﬀusion analysis
∂u
∂t
= ν
∂
2
u
∂x
2
(7.27)
144 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
Many years before computers became available (1910, in fact), Lewis F. Richardson
proposed a method for integrating equations of this type. We presented his method
in Eq. 4.2 and analyzed its stability by the Fourier method in Section 7.7.
In Richardson’s time, the concept of numerical instability was not known. How
ever, the concept is quite clear today and we now know immediately that his approach
would be unstable. As a semidiscrete method it can be expressed in matrix notation
as the system of ODE’s:
d
u
dt
=
ν
∆x
2
B(1, −2, 1)
u +
(bc) (7.28)
with the leapfrog method used for the time march. Our analysis in this Chapter
revealed that this is numerically unstable since the λroots of B(1,2,1) are all real
and negative and the spurious σroot in the leapfrog method is unstable for all such
cases, see Fig. 7.5b.
The method was used by Richardson for weather prediction, and this fact can now
be a source of some levity. In all probability, however, the hand calculations (the
only approach available at the time) were not carried far enough to exhibit strange
phenomena. We could, of course, use the 2ndorder RungeKutta method to integrate
Eq. 7.28 since it is stable for real negative λ’s. It is, however, conditionally stable
and for this case we are rather severely limited in time step size by the requirement
∆t ≤ ∆x
2
/(2ν).
There are many ways to manipulate the numerical stability of algorithms. One of
them is to introduce mixed time and space diﬀerencing, a possibility we have not yet
considered. For example, we introduced the DuFortFrankel method in Chapter 4:
u
(n+1)
j
= u
(n−1)
j
+
2ν∆t
∆x
2
u
(n)
j−1
−2
¸
u
(n+1)
j
+u
(n−1)
j
2
¸
+u
(n)
j+1
¸
¸
(7.29)
in which the central term in the space derivative in Eq. 4.2 has been replaced by its
average value at two diﬀerent time levels. Now let
α ≡
2ν∆t
∆x
2
and rearrange terms
(1 + α)u
(n+1)
j
= (1 −α)u
(n−1)
j
+α
u
(n)
j−1
+u
(n)
j+1
There is no obvious ODE between the basic PDE and this ﬁnal O∆E. Hence, there
is no intermediate λroot structure to inspect. Instead one proceeds immediately to
the σroots.
7.8. CONSISTENCY 145
The simplest way to carry this out is by means of the Fourier stability analysis
introduced in Section 7.7. This leads at once to
(1 +α)σ = (1 −α)σ
−1
+α
e
iκ∆x
+e
−iκ∆x
or
(1 +α)σ
2
−2ασ cos(k∆x) −(1 −α) = 0
The solution of the quadratic is
σ =
αcos κ∆x ±
1 −α
2
sin
2
κ∆x
1 +α
There are 2M σroots all of which are ≤ 1 for any real α in the range 0 ≤ α ≤ ∞.
This means that the method is unconditionally stable!
The above result seems too good to be true, since we have found an unconditionally
stable method using an explicit combination of Lagrangian interpolation polynomials.
The price we have paid for this is the loss of con
sistency with the original PDE.
To prove this, we expand the terms in Eq. 7.30 in a Taylor series and reconstruct
the partial diﬀerential equation we are actually solving as the mesh size becomes very
small. For the time derivative we have
1
2∆t
u
(n+1)
j
−u
(n−1)
j
= (∂
t
u)
(n)
j
+
1
6
∆t
2
(∂
ttt
u)
(n)
j
+· · ·
and for the mixed time and space diﬀerences
u
(n)
j−1
−u
(n+1)
j
−u
(n−1)
j
+u
(n)
j+1
∆x
2
= (∂
xx
u)
(n)
j
−
∆t
∆x
2
(∂
tt
u)
(n)
j
+
1
12
∆x
2
(∂
xxxx
u)
(n)
j
−
1
12
∆t
2
∆t
∆x
2
(∂
tttt
u)
(n)
j
+· · · (7.30)
Replace the terms in Eq. 7.30 with the above expansions and take the limit as
∆t , ∆x →0. We ﬁnd
∂u
∂t
= ν
∂
2
u
∂x
2
−νr
2
∂
2
u
∂t
2
(7.31)
where
r ≡
∆t
∆x
146 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
Eq. 7.27 is parabolic. Eq. 7.31 is hyperbolic. Thus if ∆t → 0 and ∆x →0 in such
a way that
∆t
∆x
remains constant, the equation we actually solve by the method in
Eq. 7.30 is a wave equation, not a diﬀusion equation. In such a case Eq. 7.30 is not
uniformly consistent with the equation we set out to solve even for vanishingly small
step sizes. The situation is summarized in Table 7.3. If ∆t →0 and ∆x →0 in such
a way that
∆t
∆x
2
remains constant, then r
2
is O(∆t), and the method is consistent.
However, this leads to a constraint on the time step which is just as severe as that
imposed by the stability condition of the secondorder RungeKutta method shown
in the table.
2ndorder RungeKutta DuFort–Frankel
For Stability
∆t ≤
∆x
2
2ν
∆t ≤ ∞
Conditionally Stable Unconditionally Stable
For Consistency
Uniformly Consistent Conditionally Consistent,
with Approximates
∂u
∂t
= ν
∂
2
u
∂x
2
∂u
∂t
= ν
∂
2
u
∂x
2
only if
ν
∆t
∆x
2
<
Therefore
∆t < ∆x
ν
Table 7.3: Summary of accuracy and consistency conditions for RK2 and
Du FortFrankel methods. = an arbitrary error bound.
7.9 Problems
1. Consider the ODE
u
=
du
dt
= Au +f
with
A =
−10 −0.1 −0.1
1 −1 1
10 1 −1
¸
¸
¸ , f =
−1
0
0
¸
¸
¸
7.9. PROBLEMS 147
(a) Find the eigenvalues of A using a numerical package. What is the steady
state solution? How does the ODE solution behave in time?
(b) Write a code to integrate from the initial condition u(0) = [1, 1, 1]
T
from
time t = 0 using the explicit Euler, implicit Euler, and MacCormack meth
ods. In all three cases, use h = 0.1 for 1000 time steps, h = 0.2 for 500
time steps, h = 0.4 for 250 time steps and h = 1.0 for 100 time steps.
Compare the computed solution with the exact steady solution.
(c) Using the λσ relations for these three methods, what are the expected
bounds on h for stability? Are your results consistent with these bounds?
2. (a) Compute a table of the numerical values of the σroots of the 2ndorder
AdamsBashforth method when λ = i. Take h in intervals of 0.05 from 0
to 0.80 and compute the absolute values of the roots to at least 6 places.
(b) Plot the trace of the roots in the complex σplane and draw the unit circle
on the same plot.
(c) Repeat the above for the RK2 method.
3. When applied to the linear convection equation, the widely known Lax–Wendroﬀ
method gives:
u
n+1
j
= u
n
j
−
1
2
C
n
(u
n
j+1
−u
n
j−1
) +
1
2
C
2
n
(u
n
j+1
−2u
n
j
+u
n
j−1
)
where C
n
, known as the Courant (or CFL) number, is ah/∆x. Using Fourier
stability analysis, ﬁnd the range of C
n
for which the method is stable.
4. Determine and plot the stability contours for the AdamsBashforth methods of
order 1 through 4. Compare with the RungeKutta methods of order 1 through
4.
5. Recall the compact centered 3point approximation for a ﬁrst derivative:
(δ
x
u)
j−1
+ 4(δ
x
u)
j
+ (δ
x
u)
j+1
=
3
∆x
(u
j+1
−u
j−1
)
By replacing the spatial index j by the temporal index n, obtain a timemarching
method using this formula. What order is the method? Is it explicit or implicit?
Is it a twostep LMM? If so, to what values of ξ, θ, and φ (in Eq. 6.59) does it
correspond? Derive the λσ relation for the method. Is it Astable, A
o
stable,
or Istable?
148 CHAPTER 7. STABILITY OF LINEAR SYSTEMS
6. Write the ODE system obtained by applying the 2ndorder centered diﬀerence
approximation to the spatial derivative in the model diﬀusion equation with
periodic boundary conditions. Using Appendix B.4, ﬁnd the eigenvalues of the
spatial operator matrix. Given that the λσ relation for the 2ndorder Adams
Bashforth method is
σ
2
−(1 + 3λh/2)σ +λh/2 = 0
show that the maximum stable value of λh for real negative λ, i.e., the point
where the stability contour intersects the negative real axis, is obtained with
λh = −1. Using the eigenvalues of the spatial operator matrix, ﬁnd the maxi
mum stable time step for the combination of 2ndorder centered diﬀerences and
the 2ndorder AdamsBashforth method applied to the model diﬀusion equa
tion. Repeat using Fourier analysis.
7. Consider the following PDE:
∂u
∂t
= i
∂
2
u
∂x
2
where i =
√
−1. Which explicit timemarching methods would be suitable for
integrating this PDE, if 2ndorder centered diﬀerencing is used for the spa
tial diﬀerences and the boundary conditions are periodic? Find the stability
condition for one of these methods.
8. Using Fourier analysis, analyze the stability of ﬁrstorder backward diﬀerenc
ing coupled with explicit Euler time marching applied to the linear convection
equation with positive a. Find the maximum Courant number for stability.
9. Consider the linear convection with a positive wave speed as in problem8. Apply
a Dirichlet boundary condition at the left boundary. No boundary condition is
permitted at the right boundary. Write the system of ODE’s which results from
ﬁrstorder backward spatial diﬀerencing in matrixvector form. Using Appendix
B.1, ﬁnd the λeigenvalues. Write the O∆E which results from the application
of explicit Euler time marching in matrixvector form, i.e.,
u
n+1
= Cu
n
−g
n
Write C in banded matrix notation and give the entries of g. Using the λ
σ relation for the explicit Euler method, ﬁnd the σeigenvalues. Based on
these, what is the maximum Courant number allowed for asymptotic stability?
Explain why this diﬀers from the answer to problem 8. Hint: is C normal?
Chapter 8
CHOICE OF TIMEMARCHING
METHODS
In this chapter we discuss considerations involved in selecting a timemarching method
for a speciﬁc application. Examples are given showing how timemarching methods
can be compared in a given context. An important concept underlying much of this
discussion is stiﬀness, which is deﬁned in the next section.
8.1 Stiﬀness Deﬁnition for ODE’s
8.1.1 Relation to λEigenvalues
The introduction of the concept referred to as “stiﬀness” comes about from the nu
merical analysis of mathematical models constructed to simulate dynamic phenom
ena containing widely diﬀerent time scales. Deﬁnitions given in the literature are
not unique, but fortunately we now have the background material to construct a
deﬁnition which is entirely suﬃcient for our purposes.
We start with the assumption that our CFD problem is modeled with suﬃcient
accuracy by a coupled set of ODE’s producing an A matrix typiﬁed by Eq. 7.1.
Any deﬁnition of stiﬀness requires a coupled system with at least two eigenvalues,
and the decision to use some numerical timemarching or iterative method to solve
it. The diﬀerence between the dynamic scales in physical space is represented by
the diﬀerence in the magnitude of the eigenvalues in eigenspace. In the following
discussion we concentrate on the transient part of the solution. The forcing function
may also be time varying in which case it would also have a time scale. However,
we assume that this scale would be adequately resolved by the chosen timemarching
method, and, since this part of the ODE has no eﬀect on the numerical stability of
149
150 CHAPTER 8. CHOICE OF TIMEMARCHING METHODS
Stable
Region
Accurate
Region
1
I( h)
R( h)
λ
λ
λh
Figure 8.1: Stable and accurate regions for the explicit Euler method.
the homogeneous part, we exclude the forcing function from further discussion in this
section.
Consider now the form of the exact solution of a system of ODE’s with a com
plete eigensystem. This is given by Eq. 6.27 and its solution using a oneroot, time
marching method is represented by Eq. 6.28. For a given time step, the time integra
tion is an approximation in eigenspace that is diﬀerent for every eigenvector x
m
. In
many numerical applications the eigenvectors associated with the small λ
m
 are well
resolved and those associated with the large λ
m
 are resolved much less accurately,
if at all. The situation is represented in the complex λh plane in Fig. 8.1. In this
ﬁgure the time step has been chosen so that time accuracy is given to the eigenvectors
associated with the eigenvalues lying in the small circle and stability without time
accuracy is given to those associated with the eigenvalues lying outside of the small
circle but still inside the large circle.
The whole concept of stiﬀness in CFD arises from the fact that we often do
not need the time resolution of eigenvectors associated with the large λ
m
 in
the transient solution, although these eigenvectors must remain coupled into
the system to maintain a high accuracy of the spatial resolution.
8.1. STIFFNESS DEFINITION FOR ODE’S 151
8.1.2 Driving and Parasitic Eigenvalues
For the above reason it is convenient to subdivide the transient solution into two
parts. First we order the eigenvalues by their magnitudes, thus
λ
1
 ≤ λ
2
 ≤ · · · ≤ λ
M
 (8.1)
Then we write
Transient
Solution
=
p
¸
m=1
c
m
e
λmt
x
m
. .. .
Driving
+
M
¸
m=p+1
c
m
e
λmt
x
m
. .. .
Parasitic
(8.2)
This concept is crucial to our discussion. Rephrased, it states that we can separate
our eigenvalue spectrum into two groups; one [λ
1
→λ
p
] called the driving eigenvalues
(our choice of a timestep and marching method must accurately approximate the time
variation of the eigenvectors associated with these), and the other, [λ
p+1
→λ
M
], called
the parasitic eigenvalues (no time accuracy whatsoever is required for the eigenvectors
associated with these, but their presence must not contaminate the accuracy of the
complete solution). Unfortunately, we ﬁnd that, although time accuracy requirements
are dictated by the driving eigenvalues, numerical stability requirements are dictated
by the parasitic ones.
8.1.3 Stiﬀness Classiﬁcations
The following deﬁnitions are somewhat useful. An inherently stable set of ODE’s is
stiﬀ if
λ
p
 λ
M

In particular we deﬁne the ratio
C
r
= λ
M
 / λ
p

and form the categories
Mildlystiﬀ C
r
< 10
2
Stronglystiﬀ 10
3
< C
r
< 10
5
Extremelystiﬀ 10
6
< C
r
< 10
8
Pathologicallystiﬀ 10
9
< C
r
It should be mentioned that the gaps in the stiﬀ category deﬁnitions are intentional
because the bounds are arbitrary. It is important to notice that these deﬁnitions
make no distinction between real, complex, and imaginary eigenvalues.
152 CHAPTER 8. CHOICE OF TIMEMARCHING METHODS
8.2 Relation of Stiﬀness to Space Mesh Size
Many ﬂow ﬁelds are characterized by a few regions having high spatial gradients of
the dependent variables and other domains having relatively low gradient phenomena.
As a result it is quite common to cluster mesh points in certain regions of space and
spread them out otherwise. Examples of where this clustering might occur are at a
shock wave, near an airfoil leading or trailing edge, and in a boundary layer.
One quickly ﬁnds that this grid clustering can strongly aﬀect the eigensystem of
the resulting A matrix. In order to demonstrate this, let us examine the eigensystems
of the model problems given in Section 4.3.2. The simplest example to discuss relates
to the model diﬀusion equation. In this case the eigenvalues are all real, negative
numbers that automatically obey the ordering given in Eq. 8.1. Consider the case
when all of the eigenvalues are parasitic, i.e., we are interested only in the converged
steadystate solution. Under these conditions, the stiﬀness is determined by the ratio
λ
M
/λ
1
. A simple calculation shows that
λ
1
= −
4ν
∆x
2
sin
2
π
2(M + 1)
≈ −
4ν
∆x
2
∆x
2
2
= −ν
λ
M
≈ −
4ν
∆x
2
sin
2
π
2
= −
4ν
∆x
2
and the ratio is
λ
M
/λ
1
≈
4
∆x
2
= 4
M + 1
π
2
The most important information found from this example is the fact that the
stiﬀness of the transient solution is directly related to the grid spacing. Furthermore,
in diﬀusion problems this stiﬀness is proportional to the reciprocal of the space mesh
size squared. For a mesh size M = 40, this ratio is about 680. Even for a mesh of
this moderate size the problem is already approaching the category of strongly stiﬀ.
For the biconvection model a similar analysis shows that
λ
M
 / λ
1
 ≈
1
∆x
Here the stiﬀness parameter is still spacemesh dependent, but much less so than for
diﬀusiondominated problems.
We see that in both cases we are faced with the rather annoying fact that the more
we try to increase the resolution of our spatial gradients, the stiﬀer our equations tend
to become. Typical CFD problems without chemistry vary between the mildly and
strongly stiﬀ categories, and are greatly aﬀected by the resolution of a boundary layer
since it is a diﬀusion process. Our brief analysis has been limited to equispaced
8.3. PRACTICAL CONSIDERATIONS FOR COMPARING METHODS 153
problems, but in general the stiﬀness of CFD problems is proportional to the mesh
intervals in the manner shown above where the critical interval is the smallest one in
the physical domain.
8.3 Practical Considerations for Comparing Meth
ods
We have presented relatively simple and reliable measures of stability and both the
local and global accuracy of timemarching methods. Since there are an endless
number of these methods to choose from, one can wonder how this information is
to be used to pick a “best” choice for a particular problem. There is no unique
answer to such a question. For example, it is, among other things, highly dependent
upon the speed, capacity, and architecture of the available computer, and technology
inﬂuencing this is undergoing rapid and dramatic changes as this is being written.
Nevertheless, if certain ground rules are agreed upon, relevant conclusions can be
reached. Let us now examine some ground rules that might be appropriate. It should
then be clear how the analysis can be extended to other cases.
Let us consider the problem of measuring the eﬃciency of a time–marching method
for computing, over a ﬁxed interval of time, an accurate transient solution of a coupled
set of ODE’s. The length of the time interval, T, and the accuracy required of the
solution are dictated by the physics of the particular problem involved. For example,
in calculating the amount of turbulence in a homogeneous ﬂow, the time interval
would be that required to extract a reliable statistical sample, and the accuracy
would be related to how much the energy of certain harmonics would be permitted
to distort from a given level. Such a computation we refer to as an event.
The appropriate error measures to be used in comparing methods for calculating
an event are the global ones, Er
a
, Er
λ
and Er
ω
, discussed in Section 6.6.5, rather
than the local ones er
λ
, er
a
, and er
p
discussed earlier.
The actual form of the coupled ODE’s that are produced by the semidiscrete
approach is
d
u
dt
=
F(
u, t)
At every time step we must evaluate the function
F(
u, t) at least once. This function
is usually nonlinear, and its computation usually consumes the major portion of the
computer time required to make the simulation. We refer to a single calculation of the
vector
F(
u, t) as a function evaluation and denote the total number of such evaluations
by F
ev
.
154 CHAPTER 8. CHOICE OF TIMEMARCHING METHODS
8.4 Comparing the Eﬃciency of Explicit Methods
8.4.1 Imposed Constraints
As mentioned above, the eﬃciency of methods can be compared only if one accepts a
set of limiting constraints within which the comparisons are carried out. The follow
assumptions bound the considerations made in this Section:
1. The timemarch method is explicit.
2. Implications of computer storage capacity and access time are ignored. In some
contexts, this can be an important consideration.
3. The calculation is to be timeaccurate, must simulate an entire event which
takes a total time T, and must use a constant time step size, h, so that
T = Nh
where N is the total number of time steps.
8.4.2 An Example Involving Diﬀusion
Let the event be the numerical solution of
du
dt
= −u (8.3)
from t = 0 to T = −ln(0.25) with u(0) = 1. Eq. 8.3 is obtained from our representa
tive ODE with λ = −1, a = 0. Since the exact solution is u(t) = u(0)e
−t
, this makes
the exact value of u at the end of the event equal to 0.25, i.e., u(T) = 0.25. To the
constraints imposed above, let us set the additional requirement
• The error in u at the end of the event, i.e., the global error, must be < 0.5%.
We judge the most eﬃcient method as the one that satisﬁes these conditions and
has the fewest number of evaluations, F
ev
. Three methods are compared — explicit
Euler, AB2, and RK4.
First of all, the allowable error constraint means that the global error in the am
plitude, see Eq. 6.48, must have the property:
Er
λ
e
λT
< 0.005
8.4. COMPARING THE EFFICIENCY OF EXPLICIT METHODS 155
Then, since h = T/N = −ln(0.25)/N, it follows that
1 −(σ
1
(ln(.25)/N))
N
/.25
< 0.005
where σ
1
is found from the characteristic polynomials given in Table 7.1. The results
shown in Table 8.1 were computed using a simple iterative procedure.
Method N h σ
1
F
ev
Er
λ
Euler 193 .00718 .99282 193 .001248 worst
AB2 16 .0866 .9172 16 .001137
RK4 2 .6931 .5012 8 .001195 best
Table 8.1: Comparison of timemarching methods for a simple dissipation problem.
In this example we see that, for a given global accuracy, the method with the
highest local accuracy is the most eﬃcient on the basis of the expense in evaluating
F
ev
. Thus the secondorder AdamsBashforth method is much better than the ﬁrst
order Euler method, and the fourthorder RungeKutta method is the best of all. The
main purpose of this exercise is to show the (usually) great superiority of secondorder
over ﬁrstorder timemarching methods.
8.4.3 An Example Involving Periodic Convection
Let us use as a basis for this example the study of homogeneous turbulence simulated
by the numerical solution of the incompressible NavierStokes equations inside a cube
with periodic boundary conditions on all sides. In this numerical experiment the
function evaluations contribute overwhelmingly to the CPU time, and the number of
these evaluations must be kept to an absolute minimum because of the magnitude of
the problem. On the other hand, a complete event must be established in order to
obtain meaningful statistical samples which are the essence of the solution. In this
case, in addition to the constraints given in Section 8.4.1, we add the following:
• The number of evaluations of
F(u, t) is ﬁxed.
Under these conditions a method is judged as best when it has the highest global
accuracy for resolving eigenvectors with imaginary eigenvalues. The above constraint
has led to the invention of schemes that omit the function evaluation in the cor
rector step of a predictorcorrector combination, leading to the socalled incomplete
156 CHAPTER 8. CHOICE OF TIMEMARCHING METHODS
predictorcorrector methods. The presumption is, of course, that more eﬃcient meth
ods will result from the omission of the second function evaluation. An example is
the method of Gazdag, given in Section 6.8. Basically this is composed of an AB2
predictor and a trapezoidal corrector. However, the derivative of the fundamental
family is never found so there is only one evaluation required to complete each cycle.
The λσ relation for the method is shown as entry 10 in Table 7.1.
In order to discuss our comparisions we introduce the following deﬁnitions:
• Let a kevaluation method be deﬁned as one that requires k evaluations of
F(u, t) to advance one step using that method’s time interval, h.
• Let K represent the total number of allowable F
ev
.
• Let h
1
be the time interval advanced in one step of a oneevaluation method.
The Gazdag, leapfrog, and AB2 schemes are all 1evaluation methods. The second
and fourth order RK methods are 2 and 4evaluation methods, respectively. For a 1
evaluation method the total number of time steps, N, and the number of evaluations,
K, are the same, one evaluation being used for each step, so that for these methods
h = h
1
. For a 2evaluation method N = K/2 since two evaluations are used for
each step. However, in this case, in order to arrive at the same time T after K
evaluations, the time step must be twice that of a one–evaluation method so h = 2h
1
.
For a 4evaluation method the time interval must be h = 4h
1
, etc. Notice that
as k increases, the time span required for one application of the method increases.
However, notice also that as k increases, the power to which σ
1
is raised to arrive
at the ﬁnal destination decreases; see the Figure below. This is the key to the true
comparison of timemarch methods for this type of problem.
0 T u
N
k = 1  • • • • • • •  [σ(λh
1
)]
8
k = 2  2h
1
• • •  [σ(2λh
1
)]
4
k = 4  4h
1
•  [σ(4λh
1
)]
2
Step sizes and powers of σ for kevaluation methods used to get to the same value
of T if 8 evaluations are allowed.
In general, after K evaluations, the global amplitude and phase error for k
evaluation methods applied to systems with pure imaginary λroots can be written
1
Er
a
= 1 −σ
1
(ikωh
1
)
K/k
(8.4)
1
See Eqs. 6.38 and 6.39.
8.4. COMPARING THE EFFICIENCY OF EXPLICIT METHODS 157
Er
ω
= ωT −
K
k
tan
−1
¸
[σ
1
(ikωh
1
)]
imaginary
[σ
1
(ikωh
1
)]
real
¸
(8.5)
Consider a convectiondominated event for which the function evaluation is very
time consuming. We idealize to the case where λ = iω and set ω equal to one. The
event must proceed to the time t = T = 10. We consider two maximum evaluation
limits K = 50 and K = 100 and choose from four possible methods, leapfrog, AB2,
Gazdag, and RK4. The ﬁrst three of these are oneevaluation methods and the last
one is a fourevaluation method. It is not diﬃcult to show that on the basis of local
error (made in a single step) the Gazdag method is superior to the RK4 method in
both amplitude and phase. For example, for ωh = 0.2 the Gazdag method produces
a σ
1
 = 0.9992276 whereas for ωh = 0.8 (which must be used to keep the number of
evaluations the same) the RK4 method produces a σ
1
 = 0.998324. However, we are
making our comparisons on the basis of global error for a ﬁxed number of evaluations.
First of all we see that for a oneevaluation method h
1
= T/K. Using this, and the
fact that ω = 1, we ﬁnd, by some rather simple calculations
2
made using Eqs. 8.4 and
8.5, the results shown in Table 8.2. Notice that to ﬁnd global error the Gazdag root
must be raised to the power of 50 while the RK4 root is raised only to the power of
50/4. On the basis of global error the Gazdag method is not superior to RK4 in either
amplitude or phase, although, in terms of phase error (for which it was designed) it
is superior to the other two methods shown.
K leapfrog AB2 Gazdag RK4
ωh
1
= .1 100 1.0 1.003 .995 .999
ωh
1
= .2 50 1.0 1.022 .962 .979
a. Amplitude, exact = 1.0.
K leapfrog AB2 Gazdag RK4
ωh
1
= .1 100 −.96 −2.4 .45 .12
ωh
1
= .2 50 −3.8 −9.8 1.5 1.5
b. Phase error in degrees.
Table 8.2: Comparison of global amplitude and phase errors for four methods.
2
The σ
1
root for the Gazdag method can be found using a numerical root ﬁnding routine to trace
the three roots in the σplane, see Fig. 7.3e.
158 CHAPTER 8. CHOICE OF TIMEMARCHING METHODS
Using analysis such as this (and also considering the stability boundaries) the
RK4 method is recommended as a basic ﬁrst choice for any explicit timeaccurate
calculation of a convectiondominated problem.
8.5 Coping With Stiﬀness
8.5.1 Explicit Methods
The ability of a numerical method to cope with stiﬀness can be illustrated quite nicely
in the complex λh plane. A good example of the concept is produced by studying
the Euler method applied to the representative equation. The transient solution is
u
n
= (1 + λh)
n
and the trace of the complex value of λh which makes 1 +λh = 1
gives the whole story. In this case the trace forms a circle of unit radius centered at
(−1, 0) as shown in Fig. 8.1. If h is chosen so that all λh in the ODE eigensystem
fall inside this circle the integration will be numerically stable. Also shown by the
small circle centered at the origin is the region of Taylor series accuracy. If some λh
fall outside the small circle but stay within the stable region, these λh are stiﬀ, but
stable. We have deﬁned these λh as parasitic eigenvalues. Stability boundaries for
some explicit methods are shown in Figs. 7.5 and 7.6.
For a speciﬁc example, consider the mildly stiﬀ system composed of a coupled
twoequation set having the two eigenvalues λ
1
= −100 and λ
2
= −1. If uncoupled
and evaluated in wave space, the time histories of the two solutions would appear as a
rapidly decaying function in one case, and a relatively slowly decaying function in the
other. Analytical evaluation of the time histories poses no problem since e
−100t
quickly
becomes very small and can be neglected in the expressions when time becomes large.
Numerical evaluation is altogether diﬀerent. Numerical solutions, of course, depend
upon [σ(λ
m
h)]
n
and no σ
m
 can exceed one for any λ
m
in the coupled system or else
the process is numerically unstable.
Let us choose the simple explicit Euler method for the time march. The coupled
equations in real space are represented by
u
1
(n) = c
1
(1 −100h)
n
x
11
+c
2
(1 −h)
n
x
12
+ (PS)
1
u
2
(n) = c
1
(1 −100h)
n
x
21
+c
2
(1 −h)
n
x
22
+ (PS)
2
(8.6)
We will assume that our accuracy requirements are such that suﬃcient accuracy is
obtained as long as λh ≤ 0.1. This deﬁnes a time step limit based on accuracy
considerations of h = 0.001 for λ
1
and h = 0.1 for λ
2
. The time step limit based
on stability, which is determined from λ
1
, is h = 0.02. We will also assume that
c
1
= c
2
= 1 and that an amplitude less than 0.001 is negligible. We ﬁrst run 66
time steps with h = 0.001 in order to resolve the λ
1
term. With this time step the
8.5. COPING WITH STIFFNESS 159
λ
2
term is resolved exceedingly well. After 66 steps, the amplitude of the λ
1
term
(i.e., (1 −100h)
n
) is less than 0.001 and that of the λ
2
term (i.e., (1 −h)
n
) is 0.9361.
Hence the λ
1
term can now be considered negligible. To drive the (1 − h)
n
term to
zero (i.e., below 0.001), we would like to change the step size to h = 0.1 and continue.
We would then have a well resolved answer to the problem throughout the entire
relevant time interval. However, this is not possible because of the coupled presence
of (1 −100h)
n
, which in just 10 steps at h = 0.1 ampliﬁes those terms by ≈ 10
9
, far
outweighing the initial decrease obtained with the smaller time step. In fact, with
h = 0.02, the maximum step size that can be taken in order to maintain stability,
about 339 time steps have to be computed in order to drive e
−t
to below 0.001. Thus
the total simulation requires 405 time steps.
8.5.2 Implicit Methods
Now let us reexamine the problem that produced Eq. 8.6 but this time using an
unconditionally stable implicit method for the time march. We choose the trapezoidal
method. Its behavior in the λh plane is shown in Fig. 7.4b. Since this is also a one
root method, we simply replace the Euler σ with the trapezoidal one and analyze the
result. It follows that the ﬁnal numerical solution to the ODE is now represented in
real space by
u
1
(n) = c
1
1 −50h
1 + 50h
n
x
11
+c
2
1 −0.5h
1 + 0.5h
n
x
12
+ (PS)
1
u
2
(n) = c
1
1 −50h
1 + 50h
n
x
21
+c
2
1 −0.5h
1 + 0.5h
n
x
22
+ (PS)
2
(8.7)
In order to resolve the initial transient of the term e
−100t
, we need to use a step size
of about h = 0.001. This is the same step size used in applying the explicit Euler
method because here accuracy is the only consideration and a very small step size
must be chosen to get the desired resolution. (It is true that for the same accuracy
we could in this case use a larger step size because this is a secondorder method,
but that is not the point of this exercise). After 70 time steps the λ
1
term has
amplitude less than 0.001 and can be neglected. Now with the implicit method we
can proceed to calculate the remaining part of the event using our desired step size
h = 0.1 without any problem of instability, with 69 steps required to reduce the
amplitude of the second term to below 0.001. In both intervals the desired solution
is secondorder accurate and well resolved. It is true that in the ﬁnal 69 steps one
σroot is [1 − 50(0.1)]/[1 + 50(0.1)] = 0.666 · · ·, and this has no physical meaning
whatsoever. However, its inﬂuence on the coupled solution is negligible at the end
of the ﬁrst 70 steps, and, since (0.666 · · ·)
n
< 1, its inﬂuence in the remaining 70
160 CHAPTER 8. CHOICE OF TIMEMARCHING METHODS
steps is even less. Actually, although this root is one of the principal roots in the
system, its behavior for t > 0.07 is identical to that of a stable spurious root. The
total simulation requires 139 time steps.
8.5.3 A Perspective
It is important to retain a proper perspective on a problem represented by the above
example. It is clear that an unconditionally stable method can always be called upon
to solve stiﬀ problems with a minimum number of time steps. In the example, the
conditionally stable Euler method required 405 time steps, as compared to about
139 for the trapezoidal method, about three times as many. However, the Euler
method is extremely easy to program and requires very little arithmetic per step. For
preliminary investigations it is often the best method to use for mildlystiﬀ diﬀusion
dominated problems. For reﬁned investigations of such problems an explicit method
of second order or higher, such as AdamsBashforth or RungeKutta methods, is
recommended. These explicit methods can be considered as eﬀective mildly stiﬀ
stable methods. However, it should be clear that as the degree of stiﬀness of the
problem increases, the advantage begins to tilt towards implicit methods, as the
reduced number of time steps begins to outweigh the increased cost per time step.
The reader can repeat the above example with λ
1
= −10, 000, λ
2
= −1, which is in
the stronglystiﬀ category.
There is yet another technique for coping with certain stiﬀ systems in ﬂuid dynamic
applications. This is known as the multigrid method. It has enjoyed remarkable
success in many practical problems; however, we need an introduction to the theory
of relaxation before it can be presented.
8.6 Steady Problems
In Chapter 6 we wrote the O∆E solution in terms of the principal and spurious roots
as follows:
u
n
= c
11
(σ
1
)
n
1
x
1
+· · · +c
m1
(σ
m
)
n
1
x
m
+· · · +c
M1
(σ
M
)
n
1
x
M
+P.S.
+c
12
(σ
1
)
n
2
x
1
+· · · +c
m2
(σ
m
)
n
2
x
m
+· · · +c
M2
(σ
M
)
n
2
x
M
+c
13
(σ
1
)
n
3
x
1
+· · · +c
m3
(σ
m
)
n
3
x
m
+· · · +c
M3
(σ
M
)
n
3
x
M
+etc., if there are more spurious roots (8.8)
When solving a steady problem, we have no interest whatsoever in the transient por
tion of the solution. Our sole goal is to eliminate it as quickly as possible. Therefore,
8.7. PROBLEMS 161
the choice of a timemarching method for a steady problem is similar to that for a
stiﬀ problem, the diﬀerence being that the order of accuracy is irrelevant. Hence the
explicit Euler method is a candidate for steady diﬀusion dominated problems, and
the fourthorder RungeKutta method is a candidate for steady convection dominated
problems, because of their stability properties. Among implicit methods, the implicit
Euler method is the obvious choice for steady problems.
When we seek only the steady solution, all of the eigenvalues can be considered to
be parasitic. Referring to Fig. 8.1, none of the eigenvalues are required to fall in the
accurate region of the timemarching method. Therefore the time step can be chosen
to eliminate the transient as quickly as possible with no regard for time accuracy.
For example, when using the implicit Euler method with local time linearization, Eq.
6.96, one would like to take the limit h →∞, which leads to Newton’s method, Eq.
6.98. However, a ﬁnite time step may be required until the solution is somewhat close
to the steady solution.
8.7 Problems
1. Repeat the timemarch comparisons for diﬀusion (Section 8.4.2) and periodic
convection (Section 8.4.3) using 2nd and 3rdorder RungeKutta methods.
2. Repeat the timemarch comparisons for diﬀusion (Section 8.4.2) and periodic
convection (Section 8.4.3) using the 3rd and 4thorder AdamsBashforth meth
ods. Considering the stability bounds for these methods (see problem 4 in
Chapter 7) as well as their memory requirements, compare and contrast them
with the 3rd and 4thorder RungeKutta methods.
3. Consider the diﬀusion equation (with ν = 1) discretized using 2ndorder central
diﬀerences on a grid with 10 (interior) points. Find and plot the eigenvalues
and the corresponding modiﬁed wavenumbers. If we use the explicit Euler time
marching method what is the maximum allowable time step if all but the ﬁrst
two eigenvectors are considered parasitic? Assume that suﬃcient accuracy is
obtained as long as λh ≤ 0.1. What is the maximum allowable time step if all
but the ﬁrst eigenvector are considered parasitic?
162 CHAPTER 8. CHOICE OF TIMEMARCHING METHODS
Chapter 9
RELAXATION METHODS
In the past three chapters, we developed a methodology for designing, analyzing,
and choosing timemarching methods. These methods can be used to compute the
timeaccurate solution to linear and nonlinear systems of ODE’s in the general form
du
dt
=
F(u, t) (9.1)
which arise after spatial discretization of a PDE. Alternatively, they can be used to
solve for the steady solution of Eq. 9.1, which satisﬁes the following coupled system
of nonlinear algebraic equations:
F(u) = 0 (9.2)
In the latter case, the unsteady equations are integrated until the solution converges
to a steady solution. The same approach permits a timemarching method to be used
to solve a linear system of algebraic equations in the form
Ax =
b (9.3)
To solve this system using a timemarching method, a time derivative is introduced
as follows
dx
dt
= Ax −
b (9.4)
and the system is integrated in time until the transient has decayed to a suﬃciently
low level. Following a timedependent path to steady state is possible only if all of
the eigenvalues of the matrix A (or −A) have real parts lying in the left halfplane.
Although the solution x = A
−1
b exists as long as A is nonsingular, the ODE given by
Eq. 9.4 has a stable steady solution only if A meets the above condition.
163
164 CHAPTER 9. RELAXATION METHODS
The common feature of all timemarching methods is that they are at least ﬁrst
order accurate. In this chapter, we consider iterative methods which are not time
accurate at all. Such methods are known as relaxation methods. While they are
applicable to coupled systems of nonlinear algebraic equations in the form of Eq. 9.2,
our analysis will focus on their application to large sparse linear systems of equations
in the form
A
b
u −
f
b
= 0 (9.5)
where A
b
is nonsingular, and the use of the subscript b will become clear shortly. Such
systems of equations arise, for example, at each time step of an implicit timemarching
method or at each iteration of Newton’s method. Using an iterative method, we seek
to obtain rapidly a solution which is arbitrarily close to the exact solution of Eq. 9.5,
which is given by
u
∞
= A
−1
b
f
b
(9.6)
9.1 Formulation of the Model Problem
9.1.1 Preconditioning the Basic Matrix
It is standard practice in applying relaxation procedures to precondition the basic
equation. This preconditioning has the eﬀect of multiplying Eq. 9.5 from the left
by some nonsingular matrix. In the simplest possible case the conditioning matrix
is a diagonal matrix composed of a constant D(b). If we designate the conditioning
matrix by C, the problem becomes one of solving for
u in
CA
b
u −C
f
b
= 0 (9.7)
Notice that the solution of Eq. 9.7 is
u = [CA
b
]
−1
C
f
b
= A
−1
b
C
−1
C
f
b
= A
−1
b
f
b
(9.8)
which is identical to the solution of Eq. 9.5, provided C
−1
exists.
In the following we will see that our approach to the iterative solution of Eq. 9.7
depends crucially on the eigenvalue and eigenvector structure of the matrix CA
b
, and,
equally important, does not depend at all on the eigensystem of the basic matrix A
b
.
For example, there are wellknown techniques for accelerating relaxation schemes if
the eigenvalues of CA
b
are all real and of the same sign. To use these schemes, the
9.1. FORMULATION OF THE MODEL PROBLEM 165
conditioning matrix C must be chosen such that this requirement is satisﬁed. A
choice of C which ensures this condition is the negative transpose of A
b
.
For example, consider a spatial discretization of the linear convection equation
using centered diﬀerences with a Dirichlet condition on the left side and no constraint
on the right side. Using a ﬁrstorder backward diﬀerence on the right side (as in
Section 3.6), this leads to the approximation
δ
x
u =
1
2∆x
0 1
−1 0 1
−1 0 1
−1 0 1
−2 2
¸
¸
¸
¸
¸
¸
¸
¸
u +
−u
a
0
0
0
0
¸
¸
¸
¸
¸
¸
¸
¸
(9.9)
The matrix in Eq. 9.9 has eigenvalues whose imaginary parts are much larger than
their real parts. It can ﬁrst be conditioned so that the modulus of each element is 1.
This is accomplished using a diagonal preconditioning matrix
D = 2∆x
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0
1
2
¸
¸
¸
¸
¸
¸
¸
¸
(9.10)
which scales each row. We then further condition with multiplication by the negative
transpose. The result is
A
2
= −A
T
1
A
1
=
0 1
−1 0 1
−1 0 1
−1 0 1
−1 −1
¸
¸
¸
¸
¸
¸
¸
¸
0 1
−1 0 1
−1 0 1
−1 0 1
−1 1
¸
¸
¸
¸
¸
¸
¸
¸
=
−1 0 1
0 −2 0 1
1 0 −2 0 1
1 0 −2 1
1 1 −2
¸
¸
¸
¸
¸
¸
¸
¸
(9.11)
166 CHAPTER 9. RELAXATION METHODS
If we deﬁne a permutation matrix P
1
and carry out the process P
T
[−A
T
1
A
1
]P (which
just reorders the elements of A
1
and doesn’t change the eigenvalues) we ﬁnd
0 1 0 0 0
0 0 0 1 0
0 0 0 0 1
0 0 1 0 0
1 0 0 0 0
¸
¸
¸
¸
¸
¸
¸
¸
−1 0 1
0 −2 0 1
1 0 −2 0 1
1 0 −2 1
1 1 −2
¸
¸
¸
¸
¸
¸
¸
¸
0 0 0 0 1
1 0 0 0 0
0 0 0 1 0
0 1 0 0 0
0 0 1 0 0
¸
¸
¸
¸
¸
¸
¸
¸
=
−2 1
1 −2 1
1 −2 1
1 −2 1
1 −1
¸
¸
¸
¸
¸
¸
¸
¸
(9.12)
which has all negative real eigenvalues, as given in Appendix B. Thus even when
the basic matrix A
b
has nearly imaginary eigenvalues, the conditioned matrix −A
T
b
A
b
is nevertheless symmetric negative deﬁnite (i.e., symmetric with negative real eigen
values), and the classical relaxation methods can be applied. We do not necessarily
recommend the use of −A
T
b
as a preconditioner; we simply wish to show that a broad
range of matrices can be preconditioned into a form suitable for our analysis.
9.1.2 The Model Equations
Preconditioning processes such as those described in the last section allow us to
prepare our algebraic equations in advance so that certain eigenstructures are guar
anteed. In the remainder of this chapter, we will thoroughly investigate some simple
equations which model these structures. We will consider the preconditioned system
of equations having the form
A
φ −
f = 0 (9.13)
where A is symmetric negative deﬁnite.
2
The symbol for the dependent variable has
been changed to φ as a reminder that the physics being modeled is no longer time
1
A permutation matrix (deﬁned as a matrix with exactly one 1 in each row and column and has
the property that P
T
= P
−1
) just rearranges the rows and columns of a matrix.
2
We use a symmetric negative deﬁnite matrix to simplify certain aspects of our analysis. Relax
ation methods are applicable to more general matrices. The classical methods will usually converge
if A
b
is diagonally dominant, as deﬁned in Appendix A.
9.1. FORMULATION OF THE MODEL PROBLEM 167
accurate when we later deal with ODE formulations. Note that the solution of Eq.
9.13,
φ = A
−1
f, is guaranteed to exist because A is nonsingular. In the notation of
Eqs. 9.5 and 9.7,
A = CA
b
and
f = C
f
b
(9.14)
The above was written to treat the general case. It is instructive in formulating the
concepts to consider the special case given by the diﬀusion equation in one dimension
with unit diﬀusion coeﬃcient ν:
∂u
∂t
=
∂
2
u
∂x
2
−g(x) (9.15)
This has the steadystate solution
∂
2
u
∂x
2
= g(x) (9.16)
which is the onedimensional form of the Poisson equation. Introducing the three
point central diﬀerencing scheme for the second derivative with Dirichlet boundary
conditions, we ﬁnd
d
u
dt
=
1
∆x
2
B(1, −2, 1)
u + (
bc) −
g (9.17)
where (
bc) contains the boundary conditions and
g contains the values of the source
term at the grid nodes. In this case
A
b
=
1
∆x
2
B(1, −2, 1)
f
b
=
g −(
bc) (9.18)
Choosing C = ∆x
2
I, we obtain
B(1, −2, 1)
φ =
f (9.19)
where
f = ∆x
2
f
b
. If we consider a Dirichlet boundary condition on the left side and
either a Dirichlet or a Neumann condition on the right side, then A has the form
A = B
1,
b, 1
168 CHAPTER 9. RELAXATION METHODS
b = [−2, −2, · · · ·, −2, s]
T
s = −2 or −1 (9.20)
Note that s = −1 is easily obtained from the matrix resulting from the Neumann
boundary condition given in Eq. 3.24 using a diagonal conditioning matrix. A tremen
dous amount of insight to the basic features of relaxation is gained by an appropriate
study of the onedimensional case, and much of the remaining material is devoted to
this case. We attempt to do this in such a way, however, that it is directly applicable
to two and threedimensional problems.
9.2 Classical Relaxation
9.2.1 The Delta Form of an Iterative Scheme
We will consider relaxation methods which can be expressed in the following delta
form:
H
φ
n+1
−
φ
n
= A
φ
n
−
f (9.21)
where H is some nonsingular matrix which depends upon the iterative method. The
matrix H is independent of n for stationary methods and is a function of n for
nonstationary ones. The iteration count is designated by the subscript n or the
superscript (n). The converged solution is designated
φ
∞
so that
φ
∞
= A
−1
f (9.22)
9.2.2 The Converged Solution, the Residual, and the Error
Solving Eq. 9.21 for
φ
n+1
gives
φ
n+1
= [I +H
−1
A]
φ
n
−H
−1
f = G
φ
n
−H
−1
f (9.23)
where
G ≡ I +H
−1
A (9.24)
Hence it is clear that H should lead to a system of equations which is easy to solve,
or at least easier to solve than the original system. The error at the nth iteration is
deﬁned as
e
n
≡
φ
n
−
φ
∞
=
φ
n
−A
−1
f (9.25)
9.2. CLASSICAL RELAXATION 169
where
φ
∞
was deﬁned in Eq. 9.22. The residual at the nth iteration is deﬁned as
r
n
≡ A
φ
n
−
f (9.26)
Multiply Eq. 9.25 by A from the left, and use the deﬁnition in Eq. 9.26. There results
the relation between the error and the residual
A
e
n
−
r
n
= 0 (9.27)
Finally, it is not diﬃcult to show that
e
n+1
= G
e
n
(9.28)
Consequently, G is referred to as the basic iteration matrix, and its eigenvalues, which
we designate as σ
m
, determine the convergence rate of a method.
In all of the above, we have considered only what are usually referred to as sta
tionary processes in which H is constant throughout the iterations. Nonstationary
processes in which H (and possibly C) is varied at each iteration are discussed in
Section 9.5.
9.2.3 The Classical Methods
Point Operator Schemes in One Dimension
Let us consider three classical relaxation procedures for our model equation
B(1, −2, 1)
φ =
f (9.29)
as given in Section 9.1.2. The PointJacobi method is expressed in point operator
form for the onedimensional case as
φ
(n+1)
j
=
1
2
φ
(n)
j−1
+φ
(n)
j+1
−f
j
(9.30)
This operator comes about by choosing the value of φ
(n+1)
j
such that together with
the old values of φ
j−1
and φ
j+1
, the jth row of Eq. 9.29 is satisﬁed. The GaussSeidel
method is
φ
(n+1)
j
=
1
2
φ
(n+1)
j−1
+φ
(n)
j+1
−f
j
(9.31)
This operator is a simple extension of the pointJacobi method which uses the most
recent update of φ
j−1
. Hence the jth row of Eq. 9.29 is satisﬁed using the new values of
φ
j
and φ
j−1
and the old value of φ
j+1
. The method of successive overrelaxation (SOR)
170 CHAPTER 9. RELAXATION METHODS
is based on the idea that if the correction produced by the GaussSeidel method tends
to move the solution toward
φ
∞
, then perhaps it would be better to move further in
this direction. It is usually expressed in two steps as
˜
φ
j
=
1
2
φ
(n+1)
j−1
+φ
(n)
j+1
−f
j
φ
(n+1)
j
= φ
(n)
j
+ω
˜
φ
j
−φ
(n)
j
(9.32)
where ω generally lies between 1 and 2, but it can also be written in the single line
φ
(n+1)
j
=
ω
2
φ
(n+1)
j−1
+ (1 −ω)φ
(n)
j
+
ω
2
φ
(n)
j+1
−
ω
2
f
j
(9.33)
The General Form
The general form of the classical methods is obtained by splitting the matrix A in
Eq. 9.13 into its diagonal, D, the portion of the matrix below the diagonal, L, and
the portion above the diagonal, U, such that
A = L +D +U (9.34)
Then the pointJacobi method is obtained with H = −D, which certainly meets
the criterion that it is easy to solve. The GaussSeidel method is obtained with
H = −(L +D), which is also easy to solve, being lower triangular.
9.3 The ODE Approach to Classical Relaxation
9.3.1 The Ordinary Diﬀerential Equation Formulation
The particular type of delta form given by Eq. 9.21 leads to an interpretation of
relaxation methods in terms of solution techniques for coupled ﬁrstorder ODE’s,
about which we have already learned a great deal. One can easily see that Eq. 9.21
results from the application of the explicit Euler timemarching method (with h = 1)
to the following system of ODE’s:
H
d
φ
dt
= A
φ −
f (9.35)
This is equivalent to
d
φ
dt
= H
−1
C
¸
A
b
φ −
f
b
= H
−1
[A
φ −
f] (9.36)
9.3. THE ODE APPROACH TO CLASSICAL RELAXATION 171
In the special case where H
−1
A depends on neither
u nor t, H
−1
f is also independent
of t, and the eigenvectors of H
−1
A are linearly independent, the solution can be
written as
φ = c
1
e
λ
1
t
x
1
+· · · +c
M
e
λ
M
t
x
M
. .. .
error
+
φ
∞
(9.37)
where what is referred to in timeaccurate analysis as the transient solution, is now
referred to in relaxation analysis as the error. It is clear that, if all of the eigenvalues
of H
−1
A have negative real parts (which implies that H
−1
A is nonsingular), then the
system of ODE’s has a steadystate solution which is approached as t →∞, given by
φ
∞
= A
−1
f (9.38)
which is the solution of Eq. 9.13. We see that the goal of a relaxation method is to
remove the transient solution from the general solution in the most eﬃcient way pos
sible. The λ eigenvalues are ﬁxed by the basic matrix in Eq. 9.36, the preconditioning
matrix in 9.7, and the secondary conditioning matrix in 9.35. The σ eigenvalues are
ﬁxed for a given λh by the choice of timemarching method. Throughout the remain
ing discussion we will refer to the independent variable t as “time”, even though no
true time accuracy is involved.
In a stationary method, H and C in Eq. 9.36 are independent of t, that is, they
are not changed throughout the iteration process. The generalization of this in our
approach is to make h, the “time” step, a constant for the entire iteration.
Suppose the explicit Euler method is used for the time integration. For this method
σ
m
= 1 +λ
m
h. Hence the numerical solution after n steps of a stationary relaxation
method can be expressed as (see Eq. 6.28)
φ
n
= c
1
x
1
(1 +λ
1
h)
n
+· · · +c
m
x
m
(1 +λ
m
h)
n
+· · · +c
M
x
M
(1 +λ
M
h)
n
. .. .
error
+
φ
∞
(9.39)
The initial amplitudes of the eigenvectors are given by the magnitudes of the c
m
.
These are ﬁxed by the initial guess. In general it is assumed that any or all of the
eigenvectors could have been given an equally “bad” excitation by the initial guess,
so that we must devise a way to remove them all from the general solution on an
equal basis. Assuming that H
−1
A has been chosen (that is, an iteration process has
been decided upon), the only free choice remaining to accelerate the removal of the
error terms is the choice of h. As we shall see, the three classical methods have all
been conditioned by the choice of H to have an optimum h equal to 1 for a stationary
iteration process.
172 CHAPTER 9. RELAXATION METHODS
9.3.2 ODE Form of the Classical Methods
The three iterative procedures deﬁned by Eqs. 9.30, 9.31 and 9.32 obey no apparent
pattern except that they are easy to implement in a computer code since all of the
data required to update the value of one point are explicitly available at the time
of the update. Now let us study these methods as subsets of ODE as formulated in
Section 9.3.1. Insert the model equation 9.29 into the ODE form 9.35. Then
H
d
φ
dt
= B(1, −2, 1)
φ −
f (9.40)
As a start, let us use for the numerical integration the explicit Euler method
φ
n+1
= φ
n
+hφ
n
(9.41)
with a step size, h, equal to 1. We arrive at
H(
φ
n+1
−
φ
n
) = B(1, −2, 1)
φ
n
−
f (9.42)
It is clear that the best choice of H from the point of view of matrix algebra is
−B(1, −2, 1) since then multiplication from the left by −B
−1
(1, −2, 1) gives the cor
rect answer in one step. However, this is not in the spirit of our study, since multi
plication by the inverse amounts to solving the problem by a direct method without
iteration. The constraint on H that is in keeping with the formulation of the three
methods described in Section 9.2.3 is that all the elements above the diagonal (or
below the diagonal if the sweeps are from right to left) are zero. If we impose this
constraint and further restrict ourselves to banded tridiagonals with a single constant
in each band, we are led to
B(−β,
2
ω
, 0)(
φ
n+1
−
φ
n
) = B(1, −2, 1)
φ
n
−
f (9.43)
where β and ω are arbitrary. With this choice of notation the three methods presented
in Section 9.2.3 can be identiﬁed using the entries in Table 9.1.
TABLE 9.1: VALUES OF β and ω IN EQ. 9.43 THAT
LEAD TO CLASSICAL RELAXATION METHODS
β ω Method Equation
0 1 PointJacobi 6.2.3
1 1 GaussSeidel 6.2.4
1 2/
1 + sin
π
M + 1
Optimum SOR 6.2.5
9.4. EIGENSYSTEMS OF THE CLASSICAL METHODS 173
The fact that the values in the tables lead to the methods indicated can be veriﬁed
by simple algebraic manipulation. However, our purpose is to examine the whole
procedure as a special subset of the theory of ordinary diﬀerential equations. In this
light, the three methods are all contained in the following set of ODE’s
d
φ
dt
= B
−1
(−β,
2
ω
, 0)
¸
B(1, −2, 1)
φ −
f
(9.44)
and appear from it in the special case when the explicit Euler method is used for its
numerical integration. The point operator that results from the use of the explicit
Euler scheme is
φ
(n+1)
j
=
ωβ
2
φ
(n+1)
j−1
+
ω
2
(h −β)φ
(n)
j−1
−
(ωh −1)φ
(n)
j
+
ωh
2
φ
(n)
j+1
−
ωh
2
f
j
(9.45)
This represents a generalization of the classical relaxation techniques.
9.4 Eigensystems of the Classical Methods
The ODE approach to relaxation can be summarized as follows. The basic equation
to be solved came from some timeaccurate derivation
A
b
u −
f
b
= 0 (9.46)
This equation is preconditioned in some manner which has the eﬀect of multiplication
by a conditioning matrix C giving
A
φ −
f = 0 (9.47)
An iterative scheme is developed to ﬁnd the converged, or steadystate, solution of
the set of ODE’s
H
d
φ
dt
= A
φ −
f (9.48)
This solution has the analytical form
φ
n
=
e
n
+
φ
∞
(9.49)
where
e
n
is the transient, or error, and
φ
∞
≡ A
−1
f is the steadystate solution. The
three classical methods, PointJacobi, GaussSeidel, and SOR, are identiﬁed for the
onedimensional case by Eq. 9.44 and Table 9.1.
174 CHAPTER 9. RELAXATION METHODS
Given our assumption that the component of the error associated with each eigen
vector is equally likely to be excited, the asymptotic convergence rate is determined
by the eigenvalue σ
m
of G (≡ I +H
−1
A) having maximum absolute value. Thus
Convergence rate ∼ σ
m

max
, m = 1, 2, · · · , M (9.50)
In this section, we use the ODE analysis to ﬁnd the convergence rates of the three
classical methods represented by Eqs. 9.30, 9.31, and 9.32. It is also instructive to
inspect the eigenvectors and eigenvalues in the H
−1
A matrix for the three methods.
This amounts to solving the generalized eigenvalue problem
A
x
m
= λ
m
H
x
m
(9.51)
for the special case
B(1, −2, 1)
x
m
= λ
m
B(−β,
2
ω
, 0)
x
m
(9.52)
The generalized eigensystem for simple tridigonals is given in Appendix B.2. The
three special cases considered below are obtained with a = 1, b = −2, c = 1, d = −β,
e = 2/ω, and f = 0. To illustrate the behavior, we take M = 5 for the matrix order.
This special case makes the general result quite clear.
9.4.1 The PointJacobi System
If β = 0 and ω = 1 in Eq. 9.44, the ODE matrix H
−1
A reduces to simply B(
1
2
, −1,
1
2
).
The eigensystem can be determined from Appendix B.1 since both d and f are zero.
The eigenvalues are given by the equation
λ
m
= −1 + cos
mπ
M + 1
, m = 1, 2, . . . , M (9.53)
The λσ relation for the explicit Euler method is σ
m
= 1 + λ
m
h. This relation can
be plotted for any h. The plot for h = 1, the optimum stationary case, is shown in
Fig. 9.1. For h < 1, the maximum σ
m
 is obtained with m = 1, Fig. 9.2 and for
h > 1, the maximum σ
m
 is obtained with m = M, Fig. 9.3. Note that for h > 1.0
(depending on M) there is the possibility of instability, i.e. σ
m
 > 1.0. To obtain the
optimal scheme we wish to minimize the maximum σ
m
 which occurs when h = 1,
σ
1
 = σ
M
 and the best possible convergence rate is achieved:
σ
m

max
= cos
π
M + 1
(9.54)
9.4. EIGENSYSTEMS OF THE CLASSICAL METHODS 175
For M = 40, we obtain σ
m

max
= 0.9971. Thus after 500 iterations the error content
associated with each eigenvector is reduced to no more than 0.23 times its initial level.
Again from Appendix B.1, the eigenvectors of H
−1
A are given by
x
j
= (x
j
)
m
= sin
¸
j
mπ
M + 1
, j = 1, 2, . . . , M (9.55)
This is a very “wellbehaved” eigensystem with linearly independent eigenvectors and
distinct eigenvalues. The ﬁrst 5 eigenvectors are simple sine waves. For M = 5, the
eigenvectors can be written as
x
1
=
1/2
√
3/2
1
√
3/2
1/2
¸
¸
¸
¸
¸
¸
¸
¸
,
x
2
=
√
3/2
√
3/2
0
−
√
3/2
−
√
3/2
¸
¸
¸
¸
¸
¸
¸
¸
,
x
3
=
1
0
−1
0
1
¸
¸
¸
¸
¸
¸
¸
¸
,
x
4
=
√
3/2
−
√
3/2
0
√
3/2
−
√
3/2
¸
¸
¸
¸
¸
¸
¸
¸
,
x
5
=
1/2
−
√
3/2
1
−
√
3/2
1/2
¸
¸
¸
¸
¸
¸
¸
¸
(9.56)
The corresponding eigenvalues are, from Eq. 9.53
λ
1
= −1 +
√
3
2
= −0.134 · · ·
λ
2
= −1 +
1
2
= −0.5
λ
3
= −1 = −1.0
λ
4
= −1 −
1
2
= −1.5
λ
5
= −1 −
√
3
2
= −1.866 · · ·
(9.57)
From Eq. 9.39, the numerical solution written in full is
φ
n
−
φ
∞
= c
1
[1 −(1 −
√
3
2
)h]
n
x
1
+ c
2
[1 −(1 −
1
2
)h]
n
x
2
176 CHAPTER 9. RELAXATION METHODS
σ
m
1.0
0.0
2.0
1.0 0.0
h = 1.0
Μ = 5
Values are equal
m=1
m=2
m=3
m=4
m=5
Figure 9.1: The σ, λ relation for PointJacobi, h = 1, M = 5.
+ c
3
[1 −(1 )h]
n
x
3
+ c
4
[1 −(1 +
1
2
)h]
n
x
4
+ c
5
[1 −(1 +
√
3
2
)h]
n
x
5
(9.58)
9.4.2 The GaussSeidel System
If β and ω are equal to 1 in Eq. 9.44, the matrix eigensystem evolves from the relation
B(1, −2, 1)
x
m
= λ
m
B(−1, 2, 0)
x
m
(9.59)
which can be studied using the results in Appendix B.2. One can show that the H
−1
A
matrix for the GaussSeidel method, A
GS
, is
A
GS
≡ B
−1
(−1, 2, 0)B(1, −2, 1) =
9.4. EIGENSYSTEMS OF THE CLASSICAL METHODS 177
σ
m
λ
m
h
1.0
0.0
2.0
1.0 0.0
h < 1.0
D
e
c
r
e
a
s
i
n
g
h
Approaches 1.0
D
e
c
r
e
a
s
i
n
g
h
Figure 9.2: The σ, λ relation for PointJacobi, h = 0.9, M = 5.
σ
m
Exceeds 1.0 at h 1.072 / Ustable h > 1.072
∼
∼
1.0
0.0
2.0
1.0 0.0
h > 1.0
I
n
c
r
e
a
s
i
n
g
h
λ
m
h
M = 5
I
n
c
r
e
a
s
i
n
g
h
Figure 9.3: The σ, λ relation for PointJacobi, h = 1.1, M = 5.
178 CHAPTER 9. RELAXATION METHODS
−1 1/2
0 −3/4 1/2
0 1/8 −3/4 1/2
0 1/16 1/8 −3/4 1/2
0 1/32 1/16 1/8 −3/4 1/2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 1/2
M
· · ·
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
1
2
3
4
5
.
.
.
M
(9.60)
The eigenvector structure of the GaussSeidel ODE matrix is quite interesting. If M
is odd there are (M + 1)/2 distinct eigenvalues with corresponding linearly indepen
dent eigenvectors, and there are (M −1)/2 defective eigenvalues with corresponding
principal vectors. The equation for the nondefective eigenvalues in the ODE matrix
is (for odd M)
λ
m
= −1 + cos
2
(
mπ
M + 1
) , m = 1, 2, . . . ,
M + 1
2
(9.61)
and the corresponding eigenvectors are given by
x
m
=
¸
cos
mπ
M + 1
j−1
sin
¸
j
mπ
M + 1
, m = 1, 2, . . . ,
M + 1
2
(9.62)
The λσ relation for h = 1, the optimum stationary case, is shown in Fig. 9.4. The
σ
m
with the largest amplitude is obtained with m = 1. Hence the convergence rate is
σ
m

max
=
¸
cos
π
M + 1
2
(9.63)
Since this is the square of that obtained for the PointJacobi method, the error associ
ated with the “worst” eigenvector is removed in half as many iterations. For M = 40,
σ
m

max
= 0.9942. 250 iterations are required to reduce the error component of the
worst eigenvector by a factor of roughly 0.23.
The eigenvectors are quite unlike the PointJacobi set. They are no longer sym
metrical, producing waves that are higher in amplitude on one side (the updated side)
than they are on the other. Furthermore, they do not represent a common family for
diﬀerent values of M.
The Jordan canonical form for M = 5 is
X
−1
A
GS
X = J
GS
=
φ
1
φ
2
φ
3
1
φ
3
1
φ
3
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
(9.64)
9.4. EIGENSYSTEMS OF THE CLASSICAL METHODS 179
σ
m
λ
m
λ
m
1.0
0.0
2.0
1.0 0.0
h = 1.0
Μ = 5
1.0 1.0
2 defective
h
Figure 9.4: The σ, λ relation for GaussSeidel, h = 1.0, M = 5.
The eigenvectors and principal vectors are all real. For M = 5 they can be written
x
1
=
1/2
3/4
3/4
9/16
9/32
¸
¸
¸
¸
¸
¸
¸
¸
,
x
2
=
√
3/2
√
3/4
0
−
√
3/16
−
√
3/32
¸
¸
¸
¸
¸
¸
¸
¸
,
x
3
=
1
0
0
0
0
¸
¸
¸
¸
¸
¸
¸
¸
,
x
4
=
0
2
−1
0
0
¸
¸
¸
¸
¸
¸
¸
¸
,
x
5
=
0
0
4
−4
1
¸
¸
¸
¸
¸
¸
¸
¸
(9.65)
The corresponding eigenvalues are
λ
1
= −1/4
λ
2
= −3/4
λ
3
= −1
(4)
(5)
¸
Defective, linked to
λ
3
Jordan block
(9.66)
The numerical solution written in full is thus
φ
n
−
φ
∞
= c
1
(1 −
h
4
)
n
x
1
+ c
2
(1 −
3h
4
)
n
x
2
180 CHAPTER 9. RELAXATION METHODS
+
¸
c
3
(1 −h)
n
+c
4
h
n
1!
(1 −h)
n−1
+c
5
h
2
n(n −1)
2!
(1 −h)
n−2
¸
x
3
+
¸
c
4
(1 −h)
n
+c
5
h
n
1!
(1 −h)
n−1
x
4
+ c
5
(1 −h)
n
x
5
(9.67)
9.4.3 The SOR System
If β = 1 and 2/ω = x in Eq. 9.44, the ODE matrix is B
−1
(−1, x, 0)B(1, −2, 1).
One can show that this can be written in the form given below for M = 5. The
generalization to any M is fairly clear. The H
−1
A matrix for the SOR method,
A
SOR
≡ B
−1
(−1, x, 0)B(1, −2, 1), is
1
x
5
−2x
4
x
4
0 0 0
−2x
3
+x
4
x
3
−2x
4
x
4
0 0
−2x
2
+x
3
x
2
−2x
3
+x
4
x
3
−2x
4
x
4
0
−2x +x
2
x −2x
2
+x
3
x
2
−2x
3
+x
4
x
3
−2x
4
x
4
−2 +x 1 −2x +x
2
x −2x
2
+x
3
x
2
−2x
3
+x
4
x
3
−2x
4
¸
¸
¸
¸
¸
¸
¸
¸
(9.68)
Eigenvalues of the system are given by
λ
m
= −1 +
ωp
m
+z
m
2
2
, m = 1, 2, . . . M (9.69)
where
z
m
= [4(1 −ω) +ω
2
p
2
m
]
1/2
p
m
= cos[mπ/(M + 1)]
If ω = 1, the system is GaussSeidel. If 4(1 −ω) +ω
2
p
m
< 0, z
m
and λ
m
are complex.
If ω is chosen such that 4(1 − ω) + ω
2
p
2
1
= 0, ω is optimum for the stationary case,
and the following conditions hold:
1. Two eigenvalues are real, equal and defective.
2. If M is even, the remaining eigenvalues are complex and occur in conjugate
pairs.
3. If M is odd, one of the remaining eigenvalues is real and the others are complex
occurring in conjugate pairs.
9.4. EIGENSYSTEMS OF THE CLASSICAL METHODS 181
One can easily show that the optimum ω for the stationary case is
ω
opt
= 2/
¸
1 + sin
π
M + 1
(9.70)
and for ω = ω
opt
λ
m
= ζ
2
m
−1
x
m
= ζ
j−1
m
sin
¸
j
mπ
M + 1
(9.71)
where
ζ
m
=
ω
opt
2
¸
p
m
+i
p
2
1
−p
2
m
Using the explicit Euler method to integrate the ODE’s, σ
m
= 1 − h + hζ
2
m
, and if
h = 1, the optimum value for the stationary case, the λσ relation reduces to that
shown in Fig. 9.5. This illustrates the fact that for optimum stationary SOR all the
σ
m
 are identical and equal to ω
opt
−1. Hence the convergence rate is
σ
m

max
= ω
opt
−1 (9.72)
ω
opt
= 2/
¸
1 + sin
π
M + 1
For M = 40, σ
m

max
= 0.8578. Hence the worst error component is reduced to less
than 0.23 times its initial value in only 10 iterations, much faster than both Gauss
Seidel and PointJacobi. In practical applications, the optimum value of ω may have
to be determined by trial and error, and the beneﬁt may not be as great.
For odd M, there are two real eigenvectors and one real principal vector. The
remaining linearly independent eigenvectors are all complex. For M = 5 they can be
written
x
1
=
1/2
1/2
1/3
1/6
1/18
¸
¸
¸
¸
¸
¸
¸
¸
,
x
2
=
−6
9
16
13
6
¸
¸
¸
¸
¸
¸
¸
¸
,
x
3,4
=
√
3(1 )/2
√
3(1 ± i
√
2)/6
0
√
3(5 ± i
√
2)/54
√
3(7 ±4i
√
2)/162
¸
¸
¸
¸
¸
¸
¸
¸
,
x
5
=
1
0
1/3
0
1/9
¸
¸
¸
¸
¸
¸
¸
¸
(9.73)
The corresponding eigenvalues are
λ
1
= −2/3
(2) Defective linked to λ
1
λ
3
= −(10 −2
√
2i)/9
λ
4
= −(10 + 2
√
2i)/9
λ
5
= −4/3 (9.74)
182 CHAPTER 9. RELAXATION METHODS
σ
m
λ
m
λ
m
λ
m
1.0
0.0
2.0
1.0 0.0
h = 1.0
Μ = 5
1.0 1.0
2 real defective
1 real
2 complex
h λ
m
Figure 9.5: The σ, λ relation for optimum stationary SOR, M = 5, h = 1.
The numerical solution written in full is
φ
n
−
φ
∞
= [c
1
(1 −2h/3)
n
+c
2
nh(1 −2h/3)
n−1
]
x
1
+ c
2
(1 −2h/3)
n
x
2
+ c
3
[1 −(10 −2
√
2i)h/9]
n
x
3
+ c
4
[1 −(10 + 2
√
2i)h/9]
n
x
4
+ c
5
(1 −4h/3)
n
x
5
(9.75)
9.5 Nonstationary Processes
In classical terminology a method is said to be nonstationary if the conditioning
matrices, H and C, are varied at each time step. This does not change the steady
state solution A
−1
b
f
b
, but it can greatly aﬀect the convergence rate. In our ODE
approach this could also be considered and would lead to a study of equations with
nonconstant coeﬃcients. It is much simpler, however, to study the case of ﬁxed
H and C but variable step size, h. This process changes the PointJacobi method
to Richardson’s method in standard terminology. For the GaussSeidel and SOR
methods it leads to processes that can be superior to the stationary methods.
The nonstationary form of Eq. 9.39 is
9.5. NONSTATIONARY PROCESSES 183
φ
N
= c
1
x
1
N
¸
n=1
(1 +λ
1
h
n
) +· · · +c
m
x
m
N
¸
n=1
(1 +λ
m
h
n
)
+ · · · +c
M
x
M
N
¸
n=1
(1 +λ
M
h
n
) +
φ
∞
(9.76)
where the symbol Π stands for product. Since h
n
can now be changed at each step,
the error term can theoretically be completely eliminated in M steps by taking h
m
=
−1/λ
m
, for m = 1, 2, · · · , M. However, the eigenvalues λ
m
are generally unknown and
costly to compute. It is therefore unnecessary and impractical to set h
m
= −1/λ
m
for m = 1, 2, . . . , M. We will see that a few well chosen h’s can reduce whole clusters
of eigenvectors associated with nearby λ’s in the λ
m
spectrum. This leads to the
concept of selectively annihilating clusters of eigenvectors from the error terms as
part of a total iteration process. This is the basis for the multigrid methods discussed
in Chapter 10.
Let us consider the very important case when all of the λ
m
are real and nega
tive (remember that they arise from a conditioned matrix so this constraint is not
unrealistic for quite practical cases). Consider one of the error terms taken from
e
N
≡
φ
N
−
φ
∞
=
M
¸
m=1
c
m
x
m
N
¸
n=1
(1 +λ
m
h
n
) (9.77)
and write it in the form
c
m
x
m
P
e
(λ
m
) ≡ c
m
x
m
N
¸
n=1
(1 +λ
m
h
n
) (9.78)
where P
e
signiﬁes an “Euler” polynomial. Now focus attention on the polynomial
(P
e
)
N
(λ) = (1 +h
1
λ)(1 +h
2
λ) · · · (1 +h
N
λ) (9.79)
treating it as a continuous function of the independent variable λ. In the annihilation
process mentioned after Eq. 9.76, we considered making the error exactly zero by
taking advantage of some knowledge about the discrete values of λ
m
for a particular
case. Now we pose a less demanding problem. Let us choose the h
n
so that the
maximum value of (P
e
)
N
(λ) is as small as possible for all λ lying between λ
a
and λ
b
such that λ
b
≤ λ ≤ λ
a
≤ 0. Mathematically stated, we seek
max
λ
b
≤λ≤λa
(P
e
)
N
(λ) = minimum , with(P
e
)
N
(0) = 1 (9.80)
184 CHAPTER 9. RELAXATION METHODS
This problem has a well known solution due to Markov. It is
(P
e
)
N
(λ) =
T
N
2λ −λ
a
−λ
b
λ
a
−λ
b
T
N
−λ
a
−λ
b
λ
a
−λ
b
(9.81)
where
T
N
(y) = cos(N arccos y) (9.82)
are the Chebyshev polynomials along the interval −1 ≤ y ≤ 1 and
T
N
(y) =
1
2
¸
y +
y
2
−1
N
+
1
2
¸
y −
y
2
−1
N
(9.83)
are the Chebyshev polynomials for y > 1. In relaxation terminology this is generally
referred to as Richardson’s method, and it leads to the nonstationary step size choice
given by
1
h
n
=
1
2
−λ
b
−λ
a
+ (λ
b
−λ
a
) cos
¸
(2n −1)π
2N
¸¸
, n = 1, 2, . . . N (9.84)
Remember that all λ are negative real numbers representing the magnitudes of λ
m
in
an eigenvalue spectrum.
The error in the relaxation process represented by Eq. 9.76 is expressed in terms
of a set of eigenvectors,
x
m
, ampliﬁed by the coeﬃcients c
m
¸
(1 +λ
m
h
n
). With each
eigenvector there is a corresponding eigenvalue. Eq. 9.84 gives us the best choice of a
series of h
n
that will minimize the amplitude of the error carried in the eigenvectors
associated with the eigenvalues between λ
b
and λ
a
.
As an example for the use of Eq. 9.84, let us consider the following problem:
Minimize the maximum error asso
ciated with the λ eigenvalues in the
interval −2 ≤ λ ≤ −1 using only 3
iterations.
(9.85)
The three values of h which satisfy this problem are
h
n
= 2/
3 −cos
¸
(2n −1)π
6
¸
(9.86)
9.5. NONSTATIONARY PROCESSES 185
and the amplitude of the eigenvector is reduced to
(P
e
)
3
(λ) = T
3
(2λ + 3)/T
3
(3) (9.87)
where
T
3
(3) =
[3 +
√
8]
3
+ [3 −
√
8]
3
¸
/2 ≈ 99 (9.88)
A plot of Eq. 9.87 is given in Fig. 9.6 and we see that the amplitudes of all the
eigenvectors associated with the eigenvalues in the range −2 ≤ λ ≤ −1 have been
reduced to less than about 1% of their initial values. The values of h used in Fig. 9.6
are
h
1
= 4/(6 −
√
3)
h
2
= 4/(6 −0)
h
3
= 4/(6 +
√
3)
Return now to Eq. 9.76. This was derived from Eq. 9.37 on the condition that the
explicit Euler method, Eq. 9.41, was used to integrate the basic ODE’s. If instead
the implicit trapezoidal rule
φ
n+1
= φ
n
+
1
2
h(φ
n+1
+φ
n
) (9.89)
is used, the nonstationary formula
φ
N
=
M
¸
m=1
c
m
x
m
N
¸
n=1
¸
¸
¸
1 +
1
2
h
n
λ
m
1 −
1
2
h
n
λ
m
¸
+
φ
∞
(9.90)
would result. This calls for a study of the rational “trapezoidal” polynomial, P
t
:
(P
t
)
N
(λ) =
N
¸
n=1
¸
¸
¸
1 +
1
2
h
n
λ
1 −
1
2
h
n
λ
¸
(9.91)
under the same constraints as before, namely that
max
λ
b
≤λ≤λa
(P
t
)
N
(λ) = minimum , (9.92)
with (P
t
)
N
(0) = 1
186 CHAPTER 9. RELAXATION METHODS
−2 −1.8 −1.6 −1.4 −1.2 −1 −0.8 −0.6 −0.4 −0.2 0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
λ
(
P
e
)
3
(
λ
)
−2 −1.8 −1.6 −1.4 −1.2 −1 −0.8 −0.6 −0.4 −0.2 0
−0.02
−0.015
−0.01
−0.005
0
0.005
0.01
0.015
0.02
λ
(
P
e
)
3
(
λ
)
Figure 9.6: Richardson’s method for 3 steps, minimization over −2 ≤ λ ≤ −1.
9.6. PROBLEMS 187
The optimum values of h can also be found for this problem, but we settle here for
the approximation suggested by Wachspress
2
h
n
= −λ
b
λ
a
λ
b
(n−1)/(N−1)
, n = 1, 2, · · · , N (9.93)
This process is also applied to problem 9.85. The results for (P
t
)
3
(λ) are shown in
Fig. 9.7. The error amplitude is about 1/5 of that found for (P
e
)
3
(λ) in the same
interval of λ. The values of h used in Fig. 9.7 are
h
1
= 1
h
2
=
√
2
h
3
= 2
9.6 Problems
1. Given a relaxation method in the form
H∆
φ
n
= A
φ
n
−
f
show that
φ
n
= G
n
φ
0
+ (I −G
n
)A
−1
f
where G = I +H
−1
A.
2. For a linear system of the form (A
1
+A
2
)x = b, consider the iterative method
(I +µA
1
)˜ x = (I −µA
2
)x
n
+µb
(I +µA
2
)x
n+1
= (I −µA
1
)˜ x +µb
where µ is a parameter. Show that this iterative method can be written in the
form
H(x
k+1
−x
k
) = (A
1
+A
2
)x
k
−b
Determine the iteration matrix G if µ = −1/2.
3. Using Appendix B.2, ﬁnd the eigenvalues of H
−1
A for the SOR method with
A = B(4 : 1, −2, 1) and ω = ω
opt
. (You do not have to ﬁnd H
−1
. Recall that
the eigenvalues of H
−1
A satisfy Ax
m
= λ
m
Hx
m
.) Find the numerical values,
not just the expressions. Then ﬁnd the corresponding σ
m
 values.
188 CHAPTER 9. RELAXATION METHODS
−2 −1.8 −1.6 −1.4 −1.2 −1 −0.8 −0.6 −0.4 −0.2 0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
λ
(
P
t
)
3
(
λ
)
−2 −1.8 −1.6 −1.4 −1.2 −1 −0.8 −0.6 −0.4 −0.2 0
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
x 10
−3
λ
(
P
t
)
3
(
λ
)
Figure 9.7: Wachspress method for 3 steps, minimization over −2 ≤ λ ≤ −1.
9.6. PROBLEMS 189
4. Solve the following equation on the domain 0 ≤ x ≤ 1 with boundary conditions
u(0) = 0, u(1) = 1:
∂
2
u
∂x
2
−6x = 0
For the initial condition, use u(x) = 0. Use secondorder centered diﬀerences
on a grid with 40 cells (M = 39). Iterate to steady state using
(a) the pointJacobi method,
(b) the GaussSeidel method,
(c) the SOR method with the optimum value of ω, and
(d) the 3step Richardson method derived in Section 9.5.
Plot the solution after the residual is reduced by 2, 3, and 4 orders of mag
nitude. Plot the logarithm of the L
2
norm of the residual vs. the number of
iterations. Determine the asymptotic convergence rate. Compare with the the
oretical asymptotic convergence rate.
190 CHAPTER 9. RELAXATION METHODS
Chapter 10
MULTIGRID
The idea of systematically using sets of coarser grids to accelerate the convergence of
iterative schemes that arise from the numerical solution to partial diﬀerential equa
tions was made popular by the work of Brandt. There are many variations of the
process and many viewpoints of the underlying theory. The viewpoint presented here
is a natural extension of the concepts discussed in Chapter 9.
10.1 Motivation
10.1.1 Eigenvector and Eigenvalue Identiﬁcation with Space
Frequencies
Consider the eigensystem of the model matrix B(1, −2, 1). The eigenvalues and
eigenvectors are given in Sections 4.3.2 and 4.3.3, respectively. Notice that as the
magnitudes of the eigenvalues increase, the spacefrequency (or wavenumber) of the
corresponding eigenvectors also increase. That is, if the eigenvalues are ordered such
that
λ
1
 ≤ λ
2
 ≤ · · · ≤ λ
M
 (10.1)
then the corresponding eigenvectors are ordered from low to high space frequencies.
This has a rational explanation from the origin of the banded matrix. Note that
∂
2
∂x
2
sin(mx) = −m
2
sin(mx) (10.2)
and recall that
δ
xx
φ =
1
∆x
2
B(1, −2, 1)
φ = X
¸
1
∆x
2
D(
λ)
X
−1
φ (10.3)
191
192 CHAPTER 10. MULTIGRID
where D(
λ) is a diagonal matrix containing the eigenvalues. We have seen that
X
−1
φ represents a sine transform, and X
φ, a sine synthesis. Therefore, the opera
tion
1
∆x
2
D(
λ) represents the numerical approximation of the multiplication of the
appropriate sine wave by the negative square of its wavenumber, −m
2
. One ﬁnds
that
1
∆x
2
λ
m
=
M + 1
π
2
¸
−2 + 2 cos
mπ
M + 1
≈ −m
2
, m << M (10.4)
Hence, the correlation of large magnitudes of λ
m
with high spacefrequencies is to be
expected for these particular matrix operators. This is consistent with the physics of
diﬀusion as well. However, this correlation is not necessary in general. In fact, the
complete counterexample of the above association is contained in the eigensystem
for B(
1
2
, 1,
1
2
). For this matrix one ﬁnds, from Appendix B, exactly the opposite
behavior.
10.1.2 Properties of the Iterative Method
The second key motivation for multigrid is the following:
• Many iterative methods reduce error components corresponding to eigenvalues
of large amplitude more eﬀectively than those corresponding to eigenvalues of
small amplitude.
This is to be expected of an iterative method which is time accurate. It is also
true, for example, of the GaussSeidel method and, by design, of the Richardson
method described in Section 9.5. The classical pointJacobi method does not share
this property. As we saw in Section 9.4.1, this method produces the same value of σ
for λ
min
and λ
max
. However, the property can be restored by using h < 1, as shown
in Fig. 9.2.
When an iterative method with this property is applied to a matrix with the
above correlation between the modulus of the eigenvalues and the space frequency of
the eigenvectors, error components corresponding to high space frequencies will be
reduced more quickly than those corresponding to low space frequencies. This is the
key concept underlying the multigrid process.
10.2 The Basic Process
First of all we assume that the diﬀerence equations representing the basic partial
diﬀerential equations are in a form that can be related to a matrix which has certain
10.2. THE BASIC PROCESS 193
basic properties. This form can be arrived at “naturally” by simply replacing the
derivatives in the PDE with diﬀerence schemes, as in the example given by Eq. 3.27,
or it can be “contrived” by further conditioning, as in the examples given by Eq. 9.11.
The basic assumptions required for our description of the multigrid process are:
1. The problem is linear.
2. The eigenvalues, λ
m
, of the matrix are all real and negative.
3. The λ
m
are fairly evenly distributed between their maximum and minimum
values.
4. The eigenvectors associated with the eigenvalues having largest magnitudes can
be correlated with high frequencies on the diﬀerencing mesh.
5. The iterative procedure used greatly reduces the amplitudes of the eigenvectors
associated with eigenvalues in the range between
1
2
λ
max
and λ
max
.
These conditions are suﬃcient to ensure the validity of the process described next.
Having preconditioned (if necessary) the basic ﬁnite diﬀerencing scheme by a pro
cedure equivalent to the multiplication by a matrix C, we are led to the starting
formulation
C[A
b
φ
∞
−
f
b
] = 0 (10.5)
where the matrix formed by the product CA
b
has the properties given above. In Eq.
10.5, the vector
f
b
represents the boundary conditions and the forcing function, if
any, and
φ
∞
is a vector representing the desired exact solution. We start with some
initial guess for
φ
∞
and proceed through n iterations making use of some iterative
process that satisﬁes property 5 above. We do not attempt to develop an optimum
procedure here, but for clarity we suppose that the threestep Richardson method
illustrated in Fig. 9.6 is used. At the end of the three steps we ﬁnd
r, the residual,
where
r = C[A
b
φ −
f
b
] (10.6)
Recall that the
φ used to compute
r is composed of the exact solution
φ
∞
and the
error
e in such a way that
A
e −
r = 0 (10.7)
where
A ≡ CA
b
(10.8)
194 CHAPTER 10. MULTIGRID
If one could solve Eq. 10.7 for
e then
φ
∞
=
φ −
e (10.9)
Thus our goal now is to solve for
e. We can write the exact solution for
e in terms of
the eigenvectors of A, and the σ eigenvalues of the Richardson process in the form:
e =
M/2
¸
m=1
c
m
x
m
3
¸
n=1
[σ(λ
m
h
n
)] +
M
¸
m=M/2+1
c
m
x
m
3
¸
n=1
[σ(λ
m
h
n
)]
. .. .
very low amplitude
(10.10)
Combining our basic assumptions, we can be sure that the high frequency content of
e has been greatly reduced (about 1% or less of its original value in the initial guess).
In addition, assumption 4 ensures that the error has been smoothed.
Next we construct a permutation matrix which separates a vector into two parts,
one containing the odd entries, and the other the even entries of the original vector
(or any other appropriate sorting which is consistent with the interpolation approxi
mation to be discussed below). For a 7point example
e
2
e
4
e
6
e
1
e
3
e
5
e
7
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
=
0 1 0 0 0 0 0
0 0 0 1 0 0 0
0 0 0 0 0 1 0
1 0 0 0 0 0 0
0 0 1 0 0 0 0
0 0 0 0 1 0 0
0 0 0 0 0 0 1
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
e
1
e
2
e
3
e
4
e
5
e
6
e
7
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
;
¸
e
e
e
o
¸
= P
e (10.11)
Multiply Eq. 10.7 from the left by P and, since a permutation matrix has an inverse
which is its transpose, we can write
PA[P
−1
P]
e = P
r (10.12)
The operation PAP
−1
partitions the A matrix to form
A
1
A
2
A
3
A
4
¸
¸
¸
¸
e
e
e
o
¸
=
¸
r
e
r
o
¸
(10.13)
Notice that
A
1
e
e
+A
2
e
o
=
r
e
(10.14)
10.2. THE BASIC PROCESS 195
is an exact expression.
At this point we make our one crucial assumption. It is that there is some connec
tion between
e
e
and
e
o
brought about by the smoothing property of the Richardson
relaxation procedure. Since the top half of the frequency spectrum has been removed,
it is reasonable to suppose that the odd points are the average of the even points.
For example
e
1
≈
1
2
(e
a
+e
2
)
e
3
≈
1
2
(e
2
+e
4
)
e
5
≈
1
2
(e
4
+e
6
) or
e
o
= A
2
e
e
(10.15)
e
7
≈
1
2
(e
6
+e
b
)
It is important to notice that e
a
and e
b
represent errors on the boundaries where the
error is zero if the boundary conditions are given. It is also important to notice that we
are dealing with the relation between
e and
r so the original boundary conditions and
forcing function (which are contained in
f in the basic formulation) no longer appear
in the problem. Hence, no aliasing of these functions can occur in subsequent steps.
Finally, notice that, in this formulation, the averaging of
e is our only approximation,
no operations on
r are required or justiﬁed.
If the boundary conditions are Dirichlet, e
a
and e
b
are zero, and one can write for
the example case
A
2
=
1
2
1 0 0
1 1 0
0 1 1
0 0 1
¸
¸
¸
¸
¸
(10.16)
With this approximation Eq. 10.14 reduces to
A
1
e
e
+A
2
A
2
e
e
=
r
e
(10.17)
or
A
c
e
e
−
r
e
= 0 (10.18)
where
A
c
= [A
1
+A
2
A
2
] (10.19)
196 CHAPTER 10. MULTIGRID
The form of A
c
, the matrix on the coarse mesh, is completely determined by the
choice of the permutation matrix and the interpolation approximation. If the original
A had been B(7 : 1, −2, 1), our 7point example would produce
PAP
−1
=
−2 1 1
−2 1 1
−2 1 1
1 −2
1 1 −2
1 1 −2
1 −2
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
=
A
1
A
2
A
3
A
4
¸
¸
¸ (10.20)
and Eq. 10.18 gives
A
1
. .. .
−2
−2
−2
¸
¸
¸+
A
2
. .. .
1 1
1 1
1 1
¸
¸
¸·
1
2
A
2
. .. .
1
1 1
1 1
1
¸
¸
¸
¸
¸
=
Ac
. .. .
−1 1/2
1/2 −1 1/2
1/2 −1
¸
¸
¸ (10.21)
If the boundary conditions are mixed DirichletNeumann, A in the 1D model
equation is B(1,
b, 1) where
b = [−2, −2, ..., −2, −1]
T
. The eigensystem is given by
Eq. B.19. It is easy to show that the high spacefrequencies still correspond to the
eigenvalues with high magnitudes, and, in fact, all of the properties given in Section
10.1 are met. However, the eigenvector structure is diﬀerent from that given in
Eq. 9.55 for Dirichlet conditions. In the present case they are given by
x
jm
= sin
¸
j
(2m−1)π
2M + 1
¸
; m = 1, 2, · · · , M (10.22)
and are illustrated in Fig. 10.1. All of them go through zero on the left (Dirichlet)
side, and all of them reﬂect on the right (Neumann) side.
For Neumann conditions, the interpolation formula in Eq. 10.15 must be changed.
In the particular case illustrated in Fig. 10.1, e
b
is equal to e
M
. If Neumann conditions
are on the left, e
a
= e
1
. When e
b
= e
M
, the example in Eq. 10.16 changes to
A
2
=
1
2
1 0 0
1 1 0
0 1 1
0 0 2
¸
¸
¸
¸
¸
(10.23)
10.2. THE BASIC PROCESS 197
X
Figure 10.1: Eigenvectors for the mixed Dirichlet–Neumann case.
The permutation matrix remains the same and both A
1
and A
2
in the partitioned
matrix PAP
−1
are unchanged (only A
4
is modiﬁed by putting −1 in the lower right
element). Therefore, we can construct the coarse matrix from
A
1
. .. .
−2
−2
−2
¸
¸
¸+
A
2
. .. .
1 1
1 1
1 1
¸
¸
¸·
1
2
A
2
. .. .
1
1 1
1 1
2
¸
¸
¸
¸
¸
=
Ac
. .. .
−1 1/2
1/2 −1 1/2
1/2 −1/2
¸
¸
¸ (10.24)
which gives us what we might have “expected.”
We will continue with Dirichlet boundary conditions for the remainder of this
Section. At this stage, we have reduced the problem from B(1, −2, 1)
e =
r on the
ﬁne mesh to
1
2
B(1, −2, 1)
e
e
=
r
e
on the next coarser mesh. Recall that our goal is
to solve for
e, which will provide us with the solution
φ
∞
using Eq. 10.9. Given
e
e
computed on the coarse grid (possibly using even coarser grids), we can compute
e
o
using Eq. 10.15, and thus
e. In order to complete the process, we must now determine
the relationship between
e
e
and
e.
In order to examine this relationship, we need to consider the eigensystems of A
and A
c
:
A = XΛX
−1
, A
c
= X
c
Λ
c
X
−1
c
(10.25)
For A = B(M : 1, −2, 1) the eigenvalues and eigenvectors are
λ
m
= −2
¸
1 −cos
mπ
M + 1
,
x
m
= sin
¸
j
mπ
M + 1
,
j = 1, 2, · · · , M
m = 1, 2, · · · , M
(10.26)
198 CHAPTER 10. MULTIGRID
Based on our assumptions, the most diﬃcult error mode to eliminate is that with
m = 1, corresponding to
λ
1
= −2
¸
1 −cos
π
M + 1
,
x
1
= sin
¸
j
π
M + 1
, j = 1, 2, · · · , M (10.27)
For example, with M = 51, λ
1
= −0.003649. If we restrict our attention to odd M,
then M
c
= (M−1)/2 is the size of A
c
. The eigenvalue and eigenvector corresponding
to m = 1 for the matrix A
c
=
1
2
B(M
c
, 1, −2, 1) are
(λ
c
)
1
= −
¸
1 −cos
2π
M + 1
, (
x
c
)
1
= sin
¸
j
2π
M + 1
, j = 1, 2, · · · , M
c
(10.28)
For M = 51 (M
c
= 25), we obtain (λ
c
)
1
= −0.007291 = 1.998λ
1
. As M increases,
(λ
c
)
1
approaches 2λ
1
. In addition, one can easily see that (
x
c
)
1
coincides with
x
1
at
every second point of the latter vector, that is, it contains the even elements of
x
1
.
Now let us consider the case in which all of the error consists of the eigenvector
component
x
1
, i.e.,
e =
x
1
. Then the residual is
r = A
x
1
= λ
1
x
1
(10.29)
and the residual on the coarse grid is
r
e
= λ
1
(
x
c
)
1
(10.30)
since (
x
c
)
1
contains the even elements of
x
1
. The exact solution on the coarse grid
satisﬁes
e
e
= A
−1
c
r
e
= X
c
Λ
−1
c
X
−1
c
λ
1
(
x
c
)
1
(10.31)
= λ
1
X
c
Λ
−1
c
1
0
.
.
.
0
¸
¸
¸
¸
¸
¸
(10.32)
= λ
1
X
c
1/(λ
c
)
1
0
.
.
.
0
¸
¸
¸
¸
¸
¸
(10.33)
10.2. THE BASIC PROCESS 199
=
λ
1
(λ
c
)
1
(
x
c
)
1
(10.34)
≈
1
2
(
x
c
)
1
(10.35)
Since our goal is to compute
e =
x
1
, in addition to interpolating
e
e
to the ﬁne grid
(using Eq. 10.15), we must multiply the result by 2. This is equivalent to solving
1
2
A
c
e
e
=
r
e
(10.36)
or
1
4
B(M
c
: 1, −2, 1)
e
e
=
r
e
(10.37)
In our case, the matrix A = B(M : 1, −2, 1) comes from a discretization of the
diﬀusion equation, which gives
A
b
=
ν
∆x
2
B(M : 1, −2, 1) (10.38)
and the preconditioning matrix C is simply
C =
∆x
2
ν
I (10.39)
Applying the discretization on the coarse grid with the same preconditioning matrix
as used on the ﬁne grid gives, since ∆x
c
= 2∆x,
C
ν
∆x
2
c
B(M
c
: 1, −2, 1) =
∆x
2
∆x
2
c
B(M
c
: 1, −2, 1) =
1
4
B(M
c
: 1, −2, 1) (10.40)
which is precisely the matrix appearing in Eq. 10.37. Thus we see that the process is
recursive. The problem to be solved on the coarse grid is the same as that solved on
the ﬁne grid.
The remaining steps required to complete an entire multigrid process are relatively
straightforward, but they vary depending on the problem and the user. The reduction
can be, and usually is, carried to even coarser grids before returning to the ﬁnest level.
However, in each case the appropriate permutation matrix and the interpolation
approximation deﬁne both the down and upgoing paths. The details of ﬁnding
optimum techniques are, obviously, quite important but they are not discussed here.
200 CHAPTER 10. MULTIGRID
10.3 A TwoGrid Process
We now describe a twogrid process for the linear problemA
φ =
f, which can be easily
generalized to a process with an arbitrary number of grids due to the recursive nature
of multigrid. Extension to nonlinear problems requires that both the solution and the
residual be transferred to the coarse grid in a process known as full approximation
storage multigrid.
1. Perform n
1
iterations of the selected relaxation method on the ﬁne grid, starting
with
φ =
φ
n
. Call the result
φ
(1)
. This gives
1
φ
(1)
= G
n
1
1
φ
n
+ (I −G
n
1
1
) A
−1
f (10.41)
where
G
1
= I +H
−1
1
A
1
(10.42)
and H
1
is deﬁned as in Chapter 9 (e.g., Eq. 9.21). Next compute the residual based
on
φ
(1)
:
r
(1)
= A
φ
(1)
−
f = AG
n
1
1
φ
n
+A(I −G
n
1
1
) A
−1
f −
f
= AG
n
1
1
φ
n
−AG
n
1
1
A
−1
f (10.43)
2. Transfer (or restrict) r
(1)
to the coarse grid:
r
(2)
= R
2
1
r
(1)
(10.44)
In our example in the preceding section, the restriction matrix is
R
2
1
=
0 1 0 0 0 0 0
0 0 0 1 0 0 0
0 0 0 0 0 1 0
¸
¸
¸ (10.45)
that is, the ﬁrst three rows of the permutation matrix P in Eq. 10.11. This type of
restriction is known as “simple injection.” Some form of weighted restriction can also
be used.
3. Solve the problem A
2
e
(2)
= r
(2)
on the coarse grid exactly:
2
e
(2)
= A
−1
2
r
(2)
(10.46)
1
See problem 1 of Chapter 9.
2
Note that the coarse grid matrix denoted A
2
here was denoted A
c
in the preceding section.
10.3. A TWOGRID PROCESS 201
Here A
2
can be formed by applying the discretization on the coarse grid. In the
preceding example (eq. 10.40), A
2
=
1
4
B(M
c
: 1, −2, 1). It is at this stage that the
generalization to a multigrid procedure with more than two grids occurs. If this is the
coarsest grid in the sequence, solve exactly. Otherwise, apply the twogrid process
recursively.
4. Transfer (or prolong) the error back to the ﬁne grid and update the solution:
φ
n+1
=
φ
(1)
−I
1
2
e
(2)
(10.47)
In our example, the prolongation matrix is
I
1
2
=
1/2 0 0
1 0 0
1/2 1/2 0
0 1 0
0 1/2 1/2
0 0 1
0 0 1/2
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
(10.48)
which follows from Eq. 10.15.
Combining these steps, one obtains
φ
n+1
= [I −I
1
2
A
−1
2
R
2
1
A]G
n
1
1
φ
n
−[I −I
1
2
A
−1
2
R
2
1
A]G
n
1
1
A
−1
f +A
−1
f (10.49)
Thus the basic iteration matrix is
[I −I
1
2
A
−1
2
R
2
1
A]G
n
1
1
(10.50)
The eigenvalues of this matrix determine the convergence rate of the twogrid process.
The basic iteration matrix for a threegrid process is found from Eq. 10.50 by
replacing A
−1
2
with (I −G
2
3
)A
−1
2
, where
G
2
3
= [I −I
2
3
A
−1
3
R
3
2
A
2
]G
n
2
2
(10.51)
In this expression n
2
is the number of relaxation steps on grid 2, I
2
3
and R
3
2
are the
transfer operators between grids 2 and 3, and A
3
is obtained by discretizing on grid
3. Extension to four or more grids proceeds in similar fashion.
202 CHAPTER 10. MULTIGRID
10.4 Problems
1. Derive Eq. 10.51.
2. Repeat problem 4 of Chapter 9 using a fourgrid multigrid method together
with
(a) the GaussSeidel method,
(b) the 3step Richardson method derived in Section 9.5.
Solve exactly on the coarsest grid. Plot the solution after the residual is reduced
by 2, 3, and 4 orders of magnitude. Plot the logarithm of the L
2
norm of the
residual vs. the number of iterations. Determine the asymptotic convergence
rate. Calculate the theoretical asymptotic convergence rate and compare.
Chapter 11
NUMERICAL DISSIPATION
Up to this point, we have emphasized the secondorder centereddiﬀerence approxima
tions to the spatial derivatives in our model equations. We have seen that a centered
approximation to a ﬁrst derivative is nondissipative, i.e., the eigenvalues of the as
sociated circulant matrix (with periodic boundary conditions) are pure imaginary.
In processes governed by nonlinear equations, such as the Euler and NavierStokes
equations, there can be a continual production of highfrequency components of the
solution, leading, for example, to the production of shock waves. In a real phys
ical problem, the production of high frequencies is eventually limited by viscosity.
However, when we solve the Euler equations numerically, we have neglected viscous
eﬀects. Thus the numerical approximation must contain some inherent dissipation to
limit the production of highfrequency modes. Although numerical approximations
to the NavierStokes equations contain dissipation through the viscous terms, this
can be insuﬃcient, especially at high Reynolds numbers, due to the limited grid res
olution which is practical. Therefore, unless the relevant length scales are resolved,
some form of added numerical dissipation is required in the numerical solution of
the NavierStokes equations as well. Since the addition of numerical dissipation is
tantamount to intentionally introducing nonphysical behavior, it must be carefully
controlled such that the error introduced is not excessive. In this Chapter, we discuss
some diﬀerent ways of adding numerical dissipation to the spatial derivatives in the
linear convection equation and hyperbolic systems of PDE’s.
203
204 CHAPTER 11. NUMERICAL DISSIPATION
11.1 OneSided FirstDerivative Space Diﬀerenc
ing
We investigate the properties of onesided spatial diﬀerence operators in the context
of the biconvection model equation given by
∂u
∂t
= −a
∂u
∂x
(11.1)
with periodic boundary conditions. Consider the following point operator for the
spatial derivative term
−a(δ
x
u)
j
=
−a
2∆x
[−(1 +β)u
j−1
+ 2βu
j
+ (1 −β)u
j+1
]
=
−a
2∆x
[(−u
j−1
+u
j+1
) +β(−u
j−1
+ 2u
j
−u
j+1
)] (11.2)
The second formshown divides the operator into an antisymmetric component (−u
j−1
+
u
j+1
)/2∆x and a symmetric component β(−u
j−1
+ 2u
j
− u
j+1
)/2∆x. The antisym
metric component is the secondorder centered diﬀerence operator. With β = 0, the
operator is only ﬁrstorder accurate. A backward diﬀerence operator is given by β = 1
and a forward diﬀerence operator is given by β = −1.
For periodic boundary conditions the corresponding matrix operator is
−aδ
x
=
−a
2∆x
B
p
(−1 −β, 2β, 1 −β)
The eigenvalues of this matrix are
λ
m
=
−a
∆x
β
¸
1 −cos
2πm
M
+i sin
2πm
M
for m = 0, 1, . . . , M−1
If a is positive, the forward diﬀerence operator (β = −1) produces Re(λ
m
) > 0,
the centered diﬀerence operator (β = 0) produces Re(λ
m
) = 0, and the backward
diﬀerence operator produces Re(λ
m
) < 0. Hence the forward diﬀerence operator is
inherently unstable while the centered and backward operators are inherently stable.
If a is negative, the roles are reversed. When Re(λ
m
) = 0, the solution will either
grow or decay with time. In either case, our choice of diﬀerencing scheme produces
nonphysical behavior. We proceed next to show why this occurs.
11.2. THE MODIFIED PARTIAL DIFFERENTIAL EQUATION 205
11.2 The Modiﬁed Partial Diﬀerential Equation
First carry out a Taylor series expansion of the terms in Eq. 11.2. We are lead to the
expression
(δ
x
u)
j
=
1
2∆x
2∆x
∂u
∂x
j
−β∆x
2
∂
2
u
∂x
2
j
+
∆x
3
3
∂
3
u
∂x
3
j
−
β∆x
4
12
∂
4
u
∂x
4
j
+. . .
¸
¸
We see that the antisymmetric portion of the operator introduces odd derivative
terms in the truncation error while the symmetric portion introduces even derivatives.
Substituting this into Eq. 11.1 gives
∂u
∂t
= −a
∂u
∂x
+
aβ∆x
2
∂
2
u
∂x
2
−
a∆x
2
6
∂
3
u
∂x
3
+
aβ∆x
3
24
∂
4
u
∂x
4
+. . . (11.3)
This is the partial diﬀerential equation we are really solving when we apply the
approximation given by Eq. 11.2 to Eq. 11.1. Notice that Eq. 11.3 is consistent with
Eq. 11.1, since the two equations are identical when ∆x →0. However, when we use
a computer to ﬁnd a numerical solution of the problem, ∆x can be small but it is
not zero. This means that each term in the expansion given by Eq. 11.3 is excited to
some degree. We refer to Eq. 11.3 as the modiﬁed partial diﬀerential equation. We
proceed next to investigate the implications of this concept.
Consider the simple linear partial diﬀerential equation
∂u
∂t
= −a
∂u
∂x
+ν
∂
2
u
∂x
2
+γ
∂
3
u
∂x
3
+τ
∂
4
u
∂x
4
(11.4)
Choose periodic boundary conditions and impose an initial condition u = e
iκx
. Under
these conditions there is a wavelike solution to Eq. 11.4 of the form
u(x, t) = e
iκx
e
(r+is)t
provided r and s satisfy the condition
r +is = −iaκ −νκ
2
−iγκ
3
+τκ
4
or
r = −κ
2
(ν −τκ
2
), s = −κ(a +γκ
2
)
The solution is composed of both amplitude and phase terms. Thus
u = e
−κ
2
(ν−τκ
2
)
. .. .
amplitude
e
iκ[x−(a+γκ
2
)t]
. .. .
phase
(11.5)
206 CHAPTER 11. NUMERICAL DISSIPATION
It is important to notice that the amplitude of the solution depends only upon ν and
τ, the coeﬃcients of the even derivatives in Eq. 11.4, and the phase depends only on
a and γ, the coeﬃcients of the odd derivatives.
If the wave speed a is positive, the choice of a backward diﬀerence scheme (β = 1)
produces a modiﬁed PDE with ν −τκ
2
> 0 and hence the amplitude of the solution
decays. This is tantamount to deliberately adding dissipation to the PDE. Under the
same condition, the choice of a forward diﬀerence scheme (β = −1) is equivalent to
deliberately adding a destabilizing term to the PDE.
By examining the term governing the phase of the solution in Eq. 11.5, we see
that the speed of propagation is a + γκ
2
. Referring to the modiﬁed PDE, Eq. 11.3
we have γ = −a∆x
2
/6. Therefore, the phase speed of the numerical solution is less
than the actual phase speed. Furthermore, the numerical phase speed is dependent
upon the wavenumber κ. This we refer to as dispersion.
Our purpose here is to investigate the properties of onesided spatial diﬀerencing
operators relative to centered diﬀerence operators. We have seen that the three
point centered diﬀerence approximation of the spatial derivative produces a modiﬁed
PDE that has no dissipation (or ampliﬁcation). One can easily show, by using the
antisymmetry of the matrix diﬀerence operators, that the same is true for any cen
tered diﬀerence approximation of a ﬁrst derivative. As a corollary, any departure
from antisymmetry in the matrix diﬀerence operator must introduce dissipation (or
ampliﬁcation) into the modiﬁed PDE.
Note that the use of onesided diﬀerencing schemes is not the only way to in
troduce dissipation. Any symmetric component in the spatial operator introduces
dissipation (or ampliﬁcation). Therefore, one could choose β = 1/2 in Eq. 11.2. The
resulting spatial operator is not onesided but it is dissipative. Biased schemes use
more information on one side of the node than the other. For example, a thirdorder
backwardbiased scheme is given by
(δ
x
u)
j
=
1
6∆x
(u
j−2
−6u
j−1
+ 3u
j
+ 2u
j+1
)
=
1
12∆x
[(u
j−2
−8u
j−1
+ 8u
j+1
−u
j+2
)
+ (u
j−2
−4u
j−1
+ 6u
j
−4u
j+1
+u
j+2
)] (11.6)
The antisymmetric component of this operator is the fourthorder centered diﬀerence
operator. The symmetric component approximates ∆x
3
u
xxxx
/12. Therefore, this
operator produces fourthorder accuracy in phase with a thirdorder dissipative term.
11.3. THE LAXWENDROFF METHOD 207
11.3 The LaxWendroﬀ Method
In order to introduce numerical dissipation using onesided diﬀerencing, backward
diﬀerencing must be used if the wave speed is positive, and forward diﬀerencing must
be used if the wave speed is negative. Next we consider a method which introduces
dissipation independent of the sign of the wave speed, known as the LaxWendroﬀ
method. This explicit method diﬀers conceptually from the methods considered pre
viously in which spatial diﬀerencing and timemarching are treated separately.
Consider the following Taylorseries expansion in time:
u(x, t +h) = u +h
∂u
∂t
+
1
2
h
2
∂
2
u
∂t
2
+O(h
3
) (11.7)
First replace the time derivatives with space derivatives according to the PDE (in
this case, the linear convection equation
∂u
∂t
+a
∂u
∂x
= 0). Thus
∂u
∂t
= −a
∂u
∂x
,
∂
2
u
∂t
2
= a
2
∂
2
u
∂x
2
(11.8)
Now replace the space derivatives with threepoint centered diﬀerence operators, giv
ing
u
(n+1)
j
= u
(n)
j
−
1
2
ah
∆x
(u
(n)
j+1
−u
(n)
j−1
) +
1
2
ah
∆x
2
(u
(n)
j+1
−2u
(n)
j
+u
(n)
j−1
) (11.9)
This is the LaxWendroﬀ method applied to the linear convection equation. It is a
fullydiscrete ﬁnitediﬀerence scheme. There is no intermediate semidiscrete stage.
For periodic boundary conditions, the corresponding fullydiscrete matrix operator
is
u
n+1
= B
p
¸
1
2
ah
∆x
+
ah
∆x
2
¸
¸
, 1 −
ah
∆x
2
,
1
2
−
ah
∆x
+
ah
∆x
2
¸
¸
¸
u
n
The eigenvalues of this matrix are
σ
m
= 1 −
ah
∆x
2 ¸
1 −cos
2πm
M
−i
ah
∆x
sin
2πm
M
for m = 0, 1, . . . , M −1
For 
ah
∆x
 ≤ 1 all of the eigenvalues have modulus less than or equal to unity and hence
the method is stable independent of the sign of a. The quantity 
ah
∆x
 is known as the
Courant (or CFL) number. It is equal to the ratio of the distance travelled by a wave
in one time step to the mesh spacing.
208 CHAPTER 11. NUMERICAL DISSIPATION
The nature of the dissipative properties of the LaxWendroﬀ scheme can be seen
by examining the modiﬁed partial diﬀerential equation, which is given by
∂u
∂t
+a
∂u
∂x
= −
a
6
(∆x
2
−a
2
h
2
)
∂
3
u
∂x
3
−
a
2
h
8
(∆x
2
−a
2
h
2
)
∂
4
u
∂x
4
+. . .
This is derived by substituting Taylor series expansions for all terms in Eq. 11.9 and
converting the time derivatives to space derivatives using Eq. 11.8. The two leading
error terms appear on the right side of the equation. Recall that the odd derivatives on
the right side lead to unwanted dispersion and the even derivatives lead to dissipation
(or ampliﬁcation, depending on the sign). Therefore, the leading error term in the
LaxWendroﬀ method is dispersive and proportional to
−
a
6
(∆x
2
−a
2
h
2
)
∂
3
u
∂x
3
= −
a∆x
2
6
(1 −C
2
n
)
∂
3
u
∂x
3
The dissipative term is proportional to
−
a
2
h
8
(∆x
2
−a
2
h
2
)
∂
4
u
∂x
4
= −
a
2
h∆x
2
8
(1 −C
2
n
)
∂
4
u
∂x
4
This term has the appropriate sign and hence the scheme is truly dissipative as long
as C
n
≤ 1.
A closely related method is that of MacCormack. Recall MacCormack’s time
marching method, presented in Chapter 6:
˜ u
n+1
= u
n
+hu
n
u
n+1
=
1
2
[u
n
+ ˜ u
n+1
+h˜ u
n+1
] (11.10)
If we use ﬁrstorder backward diﬀerencing in the ﬁrst stage and ﬁrstorder forward
diﬀerencing in the second stage,
1
a dissipative secondorder method is obtained. For
the linear convection equation, this approach leads to
˜ u
(n+1)
j
= u
(n)
j
−
ah
∆x
(u
(n)
j
−u
(n)
j−1
)
u
(n+1)
j
=
1
2
[u
(n)
j
+ ˜ u
(n+1)
j
−
ah
∆x
(˜ u
(n+1)
j+1
− ˜ u
(n+1)
j
)] (11.11)
which can be shown to be identical to the LaxWendroﬀ method. Hence MacCor
mack’s method has the same dissipative and dispersive properties as the LaxWendroﬀ
method. The two methods diﬀer when applied to nonlinear hyperbolic systems, how
ever.
1
Or viceversa; for nonlinear problems, these should be applied alternately.
11.4. UPWIND SCHEMES 209
11.4 Upwind Schemes
In Section 11.1, we saw that numerical dissipation can be introduced in the spatial
diﬀerence operator using onesided diﬀerence schemes or, more generally, by adding
a symmetric component to the spatial operator. With this approach, the direction
of the onesided operator (i.e., whether it is a forward or a backward diﬀerence)
or the sign of the symmetric component depends on the sign of the wave speed.
When a hyperbolic system of equations is being solved, the wave speeds can be both
positive and negative. For example, the eigenvalues of the ﬂux Jacobian for the one
dimensional Euler equations are u, u +a, u −a. When the ﬂow is subsonic, these are
of mixed sign. In order to apply onesided diﬀerencing schemes to such systems, some
form of splitting is required. This is avoided in the LaxWendroﬀ scheme. However,
as a result of their superior ﬂexibility, schemes in which the numerical dissipation
is introduced in the spatial operator are generally preferred over the LaxWendroﬀ
approach.
Consider again the linear convection equation:
∂u
∂t
+a
∂u
∂x
= 0 (11.12)
where we do not make any assumptions as to the sign of a. We can rewrite Eq. 11.12
as
∂u
∂t
+ (a
+
+a
−
)
∂u
∂x
= 0 ; a
±
=
a ±a
2
If a ≥ 0, then a
+
= a ≥ 0 and a
−
= 0. Alternatively, if a ≤ 0, then a
+
= 0 and
a
−
= a ≤ 0. Now for the a
+
(≥ 0) term we can safely backward diﬀerence and for the
a
−
(≤ 0) term forward diﬀerence. This is the basic concept behind upwind methods,
that is, some decomposition or splitting of the ﬂuxes into terms which have positive
and negative characteristic speeds so that appropriate diﬀerencing schemes can be
chosen. In the next two sections, we present two splitting techniques commonly used
with upwind methods. These are by no means unique.
The above approach to obtaining a stable discretization independent of the sign
of a can be written in a diﬀerent, but entirely equivalent, manner. From Eq. 11.2, we
see that a stable discretization is obtained with β = 1 if a ≥ 0 and with β = −1 if
a ≤ 0. This is achieved by the following point operator:
−a(δ
x
u)
j
=
−1
2∆x
[a(−u
j−1
+u
j+1
) +a(−u
j−1
+ 2u
j
−u
j+1
)] (11.13)
This approach is extended to systems of equations in Section 11.5.
In this section, we present the basic ideas of ﬂuxvector and ﬂuxdiﬀerence splitting.
For more subtle aspects of implementation and application of such techniques to
210 CHAPTER 11. NUMERICAL DISSIPATION
nonlinear hyperbolic systems such as the Euler equations, the reader is referred to
the literature on this subject.
11.4.1 FluxVector Splitting
Recall from Section 2.5 that a linear, constantcoeﬃcient, hyperbolic system of partial
diﬀerential equations given by
∂u
∂t
+
∂f
∂x
=
∂u
∂t
+A
∂u
∂x
= 0 (11.14)
can be decoupled into characteristic equations of the form
∂w
i
∂t
+λ
i
∂w
i
∂x
= 0 (11.15)
where the wave speeds, λ
i
, are the eigenvalues of the Jacobian matrix, A, and the
w
i
’s are the characteristic variables. In order to apply a onesided (or biased) spatial
diﬀerencing scheme, we need to apply a backward diﬀerence if the wave speed, λ
i
, is
positive, and a forward diﬀerence if the wave speed is negative. To accomplish this,
let us split the matrix of eigenvalues, Λ, into two components such that
Λ = Λ
+
+ Λ
−
(11.16)
where
Λ
+
=
Λ +Λ
2
, Λ
−
=
Λ −Λ
2
(11.17)
With these deﬁnitions, Λ
+
contains the positive eigenvalues and Λ
−
contains the neg
ative eigenvalues. We can now rewrite the system in terms of characteristic variables
as
∂w
∂t
+ Λ
∂w
∂x
=
∂w
∂t
+ Λ
+
∂w
∂x
+ Λ
−
∂w
∂x
= 0 (11.18)
The spatial terms have been split into two components according to the sign of the
wave speeds. We can use backward diﬀerencing for the Λ
+∂w
∂x
term and forward
diﬀerencing for the Λ
−∂w
∂x
term. Premultiplying by X and inserting the product
X
−1
X in the spatial terms gives
∂Xw
∂t
+
∂XΛ
+
X
−1
Xw
∂x
+
∂XΛ
−
X
−1
Xw
∂x
= 0 (11.19)
11.4. UPWIND SCHEMES 211
With the deﬁnitions
2
A
+
= XΛ
+
X
−1
, A
−
= XΛ
−
X
−1
(11.20)
and recalling that u = Xw, we obtain
∂u
∂t
+
∂A
+
u
∂x
+
∂A
−
u
∂x
= 0 (11.21)
Finally the split ﬂux vectors are deﬁned as
f
+
= A
+
u, f
−
= A
−
u (11.22)
and we can write
∂u
∂t
+
∂f
+
∂x
+
∂f
−
∂x
= 0 (11.23)
In the linear case, the deﬁnition of the split ﬂuxes follows directly from the deﬁni
tion of the ﬂux, f = Au. For the Euler equations, f is also equal to Au as a result of
their homogeneous property, as discussed in Appendix C. Note that
f = f
+
+f
−
(11.24)
Thus by applying backward diﬀerences to the f
+
term and forward diﬀerences to the
f
−
term, we are in eﬀect solving the characteristic equations in the desired manner.
This approach is known as ﬂuxvector splitting.
When an implicit timemarching method is used, the Jacobians of the split ﬂux
vectors are required. In the nonlinear case,
∂f
+
∂u
= A
+
,
∂f
−
∂u
= A
−
(11.25)
Therefore, one must ﬁnd and use the new Jacobians given by
A
++
=
∂f
+
∂u
, A
−−
=
∂f
−
∂u
(11.26)
For the Euler equations, A
++
has eigenvalues which are all positive, and A
−−
has all
negative eigenvalues.
2
With these deﬁnitions A
+
has all positive eigenvalues, and A
−
has all negative eigenvalues.
212 CHAPTER 11. NUMERICAL DISSIPATION
11.4.2 FluxDiﬀerence Splitting
Another approach, more suited to ﬁnitevolume methods, is known as ﬂuxdiﬀerence
splitting. In a ﬁnitevolume method, the ﬂuxes must be evaluated at cell bound
aries. We again begin with the diagonalized form of the linear, constantcoeﬃcient,
hyperbolic system of equations
∂w
∂t
+ Λ
∂w
∂x
= 0 (11.27)
The ﬂux vector associated with this form is g = Λw. Now, as in Chapter 5, we
consider the numerical ﬂux at the interface between nodes j and j + 1, ˆ g
j+1/2
, as a
function of the states to the left and right of the interface, w
L
and w
R
, respectively.
The centered approximation to g
j+1/2
, which is nondissipative, is given by
ˆ g
j+1/2
=
1
2
(g(w
L
) +g(w
R
)) (11.28)
In order to obtain a onesided upwind approximation, we require
(ˆ g
i
)
j+1/2
=
λ
i
(w
i
)
L
if λ
i
> 0
λ
i
(w
i
)
R
if λ
i
< 0
(11.29)
where the subscript i indicates individual components of w and g. This is achieved
with
(ˆ g
i
)
j+1/2
=
1
2
λ
i
[(w
i
)
L
+ (w
i
)
R
] +
1
2
λ
i
 [(w
i
)
L
−(w
i
)
R
] (11.30)
or
ˆ g
j+1/2
=
1
2
Λ(w
L
+w
R
) +
1
2
Λ (w
L
−w
R
) (11.31)
Now, as in Eq. 11.19, we premultiply by X to return to the original variables and
insert the product X
−1
X after Λ and Λ to obtain
Xˆ g
j+1/2
=
1
2
XΛX
−1
X (w
L
+w
R
) +
1
2
XΛX
−1
X (w
L
−w
R
) (11.32)
and thus
ˆ
f
j+1/2
=
1
2
(f
L
+f
R
) +
1
2
A (u
L
−u
R
) (11.33)
where
A = XΛX
−1
(11.34)
11.5. ARTIFICIAL DISSIPATION 213
and we have also used the relations f = Xg, u = Xw, and A = XΛX
−1
.
In the linear, constantcoeﬃcient case, this leads to an upwind operator which is
identical to that obtained using ﬂuxvector splitting. However, in the nonlinear case,
there is some ambiguity regarding the deﬁnition of A at the cell interface j + 1/2.
In order to resolve this, consider a situation in which the eigenvalues of A are all of
the same sign. In this case, we would like our deﬁnition of
ˆ
f
j+1/2
to satisfy
ˆ
f
j+1/2
=
f
L
if all λ
i
s > 0
f
R
if all λ
i
s < 0
(11.35)
giving pure upwinding. If the eigenvalues of A are all positive, A = A; if they are
all negative, A = −A. Hence satisfaction of Eq. 11.35 is obtained by the deﬁnition
ˆ
f
j+1/2
=
1
2
(f
L
+f
R
) +
1
2
A
j+1/2
 (u
L
−u
R
) (11.36)
if A
j+1/2
satisﬁes
f
L
−f
R
= A
j+1/2
(u
L
−u
R
) (11.37)
For the Euler equations for a perfect gas, Eq. 11.37 is satisﬁed by the ﬂux Jacobian
evaluated at the Roeaverage state given by
u
j+1/2
=
√
ρ
L
u
L
+
√
ρ
R
u
R
√
ρ
L
+
√
ρ
R
(11.38)
H
j+1/2
=
√
ρ
L
H
L
+
√
ρ
R
H
R
√
ρ
L
+
√
ρ
R
(11.39)
where u and H = (e + p)/ρ are the velocity and the total enthalpy per unit mass,
respectively.
3
11.5 Artiﬁcial Dissipation
We have seen that numerical dissipation can be introduced by using onesided dif
ferencing schemes together with some form of ﬂux splitting. We have also seen that
such dissipation can be introduced by adding a symmetric component to an antisym
metric (dissipationfree) operator. Thus we can generalize the concept of upwinding
to include any scheme in which the symmetric portion of the operator is treated in
such a manner as to be truly dissipative.
3
Note that the ﬂux Jacobian can be written in terms of u and H only; see problem 6 at the end
of this chapter.
214 CHAPTER 11. NUMERICAL DISSIPATION
For example, let
(δ
a
x
u)
j
=
u
j+1
−u
j−1
2∆x
, (δ
s
x
u)
j
=
−u
j+1
+ 2u
j
−u
j−1
2∆x
(11.40)
Applying δ
x
= δ
a
x
+ δ
s
x
to the spatial derivative in Eq. 11.15 is stable if λ
i
≥ 0 and
unstable if λ
i
< 0. Similarly, applying δ
x
= δ
a
x
−δ
s
x
is stable if λ
i
≤ 0 and unstable if
λ
i
> 0. The appropriate implementation is thus
λ
i
δ
x
= λ
i
δ
a
x
+λ
i
δ
s
x
(11.41)
Extension to a hyperbolic system by applying the above approach to the characteristic
variables, as in the previous two sections, gives
δ
x
(Au) = δ
a
x
(Au) +δ
s
x
(Au) (11.42)
or
δ
x
f = δ
a
x
f +δ
s
x
(Au) (11.43)
where A is deﬁned in Eq. 11.34. The second spatial term is known as artiﬁcial
dissipation. It is also sometimes referred to as artiﬁcial diﬀusion or artiﬁcial viscosity.
With appropriate choices of δ
a
x
and δ
s
x
, this approach can be related to the upwind
approach. This is particularly evident from a comparison of Eqs. 11.36 and 11.43.
It is common to use the following operator for δ
s
x
(δ
s
x
u)
j
=
∆x
(u
j−2
−4u
j−1
+ 6u
j
−4u
j+1
+u
j+2
) (11.44)
where is a problemdependent coeﬃcient. This symmetric operator approximates
∆x
3
u
xxxx
and thus introduces a thirdorder dissipative term. With an appropriate
value of , this often provides suﬃcient damping of high frequency modes without
greatly aﬀecting the low frequency modes. For details of how this can be implemented
for nonlinear hyperbolic systems, the reader should consult the literature. A more
complicated treatment of the numerical dissipation is also required near shock waves
and other discontinuities, but is beyond the scope of this book.
11.6 Problems
1. A secondorder backward diﬀerence approximation to a 1st derivative is given
as a point operator by
(δ
x
u)
j
=
1
2∆x
(u
j−2
−4u
j−1
+ 3u
j
)
11.6. PROBLEMS 215
(a) Express this operator in banded matrix form (for periodic boundary condi
tions), then derive the symmetric and skewsymmetric matrices that have
the matrix operator as their sum. (See Appendix A.3 to see how to con
struct the symmetric and skewsymmetric components of a matrix.)
(b) Using a Taylor table, ﬁnd the derivative which is approximated by the
corresponding symmetric and skewsymmetric operators and the leading
error term for each.
2. Find the modiﬁed wavenumber for the ﬁrstorder backward diﬀerence operator.
Plot the real and imaginary parts of κ
∗
∆x vs. κ∆x for 0 ≤ κ∆x ≤ π. Using
Fourier analysis as in Section 6.6.2, ﬁnd σ for the combination of this spatial
operator with 4thorder RungeKutta time marching at a Courant number of
unity and plot vs. κ∆x for 0 ≤ κ∆x ≤ π.
3. Find the modiﬁed wavenumber for the operator given in Eq. 11.6. Plot the real
and imaginary parts of κ
∗
∆x vs. κ∆x for 0 ≤ κ∆x ≤ π. Using Fourier analysis
as in Section 6.6.2, ﬁnd σ for the combination of this spatial operator with
4thorder RungeKutta time marching at a Courant number of unity and plot
vs. κ∆x for 0 ≤ κ∆x ≤ π.
4. Consider the spatial operator obtained by combining secondorder centered dif
ferences with the symmetric operator given in Eq. 11.44. Find the modiﬁed
wavenumber for this operator with = 0, 1/12, 1/24, and 1/48. Plot the real
and imaginary parts of κ
∗
∆x vs. κ∆x for 0 ≤ κ∆x ≤ π. Using Fourier analysis
as in Section 6.6.2, ﬁnd σ for the combination of this spatial operator with
4thorder RungeKutta time marching at a Courant number of unity and plot
vs. κ∆x for 0 ≤ κ∆x ≤ π.
5. Consider the hyperbolic system derived in problem 8 of Chapter 2. Find the
matrix A. Form the plusminus split ﬂux vectors as in Section 11.4.1.
6. Show that the ﬂux Jacobian for the 1D Euler equations can be written in terms
of u and H. Show that the use of the Roe average state given in Eqs. 11.38 and
11.39 leads to satisfaction of Eq. 11.37.
216 CHAPTER 11. NUMERICAL DISSIPATION
Chapter 12
SPLIT AND FACTORED FORMS
In the next two chapters, we present and analyze split and factored algorithms. This
gives the reader a feel for some of the modiﬁcations which can be made to the basic
algorithms in order to obtain eﬃcient solvers for practical multidimensional applica
tions, and a means for analyzing such modiﬁed forms.
12.1 The Concept
Factored forms of numerical operators are used extensively in constructing and ap
plying numerical methods to problems in ﬂuid mechanics. They are the basis for a
wide variety of methods variously known by the labels “hybrid”, “time split”, and
“fractional step”. Factored forms are especially useful for the derivation of practical
algorithms that use implicit methods. When we approach numerical analysis in the
light of matrix derivative operators, the concept of factoring is quite simple to present
and grasp. Let us start with the following observations:
1. Matrices can be split in quite arbitrary ways.
2. Advancing to the next time level always requires some reference to a previous
one.
3. Time marching methods are valid only to some order of accuracy in the step
size, h.
Now recall the generic ODE’s produced by the semidiscrete approach
d
u
dt
= A
u −
f (12.1)
217
218 CHAPTER 12. SPLIT AND FACTORED FORMS
and consider the above observations. From observation 1 (arbitrary splitting of A) :
d
u
dt
= [A
1
+ A
2
]
u −
f (12.2)
where A = [A
1
+ A
2
] but A
1
and A
2
are not unique. For the time march let us choose
the simple, ﬁrstorder,
1
explicit Euler method. Then, from observation 2 (new data
u
n+1
in terms of old
u
n
):
u
n+1
= [ I +hA
1
+hA
2
]
u
n
−h
f +O(h
2
) (12.3)
or its equivalent
u
n+1
=
[ I +hA
1
][ I +hA
2
] −h
2
A
1
A
2
u
n
−h
f +O(h
2
)
Finally, from observation 3 (allowing us to drop higher order terms −h
2
A
1
A
2
u
n
):
u
n+1
= [ I +hA
1
][ I +hA
2
]
u
n
−h
f +O(h
2
) (12.4)
Notice that Eqs. 12.3 and 12.4 have the same formal order of accuracy and, in
this sense, neither one is to be preferred over the other. However, their numerical
stability can be quite diﬀerent, and techniques to carry out their numerical evaluation
can have arithmetic operation counts that vary by orders of magnitude. Both of these
considerations are investigated later. Here we seek only to apply to some simple cases
the concept of factoring.
12.2 Factoring Physical Representations — Time
Splitting
Suppose we have a PDE that represents both the processes of convection and dissi
pation. The semidiscrete approach to its solution might be put in the form
d
u
dt
= A
c
u + A
d
u +
(bc) (12.5)
where A
c
and A
d
are matrices representing the convection and dissipation terms,
respectively; and their sum forms the A matrix we have considered in the previous
sections. Choose again the explicit Euler time march so that
u
n+1
= [ I +hA
d
+hA
c
]
u
n
+h
(bc) +O(h
2
) (12.6)
1
Secondorder timemarching methods are considered later.
12.2. FACTORING PHYSICAL REPRESENTATIONS — TIME SPLITTING 219
Now consider the factored form
u
n+1
= [ I +hA
d
]
[ I +hA
c
]
u
n
+h
(bc)
= [ I +hA
d
+hA
c
]
u
n
+h
(bc)
. .. .
Original Unfactored Terms
+ h
2
A
d
A
c
u
n
+
(bc)
. .. .
Higher Order Terms
+O(h
2
) (12.7)
and we see that Eq. 12.7 and the original unfactored form Eq. 12.6 have identical
orders of accuracy in the time approximation. Therefore, on this basis, their selection
is arbitrary. In practical applications
2
equations such as 12.7 are often applied in a
predictorcorrector sequence. In this case one could write
˜ u
n+1
= [ I +hA
c
]
u
n
+h
(bc)
u
n+1
= [ I +hA
d
]˜ u
n+1
(12.8)
Factoring can also be useful to form split combinations of implicit and explicit
techniques. For example, another way to approximate Eq. 12.6 with the same order
of accuracy is given by the expression
u
n+1
= [ I −hA
d
]
−1
[ I +hA
c
]
u
n
+h
(bc)
= [ I +hA
d
+hA
c
]
u
n
+h
(bc)
. .. .
Original Unfactored Terms
+O(h
2
) (12.9)
where in this approximation we have used the fact that
[ I −hA
d
]
−1
= I +hA
d
+h
2
A
2
d
+· · ·
if h · A
d
 < 1, where A
d
 is some norm of [A
d
]. This time a predictorcorrector
interpretation leads to the sequence
˜ u
n+1
= [ I +hA
c
]
u
n
+h
(bc)
[ I −hA
d
]
u
n+1
= ˜ u
n+1
(12.10)
The convection operator is applied explicitly, as before, but the diﬀusion operator is
now implicit, requiring a tridiagonal solver if the diﬀusion term is central diﬀerenced.
Since numerical stiﬀness is generally much more severe for the diﬀusion process, this
factored form would appear to be superior to that provided by Eq. 12.8. However,
the important aspect of stability has yet to be discussed.
2
We do not suggest that this particular method is suitable for use. We have yet to determine its
stability, and a ﬁrstorder timemarch method is usually unsatisfactory.
220 CHAPTER 12. SPLIT AND FACTORED FORMS
We should mention here that Eq. 12.9 can be derived for a diﬀerent point of view
by writing Eq. 12.6 in the form
u
n+1
−u
n
h
= A
c
u
n
+ A
d
u
n+1
+
(bc) +O(h
2
)
Then
[ I −hA
d
]u
n+1
= [ I +hA
c
]u
n
+h
(bc)
which is identical to Eq. 12.10.
12.3 Factoring Space Matrix Operators in 2–D
12.3.1 Mesh Indexing Convention
Factoring is widely used in codes designed for the numerical solution of equations
governing unsteady two and threedimensional ﬂows. Let us study the basic concept
of factoring by inspecting its use on the linear 2D scalar PDE that models diﬀusion:
∂u
∂t
=
∂
2
u
∂x
2
+
∂
2
u
∂y
2
(12.11)
We begin by reducing this PDE to a coupled set of ODE’s by diﬀerencing the space
derivatives and inspecting the resulting matrix operator.
A clear description of a matrix ﬁnitediﬀerence operator in 2 and 3Drequires some
reference to a mesh. We choose the 3 × 4 point mesh
3
shown in the Sketch 12.12.
In this example M
x
, the number of (interior) x points, is 4 and M
y
, the number of
(interior) y points is 3. The numbers 11 , 12 , · · · , 43 represent the location in the
mesh of the dependent variable bearing that index. Thus u
32
represents the value of
u at j = 3 and k = 2.
M
y
13 23 33 43
k 12 22 32 42
1 11 21 31 41
1 j · · · M
x
Mesh indexing in 2D.
(12.12)
3
This could also be called a 5 × 6 point mesh if the boundary points (labeled in the sketch)
were included, but in these notes we describe the size of a mesh by the number of interior points.
12.3. FACTORING SPACE MATRIX OPERATORS IN 2–D 221
12.3.2 Data Bases and Space Vectors
The dimensioned array in a computer code that allots the storage locations of the
dependent variable(s) is referred to as a database. There are many ways to lay out
a database. Of these, we consider only two: (1), consecutively along rows that are
themselves consecutive from k = 1 to M
y
, and (2), consecutively along columns that
are consecutive from j = 1 to M
x
. We refer to each row or column group as a
space vector (they represent data along lines that are continuous in space) and label
their sum with the symbol U. In particular, (1) and (2) above are referred to as x
vectors and yvectors, respectively. The symbol U by itself is not enough to identify
the structure of the database and is used only when the structure is immaterial or
understood.
To be speciﬁc about the structure, we label a data–base composed of xvectors with
U
(x)
, and one composed of yvectors with U
(y)
. Examples of the order of indexing
for these space vectors are given in Eq. 12.16 part a and b.
12.3.3 Data Base Permutations
The two vectors (arrays) are related by a permutation matrix P such that
U
(x)
= P
xy
U
(y)
and U
(y)
= P
yx
U
(x)
(12.13)
where
P
yx
= P
T
xy
= P
−1
xy
Now consider the structure of a matrix ﬁnitediﬀerence operator representing 3
point centraldiﬀerencing schemes for both space derivatives in two dimensions. When
the matrix is multiplying a space vector U, the usual (but ambiguous) representation
is given by A
x+y
. In this notation the ODE form of Eq. 12.11 can be written
4
dU
dt
= A
x+y
U +
(bc) (12.14)
If it is important to be speciﬁc about the database structure, we use the notation
A
(x)
x+y
or A
(y)
x+y
, depending on the data–base chosen for the U it multiplies. Examples
are in Eq. 12.16 part a and b. Notice that the matrices are not the same although
they represent the same derivative operation. Their structures are similar, however,
and they are related by the same permutation matrix that relates U
(x)
to U
(y)
. Thus
A
(x)
x+y
= P
xy
· A
(y)
x+y
· P
yx
(12.15)
4
Notice that A
x+y
and U, which are notations used in the special case of space vectors, are
subsets of A and
u, used in the previous sections.
222 CHAPTER 12. SPLIT AND FACTORED FORMS
A
(x)
x+y
· U
(x)
=
• x  o 
x • x  o 
x • x  o 
x •  o 
o  • x  o
o  x • x  o
o  x • x  o
o  x •  o
 o  • x
 o  x • x
 o  x • x
 o  x •
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
·
11
21
31
41
−−
12
22
32
42
−−
13
23
33
43
a:Elements in 2dimensional, centraldiﬀerence, matrix
operator, A
x+y
, for 3×4 mesh shown in Sketch 12.12.
Data base composed of M
y
x–vectors stored in U
(x)
.
Entries for x →x, for y →o, for both →•.
A
(y)
x+y
· U
(y)
=
• o  x  
o • o  x  
o •  x  
x  • o  x 
x  o • o  x 
x  o •  x 
 x  • o  x
 x  o • o  x
 x  o •  x
  x  • o
  x  o • o
  x  o •
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
·
11
12
13
−−
21
22
23
−−
31
32
33
−−
41
42
43
b: Elements in 2dimensional, centraldiﬀerence, matrix
operator, A
x+y
, for 3×4 mesh shown in Sketch 12.12.
Data base composed of M
x
y–vectors stored in U
(y)
.
Entries for x →x, for y →o, for both →•.
(12.16)
12.3. FACTORING SPACE MATRIX OPERATORS IN 2–D 223
12.3.4 Space Splitting and Factoring
We are now prepared to discuss splitting in two dimensions. It should be clear that
the matrix A
(x)
x+y
can be split into two matrices such that
A
(x)
x+y
= A
(x)
x
+A
(x)
y
(12.17)
where A
(x)
x
and A
(x)
y
are shown in Eq. 12.22. Similarily
A
(y)
x+y
= A
(y)
x
+A
(y)
y
(12.18)
where the split matrices are shown in Eq. 12.23.
The permutation relation also holds for the split matrices so
A
(x)
y
= P
xy
A
(y)
y
P
yx
and
A
(x)
x
= P
xy
A
(y)
x
P
yx
The splittings in Eqs. 12.17 and 12.18 can be combined with factoring in the
manner described in Section 12.2. As an example (ﬁrstorder in time), applying the
implicit Euler method to Eq. 12.14 gives
U
(x)
n+1
= U
(x)
n
+h
A
(x)
x
+A
(x)
y
U
(x)
n+1
+h
(bc)
or
I −hA
(x)
x
−hA
(x)
y
U
(x)
n+1
= U
(x)
n
+h
(bc) +O(h
2
) (12.19)
As in Section 12.2, we retain the same ﬁrst order accuracy with the alternative
I −hA
(x)
x
I −hA
(x)
y
U
(x)
n+1
= U
(x)
n
+h
(bc) +O(h
2
) (12.20)
Write this in predictorcorrector form and permute the data base of the second row.
There results
I −hA
(x)
x
˜
U
(x)
= U
(x)
n
+h
(bc)
I −hA
(y)
y
U
(y)
n+1
=
˜
U
(y)
(12.21)
224 CHAPTER 12. SPLIT AND FACTORED FORMS
A
(x)
x
· U
(x)
=
x x  
x x x  
x x x  
x x  
 x x 
 x x x 
 x x x 
 x x 
  x x
  x x x
  x x x
  x x
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
· U
(x)
A
(x)
y
· U
(x)
=
o  o 
o  o 
o  o 
o  o 
o  o  o
o  o  o
o  o  o
o  o  o
 o  o
 o  o
 o  o
 o  o
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
· U
(x)
The splitting of A
(x)
x+y
.
(12.22)
12.3. FACTORING SPACE MATRIX OPERATORS IN 2–D 225
A
(y)
x
· U
(y)
=
x  x  
x  x  
x  x  
x  x  x 
x  x  x 
x  x  x 
 x  x  x
 x  x  x
 x  x  x
  x  x
  x  x
  x  x
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
· U
(y)
A
(y)
y
· U
(y)
=
o o   
o o o   
o o   
 o o  
 o o o  
 o o  
  o o 
  o o o 
  o o 
   o o
   o o o
   o o
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
· U
(y)
The splitting of A
(y)
x+y
.
(12.23)
226 CHAPTER 12. SPLIT AND FACTORED FORMS
12.4 SecondOrder Factored Implicit Methods
Secondorder accuracy in time can be maintained in a certain factored implicit meth
ods. For example, apply the trapezoidal method to Eq. 12.14 where the derivative
operators have been split as in Eq. 12.17 or 12.18. Let the data base be immaterial
and the
(bc) be time invariant. There results
¸
I −
1
2
hA
x
−
1
2
hA
y
U
n+1
=
¸
I +
1
2
hA
x
+
1
2
hA
y
U
n
+h
(bc) +O(h
3
) (12.24)
Factor both sides giving
¸ ¸
I −
1
2
hA
x
¸
I −
1
2
hA
y
−
1
4
h
2
A
x
A
y
U
n+1
=
¸ ¸
I +
1
2
hA
x
¸
I +
1
2
hA
y
−
1
4
h
2
A
x
A
y
U
n
+h
(bc) +O(h
3
) (12.25)
Then notice that the combination
1
4
h
2
[A
x
A
y
](U
n+1
−U
n
) is proportional to h
3
since
the leading term in the expansion of (U
n+1
−U
n
) is proportional to h. Therefore, we
can write
¸
I −
1
2
hA
x
¸
I −
1
2
hA
y
U
n+1
=
¸
I +
1
2
hA
x
¸
I +
1
2
hA
y
U
n
+h
(bc) +O(h
3
)(12.26)
and both the factored and unfactored form of the trapezoidal method are secondorder
accurate in the time march.
An alternative form of this kind of factorization is the classical ADI (alternating
direction implicit) method
5
usually written
¸
I −
1
2
hA
x
˜
U =
¸
I +
1
2
hA
y
U
n
+
1
2
hF
n
¸
I −
1
2
hA
y
U
n+1
=
¸
I +
1
2
hA
x
˜
U +
1
2
hF
n+1
+O(h
3
) (12.27)
For idealized commuting systems the methods given by Eqs. 12.26 and 12.27 diﬀer
only in their evaluation of a timedependent forcing term.
12.5 Importance of Factored Forms in 2 and 3 Di
mensions
When the timemarch equations are stiﬀ and implicit methods are required to permit
reasonably large time steps, the use of factored forms becomes a very valuable tool
5
A form of the Douglas or PeacemanRachford methods.
12.5. IMPORTANCE OF FACTORED FORMS IN 2 AND 3 DIMENSIONS 227
for realistic problems. Consider, for example, the problem of computing the time
advance in the unfactored form of the trapezoidal method given by Eq. 12.24
¸
I −
1
2
hA
x+y
U
n+1
=
¸
I +
1
2
hA
x+y
U
n
+h
(bc)
Forming the right hand side poses no problem, but ﬁnding U
n+1
requires the solution
of a sparse, but very large, set of coupled simultaneous equations having the matrix
form shown in Eq. 12.16 part a and b. Furthermore, in real cases involving the Euler
or NavierStokes equations, each symbol (o, x, •) represents a 4 × 4 block matrix with
entries that depend on the pressure, density and velocity ﬁeld. Suppose we were to
solve the equations directly. The forward sweep of a simple Gaussian elimination ﬁlls
6
all of the 4 × 4 blocks between the main and outermost diagonal
7
(e.g. between •
and o in Eq. 12.16 part b.). This must be stored in computer memory to be used to
ﬁnd the ﬁnal solution in the backward sweep. If N
e
represents the order of the small
block matrix (4 in the 2D Euler case), the approximate memory requirement is
(N
e
×M
y
) · (N
e
×M
y
) · M
x
ﬂoating point words. Here it is assumed that M
y
< M
x
. If M
y
> M
x
, M
y
and M
x
would be interchanged. A moderate mesh of 60 × 200 points would require over 11
million words to ﬁnd the solution. Actually current computer power is able to cope
rather easily with storage requirements of this order of magnitude. With computing
speeds of over one gigaﬂop,
8
direct solvers may become useful for ﬁnding steadystate
solutions of practical problems in two dimensions. However, a threedimensional
solver would require a memory of approximatly
N
2
e
· M
2
y
· M
2
z
· M
x
words and, for well resolved ﬂow ﬁelds, this probably exceeds memory availability for
some time to come.
On the other hand, consider computing a solution using the factored implicit equa
tion 12.25. Again computing the right hand side poses no problem. Accumulate the
result of such a computation in the array (RHS). One can then write the remaining
terms in the twostep predictorcorrector form
¸
I −
1
2
hA
(x)
x
˜
U
(x)
= (RHS)
(x)
¸
I −
1
2
hA
(y)
y
U
(y)
n+1
=
˜
U
(y)
(12.28)
6
For matrices as small as those shown there are many gaps in this “ﬁll”, but for meshes of
practical size the ﬁll is mostly dense.
7
The lower band is also computed but does not have to be saved unless the solution is to be
repeated for another vector.
8
One billion ﬂoatingpoint operations per second.
228 CHAPTER 12. SPLIT AND FACTORED FORMS
which has the same appearance as Eq. 12.21 but is secondorder time accurate. The
ﬁrst step would be solved using M
y
uncoupled block tridiagonal solvers
9
. Inspecting
the top of Eq. 12.22, we see that this is equivalent to solving M
y
onedimensional
problems, each with M
x
blocks of order N
e
. The temporary solution
˜
U
(x)
would then
be permuted to
˜
U
(y)
and an inspection of the bottom of Eq. 12.23 shows that the ﬁnal
step consists of solving M
x
onedimensional implicit problems each with dimension
M
y
.
12.6 The Delta Form
Clearly many ways can be devised to split the matrices and generate factored forms.
One way that is especially useful, for ensuring a correct steadystate solution in a
converged timemarch, is referred to as the “delta form” and we develop it next.
Consider the unfactored form of the trapezoidal method given by Eq. 12.24, and
let the
(bc) be time invariant:
¸
I −
1
2
hA
x
−
1
2
hA
y
U
n+1
=
¸
I +
1
2
hA
x
+
1
2
hA
y
U
n
+h
(bc) +O(h
3
)
From both sides subtract
¸
I −
1
2
hA
x
−
1
2
hA
y
U
n
leaving the equality unchanged. Then, using the standard deﬁnition of the diﬀerence
operator ∆,
∆U
n
= U
n+1
−U
n
one ﬁnds
¸
I −
1
2
hA
x
−
1
2
hA
y
∆U
n
= h
A
x+y
U
n
+
(bc)
+O(h
3
) (12.29)
Notice that the right side of this equation is the product of h and a term that is
identical to the right side of Eq. 12.14, our original ODE. Thus, if Eq. 12.29 converges,
it is guaranteed to converge to the correct steadystate solution of the ODE. Now we
can factor Eq. 12.29 and maintain O(h
2
) accuracy. We arrive at the expression
¸
I −
1
2
hA
x
¸
I −
1
2
hA
y
∆U
n
= h
A
x+y
U
n
+
(bc)
+O(h
3
) (12.30)
This is the delta form of a factored, 2ndorder, 2D equation.
9
A block tridiagonal solver is similar to a scalar solver except that small block matrix operations
replace the scalar ones, and matrix multiplications do not commute.
12.7. PROBLEMS 229
The point at which the factoring is made may not aﬀect the order of timeaccuracy,
but it can have a profound eﬀect on the stability and convergence properties of a
method. For example, the unfactored form of a ﬁrstorder method derived from the
implicit Euler time march is given by Eq. 12.19, and if it is immediately factored,
the factored form is presented in Eq. 12.20. On the other hand, the delta form of the
unfactored Eq. 12.19 is
[I −hA
x
−hA
y
]∆U
n
= h
A
x+y
U
n
+
(bc)
and its factored form becomes
10
[I −hA
x
][I −hA
y
]∆U
n
= h
A
x+y
U
n
+
(bc)
(12.31)
In spite of the similarities in derivation, we will see in the next chapter that the
convergence properties of Eq. 12.20 and Eq. 12.31 are vastly diﬀerent.
12.7 Problems
1. Consider the 1D heat equation:
∂u
∂t
= ν
∂
2
u
∂x
2
0 ≤ x ≤ 9
Let u(0, t) = 0 and u(9, t) = 0, so that we can simplify the boundary conditions.
Assume that second order central diﬀerencing is used, i.e.,
(δ
xx
u)
j
=
1
∆x
2
(u
j−1
−2u
j
+u
j+1
)
The uniform grid has ∆x = 1 and 8 interior points.
(a) Space vector deﬁnition
i. What is the space vector for the natural ordering (monotonically in
creasing in index), u
(1)
? Only include the interior points.
ii. If we reorder the points with the odd points ﬁrst and then the even
points, write the space vector, u
(2)
?
iii. Write down the permutation matrices,(P
12
,P
21
).
10
Notice that the only diﬀerence between the O(h
2
) method given by Eq. 12.30 and the O(h)
method given by Eq. 12.31 is the appearance of the factor
1
2
on the left side of the O(h
2
) method.
230 CHAPTER 12. SPLIT AND FACTORED FORMS
iv. The generic ODE representing the discrete form of the heat equation
is
du
(1)
dt
= A
1
u
(1)
+f
Write down the matrix A
1
. (Note f = 0, due to the boundary condi
tions) Next ﬁnd the matrix A
2
such that
du
(2)
dt
= A
2
u
(2)
Note that A
2
can be written as
A
2
=
D U
T
U D
¸
¸
¸
Deﬁne D and U.
v. Applying implicit Euler time marching, write the delta form of the
implicit algorithm. Comment on the form of the resulting implicit
matrix operator.
(b) System deﬁnition
In problem 1a, we deﬁned u
(1)
, u
(2)
, A
1
, A
2
, P
12
, and P
21
which partition
the odd points from the even points. We can put such a partitioning to
use. First deﬁne extraction operators
I
(o)
=
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
=
I
4
0
4
0
4
0
4
¸
¸
¸
I
(e)
=
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
=
0
4
0
4
0
4
I
4
¸
¸
¸
12.7. PROBLEMS 231
which extract the odd even points from u
(2)
as follows: u
(o)
= I
(o)
u
(2)
and
u
(e)
= I
(e)
u
(2)
.
i. Beginning with the ODE written in terms of u
(2)
, deﬁne a splitting
A
2
= A
o
+ A
e
, such that A
o
operates only on the odd terms, and A
e
operates only on the even terms. Write out the matrices A
o
and A
e
.
Also, write them in terms of D and U deﬁned above.
ii. Apply implicit Euler time marching to the split ODE. Write down the
delta form of the algorithm and the factored delta form. Comment on
the order of the error terms.
iii. Examine the implicit operators for the factored delta form. Comment
on their form. You should be able to argue that these are now trangu
lar matrices (a lower and an upper). Comment on the solution process
this gives us relative to the direct inversion of the original system.
232 CHAPTER 12. SPLIT AND FACTORED FORMS
Chapter 13
LINEAR ANALYSIS OF SPLIT
AND FACTORED FORMS
In Section 4.4 we introduced the concept of the representative equation, and used
it in Chapter 7 to study the stability, accuracy, and convergence properties of time
marching schemes. The question is: Can we ﬁnd a similar equation that will allow
us to evaluate the stability and convergence properties of split and factored schemes?
The answer is yes — for certain forms of linear model equations.
The analysis in this chapter is useful for estimating the stability and steadystate
properties of a wide variety of timemarching schemes that are variously referred
to as timesplit, fractionalstep, hybrid, and (approximately) factored. When these
methods are applied to practical problems, the results found from this analysis are
neither necessary nor suﬃcient to guarantee stability. However, if the results indicate
that a method has an instability, the method is probably not suitable for practical
use.
13.1 The Representative Equation for Circulant
Operators
Consider linear PDE’s with coeﬃcients that are ﬁxed in both space and time and with
boundary conditions that are periodic. We have seen that under these conditions
a semidiscrete approach can lead to circulant matrix diﬀerence operators, and we
discussed circulant eigensystems
1
in Section 4.3. In this and the following section
we assume circulant systems and our analysis depends critically on the fact that all
circulant matrices commute and have a common set of eigenvectors.
1
See also the discussion on Fourier stability analysis in Section 7.7.
233
234 CHAPTER 13. LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS
Suppose, as a result of space diﬀerencing the PDE, we arrive at a set of ODE’s
that can be written
d
u
dt
= A
a
p
u + A
b
p
u −
f(t) (13.1)
where the subscript p denotes a circulant matrix. Since both matrices have the same
set of eigenvectors, we can use the arguments made in Section 4.2.3 to uncouple the
set and form the M set of independent equations
w
1
= (λ
a
+λ
b
)
1
w
1
−g
1
(t)
.
.
.
w
m
= (λ
a
+λ
b
)
m
w
m
−g
m
(t)
.
.
.
w
M
= (λ
a
+λ
b
)
M
w
M
−g
M
(t) (13.2)
The analytic solution of the m’th line is
w
m
(t) = c
m
e
(λa+λ
b
)
m
t
+P.S.
Note that each λ
a
pairs with one, and only one
2
, λ
b
since they must share a common
eigenvector. This suggests (see Section 4.4:
The representative equation for split, circulant systems is
du
dt
= [λ
a
+λ
b
+λ
c
+· · ·]u +ae
µt
(13.3)
where λ
a
+ λ
b
+ λ
c
+ · · · are the sum of the eigenvalues in A
a
, A
b
, A
c
, · · · that
share the same eigenvector.
13.2 Example Analysis of Circulant Systems
13.2.1 Stability Comparisons of TimeSplit Methods
Consider as an example the linear convectiondiﬀusion equation:
∂u
∂t
+a
∂u
∂x
= ν
∂
2
u
∂x
2
(13.4)
2
This is to be contrasted to the developments found later in the analysis of 2D equations.
13.2. EXAMPLE ANALYSIS OF CIRCULANT SYSTEMS 235
If the space diﬀerencing takes the form
d
u
dt
= −
a
2∆x
B
p
(−1, 0, 1)
u +
ν
∆x
2
B
p
(1, −2, 1)
u (13.5)
the convection matrix operator and the diﬀusion matrix operator, can be represented
by the eigenvalues λ
c
and λ
d
, respectively, where (see Section 4.3.2):
(λ
c
)
m
=
ia
∆x
sin θ
m
(λ
d
)
m
= −
4ν
∆x
2
sin
2
θ
m
2
(13.6)
In these equations θ
m
= 2mπ/M , m = 0 , 1 , · · · , M − 1 , so that 0 ≤ θ
m
≤ 2π.
Using these values and the representative equation 13.4, we can analyze the stability
of the two forms of simple timesplitting discussed in Section 12.2. In this section we
refer to these as
1. the explicitimplicit Euler method, Eq. 12.10.
2. the explicitexplicit Euler method, Eq. 12.8.
1. The ExplicitImplicit Method
When applied to Eq. 13.4, the characteristic polynomial of this method is
P(E) = (1 −hλ
d
)E −(1 +hλ
c
)
This leads to the principal σ root
σ =
1 +i
ah
∆x
sinθ
m
1 + 4
hν
∆x
2
sin
2
θ
m
2
where we have made use of Eq. 13.6 to quantify the eigenvalues. Now introduce the
dimensionless numbers
C
n
=
ah
∆x
, Courant number
R
∆
=
a∆x
ν
, mesh Reynolds number
and we can write for the absolute value of σ
σ =
1 +C
2
n
sin
2
θ
m
1 + 4
C
n
R
∆
sin
2
θ
m
2
, 0 ≤ θ
m
≤ 2π (13.7)
236 CHAPTER 13. LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS
0
4 8
R
∆
C
n
0
0.4
0.8
1.2
C
n
R
∆
= 2/
ExplicitImplicit
0
4 8
R
∆
C
n
0
0.4
0.8
1.2
ExplicitExplicit
C
n
R
∆
= 2/ C
n
R
∆
=
/2
Figure 13.1:
˙
Stability regions for two simple timesplit methods.
A simple numerical parametric study of Eq. 13.7 shows that the critical range of
θ
m
for any combination of C
n
and R
∆
occurs when θ
m
is near 0 (or 2π). From this
we ﬁnd that the condition on C
n
and R
∆
that make σ ≈ 1 is
1 +C
2
n
sin
2
=
¸
1 + 4
C
n
R
∆
sin
2
2
2
As →0 this gives the stability region
C
n
<
2
R
∆
which is bounded by a hyperbola and shown in Fig. 13.1.
2. The ExplicitExplicit Method
An analysis similar to the one given above shows that this method produces
σ =
1 +C
2
n
sin
2
θ
m
¸
1 −4
C
n
R
∆
sin
2
θ
m
2
¸
, 0 ≤ θ
m
≤ 2π
Again a simple numerical parametric study shows that this has two critical ranges
of θ
m
, one near 0, which yields the same result as in the previous example, and the
13.2. EXAMPLE ANALYSIS OF CIRCULANT SYSTEMS 237
other near 180
o
, which produces the constraint that
C
n
<
1
2
R
∆
for R
∆
≤ 2
The resulting stability boundary is also shown in Fig. 13.1. The totaly explicit,
factored method has a much smaller region of stability when R
∆
is small, as we
should have expected.
13.2.2 Analysis of a SecondOrder TimeSplit Method
Next let us analyze a more practical method that has been used in serious compu
tational analysis of turbulent ﬂows. This method applies to a ﬂow in which there is
a combination of diﬀusion and periodic convection. The convection term is treated
explicitly using the secondorder AdamsBashforth method. The diﬀusion term is
integrated implicitly using the trapezoidal method. Our model equation is again the
linear convectiondiﬀusion equation 13.4 which we split in the fashion of Eq. 13.5. In
order to evaluate the accuracy, as well as the stability, we include the forcing func
tion in the representative equation and study the eﬀect of our hybrid, timemarching
method on the equation
u
= λ
c
u +λ
d
u +ae
µt
First let us ﬁnd expressions for the two polynomials, P(E) and Q(E). The char
acteristic polynomial follows from the application of the method to the homogeneous
equation, thus
u
n+1
= u
n
+
1
2
hλ
c
(3u
n
−u
n−1
) +
1
2
hλ
d
(u
n+1
+u
n
)
This produces
P(E) = (1 −
1
2
hλ
d
)E
2
−(1 +
3
2
hλ
c
+
1
2
hλ
d
)E +
1
2
hλ
c
The form of the particular polynomial depends upon whether the forcing function is
carried by the AB2 method or by the trapezoidal method. In the former case it is
Q(E) =
1
2
h(3E −1) (13.8)
and in the latter
Q(E) =
1
2
h(E
2
+E) (13.9)
238 CHAPTER 13. LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS
Accuracy
From the characteristic polynomial we see that there are two σ–roots and they are
given by the equation
σ =
1 +
3
2
hλ
c
+
1
2
hλ
d
±
1 +
3
2
hλ
c
+
1
2
hλ
d
2
−2hλ
c
1 −
1
2
hλ
d
2
1 −
1
2
hλ
d
(13.10)
The principal σroot follows from the plus sign and one can show
σ
1
= 1 + (λ
c
+λ
d
)h +
1
2
(λ
c
+λ
d
)
2
h
2
+
1
4
λ
3
d
+λ
c
λ
2
d
−λ
2
c
λ
d
−λ
3
c
h
3
From this equation it is clear that
1
6
λ
3
=
1
6
(λ
c
+ λ
d
)
3
does not match the coeﬃcient
of h
3
in σ
1
, so
er
λ
= O(h
3
)
Using P(e
µh
) and Q(e
µh
) to evaluate er
µ
in Section 6.6.3, one can show
er
µ
= O(h
3
)
using either Eq. 13.8 or Eq. 13.9. These results show that, for the model equation,
the hybrid method retains the secondorder accuracy of its individual components.
Stability
The stability of the method can be found from Eq. 13.10 by a parametric study of c
n
and R
∆
deﬁned in Eq. 13.7. This was carried out in a manner similar to that used
to ﬁnd the stability boundary of the ﬁrstorder explicitimplicit method in Section
13.2.1. The results are plotted in Fig. 13.2. For values of R
∆
≥ 2 this secondorder
method has a much greater region of stability than the ﬁrstorder explicitimplicit
method given by Eq. 12.10 and shown in Fig. 13.1.
13.3 The Representative Equation for SpaceSplit
Operators
Consider the 2D model
3
equations
∂u
∂t
=
∂
2
u
∂x
2
+
∂
2
u
∂y
2
(13.11)
3
The extension of the following to 3D is simple and straightforward.
13.3. THE REPRESENTATIVE EQUATION FORSPACESPLIT OPERATORS239
R
∆
0
4 8
0.4
0.8
1.2
C
n
Stable
Figure 13.2:
˙
Stability regions for the secondorder timesplit method.
and
∂u
∂t
+a
x
∂u
∂x
+a
y
∂u
∂y
= 0 (13.12)
Reduce either of these, by means of spatial diﬀerencing approximations, to the coupled
set of ODE’s:
dU
dt
= [A
x
+A
y
]U +
(bc) (13.13)
for the space vector U. The form of the A
x
and A
y
matrices for threepoint central
diﬀerencing schemes are shown in Eqs. 12.22 and 12.23 for the 3 × 4 mesh shown
in Sketch 12.12. Let us inspect the structure of these matrices closely to see how we
can diagonalize [A
x
+A
y
] in terms of the individual eigenvalues of the two matrices
considered separately.
First we write these matrices in the form
A
(x)
x
=
B
B
B
¸
¸
¸ A
(x)
y
=
˜
b
0
· I
˜
b
1
· I
˜
b
−1
· I
˜
b
0
· I
˜
b
1
· I
˜
b
−1
· I
˜
b
0
· I
¸
¸
¸
where B is a banded matrix of the form B(b
−1
, b
0
, b
1
). Now ﬁnd the block eigenvector
240 CHAPTER 13. LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS
matrix that diagonalizes B and use it to diagonalize A
(x)
x
. Thus
X
−1
X
−1
X
−1
¸
¸
¸
B
B
B
¸
¸
¸
X
X
X
¸
¸
¸ =
Λ
Λ
Λ
¸
¸
¸
where
Λ =
λ
1
λ
2
λ
3
λ
4
¸
¸
¸
¸
¸
Notice that the matrix A
(x)
y
is transparent to this transformation. That is, if we
set X ≡ diag(X)
X
−1
˜
b
0
· I
˜
b
1
· I
˜
b
−1
· I
˜
b
0
· I
˜
b
1
· I
˜
b
−1
· I
˜
b
0
· I
¸
¸
¸X =
˜
b
0
· I
˜
b
1
· I
˜
b
−1
· I
˜
b
0
· I
˜
b
1
· I
˜
b
−1
· I
˜
b
0
· I
¸
¸
¸
One now permutes the transformed system to the yvector database using the per
mutation matrix deﬁned by Eq. 12.13. There results
P
yx
· X
−1
A
(x)
x
+A
(x)
y
X · P
xy
=
λ
1
· I
λ
2
· I
λ
3
· I
λ
4
· I
¸
¸
¸
¸
¸
+
˜
B
˜
B
˜
B
˜
B
¸
¸
¸
¸
¸
¸
(13.14)
where
˜
B is the banded tridiagonal matrix B(
˜
b
−1
,
˜
b
0
,
˜
b
1
), see the bottom of Eq. 12.23.
Next ﬁnd the eigenvectors
˜
X that diagonalize the
˜
B blocks. Let
˜
B ≡ diag(
˜
B) and
˜
X ≡ diag(
˜
X) and form the second transformation
˜
X
−1
˜
B
˜
X =
˜
Λ
˜
Λ
˜
Λ
˜
Λ
¸
¸
¸
¸
¸
,
˜
Λ =
˜
λ
1
˜
λ
2
˜
λ
3
¸
¸
¸
This time, by the same argument as before, the ﬁrst matrix on the right side of
Eq. 13.14 is transparent to the transformation, so the ﬁnal result is the complete
diagonalization of the matrix A
x+y
:
˜
X
−1
· P
yx
· X
−1
A
(x)
x+y
X · P
xy
·
˜
X
=
λ
1
I +
˜
Λ
λ
2
I +
˜
Λ
λ
3
I +
˜
Λ
λ
4
I +
˜
Λ
¸
¸
¸
¸
¸
¸
(13.15)
13.3. THE REPRESENTATIVE EQUATION FORSPACESPLIT OPERATORS241
It is important to notice that:
• The diagonal matrix on the right side of Eq. 13.15 contains every possible com
bination of the individual eigenvalues of B and
˜
B.
Now we are ready to present the representative equation for two dimensional sys
tems. First reduce the PDE to ODE by some choice
4
of space diﬀerencing. This
results in a spatially split A matrix formed from the subsets
A
(x)
x
= diag(B) , A
(y)
y
= diag(
˜
B) (13.16)
where B and
˜
B are any two matrices that have linearly independent eigenvectors (this
puts some constraints on the choice of diﬀerencing schemes).
Although A
x
and A
y
do commute, this fact, by itself, does not ensure the prop
erty of “all possible combinations”. To obtain the latter property the structure of
the matrices is important. The block matrices B and
˜
B can be either circulant or
noncirculant; in both cases we are led to the ﬁnal result:
The 2–D representative equation for model linear systems is
du
dt
= [λ
x
+λ
y
]u +ae
µt
where λ
x
and λ
y
are any combination of eigenvalues from A
x
and A
y
, a and µ are
(possibly complex) constants, and where A
x
and A
y
satisfy the conditions in 13.16.
Often we are interested in ﬁnding the value of, and the convergence rate to, the
steadystate solution of the representative equation. In that case we set µ = 0 and
use the simpler form
du
dt
= [λ
x
+λ
y
]u +a (13.17)
which has the exact solution
u(t) = ce
(λx+λy)t
−
a
λ
x
+λ
y
(13.18)
4
We have used 3point central diﬀerencing in our example, but this choice was for convenience
only, and its use is not necessary to arrive at Eq. 13.15.
242 CHAPTER 13. LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS
13.4 Example Analysis of 2D Model Equations
In the following we analyze four diﬀerent methods for ﬁnding a ﬁxed, steadystate
solution to the 2D representative equation 13.16. In each case we examine
1. The stability.
2. The accuracy of the ﬁxed, steadystate solution.
3. The convergence rate to reach the steadystate.
13.4.1 The Unfactored Implicit Euler Method
Consider ﬁrst this unfactored, ﬁrstorder scheme which can then be used as a reference
case for comparison with the various factored ones. The form of the method is given
by Eq. 12.19, and when it is applied to the representative equation, we ﬁnd
(1 −h λ
x
−h λ
y
)u
n+1
= u
n
+ha
from which
P(E) = (1 −h λ
x
−h λ
y
)E −1
Q(E) = h (13.19)
giving the solution
u
n
= c
¸
1
1 −h λ
x
−h λ
y
¸
n
−
a
λ
x
+λ
y
Like its counterpart in the 1D case, this method:
1. Is unconditionally stable.
2. Produces the exact (see Eq. 13.18) steadystate solution (of the ODE) for any
h.
3. Converges very rapidly to the steadystate when h is large.
Unfortunately, however, use of this method for 2D problems is generally impractical
for reasons discussed in Section 12.5.
13.4. EXAMPLE ANALYSIS OF 2D MODEL EQUATIONS 243
13.4.2 The Factored Nondelta Form of the Implicit Euler
Method
Now apply the factored Euler method given by Eq. 12.20 to the 2D representative
equation. There results
(1 −h λ
x
)(1 −h λ
y
)u
n+1
= u
n
+ha
from which
P(E) = (1 −h λ
x
)(1 −h λ
y
)E −1
Q(E) = h (13.20)
giving the solution
u
n
= c
¸
1
(1 −h λ
x
)(1 −h λ
y
)
¸
n
−
a
λ
x
+λ
y
−hλ
x
λ
y
We see that this method:
1. Is unconditionally stable.
2. Produces a steady state solution that depends on the choice of h.
3. Converges rapidly to a steadystate for large h, but the converged solution is
completely wrong.
The method requires far less storage then the unfactored form. However, it is not
very useful since its transient solution is only ﬁrstorder accurate and, if one tries to
take advantage of its rapid convergence rate, the converged value is meaningless.
13.4.3 The Factored Delta Form of the Implicit Euler Method
Next apply Eq. 12.31 to the 2D representative equation. One ﬁnds
(1 −h λ
x
)(1 −h λ
y
)(u
n+1
−u
n
) = h(λ
x
u
n
+λ
y
u
n
+a)
which reduces to
(1 −h λ
x
)(1 −h λ
y
)u
n+1
=
1 +h
2
λ
x
λ
y
u
n
+ha
and this has the solution
u
n
= c
¸
1 +h
2
λ
x
λ
y
(1 −h λ
x
)(1 −h λ
y
)
¸
n
−
a
λ
x
+λ
y
This method:
244 CHAPTER 13. LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS
1. Is unconditionally stable.
2. Produces the exact steadystate solution for any choice of h.
3. Converges very slowly to the steady–state solution for large values of h, since
σ →1 as h →∞.
Like the factored nondelta form, this method demands far less storage than the un
factored form, as discussed in Section 12.5. The correct steady solution is obtained,
but convergence is not nearly as rapid as that of the unfactored form.
13.4.4 The Factored Delta Form of the Trapezoidal Method
Finally consider the delta form of a secondorder timeaccurate method. Apply Eq.
12.30 to the representative equation and one ﬁnds
1 −
1
2
hλ
x
1 −
1
2
hλ
y
(u
n+1
−u
n
) = h(λ
x
u
n
+λ
y
u
n
+a)
which reduces to
1 −
1
2
hλ
x
1 −
1
2
hλ
y
u
n+1
=
1 +
1
2
hλ
x
1 +
1
2
hλ
y
u
n
+ha
and this has the solution
u
n
= c
1 +
1
2
hλ
x
1 +
1
2
hλ
y
1 −
1
2
hλ
x
1 −
1
2
hλ
y
¸
¸
¸
¸
n
−
a
λ
x
+λ
y
This method:
1. Is unconditionally stable.
2. Produces the exact steady–state solution for any choice of h.
3. Converges very slowly to the steady–state solution for large values of h, since
σ →1 as h →∞.
All of these properties are identical to those found for the factored delta form of the
implicit Euler method. Since it is second order in time, it can be used when time
accuracy is desired, and the factored delta form of the implicit Euler method can be
used when a converged steadystate is all that is required.
5
A brief inspection of eqs.
12.26 and 12.27 should be enough to convince the reader that the σ’s produced by
those methods are identical to the σ produced by this method.
5
In practical codes, the value of h on the left side of the implicit equation is literally switched
from h to
1
2
h.
13.5. EXAMPLE ANALYSIS OF THE 3D MODEL EQUATION 245
13.5 Example Analysis of the 3D Model Equation
The arguments in Section 13.3 generalize to three dimensions and, under the condi
tions given in 13.16 with an A
(z)
z
included, the model 3D cases
6
have the following
representative equation (with µ = 0):
du
dt
= [λ
x
+λ
y
+λ
z
]u +a (13.21)
Let us analyze a 2ndorder accurate, factored, delta form using this equation. First
apply the trapezoidal method:
u
n+1
= u
n
+
1
2
h[(λ
x
+λ
y
+λ
z
)u
n+1
+ (λ
x
+λ
y
+λ
z
)u
n
+ 2a]
Rearrange terms:
¸
1 −
1
2
h(λ
x
+λ
y
+λ
z
)
u
n+1
=
¸
1 +
1
2
h(λ
x
+λ
y
+λ
z
)
u
n
+ha
Put this in delta form:
¸
1 −
1
2
h(λ
x
+λ
y
+λ
z
)
∆u
n
= h[(λ
x
+λ
y
+ λ
z
)u
n
+a]
Now factor the left side:
1 −
1
2
hλ
x
1 −
1
2
hλ
y
1 −
1
2
hλ
z
∆u
n
= h[(λ
x
+λ
y
+λ
z
)u
n
+a] (13.22)
This preserves second order accuracy since the error terms
1
4
h
2
(λ
x
λ
y
+λ
x
λ
z
+λ
y
λ
z
)∆u
n
and
1
8
h
3
λ
x
λ
y
λ
z
are both O(h
3
). One can derive the characteristic polynomial for Eq. 13.22, ﬁnd the
σ root, and write the solution either in the form
u
n
= c
1 +
1
2
h(λ
x
+λ
y
+λ
z
) +
1
4
h
2
(λ
x
λ
y
+λ
x
λ
z
+λ
y
λ
z
) −
1
8
h
3
λ
x
λ
y
λ
z
1 −
1
2
h(λ
x
+λ
y
+λ
z
) +
1
4
h
2
(λ
x
λ
y
+λ
x
λ
z
+λ
y
λ
z
) −
1
8
h
3
λ
x
λ
y
λ
z
¸
¸
¸
¸
n
−
a
λ
x
+λ
y
+λ
z
(13.23)
6
Eqs. 13.11 and 13.12, each with an additional term.
246 CHAPTER 13. LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS
or in the form
u
n
= c
1 +
1
2
hλ
x
1 +
1
2
hλ
y
1 +
1
2
hλ
z
−
1
4
h
3
λ
x
λ
y
λ
z
1 −
1
2
hλ
x
1 −
1
2
hλ
y
1 −
1
2
hλ
z
¸
¸
¸
¸
n
−
a
λ
x
+λ
y
+λ
z
(13.24)
It is interesting to notice that a Taylor series expansion of Eq. 13.24 results in
σ = 1 +h(λ
x
+λ
y
+λ
z
) +
1
2
h
2
(λ
x
+λ
y
+λ
z
)
2
(13.25)
+
1
4
h
3
λ
3
z
+ (2λ
y
+ 2λ
x
) +
2λ
2
y
+ 3λ
x
λ
y
+ 2λ
2
y
+λ
3
y
+ 2λ
x
λ
2
y
+ 2λ
2
x
λ
y
+λ
3
x
+· · ·
which veriﬁes the second order accuracy of the factored form. Furthermore, clearly,
if the method converges, it converges to the proper steadystate.
7
With regards to stability, it follows from Eq. 13.23 that, if all the λ’s are real and
negative, the method is stable for all h. This makes the method unconditionally stable
for the 3D diﬀusion model when it is centrally diﬀerenced in space.
Now consider what happens when we apply this method to the biconvection model,
the 3D form of Eq. 13.12 with periodic boundary conditions. In this case, central
diﬀerencing causes all of the λ’s to be imaginary with spectrums that include both
positive and negative values. Remember that in our analysis we must consider every
possible combination of these eigenvalues. First write the σ root in Eq. 13.23 in the
form
σ =
1 +iα −β +iγ
1 −iα −β +iγ
where α, β and γ are real numbers that can have any sign. Now we can always ﬁnd
one combination of the λ’s for which α, and γ are both positive. In that case since
the absolute value of the product is the product of the absolute values
σ
2
=
(1 −β)
2
+ (α +γ)
2
(1 −β)
2
+ (α −γ)
2
> 1
and the method is unconditionally unstable for the model convection problem.
From the above analysis one would come to the conclusion that the method rep
resented by Eq. 13.22 should not be used for the 3D Euler equations. In practical
cases, however, some form of dissipation is almost always added to methods that are
used to solve the Euler equations and our experience to date is that, in the presence
of this dissipation, the instability disclosed above is too weak to cause trouble.
7
However, we already knew this because we chose the delta form.
13.6. PROBLEMS 247
13.6 Problems
1. Starting with the generic ODE,
du
dt
= Au +f
we can split A as follows: A = A
1
+A
2
+A
3
+A
4
. Applying implicit Euler time
marching gives
u
n+1
−u
n
h
= A
1
u
n+1
+A
2
u
n+1
+A
3
u
n+1
+A
4
u
n+1
+f
(a) Write the factored delta form. What is the error term?
(b) Instead of making all of the split terms implicit, leave two explicit:
u
n+1
−u
n
h
= A
1
u
n+1
+A
2
u
n
+A
3
u
n+1
+A
4
u
n
+f
Write the resulting factored delta form and deﬁne the error terms.
(c) The scalar representative equation is
du
dt
= (λ
1
+λ
2
+λ
3
+λ
4
)u +a
For the fully implicit scheme of problem 1a, ﬁnd the exact solution to the
resulting scalar diﬀerence equation and comment on the stability, conver
gence, and accuracy of the converged steadystate solution.
(d) Repeat 1c for the explicitimplicit scheme of problem 1b.
248 CHAPTER 13. LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS
Appendix A
USEFUL RELATIONS AND
DEFINITIONS FROM LINEAR
ALGEBRA
A basic understanding of the fundamentals of linear algebra is crucial to our develop
ment of numerical methods and it is assumed that the reader is at least familar with
this subject area. Given below is some notation and some of the important relations
between matrices and vectors.
A.1 Notation
1. In the present context a vector is a vertical column or string. Thus
v =
v
1
v
2
.
.
.
v
m
¸
¸
¸
¸
¸
and its transpose
v
T
is the horizontal row
v
T
= [v
1
, v
2
, v
3
, . . . , v
m
] ,
v = [v
1
, v
2
, v
3
, . . . , v
m
]
T
2. A general m×m matrix A can be written
A = (a
ij
) =
a
11
a
12
· · · a
1m
a
21
a
22
· · · a
2m
.
.
.
a
m1
a
m2
· · · a
mm
¸
¸
¸
¸
¸
249
250APPENDIXA. USEFUL RELATIONS ANDDEFINITIONS FROMLINEAR ALGEBRA
3. An alternative notation for A is
A =
a
1
,
a
2
, . . . ,
a
m
and its transpose A
T
is
A
T
=
a
T
1
a
T
2
.
.
.
a
T
m
¸
¸
¸
¸
¸
¸
¸
4. The inverse of a matrix (if it exists) is written A
−1
and has the property that
A
−1
A = AA
−1
= I, where I is the identity matrix.
A.2 Deﬁnitions
1. A is symmetric if A
T
= A.
2. A is skewsymmetric or antisymmetric if A
T
= −A.
3. A is diagonally dominant if a
ii
≥
¸
j=i
a
ij
 , i = 1, 2, . . . , m and a
ii
>
¸
j=i
a
ij

for at least one i.
4. A is orthogonal if a
ij
are real and A
T
A = AA
T
= I
5.
¯
A is the complex conjugate of A.
6. P is a permutation matrix if P
v is a simple reordering of
v.
7. The trace of a matrix is
¸
i
a
ii
.
8. A is normal if A
T
A = AA
T
.
9. det[A] is the determinant of A.
10. A
H
is the conjugate transpose of A, (Hermitian).
11. If
A =
¸
a b
c d
then
det[A] = ad −bc
and
A
−1
=
1
det[A]
¸
d −b
−c a
A.3. ALGEBRA 251
A.3 Algebra
We consider only square matrices of the same dimension.
1. A and B are equal if a
ij
= b
ij
for all i, j = 1, 2, . . . , m.
2. A+ (B +C) = (C + A) +B, etc.
3. sA = (sa
ij
) where s is a scalar.
4. In general AB = BA.
5. Transpose equalities:
(A+B)
T
= A
T
+B
T
(A
T
)
T
= A
(AB)
T
= B
T
A
T
6. Inverse equalities (if the inverse exists):
(A
−1
)
−1
= A
(AB)
−1
= B
−1
A
−1
(A
T
)
−1
= (A
−1
)
T
7. Any matrix A can be expressed as the sum of a symmetric and a skewsymmetric
matrix. Thus:
A =
1
2
A+A
T
+
1
2
A−A
T
A.4 Eigensystems
1. The eigenvalue problem for a matrix A is deﬁned as
A
x = λ
x or [A−λI]
x = 0
and the generalized eigenvalue problem, including the matrix B, as
A
x = λB
x or [A −λB]
x = 0
2. If a square matrix with real elements is symmetric, its eigenvalues are all real.
If it is asymmetric, they are all imaginary.
252APPENDIXA. USEFUL RELATIONS ANDDEFINITIONS FROMLINEAR ALGEBRA
3. Gershgorin’s theorem: The eigenvalues of a matrix lie in the complex plane in
the union of circles having centers located by the diagonals with radii equal to
the sum of the absolute values of the corresponding oﬀdiagonal row elements.
4. In general, an m× m matrix A has n
x
linearly independent eigenvectors with
n
x
≤ m and n
λ
distinct eigenvalues (λ
i
) with n
λ
≤ n
x
≤ m.
5. A set of eigenvectors is said to be linearly independent if
a ·
x
m
+b ·
x
n
=
x
k
, m = n = k
for any complex a and b and for all combinations of vectors in the set.
6. If A posseses m linearly independent eigenvectors then A is diagonalizable, i.e.,
X
−1
AX = Λ
where X is a matrix whose columns are the eigenvectors,
X =
x
1
,
x
2
, . . . ,
x
m
and Λ is the diagonal matrix
Λ =
λ
1
0 · · · 0
0 λ
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 0
0 · · · 0 λ
m
¸
¸
¸
¸
¸
¸
If A can be diagonalized, its eigenvectors completely span the space, and A is
said to have a complete eigensystem.
7. If A has m distinct eigenvalues, then A is always diagonalizable, and with
each distinct eigenvalue there is one associated eigenvector, and this eigenvector
cannot be formed from a linear combination of any of the other eigenvectors.
8. In general, the eigenvalues of a matrix may not be distinct, in which case the
possibility exists that it cannot be diagonalized. If the eigenvalues of a matrix
are not distinct, but all of the eigenvectors are linearly independent, the matrix
is said to be derogatory, but it can still be diagonalized.
9. If a matrix does not have a complete set of linearly independent eigenvectors,
it cannot be diagonalized. The eigenvectors of such a matrix cannot span the
space and the matrix is said to have a defective eigensystem.
A.4. EIGENSYSTEMS 253
10. Defective matrices cannot be diagonalized but they can still be put into a com
pact form by a similarity transform, S, such that
J = S
−1
AS =
J
1
0 · · · 0
0 J
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 0
0 · · · 0 J
k
¸
¸
¸
¸
¸
¸
where there are k linearly independent eigenvectors and J
i
is either a Jordan
subblock or λ
i
.
11. A Jordan submatrix has the form
J
i
=
λ
i
1 0 · · · 0
0 λ
i
1
.
.
.
.
.
.
0 0 λ
i
.
.
. 0
.
.
.
.
.
.
.
.
. 1
0 · · · 0 0 λ
i
¸
¸
¸
¸
¸
¸
¸
¸
¸
12. Use of the transform S is known as putting A into its Jordan Canonical form.
A repeated root in a Jordan block is referred to as a defective eigenvalue. For
each Jordan submatrix with an eigenvalue λ
i
of multiplicity r, there exists one
eigenvector. The other r −1 vectors associated with this eigenvalue are referred
to as principal vectors. The complete set of principal vectors and eigenvectors
are all linearly independent.
13. Note that if P is the permutation matrix
P =
0 0 1
0 1 0
1 0 0
¸
¸
¸ , P
T
= P
−1
= P
then
P
−1
λ 1 0
0 λ 1
0 0 λ
¸
¸
¸P =
λ 0 0
1 λ 0
0 1 λ
¸
¸
¸
14. Some of the Jordan subblocks may have the same eigenvalue. For example, the
254APPENDIXA. USEFUL RELATIONS ANDDEFINITIONS FROMLINEAR ALGEBRA
matrix
λ
1
1
λ
1
1
λ
1
¸
¸
¸
λ
1
¸
λ
1
1
λ
1
¸
λ
2
1
λ
2
λ
3
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
is both defective and derogatory, having:
• 9 eigenvalues
• 3 distinct eigenvalues
• 3 Jordan blocks
• 5 linearly independent eigenvectors
• 3 principal vectors with λ
1
• 1 principal vector with λ
2
A.5 Vector and Matrix Norms
1. The spectral radius of a matrix A is symbolized by σ(A) such that
σ(A) = σ
m

max
where σ
m
are the eigenvalues of the matrix A.
2. A pnorm of the vector v is deﬁned as
v
p
=
¸
M
¸
j=1
v
j

p
¸
1/p
3. A pnorm of a matrix A is deﬁned as
A
p
= max
x=0
Av
p
v
p
A.5. VECTOR AND MATRIX NORMS 255
4. Let A and B be square matrices of the same order. All matrix norms must have
the properties
A ≥ 0, A = 0 implies A = 0
c · A = c · A
A+B ≤ A +B
A· B ≤ A · B
5. Special pnorms are
A
1
= max
j=1,···,M
¸
M
i=1
a
ij
 maximum column sum
A
2
=
σ(A
T
· A)
A
∞
= max
i=1,2,···,M
¸
M
j=1
a
ij
 maximum row sum
where A
p
is referred to as the L
p
norm of A.
6. In general σ(A) does not satisfy the conditions in 4, so in general σ(A) is not a
true norm.
7. When A is normal, σ(A) is a true norm, in fact, in this case it is the L
2
norm.
8. The spectral radius of A, σ(A), is the lower bound of all the norms of A.
256APPENDIXA. USEFUL RELATIONS ANDDEFINITIONS FROMLINEAR ALGEBRA
Appendix B
SOME PROPERTIES OF
TRIDIAGONAL MATRICES
B.1 Standard Eigensystemfor Simple Tridiagonals
In this work tridiagonal banded matrices are prevalent. It is useful to list some of
their properties. Many of these can be derived by solving the simple linear diﬀerence
equations that arise in deriving recursion relations.
Let us consider a simple tridiagonal matrix, i.e., a tridiagonal with constant scalar
elements a,b, and c, see Section 3.4. If we examine the conditions under which the
determinant of this matrix is zero, we ﬁnd (by a recursion exercise)
det[B(M : a, b, c)] = 0
if
b + 2
√
ac cos
mπ
M + 1
= 0 , m = 1, 2, · · · , M
From this it follows at once that the eigenvalues of B(a, b, c) are
λ
m
= b + 2
√
ac cos
mπ
M + 1
, m = 1, 2, · · · , M (B.1)
The righthand eigenvector of B(a, b, c) that is associated with the eigenvalue λ
m
satisﬁes the equation
B(a, b, c)
x
m
= λ
m
x
m
(B.2)
and is given by
x
m
= (x
j
)
m
=
a
c
j −1
2
sin
¸
j
mπ
M + 1
, m = 1, 2, · · · , M (B.3)
257
258 APPENDIX B. SOME PROPERTIES OF TRIDIAGONAL MATRICES
These vectors are the columns of the righthand eigenvector matrix, the elements of
which are
X = (x
jm
) =
a
c
j −1
2
sin
¸
jmπ
M + 1
,
j = 1, 2, · · · , M
m = 1, 2, · · · , M
(B.4)
Notice that if a = −1 and c = 1,
a
c
j −1
2
= e
i(j−1)
π
2
(B.5)
The lefthand eigenvector matrix of B(a, b, c) can be written
X
−1
=
2
M + 1
c
a
m−1
2
sin
¸
mjπ
M + 1
,
m = 1, 2, · · · , M
j = 1, 2, · · · , M
In this case notice that if a = −1 and c = 1
c
a
m−1
2
= e
−i(m−1)
π
2
(B.6)
B.2 Generalized Eigensystem for Simple Tridiag
onals
This system is deﬁned as follows
b c
a b c
a b
.
.
. c
a b
¸
¸
¸
¸
¸
¸
¸
¸
x
1
x
2
x
3
.
.
.
x
M
¸
¸
¸
¸
¸
¸
¸
¸
= λ
e f
d e f
d e
.
.
. f
d e
¸
¸
¸
¸
¸
¸
¸
¸
x
1
x
2
x
3
.
.
.
x
M
¸
¸
¸
¸
¸
¸
¸
¸
In this case one can show after some algebra that
det[B(a −λd, b −λe, c −λf] = 0 (B.7)
if
b −λ
m
e + 2
(a −λ
m
d)(c −λ
m
f) cos
mπ
M + 1
= 0 , m = 1, 2, · · · , M (B.8)
If we deﬁne
θ
m
=
mπ
M + 1
, p
m
= cos θ
m
B.3. THE INVERSE OF A SIMPLE TRIDIAGONAL 259
λ
m
=
eb −2(cd + af)p
2
m
+ 2p
m
(ec −fb)(ea −bd) + [(cd −af)p
m
]
2
e
2
−4fdp
2
m
The righthand eigenvectors are
x
m
=
¸
a −λ
m
d
c −λ
m
f
¸
j −1
2
sin[jθ
m
] ,
m = 1, 2, · · · , M
j = 1, 2, · · · , M
These relations are useful in studying relaxation methods.
B.3 The Inverse of a Simple Tridiagonal
The inverse of B(a, b, c) can also be written in analytic form. Let D
M
represent the
determinant of B(M : a, b, c)
D
M
≡ det[B(M : a, b, c)]
Deﬁning D
0
to be 1, it is simple to derive the ﬁrst few determinants, thus
D
0
= 1
D
1
= b
D
2
= b
2
−ac
D
3
= b
3
−2abc (B.9)
One can also ﬁnd the recursion relation
D
M
= bD
M−1
−acD
M−2
(B.10)
Eq. B.10 is a linear O∆E the solution of which was discussed in Section 4.2. Its
characteristic polynomial P(E) is P(E
2
− bE + ac) and the two roots to P(σ) = 0
result in the solution
D
M
=
1
√
b
2
−4ac
¸
b +
√
b
2
−4ac
2
¸
M+1
−
¸
b −
√
b
2
−4ac
2
¸
M+1
M = 0, 1, 2, · · · (B.11)
where we have made use of the initial conditions D
0
= 1 and D
1
= b. In the limiting
case when b
2
−4ac = 0, one can show that
D
M
= (M + 1)
b
2
M
; b
2
= 4ac
260 APPENDIX B. SOME PROPERTIES OF TRIDIAGONAL MATRICES
Then for M = 4
B
−1
=
1
D
4
D
3
−cD
2
c
2
D
1
−c
3
D
0
−aD
2
D
1
D
2
−cD
1
D
1
c
2
D
1
a
2
D
1
−aD
1
D
1
D
2
D
1
−cD
2
−a
3
D
0
a
2
D
1
−aD
2
D
3
¸
¸
¸
¸
¸
and for M = 5
B
−1
=
1
D
5
D
4
−cD
3
c
2
D
2
−c
3
D
1
c
4
D
0
−aD
3
D
1
D
3
−cD
1
D
2
c
2
D
1
D
1
−c
3
D
1
a
2
D
2
−aD
1
D
2
D
2
D
2
−cD
2
D
1
c
2
D
2
−a
3
D
1
a
2
D
1
D
1
−aD
2
D
1
D
3
D
1
−cD
3
a
4
D
0
−a
3
D
1
a
2
D
2
−aD
3
D
4
¸
¸
¸
¸
¸
¸
¸
The general element d
mn
is
Upper triangle:
m = 1, 2, · · · , M −1 ; n = m + 1, m+ 2, · · · , M
d
mn
= D
m−1
D
M−n
(−c)
n−m
/D
M
Diagonal:
n = m = 1, 2, · · · , M
d
mm
= D
M−1
D
M−m
/D
M
Lower triangle:
m = n + 1, n + 2, · · · , M ; n = 1, 2, · · · , M −1
d
mn
= D
M−m
D
n−1
(−a)
m−n
/D
M
B.4 Eigensystems of Circulant Matrices
B.4.1 Standard Tridiagonals
Consider the circulant (see Section 3.4.4) tridiagonal matrix
B
p
(M : a, b, c, ) (B.12)
B.4. EIGENSYSTEMS OF CIRCULANT MATRICES 261
The eigenvalues are
λ
m
= b + (a +c) cos
2πm
M
−i(a −c) sin
2πm
M
, m = 0, 1, 2, · · · , M −1
(B.13)
The righthand eigenvector that satisﬁes B
p
(a, b, c)
x
m
= λ
m
x
m
is
x
m
= (x
j
)
m
= e
i j (2πm/M)
, j = 0, 1, · · · , M −1 (B.14)
where i ≡
√
−1, and the righthand eigenvector matrix has the form
X = (x
jm
) = e
ij
2πm
M
,
j = 0, 1, · · · , M −1
m = 0, 1, · · · , M −1
The lefthand eigenvector matrix with elements x
is
X
−1
= (x
mj
) =
1
M
e
−im
2πj
M
,
m = 0, 1, · · · , M −1
j = 0, 1, · · · , M −1
Note that both X and X
−1
are symmetric and that X
−1
=
1
M
X
∗
, where X∗ is the
conjugate transpose of X.
B.4.2 General Circulant Systems
Notice the remarkable fact that the elements of the eigenvector matrices X and X
−1
for the tridiagonal circulant matrix given by eq. B.12 do not depend on the elements
a, b, c in the matrix. In fact, all circulant matrices of order M have the same set of
linearly independent eigenvectors, even if they are completely dense. An example of
a dense circulant matrix of order M = 4 is
b
0
b
1
b
2
b
3
b
3
b
0
b
1
b
2
b
2
b
3
b
0
b
1
b
1
b
2
b
3
b
0
¸
¸
¸
¸
¸
(B.15)
The eigenvectors are always given by eq. B.14, and further examination shows that
the elements in these eigenvectors correspond to the elements in a complex harmonic
analysis or complex discrete Fourier series.
Although the eigenvectors of a circulant matrix are independent of its elements,
the eigenvalues are not. For the element indexing shown in eq. B.15 they have the
general form
λ
m
=
M−1
¸
j=0
b
j
e
i(2πjm/M)
of which eq. B.13 is a special case.
262 APPENDIX B. SOME PROPERTIES OF TRIDIAGONAL MATRICES
B.5 Special Cases Found From Symmetries
Consider a mesh with an even number of interior points such as that shown in Fig.
B.1. One can seek from the tridiagonal matrix B(2M : a, b, a, ) the eigenvector subset
that has even symmetry when spanning the interval 0 ≤ x ≤ π. For example, we seek
the set of eigenvectors
x
m
for which
b a
a b a
a
.
.
.
.
.
. a
a b a
a b
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
x
1
x
2
.
.
.
.
.
.
x
2
x
1
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
= λ
m
x
1
x
2
.
.
.
.
.
.
x
2
x
1
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
This leads to the subsystem of order M which has the form
B(M : a,
b, a)
x
m
=
b a
a b a
a
.
.
.
.
.
. a
a b a
a b +a
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
x
m
= λ
m
x
m
(B.16)
By folding the known eigenvectors of B(2M : a, b, a) about the center, one can show
from previous results that the eigenvalues of eq. B.16 are
λ
m
= b + 2a cos
(2m−1)π
2M + 1
, m = 1, 2, · · · , M (B.17)
B.6. SPECIAL CASES INVOLVING BOUNDARY CONDITIONS 263
and the corresponding eigenvectors are
x
m
= sin
j(2m−1)π
2M + 1
,
j = 1, 2, · · · , M
Imposing symmetry about the same interval
but for a mesh with an odd number of points,
see Fig. B.1, leads to the matrix
B(M :
a, b, a) =
b a
a b a
a
.
.
.
.
.
. a
a b a
2a b
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
By folding the known eigenvalues of B(2M −
1 : a, b, a) about the center, one can show
from previous results that the eigenvalues of
eq. B.17 are
Line of Symmetry
x = 0 x = π
• ◦ ◦ ◦ ◦ ◦ ◦ •
j
= 1 2 3 4 5 6
M
j = 1 2 3
M
a. An evennumbered mesh
Line of Symmetry
x = 0 x = π
• ◦ ◦ ◦ ◦ ◦ •
j
= 1 2 3 4 5
M
j = 1 2 3
M
b. An odd–numbered mesh
Figure B.1 – Symmetrical folds for
special cases
λ
m
= b + 2a cos
(2m−1)π
2M
, m = 1, 2, · · · , M
and the corresponding eigenvectors are
x
m
= sin
j(2m−1)π
2M
, j = 1, 2, · · · , M
B.6 Special Cases Involving Boundary Conditions
We consider two special cases for the matrix operator representing the 3point central
diﬀerence approximation for the second derivative ∂
2
/∂x
2
at all points away from the
boundaries, combined with special conditions imposed at the boundaries.
264 APPENDIX B. SOME PROPERTIES OF TRIDIAGONAL MATRICES
Note: In both cases
m = 1, 2, · · · , M
j = 1, 2, · · · , M
−2 + 2 cos(α) = −4 sin
2
(α/2)
When the boundary conditions are Dirichlet on both sides,
−2 1
1 −2 1
1 −2 1
1 −2 1
1 −2
¸
¸
¸
¸
¸
¸
¸
¸
λ
m
= −2 + 2 cos
mπ
M + 1
x
m
= sin
j
mπ
M + 1
(B.18)
When one boundary condition is Dirichlet and the other is Neumann (and a diagonal
preconditioner is applied to scale the last equation),
−2 1
1 −2 1
1 −2 1
1 −2 1
1 −1
¸
¸
¸
¸
¸
¸
¸
¸
λ
m
= −2 + 2 cos
¸
(2m−1)π
2M + 1
x
m
= sin
¸
j
(2m−1)π
2M + 1
(B.19)
Appendix C
THE HOMOGENEOUS
PROPERTY OF THE EULER
EQUATIONS
The Euler equations have a special property that is sometimes useful in constructing
numerical methods. In order to examine this property, let us ﬁrst inspect Euler’s
theorem on homogeneous functions. Consider ﬁrst the scalar case. If F(u, v) satisﬁes
the identity
F(αu, αv) = α
n
F(u, v) (C.1)
for a ﬁxed n, F is called homogeneous of degree n. Diﬀerentiating both sides with
respect to α and setting α = 1 (since the identity holds for all α), we ﬁnd
u
∂F
∂u
+v
∂F
∂v
= nF(u, v) (C.2)
Consider next the theorem as it applies to systems of equations. If the vector
F(Q) satisﬁes the identity
F(αQ) = α
n
F(Q) (C.3)
for a ﬁxed n, F is said to be homogeneous of degree n and we ﬁnd
¸
∂F
∂q
¸
Q = nF(Q) (C.4)
265
266APPENDIXC. THE HOMOGENEOUS PROPERTY OF THE EULER EQUATIONS
Now it is easy to show, by direct use of eq. C.3, that both E and F in eqs. 2.11 and
2.12 are homogeneous of degree 1, and their Jacobians, A and B, are homogeneous
of degree 0 (actually the latter is a direct consequence of the former).
1
This being
the case, we notice that the expansion of the ﬂux vector in the vicinity of t
n
which,
according to eq. 6.105 can be written in general as,
E = E
n
+A
n
(Q−Q
n
) +O(h
2
)
F = F
n
+B
n
(Q−Q
n
) +O(h
2
) (C.5)
can be written
E = A
n
Q+O(h
2
)
F = B
n
Q+O(h
2
) (C.6)
since the terms E
n
− A
n
Q
n
and F
n
− B
n
Q
n
are identically zero for homogeneous
vectors of degree 1, see eq. C.4. Notice also that, under this condition, the constant
term drops out of eq. 6.106.
As a ﬁnal remark, we notice from the chain rule that for any vectors F and Q
∂F(Q)
∂x
=
¸
∂F
∂Q
¸
∂Q
∂x
= A
∂Q
∂x
(C.7)
We notice also that for a homogeneous F of degree 1, F = AQ and
∂F
∂x
= A
∂Q
∂x
+
¸
∂A
∂x
¸
Q (C.8)
Therefore, if F is homogeneous of degree 1,
¸
∂A
∂x
¸
Q = 0 (C.9)
in spite of the fact that individually [∂A/∂x] and Q are not equal to zero.
1
Note that this depends on the form of the equation of state. The Euler equations are homoge
neous if the equation of state can be written in the form p = ρf(), where is the internal energy
per unit mass.
Bibliography
[1] D. Anderson, J. Tannehill, and R. Pletcher. Computational Fluid Mechanics and
Heat Transfer. McGrawHill Book Company, New York, 1984.
[2] C. Hirsch. Numerical Computation of Internal and External Flows, volume 1,2.
John Wiley & Sons, New York, 1988.
267
Contents
1 INTRODUCTION 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Problem Speciﬁcation and Geometry Preparation . . . . . . 1.2.2 Selection of Governing Equations and Boundary Conditions 1.2.3 Selection of Gridding Strategy and Numerical Method . . . 1.2.4 Assessment and Interpretation of Results . . . . . . . . . . . 1.3 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 CONSERVATION LAWS AND THE MODEL EQUATIONS 2.1 Conservation Laws . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The NavierStokes and Euler Equations . . . . . . . . . . . . . . 2.3 The Linear Convection Equation . . . . . . . . . . . . . . . . . 2.3.1 Diﬀerential Form . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Solution in Wave Space . . . . . . . . . . . . . . . . . . . 2.4 The Diﬀusion Equation . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Diﬀerential Form . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Solution in Wave Space . . . . . . . . . . . . . . . . . . . 2.5 Linear Hyperbolic Systems . . . . . . . . . . . . . . . . . . . . . 2.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 FINITEDIFFERENCE APPROXIMATIONS 3.1 Meshes and FiniteDiﬀerence Notation . . . . . 3.2 Space Derivative Approximations . . . . . . . . 3.3 FiniteDiﬀerence Operators . . . . . . . . . . . 3.3.1 Point Diﬀerence Operators . . . . . . . . 3.3.2 Matrix Diﬀerence Operators . . . . . . . 3.3.3 Periodic Matrices . . . . . . . . . . . . . 3.3.4 Circulant Matrices . . . . . . . . . . . . iii 1 1 2 2 3 3 4 4 5 7 7 8 12 12 13 14 14 15 16 17 21 21 24 25 25 25 29 30
. . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
Constructing Diﬀerencing Schemes of Any Order . . . . . 3.4.1 Taylor Tables . . . . . . . . . . . . . . . . . . . . 3.4.2 Generalization of Diﬀerence Formulas . . . . . . . 3.4.3 Lagrange and Hermite Interpolation Polynomials 3.4.4 Practical Application of Pad´ Formulas . . . . . . e 3.4.5 Other HigherOrder Schemes . . . . . . . . . . . . 3.5 Fourier Error Analysis . . . . . . . . . . . . . . . . . . . 3.5.1 Application to a Spatial Operator . . . . . . . . . 3.6 Diﬀerence Operators at Boundaries . . . . . . . . . . . . 3.6.1 The Linear Convection Equation . . . . . . . . . 3.6.2 The Diﬀusion Equation . . . . . . . . . . . . . . . 3.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
31 31 34 35 37 38 39 39 43 44 46 47 51 52 52 53 54 54 55 57 59 61 61 62 63 65 67 68 71 72 73 73 74 74 75 77 78 80
4 THE SEMIDISCRETE APPROACH 4.1 Reduction of PDE’s to ODE’s . . . . . . . . . . . . . . . . . . . . . . 4.1.1 The Model ODE’s . . . . . . . . . . . . . . . . . . . . . . . . 4.1.2 The Generic Matrix Form . . . . . . . . . . . . . . . . . . . . 4.2 Exact Solutions of Linear ODE’s . . . . . . . . . . . . . . . . . . . . 4.2.1 Eigensystems of SemiDiscrete Linear Forms . . . . . . . . . . 4.2.2 Single ODE’s of First and SecondOrder . . . . . . . . . . . . 4.2.3 Coupled FirstOrder ODE’s . . . . . . . . . . . . . . . . . . . 4.2.4 General Solution of Coupled ODE’s with Complete Eigensystems 4.3 Real Space and Eigenspace . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Eigenvalue Spectrums for Model ODE’s . . . . . . . . . . . . . 4.3.3 Eigenvectors of the Model Equations . . . . . . . . . . . . . . 4.3.4 Solutions of the Model ODE’s . . . . . . . . . . . . . . . . . . 4.4 The Representative Equation . . . . . . . . . . . . . . . . . . . . . . 4.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 FINITEVOLUME METHODS 5.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Model Equations in Integral Form . . . . . . . . . . . . . . . . . . . 5.2.1 The Linear Convection Equation . . . . . . . . . . . . . . . 5.2.2 The Diﬀusion Equation . . . . . . . . . . . . . . . . . . . . . 5.3 OneDimensional Examples . . . . . . . . . . . . . . . . . . . . . . 5.3.1 A SecondOrder Approximation to the Convection Equation 5.3.2 A FourthOrder Approximation to the Convection Equation 5.3.3 A SecondOrder Approximation to the Diﬀusion Equation . 5.4 A TwoDimensional Example . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
5.5
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83 85 86 87 88 89 90 91 91 92 93 93 95 95 96 97 97 98 99 100 101 102 102 103 105 106 107 110 110 111 112 115 117 121 122 123 123 123
6 TIMEMARCHING METHODS FOR ODE’S 6.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Converting TimeMarching Methods to O∆E’s . . . . . . . . . . . . 6.3 Solution of Linear O∆E’s With Constant Coeﬃcients . . . . . . . . 6.3.1 First and SecondOrder Diﬀerence Equations . . . . . . . . 6.3.2 Special Cases of Coupled FirstOrder Equations . . . . . . . 6.4 Solution of the Representative O∆E’s . . . . . . . . . . . . . . . . 6.4.1 The Operational Form and its Solution . . . . . . . . . . . . 6.4.2 Examples of Solutions to TimeMarching O∆E’s . . . . . . . 6.5 The λ − σ Relation . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 Establishing the Relation . . . . . . . . . . . . . . . . . . . . 6.5.2 The Principal σRoot . . . . . . . . . . . . . . . . . . . . . . 6.5.3 Spurious σRoots . . . . . . . . . . . . . . . . . . . . . . . . 6.5.4 OneRoot TimeMarching Methods . . . . . . . . . . . . . . 6.6 Accuracy Measures of TimeMarching Methods . . . . . . . . . . . 6.6.1 Local and Global Error Measures . . . . . . . . . . . . . . . 6.6.2 Local Accuracy of the Transient Solution (erλ , σ , erω ) . . . 6.6.3 Local Accuracy of the Particular Solution (erµ ) . . . . . . . 6.6.4 Time Accuracy For Nonlinear Applications . . . . . . . . . . 6.6.5 Global Accuracy . . . . . . . . . . . . . . . . . . . . . . . . 6.7 Linear Multistep Methods . . . . . . . . . . . . . . . . . . . . . . . 6.7.1 The General Formulation . . . . . . . . . . . . . . . . . . . . 6.7.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7.3 TwoStep Linear Multistep Methods . . . . . . . . . . . . . 6.8 PredictorCorrector Methods . . . . . . . . . . . . . . . . . . . . . . 6.9 RungeKutta Methods . . . . . . . . . . . . . . . . . . . . . . . . . 6.10 Implementation of Implicit Methods . . . . . . . . . . . . . . . . . . 6.10.1 Application to Systems of Equations . . . . . . . . . . . . . 6.10.2 Application to Nonlinear Equations . . . . . . . . . . . . . . 6.10.3 Local Linearization for Scalar Equations . . . . . . . . . . . 6.10.4 Local Linearization for Coupled Sets of Nonlinear Equations 6.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 STABILITY OF LINEAR SYSTEMS 7.1 Dependence on the Eigensystem . . . 7.2 Inherent Stability of ODE’s . . . . . 7.2.1 The Criterion . . . . . . . . . 7.2.2 Complete Eigensystems . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
5. . . .3 Practical Considerations for Comparing Methods 8. . . . . . . . . . . . . . . . . Fourier Stability Analysis .1 Stiﬀness Deﬁnition for ODE’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. . . . . . . . .1. . .4. . . .2 Some Examples . . . . . . . . . . .6 7. . . . . . . . . . . . . . . .3 A Perspective . . AStable Methods . . 8. . . . . . . . .4. . . . . . . . . . . . . . . . . . . . . .2 Complete Eigensystems . . . .3. . . . . . . . . . . . .3. . . . . . . . . .1 σRoot Traces Relative to the Unit Circle . . . . .6. . . . .6 Steady Problems . . . . . . . . . .6. . .7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 The Basic Procedure . . . . . . . . . 8. . . . . . . . . . . . . . . . . . . . 8. .7. . . . . . . . . . . . . . . . .5. . . . . . . . . . . . . . . . . . 123 124 124 125 125 125 128 128 132 135 135 136 137 141 141 142 143 143 146 149 149 149 151 151 152 153 154 154 154 155 158 158 159 160 160 161 8 CHOICE OF TIMEMARCHING METHODS 8. . . . . . . . . . . . . . .4 7. . . . . . . . . . . .3 Relation to Circulant Matrices . . . . . . . . . . . . . .2 Stability for Small ∆t . . . . . .1 The Criterion . . . .1 Explicit Methods . . . . . . . . . . . .5. . . . . . . .7. . . . . . . . . . . . . . . . 7. 7. . . . . 8. . . . . . . 7. . .4 Comparing the Eﬃciency of Explicit Methods . . . . . . . . . . . . . . 8. .3 An Example Involving Periodic Convection 8. . 8. .8 7. . . . . . . . . . . . . .7.3. . . . . . . . . . . . . 8. . . .5. . . 8. . . . .2 An Example Involving Diﬀusion . 8. . .5. . . . 7. 7. . . . . . . . .6. .7 7. . . . . . . . . . . . . .3 Defective Eigensystems . . . . .3 Defective Eigensystems . . .2 Implicit Methods . . . . . . . . . . Numerical Stability of O∆E ’s . .3 Stiﬀness Classiﬁcations .2 Unconditional Stability. . 8. . .4. . . . . .2. . . . . . . . . . . . . . . . . . . . . 7. . . . . . . . . . . . . . . 7. . . . . . TimeSpace Stability and Convergence of O∆E’s . .3 Stability Contours in the Complex λh Plane.9 7. . .5 7. . . . . . . . . . . . . . . . . . . . . . . .1 Imposed Constraints .2 Relation of Stiﬀness to Space Mesh Size . . . . . . . Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Stability Concepts in the Complex σPlane . . . . . . . . . .7 Problems . . 7. . . 7. . 8. . . . . . . . . . . . . . . .1. . . . . . . . . . . 8. . . . . . . .1 Relation to λEigenvalues . . . . . . . . . . . .1 Stability for Large h. . . . . . . . . . . . . . . .3 7. . . . . . Problems . . . . . . . . . . . . . . . .1. . . .5 Coping With Stiﬀness . . . . . . . . . . . . . . . . . . Numerical Stability Concepts in the Complex λh Plane 7. . . . . . . . .2 Driving and Parasitic Eigenvalues . . 7. . . . . . . .
. . . . . . . .3. . . . . . . . . . . . . . . .1 Formulation of the Model Problem . . . . . . . . . . . . . . . . . . . . . . .1 Preconditioning the Basic Matrix . . . . 9. . . . . . . . . . . . . . . . . . . . . . . . 11.3 The Classical Methods . . . . . . . . . . . . .1 The PointJacobi System . . . .4. . . . . . . . . . . . .5 Artiﬁcial Dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9. . . . . . . . 9. . . . . . . . . . . . . . . 12 SPLIT AND FACTORED FORMS 12. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 The Ordinary Diﬀerential Equation Formulation . . . . . . . 9. .4 Upwind Schemes . . . . . . . . . . . . . . . . and the Error 9. . . .1. . . . . . . . 9. . 191 10. . . . . . . . . . . . .3. . .3 A TwoGrid Process . .2 Classical Relaxation . . . .6 Problems . . . . . . . . . . . . 12. . . .1. . . . . .4. . .2.2 FluxDiﬀerence Splitting . . . . . . . . . . 9. . . . 11. . . . . . . . . . . . . . . . .1. . . . . . . . . . . . . . . . . . . .2 Properties of the Iterative Method . . . . . . . 200 10. . . . . . . .2 The GaussSeidel System . . . . . . . . . . . . . . . . .2. . . . .2 The Basic Process . . .2 The Modiﬁed Partial Diﬀerential Equation . . . . .1. . . . . . .1 FluxVector Splitting . . . . . . 192 10. . . . . . . . . . . . 11. . . . . . . . . . . .4. . . . . . . . . . . . . . . 9. . . . 11. . . . 9. . 12. . . . .3 The ODE Approach to Classical Relaxation . . . 9. . . . . . . . . . . .6 Problems . . . . . . . . 192 10. . .4. . . . . . . . . . . . . . . . . . . . . . . .2 The Converged Solution. . . . .3 The LaxWendroﬀ Method . . . . . . . .1 Motivation . . . . . . . . . .1 The Delta Form of an Iterative Scheme . . 9. . . . .1 Eigenvector and Eigenvalue Identiﬁcation with Space Frequencies191 10. .1 OneSided FirstDerivative Space Diﬀerencing 11. . . . . . . .2 Factoring Physical Representations — Time Splitting . . . . . . 163 164 164 166 168 168 168 169 170 170 172 173 174 176 180 182 187 10 MULTIGRID 191 10. . . . . . 202 11 NUMERICAL DISSIPATION 11. .3 Factoring Space Matrix Operators in 2–D . . . . . . . . . . . . . . . . . . .4. . . . . . . . . . . . . . . . . . . . . . . . . . the Residual. 9. . . . . . . .2. . . . .2 The Model Equations . . . . . . . . . . . . . . . . . . 11. . . . . . . .9 RELAXATION METHODS 9. . . . . . . . . . . . . . .3 The SOR System . . .4 Eigensystems of the Classical Methods . . . . . . . . . .5 Nonstationary Processes .1 The Concept . . . . . . . . . . . . . . . . 9. 203 204 205 207 209 210 212 213 214 217 217 218 220 . . . .4 Problems . . . . . . . . . . . . . . . . . . . . . 9. . . . . . . . . . . . . . . .2 ODE Form of the Classical Methods . 11. . . . . . . . . . . . . . . . 9. .
. . SecondOrder Factored Implicit Methods Importance of Factored Forms in 2 and 3 The Delta Form . B. . . . . .5 Vector and Matrix Norms . . . . . . . . . . 13. .3 The Factored Delta Form of the Implicit Euler Method . . . . . . . . . . . . . . . . . 220 221 221 223 226 226 228 229 233 233 234 234 237 238 242 242 243 243 244 245 247 13 LINEAR ANALYSIS OF SPLIT AND FACTORED FORMS 13.2 Data Bases and Space Vectors . 13. . . . . . . . . . . 13. . . . . .4 12.3 The Inverse of a Simple Tridiagonal . .5 12. . . . . . . . . . . . . . . . . . . . . . . . .2 Generalized Eigensystem for Simple Tridiagonals . . . .3 Data Base Permutations . . . . 251 . . . . . . . . 249 . . . . . . . . . . B. . . 13. . . . . .7 12. .2 The Factored Nondelta Form of the Implicit Euler Method 13.2 Analysis of a SecondOrder TimeSplit Method . . . . . .1 Notation . . . . . . . . . .4 Eigensystems of Circulant Matrices . .5 Special Cases Found From Symmetries . . . . .6 Problems . . . . . . . . . . .6 12. . . . . . . . . . . . . . . . .4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Algebra . . . . . . . . . . . . . . . . . . . . . . .4. . . . . . . . . .1 The Unfactored Implicit Euler Method . . . . . . . Dimensions . . A USEFUL RELATIONS AND DEFINITIONS GEBRA A. . . . . . . . . . . . . .4 Space Splitting and Factoring . 13. . . . . . . . . . . . . . . . . . . . . . . . . . . .3. . . . . . . 254 257 257 258 259 260 260 261 262 263 B SOME PROPERTIES OF TRIDIAGONAL MATRICES B. . . .5 Example Analysis of the 3D Model Equation . . . . . . . .4 Eigensystems . . . . . B. . . . 13.1 Standard Eigensystem for Simple Tridiagonals . . . .3. . . . . .4 Example Analysis of 2D Model Equations . . . . . FROM LINEAR AL249 . . . . . . . . . 12. . . . . . .1 Standard Tridiagonals . . 13. B.3. . . . . . . . . .1 Mesh Indexing Convention . A. . A. B. . . . . 12. . . . . . .1 The Representative Equation for Circulant Operators . .4. . . . . . . . 13. . . . . . . . . . . . . . . . . . . . . . . . . . A. .4. . .4 The Factored Delta Form of the Trapezoidal Method . . . . .4. . . .3. . . . . . . . .3 The Representative Equation for SpaceSplit Operators . . . . . . 250 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 General Circulant Systems . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . .2 Deﬁnitions . . . . . . . . . . . 13. . 13. . . . . . . . . . . . . . .4. . . . .2. . . . . . . . 251 . .1 Stability Comparisons of TimeSplit Methods . . . . . . . . . . . . . . . .6 Special Cases Involving Boundary Conditions . . . . . . . 12. . . A. . . . . . . . .2. . . . . . . . . . . . . B. . . . . . . . . . . . . . . . . . . . . . . . . . .2 Example Analysis of Circulant Systems . . . . . . . B.12. . .
C THE HOMOGENEOUS PROPERTY OF THE EULER EQUATIONS265 .
Chapter 1 INTRODUCTION 1. and turbulence. as a consequence. As we shall see in a later chapter. diﬀusion. shock waves. in eﬀect. In the ﬁeld of aerodynamics. The mathematical structure is the theory of linear algebra and the attendant eigenanalysis of linear systems. often have no analytic solution. The principal such constraint was the demand for uniﬁcation. boundary layers. motivates the numerical solution of the associated partial diﬀerential equations. slip surfaces. As time went on and these attempts began to crystallize.1 Motivation The material in this book originated from attempts to understand and systemize numerical solution techniques for the partial diﬀerential equations governing the physics of ﬂuid ﬂow. These events are related to the action and interaction of phenomena such as dissipation. of course. convection. At the same time it would seem to invalidate the use of linear algebra for the classiﬁcation of the numerical methods. underlying constraints on the nature of the material began to form. Experience has shown that such is not the case. which 1 . The ultimate goal of the ﬁeld of computational ﬂuid dynamics (CFD) is to understand the physical events that occur in the ﬂow of ﬂuids around and within designated objects. Many of the most important aspects of these relations are nonlinear and. The new equations. but the authors believe the answer is aﬃrmative and present this book as justiﬁcation for that belief. can change the form of the basic partial diﬀerential equations themselves. This. the use of numerical methods to solve partial diﬀerential equations introduces an approximation that. all of these phenomena are governed by the compressible NavierStokes equations. Was there one mathematical structure which could be used to describe the behavior and results of most numerical methods in common use in the ﬁeld of ﬂuid dynamics? Perhaps the answer is arguable.
Mathematically. Independent of the speciﬁc application under study. There is nothing wrong. ﬂow conditions. they can lead to serious diﬃculties. This motivates studying the concepts of stiﬀness. All of these concepts we hope to clarify in this book. simulate the physical phenomena listed above in ways that are not exactly the same as an exact solution to the basic partial diﬀerential equation. This motivates studying the concepts of stability. This means that the errors caused by the numerical approximation result in a modiﬁed partial diﬀerential equation having additional terms that can be identiﬁed with the physics of dissipation in the ﬁrst case and dispersion in the second. The geometry may result from . factorization. producing answers that represent little. as long as the error remains in some engineering sense “small”. Since they are not precisely the same as the original equations. if any. these errors are often identiﬁed with a particular physical phenomenon on which they have a strong eﬀect. nor with deliberately directing an error to a speciﬁc physical process. if used and interpreted properly.1 Problem Speciﬁcation and Geometry Preparation The ﬁrst step involves the speciﬁcation of the problem. for example. these diﬀerences are usually referred to as truncation errors. 1. that most numerical methods in practical use for solving the nondissipative Euler equations create a modiﬁed partial diﬀerential equation that produces some form of dissipation. and the requirements of the simulation.2. the following sequence of steps generally must be followed in order to obtain a satisfactory solution. if their eﬀects are not thoroughly understood and controlled.2 CHAPTER 1. the theory associated with the numerical analysis of ﬂuid mechanics was developed predominantly by scientists deeply interested in the physics of ﬂuid ﬂow and. However. However. INTRODUCTION are the ones actually being solved by the numerical process. and probably will. even if the errors are kept small enough that they can be neglected (for engineering purposes). these methods give very useful information. Regardless of what the numerical errors are called.2 Background The ﬁeld of computational ﬂuid dynamics has a broad range of applicability. are often referred to as the modiﬁed partial diﬀerential equations. 1. with identifying an error with a physical process. the resulting simulation can still be of little practical use if ineﬃcient or inappropriate algorithms are used. It is safe to say. Thus methods are said to have a lot of “artiﬁcial viscosity” or said to be highly dispersive. including the geometry. physical reality. convergence. of course. as a consequence. On the other hand. and algorithm development in general. they can. and consistency.
the Reynolds number and Mach number for the ﬂow over an airfoil. or elements. momentum. must be selected. or iii) the details of the ﬂow at some speciﬁc location. no geometry need be supplied. BACKGROUND 3 measurements of an existing conﬁguration or may be associated with a design study. and the thinlayer NavierStokes equations. As an example of solution parameters of interest in computing the ﬂowﬁeld about an airfoil. These may be steady or unsteady and compressible or incompressible. in a design context. Flow conditions might include. The success of a simulation depends greatly on the engineering insight involved in selecting the governing equations and physical models based on the problem speciﬁcation. The ﬁrst two of these requirements are often in conﬂict and compromise is necessary. We concern ourselves here only with numerical methods requiring such a tessellation of the domain. The partial diﬀerential equations resulting from these conservation laws are referred to as the NavierStokes equations. the Euler equations. .3 Selection of Gridding Strategy and Numerical Method Next a numerical method and a strategy for dividing the ﬂow domain into cells.1. The boundary conditions which must be speciﬁed depend upon the governing equations. and energy. or mesh. one may be interested in i) the lift and pitching moment only. physical models must be chosen for processes which cannot be simulated within the speciﬁed constraints. the turnaround time required. while the NavierStokes equations require the noslip condition. for example. Boundary types which may be encountered include solid walls. Turbulence is an example of a physical process which is rarely simulated in a practical context (at the time of writing) and thus is often modelled. it is always prudent to consider solving simpliﬁed forms of the NavierStokes equations when the simpliﬁcations retain the physics which are essential to the goals of the simulation. For example. However. symmetry boundaries. Instead. an appropriate set of governing equations and boundary conditions must be selected. etc. Possible simpliﬁed governing equations include the potentialﬂow equations. and the solution parameters of interest. If necessary. 1. The requirements of the simulation include issues such as the level of accuracy needed. in the interest of eﬃciency. the Euler equations require ﬂow tangency to be enforced. It is generally accepted that the phenomena of importance to the ﬁeld of continuum ﬂuid dynamics are governed by the conservation of mass. 1. periodic boundaries. at a solid wall.2. which is known as a grid.2. a set of objectives and constraints must be speciﬁed.2 Selection of Governing Equations and Boundary Conditions Once the problem has been speciﬁed. inﬂow and outﬂow boundaries. ii) the drag as well as the lift and pitching moment.2. Alternatively.
and Pletcher [1] and Hirsch [2]. and understand most numerical methods used to ﬁnd solutions for the complete compressible NavierStokes equations. the success of a simulation can depend on appropriate choices for the problem or class of problems of interest. Tannehill. Although the model equations are extremely simple and easy to solve. we present a foundation for developing. ﬁnitevolume. analyzing. they have been carefully selected to be representative. We believe that a thorough understanding of what happens when . ﬁniteelement. when used intelligently. Fortunately.4 CHAPTER 1. 1. hybrid. which are still evolving.and threedimensional ﬂuid ﬂow simulations. of diﬃculties and complexities that arise in realistic two. with emphasis on ﬁnitediﬀerence and ﬁnitevolume methods for the Euler and NavierStokes equations. Hence their numerical solution can illustrate the properties of a given numerical method when applied to a more complicated system of equations which governs similar physical phenomena. and can be aided by sophisticated ﬂow visualization tools and error estimation techniques. we can make use of much simpler expressions. composite. including structured. Rather than presenting the details of the most advanced methods. This step can require postprocessing of the data. For example. for example calculation of forces and moments. 1. These model equations isolate certain aspects of the physics contained in the complete set of equations. or spectral methods. analyze. It is critical that the magnitude of both numerical and physicalmodel errors be well understood. Furthermore. Some of them are presented in the books by Anderson. unstructured.3 Overview It should be clear that successful simulation of ﬂuid ﬂows can involve a wide range of issues from grid generation to turbulence modelling to the applicability of various simpliﬁed forms of the NavierStokes equations. to develop. The choices of a numerical method and a gridding strategy are strongly interdependent.2. The numerical methods generally used in CFD can be classiﬁed as ﬁnitediﬀerence. Here again. the use of ﬁnitediﬀerence methods is typically restricted to structured grids.4 Assessment and Interpretation of Results Finally. and overlapping grids. Many of these issues are not addressed in this book. and understanding such methods. Instead we focus on numerical methods. the results of the simulation must be assessed and interpreted. the grid can be altered based on the solution in an approach known as solutionadaptive gridding. the socalled “model” equations. INTRODUCTION Many diﬀerent gridding strategies exist.
e. however. NOTATION 5 numerical approximations are applied to the model equations is a major ﬁrst step in making conﬁdent and competent use of numerical approximations to the Euler and NavierStokes equations. For example. usually a vector A term of order (i. Some of the abbreviations used throughout the text are listed and deﬁned below. proportional to) α . The vector symbol is used for the vectors (or column matrices) which contain the values of the dependent variable at the nodes of a grid. PDE ODE O∆E RHS P.4 Notation The notation is generally explained as it is introduced.1. Otherwise. it should be noted that.S. the use of a vector consisting of a collection of scalars should be apparent from the context and is not identiﬁed by any special notation.S. 1. the variable u can denote a scalar Cartesian velocity component in the Euler and NavierStokes equations. such as velocity. Bold type is reserved for real physical vectors. As a word of caution.. kD bc O(α) Partial diﬀerential equation Ordinary diﬀerential equation Ordinary diﬀerence equation Righthand side Particular solution of an ODE or system of ODE’s Fixed (timeinvariant) steadystate solution kdimensional space Boundary conditions. a scalar quantity in the linear convection and diﬀusion equations. although we can learn a great deal by studying numerical methods as applied to the model equations and can use that information in the design and application of numerical methods to practical problems.4. there are many aspects of practical problems which can only be understood in the context of the complete physical systems. S. and a vector consisting of a collection of scalars in our presentation of hyperbolic systems.
6 CHAPTER 1. INTRODUCTION .
They are linear partial diﬀerential equations (PDE’s) with coeﬃcients that are constant in both space and time. These equations contain many of the salient mathematical and physical features of the full NavierStokes equations.1) V (t2 ) In this equation. though.Chapter 2 CONSERVATION LAWS AND THE MODEL EQUATIONS We start out by casting our equations in the most general form. momentum. stability. 2.1 Conservation Laws Conservation laws. the integral conservationlaw form. The concepts of convection and diﬀusion are prevalent in our development of numerical methods for computational ﬂuid dynamics. mass. can be written in the following integral form: QdV − t2 t2 QdV + V (t1 ) t1 S(t) n. accuracy. and they represent phenomena of importance to the analysis of certain aspects of ﬂuid dynamic problems. will be on representative model equations. such as the Euler and NavierStokes equations and our model equations. which is natural for ﬁnitediﬀerence schemes. in particular. per unit volume. The model equations we study have two properties in common. The Euler and NavierStokes equations are brieﬂy discussed in this Chapter. and convergence. e. and the recurring use of these model equations allows us to develop a consistent framework of analysis for consistency. the convection and diﬀusion equations. The equations are then recast into divergence form.g. Q is a vector containing the set of variables which are conserved. The equation is a statement of 7 .. The main focus. which is useful in understanding the concepts involved in ﬁnitevolume schemes. and energy.FdSdt = t1 V (t) P dV dt (2.
2. 2.4) and i. by ∇. containing the ﬂux of Q per unit area per unit time. The divergence form of Eq. is the wellknown divergence operator given.FdS = P dV (2. then Eq. y. they can be written as ∂Q ∂E + =0 ∂t ∂x with Q = ρu . and z coordinate directions. On the other hand. the region of space.2 The NavierStokes and Euler Equations The NavierStokes equations form a coupled system of nonlinear PDE’s describing the conservation of mass. 2.2 is obtained by applying Gauss’s theorem to the ﬂux integral. is an area A(t) bounded by a closed contour C(t).3 and ﬁnd a solution for Q on that basis are referred to as ﬁnitediﬀerence methods. For a Newtonian ﬂuid in one dimension. j. In two dimensions. respectively. a partial derivative form of a conservation law can also be derived. and P is the rate of production of Q per unit volume per unit time. and k are unit vectors in the x. leading to ∂Q + ∇. The vector n is a unit vector normal to the surface pointing outward. in Cartesian coordinates. or tensor. If all variables are continuous in time. CONSERVATION LAWS AND THE MODEL EQUATIONS the conservation of these quantities in a ﬁnite region of space with volume V (t) and surface area S(t) over a ﬁnite interval of time t2 − t1 . momentum and energy for a ﬂuid.1 can be rewritten as d QdV + n. Many of the advanced codes written for CFD applications are based on the ﬁnitevolume concept.6) e u(e + p) + κ ∂T ∂x .F = P (2.2) dt V (t) S(t) V (t) Those methods which make various numerical approximations of the integrals in Eqs. (2. 2. F is a set of vectors. or cell. 2.2 and ﬁnd a solution for Q on that basis are referred to as ﬁnitevolume methods.5) ρ E = ρu2 + p ρu − 0 4 ∂u µ 3 ∂x 4 µu ∂u 3 ∂x (2. (2. ≡ i ∂ ∂ ∂ +j +k ∂x ∂y ∂z .1 and 2. Those methods which make various approximations of the derivatives in Eq.8 CHAPTER 2.3) ∂t where ∇.
and p. for example. THE NAVIERSTOKES AND EULER EQUATIONS 9 where ρ is the ﬂuid density. they can lead to quite diﬀerent numerical solutions in terms of shock strength and shock speed. and Pletcher [1] and Hirsch [2]. This form of the equations is called conservationlaw or conservative form. u is the velocity. Later on we will make use of the following form of the Euler equations as well: ∂Q ∂Q ∂Q +A +B =0 ∂t ∂x ∂y (2. or at least may be treated as such. Tannehill. p is the pressure. The ﬂux vectors ∂Q given above are written in terms of the primitive variables. Details can be found in Anderson. For such ﬂows. these can be written as ∂Q ∂E ∂F + + =0 ∂t ∂x ∂y with q1 ρ q ρu 2 Q = = .9) where u and v are the Cartesian velocity components. ρ. µ is the coeﬃcient of viscosity. The total energy e includes internal energy per unit volume ρ (where is the internal energy per unit mass) and kinetic energy per unit volume ρu2 /2. These equations must be supplemented by relations between µ and κ and the ﬂuid state as well as an equation of state. q3 ρv q4 e (2.7) If we neglect viscosity and heat conduction. Thus the conservative form is appropriate for solving ﬂows with features such as shock waves. e is the total energy per unit volume. Nonconservative forms can be obtained by expanding derivatives of products using the product rule or by introducing diﬀerent dependent variables. and κ is the thermal conductivity. Although nonconservative forms of the equations are analytically the same as the above form. the Euler equations are obtained. Note that the convective ﬂuxes lead to ﬁrst derivatives in space. while the viscous and heat conduction terms involve second derivatives.10) ∂F The matrices A = ∂ E and B = ∂Q are known as the ﬂux Jacobians. u.8) ρu ρu2 + p E = . v. T is the temperature.2. with no interest in the transient portion of the solution. we are often interested in the steadystate solution of the NavierStokes equations. The steady solution to the onedimensional NavierStokes equations must satisfy ∂E =0 ∂x (2. In order . In twodimensional Cartesian coordinates. such as u and p. Many ﬂows of engineering interest are steady (timeinvariant). such as the ideal gas law. ρuv u(e + p) ρv ρuv F = 2 ρv + p v(e + p) (2.2.
q3 .11) + q3 q2 2 q1 F1 q3 q2 F q1 2 F = = 2 2 F3 (γ − 1)q4 + 3−γ q3 − γ−1 q2 2 q1 2 q1 2 3 F4 q γ q4 1 3 − q γ−1 2 q2 q3 2 q1 q3 (2. where γ is the ratio of speciﬁc heats. CONSERVATION LAWS AND THE MODEL EQUATIONS to derive the ﬂux Jacobian matrices.10 CHAPTER 2.13) where γ − 1 q3 = 2 q1 2 a21 3 − γ q2 − 2 q1 2 . and q4 . q1 . From this it follows that the ﬂux Jacobian of E can be written in terms of the conservative variables as a21 ∂Ei A= = ∂qj − q2 q3 q1 q1 0 1 (3 − γ) q2 q1 q3 q1 a42 0 (1 − γ) q3 q1 q2 q1 a43 a41 γ q2 1 γ−1 0 q 0 (2.12) + q3 2 q1 We have assumed that the pressure satisﬁes p = (γ − 1)[e − ρ(u2 + v 2 )/2] from the ideal gas law. we must ﬁrst write the ﬂux vectors E and F in terms of the conservative variables. as follows: E1 2 2 (γ − 1)q + 3−γ q2 − γ−1 q3 E 4 2 q1 2 q1 2 = E= q3 q2 E3 q1 3 2 E 4 q γ q4 1 2 − q γ−1 2 q2 2 q1 q2 (2. cp /cv . q2 .
THE NAVIERSTOKES AND EULER EQUATIONS 11 a41 q2 = (γ − 1) q1 q4 = γ q1 3 q3 + q1 2 q2 q4 −γ q1 q1 q3 + q1 2 q2 q1 a42 γ − 1 q2 − 3 2 q1 q2 q1 q3 q1 2 a43 = −(γ − 1) (2. which are further discussed in Section 2. The eigenvalues of the ﬂux Jacobian matrices are purely real. Thus we also study linear hyperbolic systems of PDE’s.2. This is the deﬁning feature of hyperbolic systems of PDE’s. aspects of convective phenomena associated with coupled systems of equations such as the Euler equations are important in developing numerical methods and boundary conditions. Furthermore. This motivates the choice of our two scalar model equations associated with the physics of convection and diﬀusion.16) Derivation of the two forms of B = ∂F/∂Q is similar.14) and in terms of the primitive variables as a 21 A= −uv 0 1 0 (3 − γ)u (1 − γ)v v a42 u a43 (γ − 1) 0 0 (2.15) a41 where a21 = γu γ−1 2 3−γ 2 v − u 2 2 ue ρ a41 = (γ − 1)u(u2 + v 2 ) − γ e γ−1 (3u2 + v 2 ) a42 = γ − ρ 2 a43 = (1 − γ)uv (2. The NavierStokes equations include both convective and diﬀusive ﬂuxes. The homogeneous property of the Euler equations is discussed in Appendix C.5. .2.
t) is a scalar quantity propagating with speed a. It is easy to show by substitution that the exact solution to the linear convection equation is then u(x. simply.3 2. the process continues forever without any . No boundary condition is speciﬁed at the opposite side. t) = u0 (x − at) (2. what enters on one side of the domain must be the same as that which is leaving on the other. With periodic boundary conditions. the ﬂow being simulated is periodic. Note that the lefthand boundary is the inﬂow boundary when a is positive. the waveform travels through one boundary and reappears at the other boundary. (2) In the other type. corresponding to a wave entering the domain through this “inﬂow” boundary. without special consideration of boundaries. It represents most of the “usual” situations encountered in convecting systems. Hence the wave leaves the domain through the outﬂow boundary without distortion or reﬂection. Now let us consider a situation in which the initial condition is given by u(x.17) Here u(x. This type of phenomenon is referred to. as the convection problem. a real constant which may be positive or negative. the “outﬂow” boundary. we pay a great deal of attention to it in the initial chapters. eventually returning to its initial position. while the righthand boundary is the inﬂow boundary when a is negative.18) The initial waveform propagates unaltered with speed a to the right if a is positive and to the left if a is negative. It is the simplest to study and serves to illustrate many of the basic properties of numerical methods applied to problems involving convection. This is consistent in terms of the wellposedness of a 1st order PDE. In this case. Hence. The manner in which the boundary conditions are speciﬁed separates the following two phenomena for which this equation is a model: (1) In one type. This is referred to as the biconvection problem. 0) = u0 (x). the scalar quantity u is given on one boundary.3. CONSERVATION LAWS AND THE MODEL EQUATIONS 2.1 The Linear Convection Equation Diﬀerential Form The simplest linear model for convection and wave propagation is the linear convection equation given by the following PDE: ∂u ∂u +a =0 ∂t ∂x (2. At any given time. and the domain is inﬁnite.12 CHAPTER 2.
As we shall see later. Substituting this solution into the linear convection equation. we ﬁnd that f (t) satisﬁes the following ordinary diﬀerential equation (ODE) df = −iaκf dt which has the solution f (t) = f (0)e−iaκt (2.23) where the frequency. In order to satisfy the periodic boundary conditions. the wavenumber. a.17. We restrict our attention to initial conditions in the form u(x. one obtains M u(x.21) (2. 2. With such an initial condition. Let the domain be given by 0 ≤ x ≤ 2π. 2.19.20 gives the following solution u(x. Preserving the shape of the initial condition u0 (x) can be a diﬃcult challenge for a numerical method. An arbitrary initial waveform can be produced by summing initial conditions of the form of Eq. and the phase speed. t) = f (0)eiκ(x−at) = f (0)ei(κx−ωt) (2. 2.22) Substituting f (t) into Eq. It is a measure of the number of wavelengths within the domain.20) where the time dependence is contained in the complex function f (t). This means that the phase speed is the same for all wavenumbers. 2. 2.19) where f (0) is a complex constant. in a simulation.24) The relation between the frequency and the wavenumber is known as the dispersion relation. the solution can be written as u(x. κ. THE LINEAR CONVECTION EQUATION 13 change in the shape of the solution. ω. are related by ω = κa (2. κ must be an integer. 0) = f (0)eiκx (2.3. For M modes.24 is characteristic of wave propagation in a nondispersive medium. 0) = m=1 fm (0)eiκm x (2. Eq. most numerical methods introduce some dispersion. waves with diﬀerent wavenumbers travel at diﬀerent speeds. The linear relation given by Eq. t) = f (t)eiκx (2.3. that is. and κ is the wavenumber.25) .2 Solution in Wave Space We now examine the biconvection problem in more detail.2.
the steadystate solution must satisfy ∂2u =0 (2. giving M u(x. Other steadystate solutions are obtained if a source term g(x) is added to Eq. this parabolic PDE governs the diﬀusion of heat in one dimension.1 The Diﬀusion Equation Diﬀerential Form Diﬀusive ﬂuxes are associated with molecular motion in a continuum ﬂuid.14 CHAPTER 2. A simple linear model equation for a diﬀusive process is ∂u ∂2u =ν 2 ∂t ∂x (2. Since the wave equation is linear. t) = m=1 fm (0)eiκm (x−at) (2. which is one that satisﬁes the governing PDE with the partial derivative in time equal to zero. 2. 2. with u representing the temperature. For example. the solution is obtained by summing solutions of the form of Eq.29) . In contrast to the linear convection equation. the diﬀusion equation has a nontrivial steadystate solution. 2. Dirichlet (speciﬁed u). u must vary linearly with x at steady state such that the boundary conditions are satisﬁed. as follows: ∂u ∂2u =ν − g(x) ∂t ∂x2 giving a steady statesolution which satisﬁes ∂2 u − g(x) = 0 ∂x2 (2.26) Dispersion and dissipation resulting from a numerical approximation will cause the shape of the solution to change from that of the original waveform. CONSERVATION LAWS AND THE MODEL EQUATIONS where the wavenumbers are often ordered such that κ1 ≤ κ2 ≤ · · · ≤ κM .27) where ν is a positive real constant. In the case of Eq. Boundary conditions can be periodic.27.23. Neumann (speciﬁed ∂u/∂x). 2.27.4 2.4.28) ∂x2 Therefore.30) (2. or mixed Dirichlet/Neumann.
4. 2.e.27. It is clear that the steadystate solution is given by a linear function which satisﬁes the boundary conditions.33) where κ must be an integer in order to satisfy the boundary conditions. t) = m=1 fm (t) sin κm x + h(x) (2. i.4. 0) = m=1 fm (0) sin κm x + h(x) (2. 2.34) satisﬁes the initial and boundary conditions.37 shows that high wavenumber components (large κm ) of the solution decay more rapidly than low wavenumber components. y) = 0 ∂x2 ∂y 2 15 (2. The latter is known as the Poisson equation for nonzero g. THE DIFFUSION EQUATION In two dimensions.27 gives the following ODE for fm : dfm = −κ2 νfm m dt and we ﬁnd fm (t) = fm (0)e−κm νt 2 (2. Let the initial condition be M u(x. A solution of the form M u(x. y) is again a source term. The corresponding steady equation is ∂2u ∂2u + − g(x. Let the domain be given by 0 ≤ x ≤ π with boundary conditions u(0) = ua . . u(π) = ub . we obtain M u(x.32) While Eq.2 Solution in Wave Space We now consider a series solution to Eq.2. Substituting this form into Eq. y) ∂t ∂x2 ∂y 2 where g(x. the diﬀusion equation becomes ∂u ∂2 u ∂2 u =ν + − g(x. t) = m=1 fm (0)e−κm νt sin κm x + h(x) 2 (2.36) Substituting fm (t) into equation 2. consistent with the physics of diﬀusion. h(x) = ua + (ub − ua )x/π. 2..34. 2. Eq. 2. 2. and as Laplace’s equation for zero g.35) (2. Eq.31 is parabolic.37) The steadystate solution (t → ∞) is simply h(x).32 is elliptic.31) (2.
and noting that X and X −1 are constants.38 by X−1 . CONSERVATION LAWS AND THE MODEL EQUATIONS 2.41) where Λ is a diagonal matrix containing the eigenvalues of A.1 Thus Λ = X −1 AX (2. and X is the matrix of right eigenvectors. 2. Such a system is hyperbolic if A is diagonalizable with real eigenvalues.43) When written in this manner. this equation can also be written in the form ∂u ∂f + =0 ∂t ∂x (2.2. Premultiplying Eq. postmultiplying A by the product XX −1 .16 CHAPTER 2. this can be rewritten as ∂w ∂w +Λ =0 ∂t ∂x (2.5 Linear Hyperbolic Systems The Euler equations. Many aspects of numerical methods for such systems can be understood by studying a onedimensional constantcoeﬃcient linear system of the form ∂u ∂u +A =0 (2. The entries in the ∂u ﬂux Jacobian are aij = ∂fi ∂uj (2. Eq. For conservation laws. form a hyperbolic system of partial diﬀerential equations. are also of hyperbolic type. . t) is a vector of length m and A is a real m × m matrix.42) (2. the equations have been decoupled into m scalar equations of the form ∂wi ∂wi + λi =0 (2.38) ∂t ∂x where u = u(x. we obtain Λ ∂X −1 u ∂ X −1 AX X −1 u + =0 ∂t ∂x With w = X −1 u.8.44) ∂t ∂x 1 See Appendix A for a brief review of some basic relations and deﬁnitions from linear algebra.40) The ﬂux Jacobian for the Euler equations is derived in Section 2.39) where f is the ﬂux vector and A = ∂ f is the ﬂux Jacobian matrix. 2. such as the Maxwell equations describing the propagation of electromagnetic waves. Other systems of equations governing convection and wave propagation phenomena.
That is. u + c. where u < c. which corresponds to inﬂow for these variables.6 Problems 1. where u is the local ﬂuid velocity.2. which is the inﬂow boundary for these variables. PROBLEMS 17 The elements of w are known as characteristic variables. and c = γp/ρ is the local speed of sound. characteristic variables associated with positive eigenvalues can be speciﬁed at the left boundary. p]T as follows: ∂R ∂R +M =0 ∂t ∂x where u ρ 0 M = 0 u ρ−1 0 γp u . while u + c and u − c are associated with sound waves. Characteristic variables associated with negative eigenvalues can be speciﬁed at the right boundary. Each characteristic variable satisﬁes the linear convection equation with the speed given by the corresponding eigenvalue of A. of course. In a subsonic ﬂow. 2. wave speeds of both positive and negative sign are present. although they remain nonlinear. The speed u is associated with convection of the ﬂuid. The signs of the eigenvalues of the matrix A are also important in determining suitable boundary conditions. they must be consistent with this approach. we must ensure that our numerical methods are appropriate for wave speeds of arbitrary sign and possibly widely varying magnitudes. Therefore. u. are u. Based on the above. While other boundary condition treatments are possible. Show that the 1D Euler equations can be written in terms of the primitive variables R = [ρ. The eigenvalues of the ﬂux Jacobian matrix. While the scalar linear convection equation is clearly an excellent model equation for hyperbolic systems. in a supersonic ﬂow. we see that a hyperbolic system in the form of Eq. leading to three equations in the form of the linear convection equation. corresponding to the fact that sound waves can travel upstream in a subsonic ﬂow. where u > c. The characteristic variables each satisfy the linear convection equation with the wave speed given by the corresponding eigenvalue. the boundary conditions can be speciﬁed accordingly. and u − c. Therefore. The onedimensional Euler equations can also be diagonalized.6. 2. all of the wave speeds have the same sign. or wave speeds.38 has a solution given by the superposition of waves which can travel in either the positive or negative directions and at varying speeds.
2.2. Plot the solution at t = 1 with ν = 1. fm (0) = 0 for m > 3. 3. Find its eigenvalues and compare with those obtained in question 2. The CauchyRiemann equations are formed from the coupling of the steady compressible continuity (conservation of mass) equation ∂ρu ∂ρv + =0 ∂x ∂y and the vorticity deﬁnition ω=− ∂v ∂u + =0 ∂x ∂y . Derive the ﬂux Jacobian matrix A = ∂E/∂Q for the 1D Euler equations resulting from the conservative variable formulation (Eq. 0) = sin x deﬁned on 0 ≤ x ≤ 2π. in the form of Eq. Find the eigenvalues and eigenvectors of A. Write the 2D diﬀusion equation. Write the classical wave equation ∂2 u/∂t2 = c2 ∂ 2 u/∂x2 as a ﬁrstorder system. that is.18 CHAPTER 2. are related by a similarity transform. p = (γ − 1)(e − ρu2 /2). 1. 2. ﬁnd the necessary values of fm (0). 2. respectively. Plot the ﬁrst three basis functions used in constructing the exact solution to the diﬀusion equation in Section 2. Given the initial condition u(x. Find the eigenvalues and eigenvectors of the matrix M derived in question 1. ∂u/∂t]T . and initial conditions from Eq. 7. Eq. Find the values of fm (0) required to reproduce the initial condition at these discrete points using M = 4 with κm = m − 1. 8. in the form ∂U ∂U +A =0 ∂t ∂x where U = [∂u/∂x. 9. Show that the two matrices M and A derived in questions 1 and 3. Plot the initial condition on the domain 0 ≤ x ≤ π.25. 4. 2. write it in the form of Eq. 2. j = 0. Next consider a solution with boundary conditions ua = ub = 0. 6.) 5. (Hint: use M = 2 with κ1 = 1 and κ2 = −1.2. 2. i. (Hint: make use of the matrix S = ∂Q/∂R.33 with fm (0) = 1 for 1 ≤ m ≤ 3.5).31.e.) Next consider the same initial condition deﬁned only at x = 2πj/4. 2. CONSERVATION LAWS AND THE MODEL EQUATIONS Assume an ideal gas.. 3.4.
2. v f= −ρu . Combining the two PDE’s. (c) Determine the conditions (in terms of ρ and u) under which the system is hyperbolic. (b) Determine the eigenvalues of the ﬂux Jacobians.. (d) Are the above ﬂuxes homogeneous? (See Appendix C. For isentropic and homenthalpic ﬂow. v g= −ρv −u One approach to solving these equations is to add a timedependent term and ﬁnd the steady solution of the following equation: ∂q ∂f ∂g + + =0 ∂t ∂x ∂y (a) Find the ﬂux Jacobians of f and g with respect to q. the system is closed by the relation γ−1 2 ρ= 1− u + v2 − 1 2 1 γ−1 Note that the variables have been nondimensionalized.6.) . i. PROBLEMS 19 where ω = 0 for irrotational ﬂow. has real eigenvalues.e. we have ∂f (q) ∂g(q) + =0 ∂x ∂y where q= u .
CONSERVATION LAWS AND THE MODEL EQUATIONS .20 CHAPTER 2.
thereby producing a system of ordinary diﬀerential equations. These can be applied either to spatial derivatives or time derivatives. Inspection of this ﬁgure permits us to deﬁne the terms and notation needed to describe ﬁnite21 . One can approximate these simultaneously and then solve the resulting diﬀerence equations. 3.1 Meshes and FiniteDiﬀerence Notation The simplest mesh involving both time and space is shown in Figure 3. all the geometric variation is absorbed into variable coeﬃcients of the transformed equations. as shown in Figure 3. For this reason. Our emphasis in this chapter is on spatial derivatives. our model equations contain partial derivatives with respect to both space and time.Chapter 3 FINITEDIFFERENCE APPROXIMATIONS In common with the equations governing unsteady ﬂuid ﬂow. In this chapter. This is the approach emphasized here. we use an equispaced Cartesian system without being unduly restrictive or losing practical application. time derivatives are treated in Chapter 6. The time derivatives are approximated next. The computational space is uniform. All of the material below is presented in a Cartesian system. We emphasize the fact that quite general classes of meshes expressed in general curvilinear coordinates in physical space can be transformed to a uniform Cartesian mesh with equispaced intervals in a socalled computational space. leading to a timemarching method which produces a set of diﬀerence equations. Strategies for applying these ﬁnitediﬀerence approximations will be discussed in Chapter 4. in much of the following accuracy analysis.2. the concept of ﬁnitediﬀerence approximations to partial derivatives is presented. Alternatively.1. one can approximate the spatial derivatives ﬁrst.
t). For the ﬁrst several chapters we consider primarily the 1D case u = u(x. u. k. z. time is always a superscript and is usually enclosed with parentheses to distinguish it from an exponent. local context should make the meaning obvious. When only one variable is denoted. but when used together.1: Physical and computational spaces. The convention for subscript and superscript indexing is as follows: u(t + kh) = u([n + k]h) = un+k u(x + m∆x) = u([j + m]∆x) = uj+m u(x + m∆x. t + kh) = (n+k) uj+m (3. are functions of the independent variables t. and x.2. When n.22 CHAPTER 3. both the time and space indices appear as a subscript. l are used for other purposes (which is sometimes necessary). for example. dependence on the other is assumed.1) (3. . j.2) where ∆x is the spacing in x and ∆t the spacing in t. Then on an equispaced grid x = xj = j∆x t = tn = n∆t = nh (3. diﬀerence approximations. The mesh index for x is always j.3) Notice that when used alone. y. as shown in Figure 3. FINITEDIFFERENCE APPROXIMATIONS y x Figure 3. Note that h = ∆t throughout. the dependent variables. and that for t is always n. In general. Later k and l are used for y and z in a similar way.
(3.6) but the precise nature (and order) of the approximation is not carried in the symbol δ. ∆xj = xj+1 − xj . .3. but (with one exception) it is not unique. Other ways are used to determine its precise meaning. Thus ux will not be used to represent the ﬁrst derivative of u with respect to x. The one exception is the symbol ∆. ∂x2 etc. δx ≈ ∂x .7) When there is no subscript on ∆t or ∆x. ∂x ∂t u = ∂u .1. ∆un = un+1 − un. ∂t ∂xx u = ∂2 u . By this we mean that the symbol δ is used to represent a diﬀerence approximation to a derivative such that. Thus for partial derivatives in space or time we use interchangeably ∂x u = ∂u . etc.4) For the ordinary time derivative in the study of ODE’s we use u = du dt (3. δxx ≈ ∂xx (3.5) In this text. subscripts on dependent variables are never used to express derivatives. The notation for diﬀerence approximations follows the same philosophy.2: Spacetime grid arrangement. Derivatives are expressed according to the usual conventions. the spacing is uniform. for example. MESHES AND FINITEDIFFERENCE NOTATION t Grid or Node Points n+1 n n1 23 ∆x ∆t j2 j1 j j+1 j+2 x Figure 3. which is deﬁned such that ∆tn = tn+1 − tn . (3.
9) Now subtract uj and divide by ∆x to obtain uj+1 − uj = ∆x ∂u ∂x 1 ∂2u + (∆x) 2 ∂x2 +··· j (3. FINITEDIFFERENCE APPROXIMATIONS 3. For example. t) is continuously diﬀerentiable..10. it is clear that the discrete terms on the left side of the equation represent a ﬁrst derivative with a certain amount of error which appears on the right side of the equal sign. consider the Taylor series expansion for uj+1: uj+1 = uj + (∆x) ∂u ∂x 1 ∂2u + (∆x)2 2 ∂x2 +···+ j j 1 ∂nu (∆x)n n! ∂xn +··· j (3. It is also clear that the error depends on the grid spacing to a certain order. Then. The latter is referred to as the threepoint centered diﬀerence approximation. 3. Similarly. 3. t) with t ﬁxed. we see that the expression (uj+1 −uj )/∆x is a ﬁrstorder approximation to ∂ u .10) j Thus the expression (uj+1 −uj )/∆x is a reasonable approximation for ∂ u as long as ∂x j ∆x is small relative to some pertinent length scale. Next consider the space diﬀerence approximation (uj+1 − uj−1 )/(2∆x).12) We assume that u(x.11 shows that ∂x j (uj+1 − uj−1)/(2∆x) is a secondorder approximation to a ﬁrst derivative. and one often sees the summary result presented in the form ∂u ∂x 1 = j uj+1 − uj−1 + O(∆x2 ) 2∆x (3. The error term containing the grid spacing to the lowest power gives the order of the method.3. j (3. From Eq.8) j Local diﬀerence approximations to a given partial derivative can be formed from linear combinations of uj and uj+k for k = ±1. Expand the terms in the numerator about j and regroup the result to form the following equation uj+1 − uj−1 ∂u − 2∆x ∂x 1 ∂3 u = ∆x2 6 ∂x3 + j j 1 ∂5 u ∆x4 120 ∂x5 .11) When expressed in this manner.1 to 3.24 CHAPTER 3. 3. · · ·. consider u(x. x = j∆x and u(x + k∆x) = u(j∆x + k∆x) = uj+k . For example. ±2. . Eq.2 Space Derivative Approximations A diﬀerence approximation can be generated or evaluated by means of a simple Taylor series expansion.. following the notation convention given in Eqs. Expanding the latter term about x gives1 uj+k = uj + (k∆x) ∂u ∂x 1 ∂2u + (k∆x)2 2 ∂x2 j +···+ j 1 ∂nu (k∆x)n n! ∂xn + · · · (3.
1 FiniteDiﬀerence Operators Point Diﬀerence Operators Perhaps the most common examples of ﬁnitediﬀerence formulas are the threepoint centereddiﬀerence approximations for the ﬁrst and second derivatives:2 ∂u ∂x ∂2u ∂x2 = j = j 1 (uj+1 − uj−1) + O(∆x2 ) 2∆x (3. Notice that when we speak of “the number of points in a mesh”.13) (3.14) 1 (uj+1 − 2uj + uj−1) + O(∆x2 ) ∆x2 These are the basis for point diﬀerence operators since they give an approximation to a derivative at one discrete point in a mesh in terms of surrounding points.3. a 1 2 3 4 x= 0 − − − − j= 1 · · M b π Four point mesh. Now let us derive a matrix operator representation for the same approximation. Such additional information requires a more sophisticated formulation. FINITEDIFFERENCE OPERATORS 25 3. we mean the number of interior points excluding the boundaries. 3. . u(π) = ub and use the centered diﬀerence approximation given by Eq.3. However. 3.15) which is a point diﬀerence approximation to a second derivative.2 Matrix Diﬀerence Operators 1 (uj+1 − 2uj + uj−1 ) ∆x2 Consider the relation (δxx u)j = (3. Consider the four point mesh with boundary points at a and b shown below. u(0) = ua .3 3. neither of these expressions tells us how other points in the mesh are diﬀerenced or how boundary conditions are enforced.3. ∆x = π/(M + 1) Now impose Dirichlet boundary conditions.15 at every point in the mesh. We 2 We will derive the second derivative operator shortly.3.
none . 3.26 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS arrive at the four equations: (δxx u)1 = (δxx u)2 (δxx u)3 (δxx u)4 1 (ua − 2u1 + u2 ) ∆x2 1 = (u1 − 2u2 + u3) ∆x2 1 = (u2 − 2u3 + u4) ∆x2 1 = (u3 − 2u4 + ub) ∆x2 (3. and further.17 as δxx u = Au + bc (3.19) This example illustrates a matrix diﬀerence operator. bc = 1 ∆x2 ua 0 0 ub (3. For example.16) Writing these equations in the more suggestive form (δxx u)1 (δxx u)2 (δxx u)3 (δxx u)4 = = = = ( ua −2u1 +u2 ( u1 −2u2 +u3 ( u2 −2u3 +u4 ( u3 −2u4 +ub )/∆x2 )/∆x2 )/∆x2 )/∆x2 (3. In the extreme case of the matrix diﬀerence operator representing a spectral method. Each line of a matrix diﬀerence operator is based on a point diﬀerence operator.17) it is clear that we can express them in a vectormatrix form.20) (3. but the point operators used from line to line are not necessarily the same.18) and −2 1 1 1 −2 1 A= 2 1 −2 1 ∆x 1 −2 we can rewrite Eq. that the resulting matrix has a very special form. boundary conditions may dictate that the lines at or near the bottom or top of the matrix be modiﬁed. Introducing u= u1 u2 u3 u4 .
1 (3. c .3.): (δxx u)4 = 2 1 ∂u 3 ∆x ∂x − b 2 1 (u4 − u3 ) 3 ∆x2 (3.24) Each of these matrix diﬀerence operators is a square matrix with elements that are all zeros except for those along bands which are clustered around the central diagonal. replace the fourth line in Eq.6. 3.23) b Then the matrix operator for a threepoint centraldiﬀerencing scheme at interior points and a secondorder approximation for a Neumann condition on the right is given by −2 1 1 −2 1 = 1 ∆x2 δxx 1 −2 1 2/3 −2/3 (3. and the illustration is given for a simple tridiagonal matrix although any number of .3. ..22) where the boundary condition is ∂u ∂x = x=π ∂u ∂x (3. FINITEDIFFERENCE OPERATORS 27 of the lines is the same. b.16 by the following point operator for a Neumann boundary condition (See Section 3.25) M where the matrix dimensions are M × M . a b a b .21) As a further example. The matrix operators representing the threepoint centraldiﬀerence approximations for a ﬁrst and second derivative with Dirichlet boundary conditions on a fourpoint mesh are 0 1 1 −1 0 δx = −1 2∆x 1 0 1 −1 0 . Use of M in the argument is optional. δxx −2 1 1 1 −2 1 = 2 1 −2 1 ∆x 1 −2 (3.. c) = b c a b c . We call such a matrix a banded matrix and introduce the notation B(M : a.
29 carry more information than the point operator given by Eq. −2. 0.26) we can approximate the second derivative of u by δxx u = 1 B(1. Deﬁning u as3 u= u1 u2 u3 . 0.15. A tridiagonal matrix without constants along the bands can be expressed as B(a. In Eqs. −2. 3 ∆x ∂u ∂x T b 1 B(a. 3 . · · · . 3. 3. The ability to specify in the matrix derivative operator the exact nature of the approximation at the Note that u is a function of time only since each element corresponds to one speciﬁc spatial location.22. The arguments for a banded matrix are always odd in number and the central one always refers to the central diagonal.27 and 3. 0. −2. 3. 1)u + bc ∆x2 (3. 0.28) If we prescribe Neumann boundary conditions on the right side. . 1)u + bc ∆x2 (3. 3. We can now generalize our previous examples. b. c). the boundary conditions have been uniquely speciﬁed and it is clear that the same point operator has been applied at every point in the ﬁeld except at the boundaries.28 CHAPTER 3. · · · . as in Eqs. we ﬁnd δxx u = where a = [1. FINITEDIFFERENCE APPROXIMATIONS bands is a possibility.27) where bc stands for the vector holding the Dirichlet boundary conditions on the left and right sides: bc = 1 T 2 [ua .24 and 3. uM (3. · · · . · · · .27 and 3.29) Notice that the matrix operators given by Eqs. b.29. . ub ] ∆x (3. 2/3]T b = [−2. 1. −2/3]T bc = 1 2∆x 2 ua .
If the boundary conditions are periodic.3. or more suggestively on a circular mesh as in Figure 3. 0)u (3.3.3. b. a1 . A jshift in the point operator corresponds to a diagonal shift in the matrix operator. When the mesh is laid out on the perimeter of a circle. 3. the form of the matrix operator changes. The corresponding matrix operator has for its arguments the coeﬃcients giving the weights to the values of the function at the various locations. it doesn’t really matter where the numbering starts as long as it “ends” at the point just preceding its starting location. ∆x = 2π/M The matrix that represents diﬀerencing schemes for scalar equations on a periodic mesh is referred to as a periodic matrix. 3.30 is δx u = B(a2 .30) might be the point operator for a ﬁrst derivative. Consider the eightpoint periodic mesh shown below.3 Periodic Matrices The above illustrated cases in which the boundary conditions are ﬁxed.3. Since we make considerable use of both matrix and point operators. This can either be presented on a linear mesh with repeated entries. For example (δx u)j = a2 uj−2 + a1 uj−1 + buj + c1 uj+1 (3. it is important to establish a relation between them. A typical periodic tridiagonal matrix operator . c1 . ··· 7 8 1 2 3 4 5 6 7 8 1 x = − − 0 − − − − − − − 2π j= 0 1 · · · · · · M 2 ··· − Eight points on a linear periodic mesh. Thus the matrix equivalent of Eq. FINITEDIFFERENCE OPERATORS 29 various points in the ﬁeld including the boundaries permits the use of quite general constructions which will be useful later in considerations of stability. A point operator is generally written for some derivative at the reference point j in terms of neighboring values of the function.31) Note the addition of a zero in the ﬁfth element which makes it clear that b is the coeﬃcient of uj .
We will have much more to say about them later. as shown in the example. However. formed when the elements along the various bands are constant. with nonuniform entries is given for a 6point mesh by b1 c2 a6 a1 b2 c3 a2 b3 c4 Bp (6 : a. the elements along the diagonals of a periodic matrix are not constant. b.33) Notice that each row of a circulant matrix is shifted (see Figure 3. a special subset of a periodic matrix is the circulant matrix.30 CHAPTER 3.3: 4 Eight points on a circular mesh. Circulant matrices play a vital role in our analysis.3.4 Circulant Matrices In general. FINITEDIFFERENCE APPROXIMATIONS 1 8 2 7 3 6 5 Figure 3.3) one element to the right of the one above it. The special case of a tridiagonal circulant matrix is .32) 3. The most general circulant matrix of order 4 is b0 b3 b2 b1 b1 b0 b3 b2 b2 b1 b0 b3 b3 b2 b1 b0 (3. c) = a3 b4 c5 a4 b5 c6 c1 a5 b6 (3.
4. Later on we take advantage of this special property.. Notice that there are no boundary condition vectors since this information is all interior to the matrices themselves. As an example. 1 (3.8. consider Table 3.34) M When the standard threepoint centraldiﬀerencing approximations for a ﬁrst and second derivative.4 3.3.35) Clearly.1 Constructing Diﬀerencing Schemes of Any Order Taylor Tables The Taylor series expansion of functions about a ﬁxed point provides a means for constructing ﬁnitediﬀerence pointoperators of any order. −2. they take the form 0 1 −1 1 −1 1 0 1 (δx )p = = B (−1. 3.4. .” which makes extensive use of the expansion given by Eq. CONSTRUCTING DIFFERENCING SCHEMES OF ANY ORDER given by 31 Bp (M : a. a c b a b . c) = b c a b a c .21. are used with periodic boundary conditions.. A simple and straightforward way to carry this out is to construct a “Taylor table. which represents a Taylor table for an approximation of a second derivative using three values of the function centered about the point at which the derivative is to be evaluated. 3. b. 1) −1 0 1 2∆x p 2∆x 1 −1 0 and −2 1 1 1 1 −2 1 (δxx )p = 1 −2 1 ∆x2 1 1 −2 1 = Bp (1. . see Eq. 0.1. 1) ∆x2 (3. c . 3. these special cases of periodic operators are also circulant operators.
in this case. Each entry is the coeﬃcient of the term at the top of the corresponding column in the Taylor series expansion of the term to the left of the corresponding row. under the headings. For example.1. 1. Taylor table for centered 3point Lagrangian approximation to a second derivative. FINITEDIFFERENCE APPROXIMATIONS ∂2u ∂x2 uj ∆x2 · ∂2u ∂x2 j − j 1 (a uj−1 + b uj + c uj+1) = ? ∆x2 ∆x2 · ∂2u ∂x2 j ∆x· ∂u ∂x j ∆x3 · ∂3u ∂x3 j ∆x4 · ∂4u ∂x4 j 1 −a −b −c −a · (1) · 1 1 −c · (1) · 1 1 1 −a · (1)2 · 2 −c · (1)2 · 1 2 −a · (1)3 · 1 6 −c · (1)3 · 1 6 1 −a · (1)4 · 24 1 −c · (1)4 · 24 −a · uj−1 −b · uj −c · uj+1 Table 3. The table is constructed so that some of the algebra is simpliﬁed. 2. namely. ∆xk · ∂k u ∂xk k = 0. the last row in the table corresponds to the Taylor series expansion of −c uj+1: 1 ∂u −c uj+1 = −c uj − c · (1) · ∆x · 1 ∂x 1 ∂ u −c · (1)3 · ∆x3 · 6 ∂x3 3 1 ∂2 u − c · (1)2 · ∆x2 · 2 ∂x2 j 1 ∂ u ∆x4 · 24 ∂x4 4 j j − c · (1)4 · j − · · · (3.32 CHAPTER 3. This represents one of the questions that a study of this table can answer. . that is. make up the Taylor table. what is the local error caused by the use of this approximation? Notice that all of the terms in the equation appear in a column at the left of the table (although. At the top of the table we see an expression with a question mark. · · · j The columns to the right of the leftmost one.36) A Taylor table is simply a convenient way of forming linear combinations of Taylor series on a term by term basis. Then notice that at the head of each column there appears the common factor that occurs in the expansion of each term about the point j. ∆x2 has been multiplied into each term in order to simplify the terms to be put into the table).
To maximize the order of accuracy of the method. and c. We designate the ﬁrst nonvanishing sum to be ert . b.38) The Taylor table for a 3point backwarddiﬀerencing operator representing a ﬁrst derivative is shown in Table 3. −2. c] = [1.3. we proceed from left to right and force.2.4. We have just derived the familiar 3point centraldiﬀerencing point operator for a second derivative ∂2u ∂x2 − j 1 (uj−1 − 2uj + uj+1) = O(∆x2 ) ∆x2 (3. CONSTRUCTING DIFFERENCING SCHEMES OF ANY ORDER 33 Consider the sum of each of these columns. One can easily show that the sums of the ﬁrst three columns are zero if we satisfy the equation −1 −1 −1 a 0 0 −1 b = 0 1 −1 0 −1 c −2 The solution is given by [a. and refer to it as the Taylor series error. The columns that do not sum to zero constitute the error.37) j Note that ∆x2 has been divided through to make the error term consistent. these sums to be zero. . 1]. b. In this case ert occurs at the ﬁfth column in the table (for this example all even columns will vanish by symmetry) and one ﬁnds ert = 1 −a −c ∂4u ∆x4 + 24 ∂x4 ∆x2 24 = j −∆x2 ∂ 4 u 12 ∂x4 (3. by the proper choice of a.
39) j Thus we have derived a secondorder backwarddiﬀerence approximation of a ﬁrst derivative: ∂u ∂x − j 1 (uj−2 − 4uj−1 + 3uj ) = O(∆x2 ) 2∆x (3. Taylor table for backward 3point Lagrangian approximation to a ﬁrst derivative.4.34 CHAPTER 3. In this case the fourth column provides the leading 2 truncation error term: ert = 1 8a2 a1 ∂3u + ∆x3 ∆x 6 6 ∂x3 = j ∆x2 ∂ 3 u 3 ∂x3 (3.2. b] = 1 [1.40) 3. FINITEDIFFERENCE APPROXIMATIONS ∂u ∂x uj ∆x · ∂u ∂x j − j 1 (a2 uj−2 + a1 uj−1 + b uj ) = ? ∆x ∆x2 · ∂2u ∂x2 j ∆x· ∂u ∂x j ∆x3 · ∂3u ∂x3 j ∆x4 · ∂4u ∂x4 j 1 −a2 −a1 −b −a2 · (2) · 1 1 −a1 · (1) · 1 1 −a2 · (2)2 · 1 2 2 1 −a1 · (1) · 2 −a2 · (2)3 · 1 6 3 1 −a1 · (1) · 6 1 −a2 · (2)4 · 24 1 −a1 · (1)4 · 24 −a2 · uj−2 −a1 · uj−1 −b · uj Table 3. a1 . a diﬀerence approximation to the mth derivative at grid point j can be cast in terms of q + p + 1 neighboring points as ∂mu ∂xm q − j i=−p ai uj+i = ert (3.2 Generalization of Diﬀerence Formulas In general.41) . 3]. −4. This time the ﬁrst three columns sum to zero if −1 −1 −1 a2 0 1 0 a1 = −1 2 −4 −1 0 0 b which gives [a2 .
and zero when x takes any other discrete value in the set. 3. The construction of the ak (x) can be taken from the simple Lagrangian formula for quadratic interpolation (or extrapolation) with nonequispaced points (x1 − x)(x2 − x) (x0 − x)(x2 − x) + u1 (x1 − x0 )(x2 − x0 ) (x0 − x1 )(x2 − x1 ) (x0 − x)(x1 − x) +u2 (x0 − x2 )(x1 − x2 ) u(x) = u0 (3. CONSTRUCTING DIFFERENCING SCHEMES OF ANY ORDER 35 where the ai are coeﬃcients to be determined through the use of Taylor tables to produce approximations of a given order. that is from the point of view of interpolation formulas.42) where ak (x) are polynomials in x of degree K.44) k . impose an equispaced mesh. Hermite formulas use values of the function and its derivative(s) at given points in space. Clearly this process can be used to ﬁnd forward. and evaluate these derivatives at the appropriate discrete point. In order to do this. let us approach the subject in a slightly diﬀerent way. is the fact that it can be further generalized. To construct a polynomial for u(x).3 Lagrange and Hermite Interpolation Polynomials The Lagrangian interpolation polynomial is given by K u(x) = k=0 ak (x)uk (3. Finitediﬀerence schemes that can be derived from Eq.42 are referred to as Lagrangian approximations. 3.43) Notice that the coeﬃcient of each uk is one when x = xk . skewed.4. or central point operators of any order for any derivative. however. More important. It could be computer automated and extended to higher dimensions. we rederive the ﬁnitediﬀerence approximations just presented. A generalization of the Lagrangian approach is brought about by using Hermitian interpolation.3. backward. Our illustration is for the case in which discrete values of the function and its ﬁrst derivative are used.4. If we take the ﬁrst or second derivative of u(x). These formulas are discussed in many texts on numerical analysis. producing the expression u(x) = ak (x)uk + bk (x) ∂u ∂x (3.
Taylor table for central 3point Hermitian approximation to a ﬁrst derivative. which also must be expanded using Taylor series about point j. Here not only is the derivative at point j represented. A generalization of Eq. Consider next the Taylor table for an implicit space diﬀerencing scheme for a ﬁrst derivative arising from the use of a Hermite interpolation formula. The previous examples of a Taylor table constructed explicit point diﬀerence operators from Lagrangian interpolation formulas. 3.45) analogous to Eq. but here we need only the concept.44.3. FINITEDIFFERENCE APPROXIMATIONS Obviously higherorder derivatives could be included as the problems dictate. s bi i=−r ∂mu ∂xm q + j+i i=−p ai uj+i = ert (3. 3. An example formula is illustrated at the top of Table 3.8: ∂mu ∂xm = j+k n 1 n ∂ (k∆x) ∂xn n=0 n! ∞ ∂mu ∂xm (3. i. A complete discussion of these polynomials can be found in many references on numerical methods.. 3. .36 CHAPTER 3. but also included are derivatives at points j − 1 and j + 1.e. ∂u ∂x uj ∆x · d ∆x · ∆x · e ∂u ∂x j−1 ∂u ∂x j ∂u ∂x j+1 d + j−1 ∂u ∂x +e j ∂u ∂x ∆x · 2 ∂2u ∂x2 j − j+1 1 (auj−1 + buj + cuj+1 ) = ? ∆x ∆x3 · ∂3u ∂x3 j ∆x· ∂u ∂x j ∆x4 · ∂4u ∂x4 j ∆x5 · ∂5u ∂x5 j d 1 e −a −b −c −a · (1) · 1 1 −c · (1) · 1 1 d · (1) · 1 1 e · (1) · 1 1 2 1 −a · (1) · 2 −c · (1)2 · 1 2 d · (1)2 · 1 2 e · (1)2 · 1 2 3 1 −a · (1) · 6 −c · (1)3 · 1 6 d · (1)3 · 1 6 e · (1)3 · 1 6 1 4 −a · (1) · 24 1 −c · (1)4 · 24 1 d · (1)4 · 24 1 e · (1)4 · 24 1 −a · (1)5 · 120 1 −c · (1)5 · 120 −a · uj−1 −b · uj −c · uj+1 Table 3. This requires the following generalization of the Taylor series expansion given in Eq.3.41 can include derivatives at neighboring points.46) j The derivative terms now have coeﬃcients (the coeﬃcient on the j point is taken as one to simplify the algebra) which must be determined using the Taylor table approach as outlined below.
CONSTRUCTING DIFFERENCING SCHEMES OF ANY ORDER To maximize the order of accuracy.47) j ∂u ∂x + j ∂u ∂x − j+1 3 (−uj−1 + uj+1) = O(∆x4 ) ∆x (3. 4.4. 4. 0. 1.51) In this case the vector containing the boundary conditions would include values of both u and ∂u/∂x at both boundaries. 3.4. 4. In the form of a point operator it is probably not evident at ﬁrst just how Eq.49) in which Dirichlet boundary conditions have been imposed. 1)δx u = B(−1. 0. d. e] = 1 [−3. Under these conditions the sixth 4 column sums to ert = and the method can be expressed as ∂u ∂x +4 j−1 ∆x4 ∂ 5 u 120 ∂x5 (3. 1]. 1)u + bc 6 2∆x (3. the situation is quite easy to comprehend if we express the same method in the form of a matrix operator.50) which can be reexpressed by the “predictorcorrector” sequence 1 B(−1. 1)u + bc 2∆x δx u = 6 [B(1.4 Mathematically this is equivalent to δx u = 6[B(1. . 0.3. However. A banded matrix notation for Eq.4 Practical Application of Pad´ Formulas e It is one thing to construct methods using the Hermitian concept and quite another to implement them in a computer code. 0.48 is 1 1 B(1.48 can be applied. 1)]−1 1 B(−1. b. e 3. c. 3.48) This is also referred to as a Pad´ formula. 3. 1)]−1 u ˜ u = ˜ 4 (3. 1)u + bc 2∆x (3. we must satisfy the relation 37 −1 −1 −1 0 0 a 0 1 0 −1 1 1 b −1 0 −1 −2 2 c = 0 −1 1 d 0 0 −1 3 3 −1 0 −1 −4 4 e 0 having the solution [a.
For example 1 δxx u = 12 [B(1. a more subtle point is that they get this increase in accuracy using information that is still local to the point where the derivatives are being evaluated. 3. 4. However.38 CHAPTER 3. in this case a tridiagonal. An inverse matrix operator implies the solution of a coupled set of linear equations. and widely used.54) . 1)]−1 B(1.48 to 3. add the boundary conditions. but it still can be given a simple interpretation.5 Other HigherOrder Schemes Hermitian forms of the second derivative can also be easily derived by means of a Taylor table. although this is one of the principal issues in applying these schemes): ∂u 1 − Bp (1. It simply says – take the vector array u. How much this reduction in accuracy is oﬀset by the increased global continuity built into a spline ﬁt is not known. eﬃcient to run. The ˜ meaning of the second row is more subtle. 0. 1)u + bc (3. Hermitian or Pad´ approximations can e be practical when they can be implemented by the use of eﬃcient banded solvers. In general. FINITEDIFFERENCE APPROXIMATIONS With respect to practical implementation. A ﬁnal word on Hermitian approximations.53) ∆x2 but its order of accuracy is only O(∆x2 ). It is given e by 1 δxx u = 6 [B(1. which is simple to code.4. −8. two Lagrangian schemes with the same order of accuracy are (here we ignore the problem created by the boundary conditions. 3. the meaning of the predictor in this sequence should be clear. 1)u + bc (3. In application. of course. It is obvious. In particular. 8. These operators are very common in ﬁnite diﬀerence applications. −1)u = O ∆x4 ∂x 12∆x (3. The evaluation of [B(1.48 and 3. since it is demanding the evaluation of an inverse operator. and store the result in the intermediate array u. We note that the spline ﬁt of a ﬁrst derivative is identical to any of the expressions in Eqs. 1)]−1 is found by means of a tridiagonal “solver”. that ﬁve point schemes using Lagrangian approximations can be derived that have the same order of accuracy as the methods given in Eqs. this can be advantageous at boundaries and in the vicinity of steep gradients. 1)]−1 B(1.51. 4. −2. It should be mentioned that the spline approximation is one form of a Pad´ matrix diﬀerence operator. −2. 3. 10. but they will have a wider spread of space indices. diﬀerence it. They appear in the form of banded matrices having a small bandwidth.52) ∆x2 is O(∆x4 ) and makes use of only tridiagonal operations.52. Clearly they have an advantage over 3point Lagrangian schemes because of their increased accuracy.
The exact ﬁrst derivative of eiκx is ∂eiκx = iκeiκx ∂x (3. although the analysis is equally applicable to higher derivatives. a secondorder centered diﬀerence operator to uj = eiκxj . 3. it provides a fairly limited description. 16. which are in the form eiκx . FOURIER ERROR ANALYSIS 39 ∂2u 1 − Bp (−1. 16. We will concentrate here on ﬁrst derivative approximations.56) If we apply. where xj = j∆x. Further information about the error behavior of a ﬁnitediﬀerence scheme can be obtained using Fourier error analysis. −1)u = O ∆x4 2 ∂x 12∆x2 (3.3.5.55) 3. The accuracy of an operator is often expressed in terms of the order of the leading error term determined from a Taylor table. where κ is the wavenumber.1 Application to a Spatial Operator An arbitrary periodic function can be decomposed into its Fourier components. we get (δx u)j = = = = = = uj+1 − uj−1 2∆x eiκ∆x(j+1) − eiκ∆x(j−1) 2∆x iκ∆x (e − e−iκ∆x )eiκxj 2∆x 1 [(cos κ∆x + i sin κ∆x) − (cos κ∆x − i sin κ∆x)]eiκxj 2∆x sin κ∆x iκxj i e ∆x iκ∗ eiκxj (3. While this is a useful measure. for example.5 Fourier Error Analysis In order to select a ﬁnitediﬀerence scheme for a given application one must be able to assess the accuracy of the candidate schemes.5.57) . It is therefore of interest to examine how well a given ﬁnitediﬀerence operator approximates derivatives of eiκx . −30.
0 ≤ κ∆x ≤ π..5 1 1. appears in the exact expression. FINITEDIFFERENCE APPROXIMATIONS 3 2. Thus the degree to which the modiﬁed wavenumber approximates the actual wavenumber is a measure of the accuracy of the approximation. In general.58) Note that κ∗ approximates κ to secondorder accuracy. ﬁnitediﬀerence operators can be written in the form a s (δx )j = (δx )j + (δx )j .5 κ*∆ x 2 th 1. The modiﬁed wavenumber is so named because it appears where the wavenumber.5 2 Central 0 0 0. For the secondorder centered diﬀerence operator the modiﬁed wavenumber is given by κ∗ = sin κ∆x ∆x (3.58 is plotted in Figure 3..40 CHAPTER 3. κ. as is to be expected. along with similar relations for the standard fourthorder centered diﬀerence scheme and the fourthorder Pad´ scheme. where κ∗ is the modiﬁed wavenumber.5 3 κ∆ x Figure 3. since sin κ∆x κ3 ∆x2 =κ− +.5 th 4 Pade 4 Central nd 1 0.4.4: Modiﬁed wavenumber for various schemes. ∆x 6 Equation 3.5 2 2. The e expression for the modiﬁed wavenumber provides the accuracy with which a given wavenumber component of the solution is resolved for the entire wavenumber range available in a mesh of a given size.
6 Note that terms such as (δx u)j−1 are handled by letting (δx u)j = ik ∗ eiκj∆x and evaluating the shift in j. The fourthorder Pad´ scheme is given by e (δx u)j−1 + 4(δx u)j + (δx u)j+1 = 3 (uj+1 − uj−1) ∆x The modiﬁed wavenumber for this scheme satisﬁes6 iκ∗ e−iκ∆x + 4iκ∗ + iκ∗ eiκ∆x = which gives iκ∗ = 3i sin κ∆x (2 + cos κ∆x)∆x 3 iκ∆x (e − e−iκ∆x ) ∆x The modiﬁed wavenumber provides a useful tool for assessing diﬀerence approximations. the modiﬁed wavenumber is complex.3. the modiﬁed wavenumber is purely real. When the operator includes a symmetric component. In the context of the linear convection equation.5. with the imaginary component being entirely error. Consider once again the linear convection equation in the form ∂u ∂u +a =0 ∂t ∂x In terms of a circulant matrix operator A.59) When the ﬁnitediﬀerence operator is antisymmetric (centered). the antisymmetric part is obtained from (A − AT )/2 and the symmetric part from (A + AT )/2. the errors can be given a physical interpretation. FOURIER ERROR ANALYSIS 41 a s where (δx )j is an antisymmetric operator and (δx )j is a symmetric operator. then a (δx u)j = 1 [a1 (uj+1 − uj−1) + a2 (uj+2 − uj−2 ) + a3 (uj+3 − uj−3)] ∆x and s (δx u)j = 1 [d0 uj + d1 (uj+1 + uj−1) + d2 (uj+2 + uj−2) + d3 (uj+3 + uj−3 )] ∆x The corresponding modiﬁed wavenumber is iκ∗ = 1 [d0 + 2(d1 cos κ∆x + d2 cos 2κ∆x + d3 cos 3κ∆x) ∆x + 2i(a1 sin κ∆x + a2 sin 2κ∆x + a3 sin 3κ∆x) (3.5 If we restrict our interest to schemes extending from j − 3 to j + 3. 5 .
3. the following ODE is obtained for f (t): df sin κ∆x = −ia f = −iaκ∗ f dt ∆x (3. Since a∗ /a ≤ 1 for this example. t) = f (0)eiκ(x−at) If secondorder centered diﬀerences are applied to the spatial term.5 shows the numerical phase speed for the schemes considered previously. the secondorder centered diﬀerence scheme requires 80 P P W to produce an error in phase speed of less than 0. which is related to the modiﬁed wavenumber by a∗ κ∗ = a κ For the above example.1 percent. t) = f (0)eiκ(x−a ∗ t) (3. FINITEDIFFERENCE APPROXIMATIONS on a domain extending from −∞ to ∞. Since a∗ is a function of the wavenumber. The resolving eﬃciency of a scheme can be expressed in terms of the P P W required to produce errors below a speciﬁed level. The 5point fourthorder centered scheme and . the numerical approximation introduces dispersion. 3. For example. Recall from Section 2. we obtain unumerical (x.60 gives the exact solution as u(x. a∗ sin κ∆x = a κ∆x The numerical phase speed is the speed at which a harmonic function is propagated numerically.2 that a solution initiated by a harmonic function with wavenumber κ is u(x. a waveform consisting of many diﬀerent wavenumber components eventually loses its original form.60) Solving this ODE exactly (since we are considering the error from the spatial approximation only) and substituting into Eq.61) (3. although the original PDE is nondispersive. t) = f (t)eiκx where f (t) satisﬁes the ODE df = −iaκf dt Solving for f (t) and substituting into Eq. Figure 3.42 CHAPTER 3. As a result.62) where a∗ is the numerical (or modiﬁed) phase speed. The number of points per wavelength (P P W ) by which a given wave is resolved is given by 2π/κ∆x. the numerical solution propagates too slowly.60. 3.
2. Thus the antisymmetric portion of the spatial diﬀerence operator determines the error in speed and the symmetric portion the error in amplitude. while the imaginary part leads to an error in the amplitude of the solution. no special boundary operators are required. DIFFERENCE OPERATORS AT BOUNDARIES 43 1. This will be further discussed in Section 11.5 3 κ ∆x Figure 3.2. 3.3.3.6 Diﬀerence Operators at Boundaries As discussed in Section 3.61.5 2 2. the fourthorder Pad´ scheme require 15 and 10 P P W respectively to achieve the e same error level.2 0 0 0.5: Numerical phase speed for various schemes. 3. . but in the general case it can include an imaginary component as well.6 4 Central 4 Pade th 0. the error in the phase speed is determined from the real part of the modiﬁed wavenumber.59.6. 3. a matrix diﬀerence operator incorporates both the diﬀerence approximation in the interior of the domain and that at the boundaries.5 1 1. For our example using secondorder centered diﬀerences. the modiﬁed wavenumber is purely real. we consider the boundary operators needed for our model equations for convection and diﬀusion. In that case.4 0. as shown in Eq. as can be seen by inspecting Eq. In this section. In the case of periodic boundary conditions.8 * a _ a 2 Central th nd 0.2 1 0.
In the latter case. .65) However. The resulting diﬀerence operator has the form of Eq. we can use the following thirdorder operator at j = 1: 1 (−2u0 − 3u1 + 6u2 − u3 ) (3. It is clear that as long as the interior diﬀerence approximation does not extend beyond uj−1. with fourthorder centered diﬀerences. known as a numerical boundary scheme.. if we use the fourthorder interior operator given in Eq. with secondorder centered diﬀerences we obtain δx u = Au + bc with 0 1 −1 0 1 1 A= −1 0 2∆x 1 . the boundary conditions for the linear convection equation can be either periodic or of inﬂowoutﬂow type. then the approximation at j = 1 requires a value of uj−2.67) Proof of this theorem is beyond the scope of this book. the interested reader should consult the literature for further details.54.7 For example. .3. 1 bc = 12∆x −4u0 u0 0 . Such an operator. . while having the appropriate order of accuracy. and the righthand boundary is the outﬂow boundary. can have an order of accuracy which is one order lower than that of the interior scheme. 1 bc = 2∆x (3. while no condition is speciﬁed at the outﬂow boundary. a diﬀerent operator is required at j = 1 which extends only to j − 1.64) −u0 0 0 .64 with 1 A= 12∆x 7 −6 12 −2 −8 0 8 −1 1 −8 0 8 −1 . Consider ﬁrst the inﬂow boundary. thus the lefthand boundary is the inﬂow boundary. . .44 CHAPTER 3. Hence. . (3.. . For example.. 3.1 The Linear Convection Equation Referring to Section 2. which is outside the domain. . u2 . 3. then no special treatment is required for this boundary.66) (δx u)1 = 6∆x which is easily derived using a Taylor table. FINITEDIFFERENCE APPROXIMATIONS 3. . . Thus the vector of unknowns is u = [u1 .63) and u0 is speciﬁed.. . uM ]T (3. and the global accuracy will equal that of the interior scheme.6. (3. a Dirichlet boundary condition is given at the inﬂow boundary. Here we assume that the wave speed a is positive.
. Another approach to the development of boundary schemes is in terms of space extrapolation.71) Substituting this into the secondorder centereddiﬀerence operator applied at node M gives (δx u)M = 1 1 (uM +1 − uM −1 ) = (2uM − uM −1 − uM −1) 2∆x 2∆x 1 = (uM − uM −1) ∆x (3. no boundary condition is speciﬁed.73) which is identical to Eq. A backwarddiﬀerence formula must be used.. cannot be used at j = M . 0 1 .6. DIFFERENCE OPERATORS AT BOUNDARIES 45 This approximation is globally fourthorder accurate. .. with p = 2 we obtain (1 − 2E −1 + E −2 )uM +1 = uM +1 − 2uM + uM −1 = 0 which gives the following ﬁrstorder approximation to uM +1 : uM +1 = 2uM − uM −1 (3.68) This produces a diﬀerence operator with 0 −1 1 0 1 −1 0 −u0 0 0 . With a secondorder interior operator.69) −2 2 In the case of a fourthorder centered interior operator the last two rows of A require modiﬁcation. We must approximate ∂u/∂x at node M with no information about uM +1 . −1 (3. The following formula allows uM +1 to be extrapolated from the interior data to arbitrary order on an equispaced grid: (1 − E −1 )p uM +1 = 0 (3. . which requires uj+1. Thus the secondorder centereddiﬀerence operator.70) where E is the shift operator deﬁned by Euj = uj+1 and the order of the approximation is p − 1.72) (3. At the outﬂow boundary.68. 0 1 1 bc = 2∆x (3. 3. the following ﬁrstorder backward formula can be used: (δx u)M = 1 A= 2∆x 1 (uM − uM −1 ) ∆x . For example.3.
75) b Thus we design an operator at node M which is in the following form: (δxx u)M = 1 c (auM −1 + buM ) + 2 ∆x ∆x ∂u ∂x (3.4. b. and c. that is (δxx u)j = ∂u ∂x = M +1 ∂u ∂x (3.77) M +1 . and c are constants which can easily be determined using a Taylor table. we must consider Dirichlet and Neumann boundary conditions.74) ∆x2 no modiﬁcations are required near boundaries.76) M +1 where a. as shown in Table 3. we obtain the following operator: (δxx u)M = 1 2 (2uM −1 − 2uM ) + 2 3∆x 3∆x ∂u ∂x (3. The treatment of a Dirichlet boundary condition proceeds along the same lines as the inﬂow boundary for the convection equation discussed above. leading to the diﬀerence operator given in Eq.27.46 CHAPTER 3. 3. With the secondorder centered interior operator 1 (uj+1 − 2uj + uj−1 ) (3. ∂2u ∂x2 uj ∆x2 · ∂2u ∂x2 j j 1 c − 2 (auj−1 + buj ) + ∆x ∆x ∆x· ∂u ∂x j ∂u ∂x =? j+1 ∆x2 · 1 ∂2u ∂x2 j ∆x3 · ∂3u ∂x3 j ∆x4 · ∂4u ∂x4 j −a · uj−1 −b · uj −∆x · c · ∂u ∂x −a −b j+1 −a · (1) · 1 1 −c −a · (1)2 · 1 2 −c · (1) · 1 1 −a · (1)3 · 1 6 −c · (1)2 · 1 2 1 −a · (1)4 · 24 −c · (1)3 · 1 6 Table 3. we assume that ∂u/∂x is speciﬁed at j = M + 1. FINITEDIFFERENCE APPROXIMATIONS 3. Taylor table for Neumann boundary condition. Solving for a.4. For a Neumann boundary condition. b.6.2 The Diﬀusion Equation In solving the diﬀusion equation.
In the case of a numerical approximation to a Neumann boundary condition. Consider a secondorder backwarddiﬀerence approximation applied at node M + 1: ∂u ∂x = M +1 1 (uM −1 − 4uM + 3uM +1 ) + O(∆x2 ) 2∆x (3.78) Solving for uM +1 gives uM +1 = 1 ∂u 4uM − uM −1 + 2∆x 3 ∂x + O(∆x3 ) M +1 (3.7 Problems 1. ∂u ∂x (3.77 using the space extrapolation idea. 3. PROBLEMS 47 which produces the diﬀerence operator given in Eq. This contrasts with the numerical boundary schemes described previously which can be one order lower than the interior scheme.80) M +1 (3. 2. Derive a thirdorder ﬁnitediﬀerence approximation to a ﬁrst derivative in the form 1 (δx u)j = (auj−2 + buj−1 + cuj + duj+1) ∆x Find the leading error term.81) M +1 3.7. Derive a ﬁnitediﬀerence approximation to a ﬁrst derivative in the form a(δx u)j−1 + (δx u)j = Find the leading error term. We can also obtain the operator in Eq.29. this is necessary to obtain a globally secondorder accurate formulation.79) Substituting this into the secondorder centered diﬀerence operator for a second derivative applied at node M gives (δxx u)M = 1 (uM +1 − 2uM + uM −1 ) ∆x2 1 ∂u = 3uM −1 − 6uM + 4uM − uM −1 + 2∆x 2 3∆x ∂x 1 2 = (2uM −1 − 2uM ) + 3∆x2 3∆x which is identical to Eq. Notice that this operator is secondorder accurate. 3. 1 (buj−1 + cuj + duj+1) ∆x .77.3. 3.
Compare the real part with that obtained from the fourthorder centered operator (Eq. Find the modiﬁed wavenumber for the operator derived in question 1. 1 (auj−1 + buj + cuj+1 ) ∆x2 . Using a 4 (interior) point mesh. the noncompact fourthorder operator (Eq. Repeat question 2 with d = 0. Plot (κ∗ ∆x)2 vs. 9. FINITEDIFFERENCE APPROXIMATIONS 3. and the compact fourthorder operator derived in question 6. Find the modiﬁed wavenumber for the secondorder centered diﬀerence operator for a second derivative. κ∆x for 0 ≤ κ∆x ≤ π.1 percent for sixthorder noncompact and compact centered approximations to a ﬁrst derivative. Derive a compact (or Pad´) ﬁnitediﬀerence approximation to a second derivae tive in the form d(δxx u)j−1 + (δxx u)j + e(δxx u)j+1 = Find the leading error term. 7. Application of the secondderivative operator to the function eiκx gives ∂ 2 eiκx = −κ2 eiκx 2 ∂x Application of a diﬀerence operator for the second derivative gives (δxx eiκj∆x )j = −κ∗2 eiκx thus deﬁning the modiﬁed wavenumber κ∗ for a second derivative approximation. 3. 3. 5. (κ∆x)2 for 0 ≤ κ∆x ≤ π. 8. write out the 4×4 matrices and the boundarycondition vector formed by using the scheme derived in question 2 when both u and ∂u/∂x are given at j = 0 and u is given at j = 5.48 CHAPTER 3.54).55). 4. 6. Derive a ﬁnitediﬀerence approximation to a third derivative in the form (δxxx u)j = 1 (auj−2 + buj−1 + cuj + duj+1 + euj+2 ) ∆x3 Find the leading error term. Find the gridpointsperwavelength (P P W ) requirement to achieve a phase speed error less than 0. Plot the real and imaginary parts of κ∗ ∆x vs.
which are ﬁrst. second. respectively: (δx u)j = (uj − uj−1) /∆x (δx u)j = (3uj − 4uj−1 + uj−2 ) /(2∆x) (δx u)j = (11uj − 18uj−1 + 9uj−2 − 2uj−3) /(6∆x) Find the modiﬁed wavenumber for each of these schemes. .7. Plot the real and imaginary parts of κ∗ ∆x vs.3. and thirdorder. Derive the two leading terms in the truncation error for each scheme. Consider the following onesided diﬀerencing schemes. κ∆x for 0 ≤ κ∆x ≤ π. PROBLEMS 49 10.
50 CHAPTER 3. FINITEDIFFERENCE APPROXIMATIONS .
In the most general notation. we proceed with an analysis making considerable use of this concept.1) which. we ﬁnd (n+1) (n−1) (n) (n) (n) uj − uj uj+1 − 2uj + uj−1 = ν 2h ∆x2 or 2hν (n) (n+1) (n−1) (n) (n) uj = uj + (4. As an example let us consider the model equation for diﬀusion ∂u ∂2 u =ν 2 ∂t ∂x Using threepoint centraldiﬀerencing schemes for both the time and space derivatives.2) 2 uj+1 − 2uj + uj−1 ∆x 51 . but the rest of this chapter is devoted to a more specialized matrix notation described below.1) dt which includes all manner of nonlinear and timedependent possibilities. in turn. Another strategy for constructing a ﬁnitediﬀerence approximation to a PDE is to approximate all the partial derivatives at once. In the following chapters. without approximating the time derivative. t) (4. This generally leads to a point diﬀerence operator (see Section 3. On occasion. Diﬀerencing the space derivatives converts the basic PDE into a set of coupled ODE’s.Chapter 4 THE SEMIDISCRETE APPROACH One strategy for obtaining ﬁnitediﬀerence approximations to a PDE is to start by diﬀerencing the space derivatives only. we use this form. these ODE’s would be expressed in the form du = F (u.3. can be used for the time advance of the solution at any given point in the mesh. which we refer to as the semidiscrete approach.
. that the spatial and temporal discretizations are separable. Thus. THE SEMIDISCRETE APPROACH Clearly Eq. There are a number of issues concerning the accuracy. and convergence of Eqs.1.2 (n+1) (n−1) with the time average of u at that point. we can approximate the space derivatives with diﬀerence operators and express the resulting ODE’s with a matrix formulation. 4.3 is referred to as the DuFortFrankel method.52 CHAPTER 4. This results in the formula (n+1) uj = (n−1) uj u 2hν (n) j + 2 uj+1 − 2 ∆x (n+1) + uj 2 (n−1) + (n) uj−1 (4. we shall separate the space diﬀerence approximations from the time diﬀerencing. 4. It is a full discretization of the PDE. namely (uj + uj )/2.2 is sometimes called Richardson’s method of overlapping steps and Eq. stability. we reduce the governing PDE’s to ODE’s by discretizing the spatial terms and use the welldeveloped theory of ODE solutions to aid us in the development of an analysis of accuracy and stability. 4. 4. In these simple cases.3 which we cannot comment on until we develop a framework for such investigations. and no semidiscrete form exists. 4. We introduce these methods here only to distinguish between methods in which the temporal and spatial terms are discretized separately and those for which no such separation is possible. there are subtle points to be made about using these methods to ﬁnd a numerical solution to the diﬀusion equation. As we shall see later on. (n) Another possibility is to replace the value of uj in the right hand side of Eq. In this approach. This is a simple and natural formulation when the ODE’s are linear.1 Reduction of PDE’s to ODE’s The Model ODE’s First let us consider the model PDE’s for diﬀusion and biconvection described in Chapter 2. Equation 4.1 4. this method has an intermediate semidiscrete form and can be analyzed by the methods discussed in the next few chapters.2 and 4.2 is a diﬀerence equation which can be used at the space point j to advance the value of u from the previous time levels n and n − 1 to the level n + 1. In this case. the spatial and temporal discretizations are not separable. however.3) which can be solved for u(n+1) and time advanced at the point j. For the time being. Note.
It is used for the scalar convection model when the boundary conditions are periodic. even the Euler and NavierStokes equations can be expressed in the form of Eq. 4.3. 4. 4.6) Note that the elements in the matrix A depend upon both the PDE and the type of diﬀerencing scheme chosen for the space terms. the linear analysis presented in this book leads to diagnostics that are surprisingly accurate when used to evaluate many aspects of numerical methods as they apply to the Euler and NavierStokes equations. that is. using the 3point centraldiﬀerencing scheme to represent the second derivative in the scalar PDE governing diﬀusion leads to the following ODE diﬀusion model du ν B(1. −2.4) where the boundary condition vector is absent because the ﬂow is periodic. . REDUCTION OF PDE’S TO ODE’S Model ODE for Diﬀusion 53 For example. In general.6. the elements of A depend on the solution u and are usually derived by ﬁnding the Jacobian of a ﬂux vector. Although the equations are nonlinear.1. The vector f (t) is usually determined by the boundary conditions and possibly source terms. In this case. the 3point centraldiﬀerencing approximation produces the ODE model given by du a =− Bp (−1.5) (4. Model ODE for Biconvection The term biconvection was introduced in Section 2. 1)u + (bc) = dt ∆x2 with Dirichlet boundary conditions folded into the (bc) vector. 0.4.5 are the model ODE’s for diﬀusion and biconvection of a scalar in one dimension. Eqs.2 The Generic Matrix Form The generic matrix form of a semidiscrete approximation is expressed by the equation du = Au − f (t) dt (4. In such cases the equations are nonlinear. They are linear with coeﬃcient matrices which are independent of x and t.4 and 4.1. 1)u dt 2∆x (4.
THE SEMIDISCRETE APPROACH 4.6 can be written in terms of the eigenvalues and eigenvectors of A. . the system of ODE’s must be integrated using a timemarching method. 4. As we shall soon see. 4. xm . (4. 4. have the property that Axm = λm xm [A − λm I]xm = 0 The eigenvalues are the roots of the equation det[A − λI] = 0 We form the righthand eigenvector matrix of a complete system by ﬁlling its columns with the eigenvectors xm : X = x1 . whereas. if ∂F/∂u = A where A is independent of u). An eigenvector. The ODE’s represented by Eq. 4. we will make use of exact solutions of coupled systems of ODE’s.7) . xM The inverse is the lefthand eigenvector matrix. which exist under certain conditions.. λm . if A does not depend explicitly on t. A. .2. This holds regardless of whether or not the forcing function. and its corresponding eigenvalue. . 4. If A does depend explicitly on t. As we have already pointed out. In order to analyze timemarching methods. the general solution to Eq. These ideas are developed in the following sections. and together they have the property that X −1 AX = Λ where Λ is a diagonal matrix whose elements are the eigenvalues of A. x2 . depends explicitly on t.1 Eigensystems of SemiDiscrete Linear Forms Complete Systems An M × M matrix is represented by a complete eigensystem if it has a complete set of linearly independent eigenvectors (see Appendix A) .1 in time.8) or (4. is independent of u.54 CHAPTER 4. when the ODE’s are linear they can be expressed in a matrix notation as Eq.6 in which the coeﬃcient matrix. f . 4.e. This will lead us to a representative scalar equation for use in analyzing timemarching methods. the exact solution of Eq.6 can be written. the general solution cannot be written.1 are said to be linear if F is linearly dependent on u (i.2 Exact Solutions of Linear ODE’s In order to advance Eq.
.4. and an eigenvalue with multiplicity n within a Jordan block is said to be a defective eigenvalue.9) where λ. . It can. Jm . 4. and has a general solution because λ does not depend on t. In general. It has a steadystate solution if the righthand side is independent of t.. 4.. . ..and SecondOrder FirstOrder Equations The simplest nonhomogeneous ODE of interest is given by the single.e. n The matrix J is said to be in Jordan canonical form. i. λm 1 . all of which can be complex numbers..e. if µ = 0.2 Single ODE’s of First. there exists some S which transforms any matrix A such that S −1 AS = J where J = J1 J2 . 1 and (n) Jm = λm 1 λm . .9 displays many of the fundamental properties and issues involved in the construction and study of most popular timemarching methods. Although it is quite simple. . some of which may be scalars (see Appendix A). and it is said to be defective. and µ are scalars. be transformed to a diagonal set of blocks. . if a = 0. the numerical analysis of Eq. EXACT SOLUTIONS OF LINEAR ODE’S Defective Systems 55 If an M ×M matrix does not have a complete set of linearly independent eigenvectors.2. however.. a. ﬁrstorder equation du = λu + aeµt dt (4. i. . . This theme will be developed as we proceed. . Defective systems play a role in numerical stability analysis. it cannot be transformed to a diagonal matrix of scalars. and is homogeneous if the forcing function is zero. The equation is linear because λ does not depend on u.2...
u(t) = c1 eλt + aeµt µ−λ where c1 is a constant determined by the initial conditions. For ODE’s. 4. Now we introduce the diﬀerential operator D such that d D≡ dt and factor u(t) out of Eq. Characteristic polynomials are fundamental to the analysis of both ODE’s and O∆E’s. · · ·. this solution is required for the analysis of defective systems. In terms of the initial value of u.11.9 when µ = λ? This is easily found by setting µ = λ + . we ﬁnd that the solution to du = λu + aeλt dt is given by u(t) = [u(0) + at]eλt As we shall soon see. giving (D2 + a1 D + a0 ) u(t) = 0 The polynomial in D is referred to as a characteristic polynomial and designated P (D). . SecondOrder Equations The homogeneous form of a secondorder equation is given by d2 u du + a1 + a0 u = 0 2 dt dt (4.10) where a1 and a0 are complex constants. and then taking the limit as → 0. THE SEMIDISCRETE APPROACH The exact solution of Eq. λ2 . 4.56 CHAPTER 4. solving. 4. Using this limiting device. it can be written eµt − eλt u(t) = u(0)e + a µ−λ λt The interesting question can arise: What happens to the solution of Eq.11) (4. since the roots of these polynomials determine the solutions of the equations. we often label these roots in order of increasing modulus as λ1 . λm .9 is. for µ = λ.
these are indeed solutions to Eq. c2 .2. homogeneous equations is given by u1 = a11 u1 + a12 u2 u2 = a21 u1 + a22 u2 which can be written u = Au . u2 = u .4. 4.11 can be written u1 = −a1 u1 − a0 u2 u2 = u1 which is a subset of Eq. 4.14. a11 a12 a21 a22 x12 x22 = λ2 x12 x22 (4. The proof of this is simple and is found by substituting Eq. In our simple example. 4. Thus. λM . and t if and only if the λ’s satisfy Eq. 4.13) where c1 and c2 are constants determined from initial conditions.14 if and only if a11 a12 a21 a22 x11 x21 = λ1 x11 x21 . One ﬁnds the result c1 eλ1 t (λ2 + a1 λ1 + a0 ) + c2 eλ2 t (λ2 + a1 λ2 + a0 ) 1 2 which is identically zero for all c1 . u = [u1 . They are found by solving the equation P (λ) = 0.12) 4. 4.11.14) Consider the possibility that a solution is represented by u1 = c1 x11 eλ1 t + c2 x12 eλ2 t u2 = c1 x21 eλ1 t + c2 x22 eλ2 t By substitution.12. . (4. EXACT SOLUTIONS OF LINEAR ODE’S 57 · · ·.11 is given by u(t) = c1 eλ1 t + c2 eλ2 t (4. u2 ]T . ﬁrstorder.2.3 Coupled FirstOrder ODE’s A Complete System A set of coupled. A = (aij ) = a11 a12 a21 a22 (4. by setting u1 = u we ﬁnd Eq.13 into Eq.15) Notice that a higherorder equation can be reduced to a coupled set of ﬁrstorder equations by introducing a new set of dependent variables. 4. there would be two roots. 4.17) . (4. determined from P (λ) = λ2 + a1 λ + a0 = 0 and the solution to Eq. λ1 and λ2 .16) (4.
16 with A = Λ. 4. 4.58 A Derogatory System CHAPTER 4. THE SEMIDISCRETE APPROACH Eq. substituting this result into the second equation. b = u2 (0) . one ﬁnds du2 = λu2 + u1 (0)eλt dt which is identical in form to Eq. one can solve the top equation ﬁrst. A Defective System If A is defective. giving u1 (t) = u1 (0)eλt . in this case. then it can be represented by the Jordan canonical form u1 u2 = λ 0 1 λ u1 u2 (4.10 and has the solution u2 (t) = [u2 (0) + u1 (0)t]eλt From this the reader should be able to verify that u3 (t) = a + bt + ct2 eλt is a solution to u1 λ u1 u2 = 1 λ u2 u3 1 λ u3 if a = u3 (0) . . This is the case where A has a complete set of eigenvectors and is not defective.18) whose solution is not obvious. In this case λ 0 0 λ 1 1 =λ 0 0 and λ 0 0 λ 0 0 =λ 1 1 provide such a solution. 1 c = u1 (0) 2 (4. 4.14 if λ1 = λ2 = λ. Then. 4.19) The general solution to such defective systems is left as an exercise.15 is still a solution to Eq. provided two linearly independent vectors exist to satisfy Eq. However.
4. There results X −1 du = X −1 AX · X −1 u − X −1 f(t) dt (4.20 to a new algebraic form dw = Λw − g(t) dt (4. EXACT SOLUTIONS OF LINEAR ODE’S 59 4. Now let us multiply Eq. see Section 4. by introducing the new variables w and g such that w = X −1 u .1. Represent them by the equation du = Au − f (t) dt (4. not because they cannot be analyzed (the example at the conclusion of the previous section proves otherwise). nonhomogeneous. 4.2. but because they are only of limited interest in the general development of our theory. 1 . In the following.21 can be modiﬁed to d −1 X u = ΛX −1 u − X −1 f(t) dt Finally.4 General Solution of Coupled ODE’s with Complete Eigensystems Let us consider a set of coupled.2. 4. Notice that Eqs. X−1 and X. The only diﬀerence between them was brought about by algebraic manipulations which regrouped the variables.23) It is important at this point to review the results of the previous paragraph.20 and 4.4. linear.2. However. and Eq. this regrouping is crucial for the solution process because Eqs. we exclude defective systems.20) Our assumption is that the M × M matrix A has a complete eigensystem1 and can be transformed by the left and right eigenvector matrices. the elements in X−1 and X are also independent of both u and t. to a diagonal matrix Λ having diagonal elements which are the eigenvalues of A.23 are expressing exactly the same equality. 4.22) we reduce Eq.21) Since A is independent of both u and t. ﬁrstorder ODE’s with constant coeﬃcients which might have been derived by space diﬀerencing a set of PDE’s.20 from the left by X −1 and insert the identity combination XX −1 = I between A and u. g(t) = X −1 f (t) (4.
24 are also time invariant and the solution to any line in Eq. single. 4.23 are no longer coupled. They can be written line by line as a set of independent. using the inverse of the relations given in Eqs. thus w1 = λ1 w1 − g1 (t) . In such a case.23 and 4. 4.26) = m=1 Transient Steadystate . ﬁrstorder equations. . wM = λM wM − gM (t) (4. the gm in Eqs. i. . Transforming back to the usystem gives u(t) = X w(t) M = m=1 M wm (t)xm cm eλm t xm + m=1 M = = m=1 M 1 gm xm m=1 λm M cm eλm t xm + XΛ−1 X −1 f cm eλm t xm + A−1 f (4.20 when neither A nor f has any explicit dependence on t.24 is wm (t) = cm eλm t + 1 gm λm where the cm are constants that depend on the initial conditions. the eigenvector corresponding to λm . 4.. .24) For any given set of gm (t) each of these equations can be solved separately and then recoupled.60 CHAPTER 4.e. THE SEMIDISCRETE APPROACH 4. We next focus on the very important subset of Eq. 4.25) where xm is the m’th column of X. .22: u(t) = X w(t) M = m=1 wm (t)xm (4. wm = λm wm − gm (t) .
3. on the cleverness of the relator.and postmultiplication of A by the appropriate similarity matrices transforms A into a diagonal matrix. We begin by developing the concept of “spaces”. we get diﬀerent perspectives of the same solution. In our application to ﬂuid dynamics.1.3 4. This permits us to say a great deal about such problems and serves as the basis for this section. of Jordan blocks or. In this way. the elements of A are independent of both u and t. In particular. however.1 Real Space and Eigenspace Deﬁnition Following the semidiscrete approach discussed in Section 4. we reduce the partial diﬀerential equations to a set of ordinary diﬀerential equations represented by the generic form du = Au − f dt (4.27) 4. respectively.3. For the model problems on which we are focusing most of our attention. it is said to be in real space.2 that pre. it is more instructive to refer to these groups as the transient and steadystate solutions. We say If a solution is expressed in terms of u. composed. The ﬁrst group of terms on the right side of this equation is referred to classically as the complementary solution or the solution of the homogeneous equations. An alternative. and this can add signiﬁcantly to our understanding. The second group is referred to classically as the particular solution or the particular integral.2. we identify diﬀerent mathematical reference frames (spaces) and view our solutions from within each. another very useful frame. There is. to a much greater extent. we can develop some very important and fundamental concepts that underly the global properties of the numerical solutions to the model problems. REAL SPACE AND EIGENSPACE 61 Note that the steadystate solution is A−1 f . That is. How these relate to the numerical solutions of more practical problems depends upon the problem and. The most natural reference frame is the physical one. in the . but entirely equivalent. We saw in Sections 4.28) The dependent variable u represents some physical quantity or quantities which relate to the problem of interest.4. as might be expected.1 and 4. in the most general case. form of the solution is u(t) = c1 eλ1 t x1 + · · · + cm eλm t xm + · · · + cM eλM t xM + A−1 f (4.
for simplicity. We say If a solution is expressed in terms of w. The relations that transfer from one space to the other are: w = X −1 u g = X −1 f u = Xw f = Xg The elements of u relate directly to the local physics of the problem.1.2 Eigenvalue Spectrums for Model ODE’s It is instructive to consider the eigenvalue spectrums of the ODE’s formulated by central diﬀerencing the model equations for diﬀusion and biconvection.2 and. it is said to be in eigenspace (often referred to as wave space). using only complete systems for our examples. These model ODE’s are presented in Section 4. 4. When the forcing function f is independent of t. At this point we make two observations: 1. the transient portion of the solution in real space consists of a linear combination of contributions from each eigenvector. · · · . respectively. Equations for the eigenvalues of the simple . However. THE SEMIDISCRETE APPROACH simplest nondefective case. and 2. 4. m = 1. and individually they have no direct local physical interpretation. 2. the solutions in the two spaces are represented by u(t) = cm eλm t xm + XΛ−1 X −1 f and wm (t) = cm eλm t + 1 gm λm .62 CHAPTER 4.28 had the alternative form dw = Λw − g dt which is an uncoupled set of ﬁrstorder ODE’s that can be solved independently for the dependent variable vector w. Following Section 4. we found that Eq.3. M for real space and eigenspace. the transient portion of the solution in eigenspace provides the weighting of each eigenvector component of the transient solution in real space. of scalars. the elements of w are linear combinations of all of the elements of u.
REAL SPACE AND EIGENSPACE 63 tridiagonals. · · · .5. j = 1. 1. M − 1 sin κm ∆x m = 0. c) and Bp (M : a. we present results for a simple 4point mesh and then we give the general case. m = 1. the model ODE’s for diﬀusion. 1. The righthand eigenvector matrix X is given by sin (x1 ) sin (x ) 2 sin (x3 ) sin (x4 ) sin (2x1 ) sin (2x2 ) sin (2x3 ) sin (2x4 ) sin (3x1 ) sin (3x2 ) sin (3x3 ) sin (3x4 ) sin (4x1 ) sin (4x2 ) sin (4x3 ) sin (4x4 ) The columns of the matrix are proportional to the eigenvectors. 1. M (4. · · · .3. Notice that the diﬀusion eigenvalues are real and negative while those representing periodic convection are all pure imaginary. The interpretation of this result plays a very important role later in our stability analysis.4. M (4. b.4.3. 2. The Diﬀusion Model Consider Eq.29) 2(M + 1) ∆x2 and. b.30) = −iκ∗ a m where κ∗ = m m = 0. · · · . and ∆x = 2π/M .31) ∆x is the modiﬁed wavenumber from Section 3. κm = m. B(M : a. M − 1 (4. 4. First. to help visualize the matrix structure. M − 1 (4.32) . Recall that xj = j∆x = jπ/(M + 1).4. These follow as special cases from the results given in Appendix B. are given in Appendix B. 4. so in general the relation u = X w can be written as M uj = m=1 wm sin mxj . · · · . From Section B. from Section B.1. 2. c).3 Eigenvectors of the Model Equations Next we consider the eigenvectors of the two model equations. for the model biconvection equation λm = 2mπ −ia sin ∆x M m = 0. we ﬁnd for the model diﬀusion equation with Dirichlet boundary conditions mπ ν λm = 2 −2 + 2 cos M +1 ∆x mπ −4ν = sin2 . · · · .
The Biconvection Model Next consider the model ODE’s for periodic convection. · · · . For the model diﬀusion equation: w = X −1 u u = Xw is a sine transform from real space to (sine) wave space.33 represents a sine transform of the function u(x) for an M point sample between the boundaries x = 0 and x = π with the condition u(0) = u(π) = 0. 2. · · · .5. In general w = X−1 u gives M wm = j=1 uj sin mxj . THE SEMIDISCRETE APPROACH For the inverse.33. · · · .34) . 4. j = 0. 4. For our model ODE. the righthand eigenvectors are given by xm = ei j (2πm/M ) . Similarly. 1. j = 0. 4. The coeﬃcient matrices for these ODE’s are always circulant. M − 1 m = 0. m = 1. In summary. 4. M − 1 With xj = j · ∆x = j · 2π/M .33) In the ﬁeld of harmonic analysis.32 represents the sine synthesis that companions the sine transform given by Eq. Eq. or lefthand eigenvector matrix X−1 . Eq.64 CHAPTER 4. M − 1 (4. we can write u = X w as M −1 uj = m=0 wm eimxj . is a sine synthesis from wave space back to real space. · · · . 1. 1. M (4. we ﬁnd sin (x1 ) sin (x2 ) sin (x3 ) sin (x4 ) sin (2x ) sin (2x ) sin (2x ) sin (2x ) 1 2 3 4 sin (3x1 ) sin (3x2 ) sin (3x3 ) sin (3x4 ) sin (4x1 ) sin (4x2 ) sin (4x3 ) sin (4x4 ) The rows of the matrix are proportional to the eigenvectors. Eq.
Eq.27 becomes M uj (t) = m=1 cm eλm t sin mxj + (A−1 f )j . M − 1 This equation is identical to a discrete Fourier transform of the periodic dependent variable u using an M point sample between and including x = 0 and x = 2π − ∆x.4 Solutions of the Model ODE’s We can now combine the results of the previous sections to write the solutions of our model ODE’s.35) where λm = −4ν mπ 2 2 sin 2(M + 1) ∆x (4. The Diﬀusion Equation For the diﬀusion equation. we ﬁnd the following lefthand eigenvector matrix from Appendix B.3. M (4. 2.4. For circulant matrices. it is straightforward to establish the fact that the relation u = X w represents the Fourier synthesis of the variable w back to u. · · · .4: w1 1 1 w 1 e−2iπ/4 1 2 = w3 4 1 e−4iπ/4 w4 1 e−6iπ/4 In general wm = 1 M M −1 j=0 1 e−4iπ/4 e−8iπ/4 e−12iπ/4 u1 −6iπ/4 e u2 = X −1 u −12iπ/4 u3 e e−18iπ/4 u4 1 uj e−imxj . For any circulant system: w = X −1 u u = Xw is a complex Fourier transform from real space to wave space. In summary.3.36) . · · · . 4. 1. j = 1. REAL SPACE AND EIGENSPACE 65 For a 4point periodic mesh. 4. is a complex Fourier synthesis from wave space back to real space. m = 0.
66 CHAPTER 4. 1. while high wavenumber modes (if they have signiﬁcant amplitudes) can have large errors.37) and using κm = m.38) This can be compared with the exact solution to the PDE. The diﬀerence between the modiﬁed wavenumber and the actual wavenumber depends on the diﬀerencing scheme and the grid resolution. · · · . ∗ j = 0. This diﬀerence causes the various modes (or eigenvector components) to decay at rates which diﬀer from the exact solution.39) We see that the solutions are identical except for the steady solution and the modiﬁed wavenumber in the transient term. 2. Eq. 2 j = 1. 1. low wavenumber modes are accurately represented. With conventional diﬀerencing schemes. · · · . we obtain M −1 uj (t) = m=0 cm eλm t eiκm xj . 4. we can write the ODE solution as M uj (t) = m=1 cm e−νκm t sin κm xj + (A−1 f )j . · · · . The Convection Equation For the biconvection equation. ∗ 2 j = 1. M − 1 (4. · · · . M (4. We can write this ODE solution as M −1 uj (t) = m=0 cm e−iκm at eiκm xj . evaluated at the nodes of the grid: M uj (t) = m=1 cm e−νκm t sin κm xj + h(xj ).41) with the modiﬁed wavenumber deﬁned in Eq. THE SEMIDISCRETE APPROACH With the modiﬁed wavenumber deﬁned as κ∗ = m 2 κm ∆x sin ∆x 2 (4. The modiﬁed wavenumber is an approximation to the actual wavenumber. M (4.31. M − 1 (4. 2.42) .37. 2.40) where λm = −iκ∗ a m (4. j = 0.
THE REPRESENTATIVE EQUATION 67 and compare it to the exact solution of the PDE. the modiﬁed wavenumber is complex.44) The goal in our analysis is to study typical behavior of general situations. topics which are covered in Chapters 6 and 7.3. Our next objective is to ﬁnd a “typical” single ODE to analyze. this leads to an error in the speed with which various modes are convected. This is helpful both in analyzing the accuracy of timemarching methods and in studying their stability. 4.5. since κ∗ is real.43) Once again the diﬀerence appears through the modiﬁed wavenumber contained in λm . 1. j = 0.26.42 shows that the imaginary portion of the modiﬁed wavenumber produces nonphysical decay or growth in the numerical solution. Since the error in the phase speed depends on the wavenumber. We found the uncoupled solution to a set of ODE’s in Section 4. For such a purpose Eq. discussed in Chapter 11. 4. uncoupled equation and our conclusions will apply in general. .4.2. A typical member of the family is dwm = λm wm − gm (t) dt (4. 4.44 is not quite satisfactory. This leads to the following important concept: The numerical solution to a set of linear ODE’s (in which A is not a function of t) is entirely equivalent to the solution obtained if the equations are transformed to eigenspace.4. the result is erroneous numerical dispersion. and then returned as a coupled set to real space. we pointed out that Eqs. The importance of this concept resides in its message that we can analyze timemarching methods by applying them to a single. 4. solved there in their uncoupled form. it stands for some representative eigenvalue in the original A matrix. · · · . The role of λm is clear. evaluated at the nodes of the grid: M −1 uj (t) = m=0 fm (0)e−iκm at eiκm xj . M − 1 (4. while the actual phase speed is independent of the wavenumber.23 express identical results but in terms of diﬀerent groupings of the dependent variables. not particular problems. As discussed in Section 3. Eq.20 and 4. 2. However. which are related by algebraic manipulation.4 The Representative Equation In Section 4. The form of Eq. In the case of noncentered diﬀerencing.
Write the semidiscrete form resulting from the application of secondorder centered diﬀerences to the following equation on the domain 0 ≤ x ≤ 1 with boundary conditions u(0) = 0. For example −g(t) = k ak eikt for which Eq. This leads to The Representative ODE dw = λw + aeµt dt (4. one can express any one of the forcing terms gm (t) as a ﬁnite Fourier series. u(1) = 1: ∂u ∂2u = 2 − 6x ∂t ∂x . we note that. 4. In such evaluations the parameters λ and µ must be allowed to take the worst possible combination of values that might occur in the ODE eigensystem. The exact solution of the representative ODE is (for µ = λ): w(t) = ceλt + aeµt µ−λ (4. in principle. Consider the application of the operator given in Eq. 3. Write the resulting semidiscrete ODE form. 2. Consider the ﬁnitediﬀerence operator derived in question 1 of Chapter 3.44 has the exact solution: w(t) = ce + k λt ak eikt ik − λ From this we can extract the k’th term and replace ik with µ. Using this operator to approximate the spatial derivative in the linear convection equation.46) 4.68 CHAPTER 4.45) which can be used to evaluate all manner of timemarching methods.5 Problems 1.52 to the 1D diﬀusion equation with Dirichlet boundary conditions. Find the entries in the boundarycondition vector. 3. THE SEMIDISCRETE APPROACH the question is: What should we use for gm (t) when the time dependence cannot be ignored? To answer this question. write the semidiscrete form obtained with periodic boundary conditions on a 5point grid (M = 5).
Calculate and plot the corresponding ODE solutions. 0) = sin x. Calculate the numerical phase speed from the modiﬁed wavenumber corresponding to this initial condition and show that it is consistent with the ODE solution.4. PROBLEMS 69 4. The eigenvalues of the above matrix A.37. For initial conditions u(x. 0) = sin(mx) and boundary conditions u(0. Using these. Compare with the exact solution of the PDE. 1)/(2∆x) corresponding to the ODE form of the biconvection equation resulting from the application of secondorder central diﬀerencing on a 10point grid. The grid nodes are given by xj = j∆x. t) = 0. can be found from Appendix B. (Plot the solution at the grid nodes only. 0) = sin 2x. Consider a grid with 10 interior points spanning the domain 0 ≤ x ≤ π. . . 4. 0. 1. Consider the matrix A = −Bp (10. . j = 0. Repeat for the initial condition u(x.5. compute and plot the ODE solution at t = 2π for the initial condition u(x. −1. Note that the domain is 0 ≤ x ≤ 2π and ∆x = 2π/10. t) = u(π.4. . as well as the matrices X and X −1 . 5. plot the exact solution of the diﬀusion equation with ν = 1 at t = 1 with m = 1 and m = 3. 9.) Calculate the corresponding modiﬁed wavenumbers for the secondorder centered operator from Eq.
THE SEMIDISCRETE APPROACH .70 CHAPTER 4.
FdS = V (t) P dV (5. we will show how similar semidiscrete forms can be derived using ﬁnitevolume approximations in space. Consistent with our emphasis on semidiscrete methods. and energy are conserved in a discrete sense. we saw how to derive ﬁnitediﬀerence approximations to arbitrary derivatives.1 or Eq. 71 . we will study the latter form. ﬁnitevolume methods do not require a coordinate transformation in order to be applied on irregular meshes. Finitevolume methods are applied to the integral form of the governing equations. either in the form of Eq. This increased ﬂexibility can be used to great advantage in generating grids about arbitrary geometries. it is obtained naturally from a ﬁnitevolume formulation. As a result. Finitevolume methods have become popular in CFD as a result. In Chapter 4. momentum.1. 2. While this property can usually be obtained using a ﬁnitediﬀerence formulation. This will be followed by several examples which hopefully make these concepts clear. mass. we saw that the application of a ﬁnitediﬀerence approximation to the spatial derivatives in our model PDE’s produces a coupled set of ODE’s. 2. Next we will give our model equations in the form of Eq.Chapter 5 FINITEVOLUME METHODS In Chapter 3. In this Chapter. of two advantages. they ensure that the discretization is conservative. i.2.1) We will begin by presenting the basic concepts which apply to ﬁnitevolume strategies. First. 5. they can be applied on unstructured meshes consisting of arbitrary polyhedra in three dimensions or arbitrary polygons in two dimensions. which is d dt QdV + V (t) S(t) n..e. Second. primarily.
each cell will have a diﬀerent piecewise approximation to Q. they will generally produce diﬀerent approximations to the ﬂux at the boundary between two control volumes. we note that the average value of Q in a cell with volume V is 1 ¯ Q≡ V and Eq. Let us consider these approximations in more detail. Q can be represented within the cell by some piecewise approximation which produces ¯ the correct value of Q.FdS = S V QdV V (5. In order to evaluate the ﬂuxes. the source term P must be integrated over the control volume. which are a function of Q.3) P dV (5. As we shall see in our examples. Thus after applying a time¯ marching method. we see that several approximations must be made. 5. Next a timemarching method1 can be applied to ﬁnd the value of QdV V (5. A nondissipative scheme analogous to centered diﬀerencing is obtained by taking the average of these two ﬂuxes. When these are used to calculate F(Q).4) for a control volume which does not vary with time. Examining Eq. FINITEVOLUME METHODS 5. the ﬂux will be discontinuous. we have updated values of the cellaveraged quantities Q. that is.72 CHAPTER 5. at the controlvolume boundary.1. 5. This ﬂux must then be integrated to ﬁnd the net ﬂux through the boundary. The ﬂux is required at the boundary of the control volume. First. 5. Similarly. we will consider only control volumes which do not vary with time. .2) at the next time step.1 Basic Concepts The basic idea of a ﬁnitevolume method is to satisfy the integral form of the conservation law to some degree of approximation for each of many contiguous control volumes which cover the domain of interest.1 can be written as V d ¯ Q+ dt n. In our examples. The basic elements of a ﬁnitevolume method are thus the following: 1 Timemarching methods will be discussed in the next chapter. which is a closed surface in three dimensions and a closed contour in two dimensions. Thus the volume V in Eq. Another approach known as ﬂuxdiﬀerence splitting is described in Chapter 11.1 is that of a control volume whose shape is dependent on the nature of the grid. This is a form of interpolation often referred to as reconstruction.
Using this approximation. y. Advance the solution in time to obtain new values of Q.2. In order to include diﬀusive ﬂuxes. .1 Model Equations in Integral Form The Linear Convection Equation A twodimensional form of the linear convection equation can be written as ∂u ∂u ∂u + a cos θ + a sin θ =0 ∂t ∂x ∂y (5. 5. ¯ 4. z) in each control volume. y. Evaluate F(Q) at the boundary. two distinct values of the ﬂux will generally be obtained at any point on the boundary between two control volumes. the following relation between ∇Q and Q is sometimes used: ∇QdV = nQdS (5. Since there is a distinct approximation to Q(x.2 5.5. z) in each control volume.4. ﬁnd Q at the controlvolume boundary. A ∇QdA = nQdl C (5. Given the value of Q for each control volume. y. The order of accuracy of the method is dependent on each of the approximations. Integrate the ﬂux to ﬁnd the net ﬂux through the controlvolume boundary using some sort of quadrature. This issue is discussed in Section 11.7) This PDE governs a simple plane wave convecting the scalar quantity.2. 3. construct an approximation to Q(x. in two dimensions. These ideas should be clariﬁed by the examples in the remainder of this chapter. 2. MODEL EQUATIONS IN INTEGRAL FORM 73 ¯ 1. t) with speed a along a straight line making an angle θ with respect to the xaxis. u(x.5) V S or.2.6) where the unit vector n points outward from the surface or contour. Apply some strategy for resolving the discontinuity in the ﬂux at the controlvolume boundary to produce a single value of F(Q) at any point on the boundary. The onedimensional form is recovered with θ = 0.
We consider an equispaced grid with spacing ∆x.13) (5.10) Since Q is a scalar. 5.74 CHAPTER 5.12) (5. Substituting these expressions into a twodimensional form of Eq.16) to be the integral form of the twodimensional diﬀusion equation. Control volume j extends from xj − ∆x/2 to xj + ∆x/2.17) .9) (5.(iu cos θ + ju sin θ)ds = 0 (5.3 OneDimensional Examples We restrict our attention to a scalar dependent variable u and a scalar ﬂux f .2. 2.3. We will use the following notation: xj−1/2 = xj − ∆x/2. The nodes of the grid are located at xj = j∆x as usual.2 The Diﬀusion Equation The integral form of the twodimensional diﬀusion equation with no source term and unit diﬀusion coeﬃcient ν is obtained from the general divergence form. xj+1/2 = xj + ∆x/2 (5. Eq. 2.14) (5. F is simply a vector. 5. i ∂u ∂u +j ds ∂x ∂y (5. the twodimensional linear convection equation is obtained from the general divergence form. Eq. as shown in Fig.8) (5. FINITEVOLUME METHODS For unit speed a. 2. as in the model equations. 5. with Q = u F = iu cos θ + ju sin θ P = 0 (5. with Q = u F = −∇u ∂u ∂u = − i +j ∂x ∂y P = 0 Using these. we ﬁnd d dt udA = A C (5.2 gives the following integral form d dt udA + A C n.11) where A is the area of the cell which is bounded by the closed contour C.15) n.3.1.
fj±1/2 = f (uj±1/2) (5.19 in a Taylor series about xj (with t ﬁxed) to get uj ¯ 1 ≡ ∆x = uj + ∂u uj + ξ ∂x −∆x/2 ∆x 24 2 ∆x/2 ξ2 + 2 j 4 ∂2 u ∂x2 j ξ3 + 6 ∂3u ∂x3 + . becomes d¯j u ∆x + fj+1/2 − fj−1/2 = 0 (5. . uj±1/2 = u(xj±1/2 ). 5. 5. the cellaverage value becomes xj+1/2 1 uj (t) ≡ ¯ u(x.3. ONEDIMENSIONAL EXAMPLES 75 j1/2 ∆ x j+1/2 j2 j1 L R j L R j+1 j+2 Figure 5. dξ j ∂ u ∂x2 2 + j ∆x 1920 ∂ u ∂x4 4 + O(∆x6 ) j (5.1: Control volume in one dimension.23) dt .20) Now with ξ = x − xj . 5.3.18) With these deﬁnitions.21) or uj = uj + O(∆x2 ) ¯ (5. Eq. Hence the cellaverage value and the value at the center of the cell diﬀer by a term of second order.11. t)dx ∆x xj−1/2 and the integral form becomes d (∆x¯j ) + fj+1/2 − fj−1/2 = u dt xj+1/2 xj−1/2 (5.1 A SecondOrder Approximation to the Convection Equation In one dimension. we can expand u(x) in Eq.19) P dx (5. .5. the integral form of the linear convection equation.22) where uj is the value at the center of the cell.
this point operator produces the following semidiscrete form: du ¯ 1 =− Bp (−1. 0. We choose a piecewise constant approximation to u(x) in each cell such that u(x) = uj ¯ Evaluating this at j + 1/2 gives L fj+1/2 = f (uL ) = uL ¯ j+1/2 j+1/2 = uj xj−1/2 ≤ x ≤ xj+1/2 (5. Eq. Thus 1 L 1 R ˆ fj+1/2 = (fj+1/2 + fj+1/2 ) = (¯j + uj+1) u ¯ 2 2 and 1 L 1 R ˆ fj−1/2 = (fj−1/2 + fj−1/2 ) = (¯j−1 + uj ) u ¯ 2 2 (5. giving R fj−1/2 = uj ¯ (5. 5. gives R fj+1/2 = uj+1 ¯ (5.25) where the L indicates that this approximation to fj+1/2 is obtained from the approximation to u(x) in the cell to the left of xj+1/2 .31) With periodic boundary conditions. The cell to the right of xj+1/2 .28) We have now accomplished the ﬁrst step from the list in Section 5.76 CHAPTER 5.30 into the integral form. we obtain ∆x d¯j 1 u 1 d¯j 1 u + (¯j + uj+1) − (¯j−1 + uj ) = ∆x u ¯ u ¯ + (¯j+1 − uj−1 ) = 0 u ¯ dt 2 2 dt 2 (5.24) (5. which is cell j + 1.1.29) ˆ where f denotes a numerical ﬂux which is an approximation to the exact ﬂux. the discontinuity in the ﬂux at the cell boundary is resolved by taking the average of the ﬂuxes on either side of the boundary. 1)u ¯ dt 2∆x (5.27) and cell j − 1 is the cell to the left of xj−1/2 . 5. as shown in Fig.1. giving L fj−1/2 = uj−1 ¯ (5. In this example.23. Substituting Eqs.26) Similarly. FINITEVOLUME METHODS with f = u.30) (5. 5. cell j is the cell to the right of xj−1/2 .32) .29 and 5. we have deﬁned the ﬂuxes at the cell boundaries in terms of the cellaverage data.
rather than the nodal values.30.3. ONEDIMENSIONAL EXAMPLES 77 This is identical to the expression obtained using secondorder centered diﬀerences.5. and c are chosen to satisfy the following constraints: 1 ∆x 1 ∆x 1 ∆x These constraints lead to a= uj+1 − 2¯j + uj−1 ¯ u ¯ 2 2∆x uj+1 − uj−1 ¯ ¯ 2∆x (5.35) −∆x/2 −3∆x/2 ∆x/2 −∆x/2 3∆x/2 ∆x/2 u(ξ)dξ = uj−1 ¯ u(ξ)dξ = uj ¯ (5.33) where ξ is again equal to x − xj . Since the eigenvalues of Bp (−1. 0.29 and 5.36) . except it is written in terms of the cell average u. 1) are pure imaginary.3. and c.34) u(ξ)dξ = uj+1 ¯ b= −¯j−1 + 26¯j − uj+1 u u ¯ 24 With these values of a. we can conclude that the use of the average of the ﬂuxes on either side of the cell boundary. The three parameters a.3. 0. 5.1 with a piecewise quadratic approximation as follows u(ξ) = aξ 2 + bξ + c (5. u. 1) is relevant to ﬁnitevolume methods as well as ﬁnitediﬀerence methods. 5. ¯ Hence our analysis and understanding of the eigensystem of the matrix Bp (−1. b. can lead to a nondissipative ﬁnitevolume method. the piecewise quadratic approximation produces the following values at the cell boundaries: c= 1 uL u u ¯ j+1/2 = (2¯j+1 + 5¯j − uj−1 ) 6 (5. as in Eqs. b.2 A FourthOrder Approximation to the Convection Equation Let us replace the piecewise constant approximation in Section 5.
23.3.3. −1)u ¯ dt 12∆x This is a system of ODE’s governing the evolution of the cellaverage data. we again use the average of the ﬂuxes on either side of the boundary to obtain 1 ˆ fj+1/2 = [f (uL ) + f (uR )] j+1/2 j+1/2 2 1 = (−¯j+2 + 7¯j+1 + 7¯j − uj−1) u u u ¯ 12 and 1 ˆ fj−1/2 = [f (uL ) + f (uR )] j−1/2 j−1/2 2 1 = u u ¯ (−¯j+1 + 7¯j + 7¯j−1 − uj−2) u 12 Substituting these expressions into the integral form.41) This is a fourthorder approximation to the integral form of the equation. (5.40) (5. FINITEVOLUME METHODS 1 uR u u u j−1/2 = (−¯j+1 + 5¯j + 2¯j−1 ) 6 1 u u uR u j+1/2 = (−¯j+2 + 5¯j+1 + 2¯j ) 6 1 uL u u ¯ j−1/2 = (2¯j + 5¯j−1 − uj−2 ) 6 (5.1. the following semidiscrete form is obtained: du ¯ 1 =− Bp (1.38) (5.42) (5.39) using the notation deﬁned in Section 5. 0. we describe two approaches to deriving a ﬁnitevolume approximation to the diﬀusion equation. . With periodic boundary conditions. 8.3 A SecondOrder Approximation to the Diﬀusion Equation In this section.37) (5. −8. Recalling that f = u. Eq.43) 5. 5. as can be veriﬁed using Taylor series expansions (see question 1 at the end of this chapter). while the second approach is more suited to extension to higher order accuracy. The ﬁrst approach is simpler to extend to multidimensions. gives ∆x d¯j u 1 + (−¯j+2 + 8¯j+1 − 8¯j−1 + uj−2 ) = 0 u u u ¯ dt 12 (5.78 CHAPTER 5.
with Dirichlet boundary conditions.5. 5.45) We can thus write the following expression for the average value of the gradient of u over the interval xj ≤ x ≤ xj+1 : 1 ∆x xj+1 xj ∂u 1 dx = (uj+1 − uj ) ∂x ∆x (5. Hence. −2.22. 5. 5. 1 ˆ fj−1/2 = − ¯ (¯j − uj−1) u ∆x Substituting these into the integral form.44) with f = −∇u = −∂u/∂x. to secondorder.16. Also.3. du ¯ 1 = B(1. 5. we obtain ∆x d¯j u 1 = (¯j−1 − 2¯j + uj+1) u u ¯ dt ∆x (5.51) .47) or.2.49) (5.44.3. Eq. we can write ∂u ˆ fj+1/2 = − ∂x Similarly. we use a piecewise quadratic approximation as in Section 5.46) From Eq. becomes ∆x d¯j u + fj+1/2 − fj−1/2 = 0 dt (5. Eq. From Eq. −2.6 becomes b a ∂u dx = u(b) − u(a) ∂x (5. 1)u + bc ¯ dt ∆x2 (5. and we see that the properties of the matrix B(1. Eq. 5. the integral form of the diﬀusion equation. 1) are relevant to the study of ﬁnitevolume methods as well as ﬁnitediﬀerence methods. we know that the value of a continuous function at the center of a given interval is equal to the average value of the function over the interval to secondorder accuracy. ONEDIMENSIONAL EXAMPLES 79 In one dimension.48) =− j+1/2 1 (¯j+1 − uj ) u ¯ ∆x (5.50) This provides a semidiscrete ﬁnitevolume approximation to the diﬀusion equation. For our second approach.33 we have ∂u ∂u = = 2aξ + b ∂x ∂ξ (5.
∆ is the length of the sides of the triangles.56) n. The following relations hold between .2.3.80 CHAPTER 5. 5. The nodal data are stored at the vertices of the triangles formed by the grid. 1)u ¯ dt ∆x2 which is written entirely in terms of cellaverage data. as shown in Figure 5. FINITEVOLUME METHODS with a and b given in Eq.52) 1 ¯ (5. ∆.35.4 A TwoDimensional Example The above onedimensional examples of ﬁnitevolume approximations obscure some of the practical aspects of such methods.57) . This produces R L fj−1/2 = fj−1/2 = − d¯j u 1 = (¯j−1 − 2¯j + uj+1 ) u u ¯ (5.1. The resulting semidiscrete form with periodic boundary conditions is 1 du ¯ = Bp (1. and is the length of the sides of the hexagons. As in Section 5. 1 = √ ∆ 3 √ 3 3 2 A = 2 2 = A 3∆ The twodimensional form of the conservation law is d dt QdA + A C (5. 5. (5. we use a piecewise constant approximation in each control volume and the ﬂux at the control volume boundary is the average of the ﬂuxes obtained on either side of the boundary.49.55) 5. With f = −∂u/∂x. Thus our ﬁnal example is a ﬁnitevolume approximation to the twodimensional linear convection equation on a grid consisting of regular triangles. The control volumes are regular hexagons with area A.Fdl = 0 (5. this gives R L fj+1/2 = fj+1/2 = − 1 ¯ (¯j+1 − uj ) u ∆x (5. −2.54) dt ∆x2 which is identical to Eq. and A.53) (¯j − uj−1 ) u ∆x Notice that there is no discontinuity in the ﬂux at the cell boundary.
2: Triangular grid. n 3 j)/2 i √ (i + √ j)/2 3 (−i + 3 j)/2 −i √ (−i − 3 j)/2 (i − √ Table 5. . The contour in the line integral is composed of the sides of the hexagon.2. where we have ignored the source term. A TWODIMENSIONAL EXAMPLE c b 81 2 3 p d 4 1 l ∆ 0 5 a e f Figure 5. i and j are unit normals along x and y. as shown in Figure 5. ν 0 1 2 3 4 5 Outward Normal.2. the unit normals can be taken outside the integral and the ﬂux balance is given by d dt 5 Q dA + A ν=0 nν · Fdl = 0 ν where ν indexes a side of the hexagon.1.5. Outward normals. 5. Since these sides are all straight. A list of the normals for the mesh orientation shown is given in Table 5. see Fig. respectively.1. Side.4.
see Eq. Side. ν 0 1 2 3 4 5 nν · (i cos θ + j sin θ) √ (cos θ − 3 sin θ)/2 cos θ √ (cos θ + √ sin θ)/2 3 (− cos θ + 3 sin θ)/2 − cos θ √ (− cos θ − 3 sin θ)/2 Table 5. FINITEVOLUME METHODS For Eq. 5.82 CHAPTER 5. 5.2. we have d dt 5 u dA + A ν=0 nν · (i cos θ + j sin θ) 1/2 −1/2 u(z)dz ν =0 (5. Introducing the cell average. Making the change of variable z = ξ/ .60.11.2. if the integrals in the equation are evaluated exactly. the ap¯ proximation to the ﬂux integral becomes trivial. A u dA = A¯p u (5.62) . 5. There are no numerical approximations in Eq. That is.58) where ξ is a length measured from the middle of a side ν.60) The values of nν · (i cos θ + j sin θ) are given by the expressions in Table 5. one has the expression /2 − /2 1/2 u(ξ)dξ = −1/2 u(z)dz (5.59) Then. the integrated time rate of change of the integral of u over the area of the hexagon is known exactly. we have for side ν nν · Fdl = nν · (i cos θ + j sin θ) /2 − /2 ν uν (ξ)dξ (5. the twodimensional linear convection equation. Weights of ﬂux integrals. Taking the average of the ﬂux on either side of each edge of the hexagon gives for edge 1: u(z)dz = 1 u p + ua ¯ ¯ 2 1/2 −1/2 dz = u p + ua ¯ ¯ 2 (5. in terms of u and the hexagon area A.60.61) and the piecewiseconstant approximation u = up over the entire hexagon.
5. PROBLEMS Similarly.5.64) u(z)dz = 4 (5.5. Use Taylor series to verify that Eq.66) u(z)dz = 0 (5. that this is a secondorder approximation to the integral form of the twodimensional linear convection equation.63) u(z)dz = 3 (5. 5. 5.69) The reader can verify.60. 5.2. we obtain A √ d¯p u + [(2 cos θ)(¯a − ud ) + (cos θ + 3 sin θ)(¯b − ue ) u ¯ u ¯ dt 2 √ +(− cos θ + 3 sin θ)(¯c − uf )] = 0 u ¯ (5.68) or √ d¯p u 1 + [(2 cos θ)(¯a − ud ) + (cos θ + 3 sin θ)(¯b − ue ) u ¯ u ¯ dt 3∆ √ +(− cos θ + 3 sin θ)(¯c − uf )] = 0 u ¯ (5. Find the semidiscrete ODE form governing the cellaverage data resulting from the use of a linear approximation in developing a ﬁnitevolume method for the linear convection equation.65) u(z)dz = 5 (5. using Taylor series expansions.23. we have for the other ﬁve edges: u(z)dz = 2 83 u p + ub ¯ ¯ 2 u p + uc ¯ ¯ 2 u p + ud ¯ ¯ 2 u p + ue ¯ ¯ 2 u p + uf ¯ ¯ 2 (5.67) Substituting these into Eq. along with the expressions in Table 5.42 is a fourthorder approximation to Eq. Use the following linear approximation: u(ξ) = aξ + b . 2.5 Problems 1.
Using the ﬁrst approach given in Section 5. 4.3. derive a ﬁnitevolume approximation to the spatial terms in the twodimensional diﬀusion equation on a square grid. FINITEVOLUME METHODS a= uj+1 − uj−1 ¯ ¯ 2∆x and use the average ﬂux at the cell interface. . 3. Repeat question 3 for a grid consisting of equilateral triangles.3.84 where b = uj and ¯ CHAPTER 5.
For example.). For a steady ﬂow problem. timeaccuracy is not required.1) These can be integrated in time using a timemarching method to obtain a timeaccurate solution to an unsteady ﬂow problem. one can consider a timedependent path to the steady state and use a timemarching method to integrate the unsteady form of the equations until the solution is suﬃciently close to the steady solution.10. we obtain a coupled system of nonlinear ODE’s in the form du = F (u. spatial discretization leads to a coupled system of nonlinear algebraic equations in the form F (u) = 0 (6. one can consider the use of Newton’s method. 85 . Alternatively. The subject of the present chapter. When using a timemarching method to compute steady ﬂows. This produces an iterative method in which a coupled system of linear algebraic equations must be solved at each iteration. the goal is simply to remove the transient portion of the solution as quickly as possible.2) As a result of the nonlinearity of these equations. which will be discussed in Chapter 9. some sort of iterative method is required to obtain a solution. or directly using Gaussian elimination or some variation thereof. This motivates the study of stability and stiﬀness. timemarching methods for ODE’s. These can be solved iteratively using relaxation methods. which is widely used for nonlinear algebraic equations (See Section 6. topics which are covered in the next two chapters. t) dt (6. is thus relevant to both steady and unsteady ﬂow problems.3.Chapter 6 TIMEMARCHING METHODS FOR ODE’S After discretizing the spatial derivatives in the governing PDE’s (such as the NavierStokes equations).
u. The methods we study are to be applied to linear or nonlinear ODE’s. β−1 un−1 .1. 6. or the (n) superscript. 6. but the methods themselves are formed by linear combinations of the dependent variable and its derivative at various time intervals. rather than w. We then face a further task concerning the numerical stability of the resulting methods. always points to a discrete time value.3) Although we use u to represent the dependent variable. In earlier chapters. Application of a timemarching method to an ODE produces an ordinary diﬀerence equation (O∆E ). β0 un . we introduced the convention that the n subscript. 4. In this chapter we will present some basic theory of linear O∆E’s which closely parallels that for linear ODE’s. Our ﬁrst task is to ﬁnd numerical approximations that can be used to carry out the time integration of Eq. for the purpose of this chapter. tn + αh) ˜ u The choice of u or F to express the derivative in a scheme is arbitrary. t) dt (6. we need only consider the scalar case du = u = F (u. where accuracy can be measured either in a local or a global sense. and. For these we use the notation ˜ ¯ ˜ un+α = Fn+α = F (˜n+α. 6.3 to some given accuracy.3 gives un = Fn = F (un. · · · (6. α0 un . the reader should recall the arguments made in Chapter 4 to justify the study of a scalar ODE. and h represents the time interval ∆t. α−1 un−1 . we developed exact solutions to our model PDE’s and ODE’s. etc. · · · . They are represented conceptually by un+1 = f β1 un+1.1 Notation Using the semidiscrete approach. tn ) . They are both commonly used in the literature on ODE’s. TIMEMARCHING METHODS FOR ODE’S Application of a spatial discretization to a PDE produces a coupled system of ODE’s.4) . Combining this notation with Eq. using this theory. tn = nh Often we need a more sophisticated notation for intermediate time steps involving temporary calculations denoted by u.86 CHAPTER 6. However. we will develop exact solutions for the model O∆E’s arising from the application of timemarching methods to the model ODE’s. we reduce our PDE to a set of coupled ODE’s represented in general by Eq. In Chapter 2. but we postpone such considerations to the next chapter.
3 1 . we can analytically convert Here we give only MacCormack’s timemarching method.2 Converting TimeMarching Methods to O∆E’s Examples of some very common forms of methods used for timemarching general ODE’s are: un+1 = un + hun un+1 = un + hun+1 and un+1 = un + hun ˜ 1 un+1 = [un + un+1 + h˜n+1 ] ˜ u 2 (6. implicit methods require more complicated strategies to solve for un+1 than explicit methods. will be presented in Section 11.7) According to the conditions presented under Eq. that is. and therefore the time advance is simple. An explicit method is one in which the new predicted solution is only a function of known data. and un−1 for a method using two previous time levels. for systems of ODE’s and nonlinear problems. The value of this equation arises from the fact that. these methods can be constructed to give a local Taylor series accuracy of any order. un. at given time intervals. un+1 . un−1. These methods are simple recipes for the time advance of a function in terms of its value and the value of its derivative.1 respectively. The second is implicit and referred to as the implicit (or backward) Euler method. The methods are said to be explicit if β1 = 0 and implicit otherwise. For an implicit method.2. 6. the ﬁrst and third of these are examples of explicit methods. As we shall see. The material presented in Chapter 4 develops a basis for evaluating such methods by introducing the concept of the representative equation du = u = λu + aeµt dt (6. The method commonly referred to as MacCormack’s method. CONVERTING TIMEMARCHING METHODS TO O∆E’S 87 With an appropriate choice of the α s and β s. un . u.4. for example.6) (6. 6.8) written here in terms of the dependent variable. by applying a timemarching method. which is a fullydiscrete method.5) (6. the new predicted solution is also a function of the time derivative at the new time level.6. We refer to them as the explicit Euler method and the MacCormack predictorcorrector method.
6. to Eq. This is demonstrated by repeating some of the developments in Section 4. 6. TIMEMARCHING METHODS FOR ODE’S such a linear ODE into a linear O∆E . but for diﬀerence rather than diﬀerential equations. We next consider examples of this conversion process and then go into the general theory on solving O∆E’s. Eq. with constant coeﬃcients. and just as powerful as. the predictorcorrector sequence.11 is identical to Eq. 6. 6.2.8. gives un+1 − (1 + λh)un = aheµhn ˜ 1 1 1 u aheµh(n+1) − (1 + λh)˜n+1 + un+1 − un = (6.11 is obtained by noting that un+1 = F (˜n+1. Eq. .9.9) Eq.6. (6.12) Now we need to develop techniques for analyzing these diﬀerence equations so that we can compare the merits of the timemarching methods that generated them. the theory of ODE’s. expressed in terms of the dependent variable un and the independent variable n. There results un+1 = un + h(λun + aeµhn ) or un+1 − (1 + λh)un = haeµhn (6.7 is simply the explicit Euler method.5. The second line in Eq. As another example. we ﬁnd un+1 = un + h λun+1 + aeµh(n+1) or (1 − λh)un+1 − un = heµh · aeµhn As a ﬁnal example. The latter are subject to a whole body of analysis that is similar in many respects to. 6.11) 2 2 2 which is a coupled set of linear O∆E’s with constant coeﬃcients. Apply the simple explicit Euler scheme. applying the implicit Euler method. tn + h) ˜ u = λ˜n+1 + aeµh(n+1) u (6.10) 6. since the predictor step in Eq.3 Solution of Linear O∆E’s With Constant Coeﬃcients The techniques for solving linear diﬀerence equations with constant coeﬃcients is as well developed as that for ODE’s and the theory follows a remarkably parallel path. 6. to Eq. 6. Note that the ﬁrst line in Eq. Eq.7.9 is a linear O∆E .8. 6.88 CHAPTER 6. 6. 6.
and SecondOrder Diﬀerence Equations FirstOrder Equations The simplest nonhomogeneous O∆E of interest is given by the single ﬁrstorder equation un+1 = σun + abn (6. un+k = E k un Further notice that the displacement operator also applies to exponents. In terms of the initial value of u it can be written u n = u0 σ n + a bn − σn b−σ Just as in the development of Eq.3. 6. 4.1 First. we use for O∆E’s the diﬀerence operator E (commonly referred to as the displacement or shift operator) and deﬁned formally by the relations un+1 = Eun . and b are. a.3. un+1 = σun + aσn is un = u0 + anσ−1 σ n This can all be easily veriﬁed by substitution. in general. The exact solution of Eq. thus bα · bn = bn+α = E α · bn . one can readily show that the solution of the defective case.13 is un = c1 σ n + abn b−σ where c1 is a constant determined by the initial conditions. complex parameters.6. and since the equations are linear and have constant coeﬃcients. SOLUTION OF LINEAR O∆E’S WITH CONSTANT COEFFICIENTS 89 6. SecondOrder Equations The homogeneous form of a secondorder diﬀerence equation is given by un+2 + a1 un+1 + a0 un = 0 (6.14) d Instead of the diﬀerential operator D ≡ dt used for ODE’s. The independent variable is now n rather than t. (b = σ). σ is not a function of either n or u.10.13) where σ.
17 can be written (c11 − E) c12 c21 (c22 − E) u1 u2 (n) = [ C − E I ]un = 0 which must be zero for all u1 and u2 .14 for all c1 . TIMEMARCHING METHODS FOR ODE’S where α can be any fraction or irrational number. there are just two σ roots and in terms of them the solution can be written un = c1 (σ1 )n + c2 (σ2 )n (6. 6. The fact that Eq. The operational form contains a characteristic polynomial P (E) which plays the same role for diﬀerence equations that P (D) played for diﬀerential equations.15 is known as the operational form of Eq. 6.16) where c1 and c2 depend upon the initial conditions. 6. 6. · · ·. linear homogeneous diﬀerence equations have the form u1 u2 which can also be written un+1 = C un . and refer to them as the σroots. Eq. Thus Eq. etc. 6. 6.16 is a solution to Eq.2 Special Cases of Coupled FirstOrder Equations A Complete System Coupled.3. u2 (n) (n) T (n+1) (n+1) = c11 u1 + c12 u2 = c21 u1 + c22 u2 (n) (n) (n) (n) (6. The roles of D and E are the same insofar as once they have been introduced to the basic equations the value of u(t) or un can be factored out. un = u1 . this time having the form P (E) = det [ C − E I ]. we label these roots σ1 .17) .15) which must be zero for all un . c2 and n should be veriﬁed by substitution. ﬁrstorder. C= c11 c21 c12 c22 The operational form of Eq. 6. Again we are led to a characteristic polynomial.90 CHAPTER 6. In the simple example given above.14 can now be reexpressed in an operational notion as (E 2 + a1 E + a0 )un = 0 (6. its roots determine the solution to the O∆E.14. σ2 . They are found by solving the equation P (σ) = 0. that is. The σroots are found from P (σ) = det (c11 − σ) c12 =0 c21 (c22 − σ) . In the analysis of O∆E’s.
if x are its eigenvectors.17 is 2 un = k=1 ck (σk )n xk where ck are constants determined by the initial conditions. linear.2. one can show that the solution to un+1 ¯ σ ˆ un+1 = 1 un+1 is σ 1 un ¯ ˆ un σ un u n = u0 σ n ¯ ¯ un = u0 + u0 nσ−1 σ n ˆ ˆ ¯ un = u0 + u0 nσ−1 + u0 ˆ ¯ n(n − 1) −2 n σ σ 2 (6. the solution of Eq. produced by applying a timemarching method to the representative equation.11. SOLUTION OF THE REPRESENTATIVE O∆E’S 91 Obviously the σk are the eigenvalues of C and. Using the displacement operator.18) 6.4 6.2 for defective ODE’s.9 to 6.19) (6. 6.6.21) .4. following the logic of Section 4. 6. A Defective System The solution of O∆E’s with defective eigensystems follows closely the logic in Section 4.20) E −(1 + λh) 1 (1 + λh)E −2 E−1 2 u ˜ u =h· n 1 1E 2 · aeµhn (6. For example.2.4. E. these equations can be written [E − (1 + λh)]un = h · aeµhn [(1 − λh)E − 1]un = h · E · aeµhn (6. ﬁrstorder ordinary diﬀerence equations. are given by Eqs.1 Solution of the Representative O∆E’s The Operational Form and its Solution Examples of the nonhomogeneous.
˜ the important subset of this solution which occurs when µ = 0.23) where σk are the K roots of the characteristic polynomial.21. Eq. Keep in mind that for methods such as in Eq. respectively. Eq. 6.22 all manner of standard timemarching methods having multiple time steps and various types of intermediate predictorcorrector families. one for un and un and we are usually only interested in the ﬁnal solution un. 6.22) which is produced by applying timemarching methods to the representative ODE. Notice also.21. For the explicit Euler method. or a steady state. 6. TIMEMARCHING METHODS FOR ODE’S All three of these equations are subsets of the operational form of the representative O∆E P (E)un = Q(E) · aeµhn (6. we derive the solutions of Eqs. 6. representing a time invariant particular solution.22 can be expressed as K un = k=1 ck (σk ) + ae n µhn Q(eµh ) · P (eµh ) (6. 6.45.2 Examples of Solutions to TimeMarching O∆E’s As examples of the use of Eqs.19 to 6.92 CHAPTER 6. 4. 6.25) . In such a case K un = k=1 ck (σk )n + a · Q(1) P (1) 6.20. we have P (E) = E − 1 − λh Q(E) = h and the solution of its representative O∆E follows immediately from Eq. The terms P (E) and Q(E) are polynomials in E referred to as the characteristic polynomial and the particular polynomial. the ratio Q(E)/P (E) can be found by Kramer’s rule.23: un = c1 (1 + λh)n + aeµhn · For the implicit Euler method.24) h − 1 − λh (6. Eq.22 and 6. P (σ) = 0.4. We can express in terms of Eq.19.21 there are multiple (two in this case) solutions. as would be the case for Eq. The general solution of Eq. 6.23. 6. 6. When determinants are involved in the construction of P (E) and Q(E). we have P (E) = (1 − λh)E − 1 Q(E) = hE e µh (6.
26) 6. 4. one can write u(t) = c1 eλ1 h n x1 + · · · + cm eλm h n xm + · · · + cM eλM h n xM + P.6.1 The λ − σ Relation Establishing the Relation We have now been introduced to two basic kinds of roots. Eq.5. THE λ − σ RELATION so un = c1 1 1 − λh n 93 heµh · (1 − λh)eµh − 1 + ae µhn In the case of the coupled predictorcorrector equations. and there results P (E) = det E 1 (1 + λh)E −2 −(1 + λh) 1 = E E − 1 − λh − λ2 h2 1 E−2 2 1 h 1 hE = 2 hE(E + 1 + λh) 2 Q(E) = det The σroot is found from E − 1 (1 + λh)E 2 1 P (σ) = σ σ − 1 − λh − λ2 h2 = 0 2 which has only one nontrivial root (σ = 0 is simply a shift in the reference index). There is a fundamental relation between the two which can be used to identify many of the essential properties of a timemarch method. the λroots and the σroots. The former are the eigenvalues of the A matrix in the ODE’s found by space diﬀerencing the original PDE.S. This relation is ﬁrst demonstrated by developing it for the explicit Euler method. 6.5. The complete solution can therefore be written 1 un = c1 1 + λh + λ2 h2 2 n 1 h eµh + 1 + λh µhn 2 + ae · 1 eµh − 1 − λh − λ2 h2 2 (6.27. one solves for the ˜ ﬁnal family un (one can also ﬁnd a solution for the intermediate family u).21. (6. Remembering that t = nh. First we make use of the semidiscrete approach to ﬁnd a system of ODE’s and then express its solution in the form of Eq. and the latter are the roots of the characteristic polynomial in a representative O∆E found by applying a timemarching method to the representative ODE.27) .5 6.
instead of the Euler method.8 to Eq. Now the explicit Euler method produces for each λroot. Suppose.28. we see a correspondence between σm and eλm h . (6. Since the value of eλh can be expressed in terms of the series 1 1 1 eλh = 1 + λh + λ2 h2 + λ3 h3 + · · · + λn hn + · · · 2 6 n! the truncated expansion σ = 1 + λh is a reasonable3 approximation for small enough λh.3. .27 and Eq. So if we use the Euler method for the time advance of the ODE’s.29.32) This is an approximation to eλm h with an error O(λ3 h3 ). the solution2 of the resulting O∆E is un = c1 (σ1 )n x1 + · · · + cm (σm )n xm + · · · + cM (σM )n xM + P. λm h − 1 + λ2 h2 . so that for every λ the σ must satisfy the relation 2 σm − 2λm hσm − 1 = 0 (6.28) where the cm and the xm in the two equations are identical and σm = (1 + λm h). For one of these we ﬁnd σm = λm h + 1 + λ2 h2 m 1 1 = 1 + λm h + λ2 h2 − λ4 h4 + · · · m 2 8 m (6.5. The other root. 6. one σroot. 6. 6.30) Now we notice that each λ produces two σroots. we have the characteristic polynomial P (E) = E2 − 2λhE − 1.29) Applying Eq.4.S. The error is O(λ2 h2 ).31) (6. which is given by σ = 1 + λh. will be discussed in Section 6. TIMEMARCHING METHODS FOR ODE’S where for the present we are not interested in the form of the particular solution (P. 6. we use the leapfrog method for the time advance. which is deﬁned by un+1 = un−1 + 2hun (6. Comparing Eq. m 2 3 Based on Section 4.).S.94 CHAPTER 6.
6. which is the leapfrog method: un+1 = un−1 + 2hun (6.3 Spurious σRoots We saw from Eq.5. this is reduced by a factor of eight (considering the leading term only).5. 6. 4.33) has a leading truncation error which is O(h2 ). Therefore the error at the end of the simulation is reduced by a factor of four. Note that a secondorder approximation to a derivative written in the form (δt u)n = 1 (un+1 − un−1 ) 2h (6.34) (6. knowing only that its leading error is O hk+1 .30 that the λ − σ relation for the leapfrog method produces two σroots for each λ.2 The Principal σRoot Based on the above we make the following observation: Application of the same timemarching method to all of the equations in a coupled system linear ODE’s in the form of Eq. twice as many time steps are required to reach the time T . always produces one σroot for every λroot that satisﬁes the relation 1 1 σ = 1 + λh + λ2 h2 + · · · + λk hk + O hk+1 2 k! where k is the order of the timemarching method.34.5. However. The following example makes this clear. THE λ − σ RELATION 95 6. 6. One of these we identiﬁed as the principal root which always . We refer to the root that has the above property as the principal σroot. and designate it (σm )1 . 6. Thus the principal root is an approximation to eλh up to O hk .35) has a leading truncation error O(h3 ). This arises simply because of our notation for the timemarching method in which we have multiplied through by h to get an approximation for the function un+1 rather than the derivative as in Eq. Now consider the solution obtained using the same method with a time step h/2. Since the error per time step is O(h3 ). while the secondorder timemarching method which results from this approximation. consistent with a secondorder approximation. Consider a solution obtained at a given time T using a secondorder timemarching method with a time step h.6. The above property can be stated regardless of the details of the timemarching method.
. Such roots originate entirely from the numerical approximation of the timemarching method and have nothing to do with the ODE being solved. For example. It should be mentioned that methods with spurious roots are not self starting. the leapfrog method requires un−1 and thus cannot be started using only un . 6. In fact. if there is one spurious root to a method. the solution for the O∆E in the form shown in Eq. they play virtually no role in accuracy analysis. . This results because methods which produce spurious roots require data from time level n − 1 or earlier. and presumably the magnitudes of the spurious roots themselves are all less than one (see Chapter 7). except for the principal one. Following again the message of Section 4. We refer to them as oneroot methods. if there are more spurious roots (6. The initial vector u0 does not provide enough data to initialize all of the coeﬃcients.36 must be initialized by some starting procedure. If a timemarching method produces spurious σroots.96 CHAPTER 6.36) Spurious roots arise if a method uses data from time level n − 1 or earlier to advance the solution from time level n to n + 1.4. In general. generation of spurious roots does not. 3. we have un = c11 (σ1 )n x1 + · · · + cm1(σm )n xm + · · · + cM 1 (σM )n xM + P.28 must be modiﬁed. 6. All spurious roots are designated (σm )k where k = 2. The other is referred to as a spurious σroot and designated (σm )2 . after some ﬁnite time the amplitude of the error associated with the spurious roots is even smaller then when it was initialized. TIMEMARCHING METHODS FOR ODE’S has the property given in 6. make a method inferior.. Then the presence of spurious roots does not contaminate the answer. 6. many very accurate methods in practical use for integrating some forms of ODE’s have spurious roots. if one starts the method properly) the spurious coeﬃcients are all initialized with very small magnitudes. To express this fact we use the notation σ = σ(λh).33. in itself. are spurious.5. · · ·. 1 1 1 +c12 (σ1 )n x1 + · · · + cm2 (σm )n xm + · · · + cM 2 (σM )n xM 2 2 2 +c13 (σ1 )n x1 + · · · + cm3 (σm )n xm + · · · + cM 3 (σM )n xM 3 3 3 +etc. the λ − σ relation produced by a timemarching scheme can result in multiple σroots all of which. it is always some algebraic function of the product λh. For example. all of the coeﬃcients (cm )2 in Eq. That is.e. However. Thus while spurious roots must be considered in stability analysis. No matter whether a σroot is principal or spurious. They are also called onestep methods.4 OneRoot TimeMarching Methods There are a number of timemarching methods that produce only one σroot for each λroot.S. Presumably (i.
as we shall see in Chapter 8.6. these equations are identical.37) un = c1 (σ1 )n x1 + · · · + cm (σm )n xm + · · · + cM (σM )n xM + A−1 f respectively. it is of no use in: . For example. A popular method having this property.1 Accuracy Measures of TimeMarching Methods Local and Global Error Measures There are two broad categories of errors that can be used to derive and evaluate timemarching methods. see Section 3.4. Its λ − σ relation is σ= 1 + (1 − θ)λh 1 − θλh It is instructive to compare the exact solution to a set of ODE’s (with a complete eigensystem) having timeinvariant forcing terms with the exact solution to the O∆E’s for oneroot methods. This is a global error. The only error made by introducing the time marching is the error that σ makes in approximating eλh . Three oneroot methods were analyzed in Section 6.6 6.6. Notice that when t and n = 0. It is usually used as the basis for establishing the order of a method. is given by the formula un+1 = un + h (1 − θ)un + θun+1 The θmethod represents the explicit Euler (θ = 0). so that all the constants. The other is the error determined at the end of a given event which has covered a speciﬁc interval of time composed of many time steps.4. ACCURACY MEASURES OF TIMEMARCHING METHODS 97 They have the signiﬁcant advantage of being selfstarting which carries with it the very useful property that the timestep interval can be changed at will throughout the marching process. It is useful for comparing methods. It is quite common to judge a timemarching method on the basis of results found from a Taylor table. These are u(t) = c1 eλ1 h n x1 + · · · + cm eλm h n xm + · · · + cM eλM h n xM + A−1 f (6. vectors.2. the trapezoidal (θ = 1 ). 6. However. a Taylor series analysis is a very limited tool for ﬁnding the more subtle properties of a numerical timemarching method. and matrices are identical except the u and the terms inside the parentheses on the right hand sides. This is a local error such as that found from a Taylor table analysis. the socalled θmethod. One is the error made in each time step. respectively.6. and the 2 implicit Euler methods (θ = 1).
However. • evaluating numerical stability and separating the errors in phase and amplitude. is to some extent arbitrary. This is similar to the error found from a Taylor table.38) and the solution to the representative O∆E’s. a very natural local error measure for the transient solution is the value of the diﬀerence between solutions based on these two roots. a necessary condition for the choice should be that the measure can be used consistently for all methods. erω ) Transient error The particular choice of an error measure. either local or global.39) 6. The order of the method is the last power of λh matched exactly. The latter three of these are of concern to us here. TIMEMARCHING METHODS FOR ODE’S • ﬁnding spurious roots. Our error measures are based on the diﬀerence between the exact solution to the representative ODE.6. • analyzing the particular solution of predictorcorrector combinations. In the discussion of the λσ relation we saw that all timemarching methods produce a principal σroot for every λroot that exists in a set of linear ODE’s. including only the contribution from the principal root.98 CHAPTER 6. and to study them we make use of the material developed in the previous sections of this chapter. . Therefore. which can be written as un = c1 (σ1 ) + ae n µhn Q(eµh ) · P (eµh ) (6. given by u(t) = ceλt + aeµt µ−λ (6. We designate this by erλ and make the following deﬁnition erλ ≡ eλh − σ1 The leading error term can be found by expanding in a Taylor series and choosing the ﬁrst nonvanishing term. σ .2 Local Accuracy of the Transient Solution (erλ. • ﬁnding the global error.
Cn = ah/∆x. Such can indeed be the case when we study the equations governing periodic convection which produces harmonic motion.41) (σ1 )2 + (σ1 )2 i r Amplitude and phase errors are important measures of the suitability of timemarching methods for convection and wave propagation phenomena. as follows erp = erω tan−1 [(σ1 )i /(σ1 )r )] =1+ ωh Cn κ∆x (6. Introducing the Courant number. Thus one can obtain values of the principal root over the range 0 ≤ κ∆x ≤ π for a given value of the Courant number. Let λ = iω where ω is a real number representing a frequency. we have λh = −iCn κ∗ ∆x. ACCURACY MEASURES OF TIMEMARCHING METHODS Amplitude and Phase Error 99 Suppose a λ eigenvalue is imaginary.3 Local Accuracy of the Particular Solution (erµ ) The numerical error in the particular solution is found by comparing the particular solution of the ODE with that for the O∆E. The approach to error analysis described in Section 3. is found using λ = −iaκ∗ . that is era = 1 − σ1  = 1 − and the local error in phase can be deﬁned as erω ≡ ωh − tan−1 [(σ1 )i /(σ1 )r )] (6. The above expression for erω can be normalized to give the error in the phase speed. where κ∗ is the modiﬁed wavenumber of the spatial discretization.6.(ODE) = aeµt · 1 (µ − λ) . σ1 (λh).5 can be extended to the combination of a spatial discretization and a timemarching method applied to the linear convection equation. 6.40) From this it follows that the local error in amplitude is measured by the deviation of σ1  from unity. Then the numerical method must produce a principal σroot that is complex and expressible in the form σ1 = σr + iσi ≈ eiωh (6. A positive value of erp corresponds to phase lag (the numerical phase speed is too small). For such cases it is more meaningful to express the error in terms of amplitude and phase.S.42) where ω = −aκ. while a negative value corresponds to phase lead (the numerical phase speed is too large).6. The principal root. We have found these to be given by P.6.
6. An illustration of this is given in the section on RungeKutta methods.S.43) The multiplication by h converts the error from a global measure to a local one. Eq. a timemarching method is said to be of order k if erλ = c1 · (λh)k1 +1 erµ = c2 · (λh)k2 +1 where k = smallest of(k1 . If the forcing function is independent of time.S.S. More precisely. so that the order of erλ and erµ are consistent. TIMEMARCHING METHODS FOR ODE’S Q(eµh ) = ae · P (eµh ) µt P.100 and CHAPTER 6.4 Time Accuracy For Nonlinear Applications In practice. The algebra involved in ﬁnding the order of erµ can be quite tedious.43 can be written in terms of the characteristic and particular polynomials as erµ = where co = lim h→0 co · (µ − λ)Q eµh − P eµh µ−λ h(µ − λ) P eµh (6. k2 ) (6. µ is equal to zero.(O∆E) respectively.46) (6. For a measure of the local error in the particular solution we introduce the deﬁnition erµ ≡ h P.47) .44) The value of co is a methoddependent constant that is often equal to one. 6. and it is necessary that the advertised order of accuracy be valid for the nonlinear cases as well as for the linear ones. this order is quite important in determining the true order of a timemarching method by the process that has been outlined. timemarching methods are usually applied to nonlinear ODE’s.45) (6. and for this case. A necessary condition for this to occur is that the local accuracies of both the transient and the particular solutions be of the same order.(O∆E) −1 P.(ODE) (6. In order to determine the leading error term.6. many numerical methods generate an erµ that is also zero. However.
we can also deﬁne global error measures. Suppose we wish to compute some timeaccurate phenomenon over a ﬁxed interval of time using a constant time step. This subject is covered in Chapter 8 after our introduction to stability in Chapter 7. However.50) . to derive all of the necessary conditions for the fourthorder RungeKutta method presented later in this chapter the derivation must be performed for a nonlinear ODE.6. Let T be the ﬁxed time of the event and h be the chosen step size. is N. These are useful when we come to the evaluation of timemarching methods for speciﬁc purposes. These are given by Era = 1 − and Erω ≡ N ωh − tan−1 (σ1 )i (σ1 )r = ωT − N tan−1 [(σ1 )i /(σ1 )r ] N (6.5 Global Accuracy In contrast to the local error measures which have just been discussed. For example. given by the relation T = Nh Global error in the transient A natural extension of erλ to cover the error in an entire event is given by Erλ ≡ eλT − (σ1 (λh))N Global error in amplitude and phase If the event is periodic.6. ACCURACY MEASURES OF TIMEMARCHING METHODS 101 The reader should be aware that this is not suﬃcient.48) (σ1 )2 + (σ1 )2 i r (6.49) (6. 6. the analysis based on a linear nonhomogeneous ODE produces the appropriate conditions for the majority of timemarching methods used in CFD.6. we are more concerned with the global error in amplitude and phase. We refer to such a computation as an “event”. Then the required number of time steps.
51) where the notation for F is deﬁned in Section 6. TIMEMARCHING METHODS FOR ODE’S Global error in the particular solution Finally.102 CHAPTER 6. one ﬁnds 1 k=1−K αk E k un = h 1 k=1−K βk E k (λun + aeµhn ) (6. When Eq.4. In the subsequent sections.51 is applied to the representative equation. we introduce classes of methods along with their associated error analysis. The Linear Multistep Methods (LMM’s) are probably the most natural extension to time marching of the space diﬀerencing schemes introduced in Chapter 3 and can be analyzed for accuracy or designed using the Taylor table approach of Section 3. It can be measured by Erµ ≡ (µ − λ) Q eµh P eµh −1 6. h. and the result is expressed in operational form. 6. Eq. The methods are said to be linear because the α’s and β’s are independent of u and n.8.7. developmental or design issues.52) .1. They are explicit if β1 = 0 and implicit otherwise. 6.1 The General Formulation du = u = F(u. t) dt When applied to the nonlinear ODE all linear multistep methods can be expressed in the general form 1 1 αk un+k = h k=1−K k=1−K βk Fn+k (6. we have developed the framework of error analysis for time advance methods and have randomly introduced a few methods without addressing motivational. the global error in the particular solution follows naturally by comparing the solutions to the ODE and the O∆E. We shall not spend much time on development or design of these methods.7 Linear Multistep Methods In the previous sections. since most of them have historic origins from a wide variety of disciplines and applications. and they are said to be Kstep because K timelevels of data are required to marching the solution one timestep. 6.
6. 6.5. 6. 6.44. α0 = −1. k = −1.7.51 with α1 = 1. and it is certainly a necessary condition for the accuracy of any time marching method.54 can be generated as un+1 −un −hβ1 un+1 −hβ0 un −hβ−1 un−1 −hβ−2 un−2 un 1 −1 h · un 1 −β1 −β0 −β−1 −(−2)0 β−2 h2 · un 1 2 −β1 β−1 −(−2)1 β−2 h3 · un 1 6 −β1 1 2 −(−2)2 β−2 1 2 −β−1 1 2 h4 · un 1 24 −β1 1 6 −(−2)3 β−2 1 6 β−1 1 6 (6.6. 6. LINEAR MULTISTEP METHODS 103 We recall from Section 6.4. αk = 0. We can also agree that. to be of any value in time accuracy. 6. −2. that is σ → (1 + λh) as h → 0.2 that a timemarching method when applied to the representative equation must provide a σroot. these methods are often “normalized” by requiring βk = 1 k Under this condition co = 1 in Eq. Taylor table for the AdamsMoulton threestep linear multistep method.53) The AdamsBashforth family has the same α’s with the additional constraint that β1 = 0. One can show that these conditions are met by any method represented by Eq. can be designed using the Taylor table approach of Section 3.7. Two wellknown families of them. referred to as AdamsBashforth (explicit) and AdamsMoulton (implicit). that approximates eλh through the order of the method. The threestep AdamsMoulton method can be written in the following form un+1 = un + h(β1 un+1 + β0 un + β−1 un−1 + β−2 un−2 ) A Taylor table for Eq. The AdamsMoulton family is obtained from Eq.2 Examples There are many special explicit and implicit forms of linear multistep methods.54) Table 6.51 can be multiplied by an arbitrary constant. .51 if αk = 0 and k k βk = k (K + k − 1)αk Since both sides of Eq.1. labeled σ1 . · · · (6. a method should at least be ﬁrstorder accurate. The condition referred to as consistency simply means that σ → 1 as h → 0.
One can verify that the Adams type schemes given below satisfy Eqs.58) This is the thirdorder AdamsBashforth method. .5.55 and 6. some of which are very common in CFD applications. β−2 = 1/24 (6.104 CHAPTER 6. In the following material AB(n) and AM(n) are used as abbreviations for the (n)th order AdamsBashforth and (n)th order AdamsMoulton methods.56) which produces a method which is fourthorder accurate.4 With β1 = 0 one obtains 1 1 1 β0 1 0 −2 −4 β−1 = 1 0 3 12 1 β−2 giving β0 = 23/12. β−1 = −5/24.57) (6. A list of simple methods. β1 β0 β−1 β−2 = 1 1 1 1 (6.2 that a kthorder timemarching method has a leading truncation error term which is O(hk+1 ). TIMEMARCHING METHODS FOR ODE’S This leads to the linear system 1 2 3 4 1 1 1 0 −2 −4 0 3 12 0 −4 −32 β0 = 19/24. resulting in β1 = 9/24. β−1 = −16/12. Explicit Methods un+1 un+1 un+1 un+1 Implicit Methods un+1 un+1 un+1 un+1 4 = un + hun = un−1 + 2hun 1 = un + 2 h 3un − un−1 h = un + 12 23un − 16un−1 + 5un−2 Euler Leapfrog AB2 AB3 = un + hun+1 = un + 1 h un + un+1 2 1 4u − u =3 n n−1 + 2hun+1 h = un + 12 5un+1 + 8un − un−1 Implicit Euler Trapezoidal (AM2) 2ndorder Backward AM3 Recall from Section 6.55) to solve for the β’s. 6. is given below together with identifying names that are sometimes associated with them. β−2 = 5/12 (6.57 up to the order of the method.
One can show after a little algebra that both erµ and erλ are reduced to 0(h3 ) (i.2.59 is given in Table 6. which corresponds to α−1 = 0 in Eq.51. which corresponds to α0 = 0 in Eq. see Eq.7. 6. 6. 6.51). 6.59.59) Clearly the methods are explicit if θ = 0 and implicit otherwise.e. This limits the interest in multistep methods to about two time levels. Notice that the Adams methods have ξ = 0. are known as Milne methods. 6. Some linear one. that is at least ﬁrstorder accurate. K=2 in Eq. can be written as (1 + ξ)un+1 = [(1 + 2ξ)un − ξun−1] + h θun+1 + (1 − θ + ϕ)un − ϕun−1 (6. LINEAR MULTISTEP METHODS 105 6. the methods are 2ndorder accurate) if 1 ϕ=ξ−θ+ 2 The class of all 3rdorder methods is determined by imposing the additional constraint 5 ξ = 2θ − 6 Finally a unique fourthorder method is found by setting θ = −ϕ = −ξ/3 = 1 . A list of methods contained in Eq.2.and twostep methods.7..e. 6 . θ 0 1 1/2 1 3/4 1/3 1/2 5/9 0 0 0 1/3 5/12 1/6 ξ 0 0 0 1/2 0 −1/2 −1/2 −1/6 −1/2 0 −5/6 −1/6 0 −1/2 ϕ 0 0 0 0 −1/4 −1/3 −1/2 −2/9 0 1/2 −1/3 0 1/12 −1/6 Method Euler Implicit Euler Trapezoidal or AM2 2nd Order Backward Adams type Lees Type Two–step trapezoidal A–contractive Leapfrog AB2 Most accurate explicit Third–order implicit AM3 Milne Order 1 1 2 2 2 2 2 2 2 2 3 3 3 4 Table 6.. Methods with ξ = −1/2.51.6. The most general twostep linear multistep method (i.3 TwoStep Linear Multistep Methods High resolution CFD problems usually require very large data sets to store the spatial information from which the time derivative is calculated.
63) Such as alternating direction. There may be many families in the sequence.61) (6. 6.6. and hybrid methods. following the example leading to Eq. each of which is referred to as a family in the solution process. 5 (6. TIMEMARCHING METHODS FOR ODE’S 6. αβ = 1 2 which provides two equations for three unknowns. and usually the ﬁnal family has a higher Taylorseries order of accuracy than the intermediate ones. Their use in solving ODE’s is relatively easy to illustrate and understand. where measures of eﬃciency are discussed in the next two chapters. to the following observations. A simple onepredictor. One concludes.62) Considering only local accuracy. but it is not diﬃcult to show using Eq.61 that γ+β =1 . One can analyze this sequence by applying it to the representative equation and using the operational techniques outlined in Section 6.26. Predictorcorrector methods constructed to timemarch linear or nonlinear ODE’s are composed of sequences of linear multistep methods. fractionalstep. therefore. For the method to be secondorder accurate both erλ and erµ must be O(h3 ). It is easy to show. Their use in solving PDE’s can be much more subtle and demands concepts5 which have no counterpart in the analysis of ODE’s. it is obvious from Eq. 6. The situation for erµ requires some algebra. Their use is motivated by ease of application and increased eﬃciency. onecorrector example is given by un+α = un + αhun ˜ un+1 = un + h β un+α + γun ˜ (6. that the predictorcorrector sequence un+α = un + αhun ˜ 1 1 2α − 1 un+1 = un + h un+α + ˜ un 2 α α is a secondorder accurate method for any α.106 CHAPTER 6. that P (E) = E α · E − 1 − (γ + β)λh − αβλ2h2 Q(E) = E α · h · [βE α + γ + αβλh] (6. For this to hold for erλ . 6.4.8 PredictorCorrector Methods There are a wide variety of predictorcorrector schemes created and used for a variety of purposes. β and γ are arbitrary parameters to be determined. .60) where the parameters α. one is led. by following the discussion in Section 6.44 that the same conditions also make it O(h3 ).
6. If the order of the corrector is (k). referred to as RungeKutta methods. is un+1 = un + hun ˜ 1 un+1 = ˜ u [un + un+1 + h˜n+1 ] 2 Note that MacCormack’s method can also be written as un+1 = un + hun ˜ 1 un+1 = un + h[un + un+1 ] ˜ 2 from which it is clear that it is obtained from Eq.67) (6. MacCormack’s method. 6 .66) (6. is 1 u ˜ un+1 = un + h 3˜n − un−1 ˜ 2 1 un+1 = un + h un + un+1 ˜ ˜ 2 The Burstein method.64) 12 Some simple. 6.68) 6.9. secondorder accurate methods are given below.6 that produce just one σroot for each λroot such that σ(λh) corresponds Although implicit and multistep RungeKutta methods exist. The AdamsBashforthMoulton sequence for k = 3 is 1 un+1 = un + h 3un − un−1 ˜ 2 h un+1 = un + 5˜n+1 + 8un − un−1 u (6. (6. which we discuss in Chapter 8. obtained from Eq.63 with α = 1.6.65) (6. we will consider only singlestep. The Gazdag method.63 with α = 1/2 is 1 un+1/2 = un + hun ˜ 2 un+1 = un + h˜n+1/2 u and. we refer to these as ABM(k) methods. explicit RungeKutta methods here. presented earlier in this chapter. RUNGEKUTTA METHODS 107 A classical predictorcorrector sequence is formed by following an AdamsBashforth predictor of any order with an AdamsMoulton corrector having an order one higher. speciﬁc. The order of the combination is then equal to the order of the corrector.9 RungeKutta Methods There is a special subset of predictorcorrector methods. ﬁnally.
First of all. It is usually introduced in the form k1 k2 k3 k4 followed by u(tn + h) − u(tn ) = µ1 k1 + µ2 k2 + µ3 k3 + µ4 k4 (6.72) Appearing in Eqs. but. α1 .69) 2 k! It is not particularly diﬃcult to build this property into a method. 6.71 and 6. it is not suﬃcient to guarantee k’th order accuracy for the solution of u = F (u.71) = = = = hF (un . tn + α1 h) hF (un + β2 k1 + γ2 k2 + δ2 k3 .108 CHAPTER 6.6. the choices for the time samplings.70) and this is much more diﬃcult.73) . tn + αh) hF (un + β1 k1 + γ1 k2 . tn + α2 h) However. TIMEMARCHING METHODS FOR ODE’S to the Taylor series expansion of eλh out through the order of the method and then truncates. tn ) hF (un + βk1 . The most widely publicized RungeKutta process is the one that leads to the fourthorder method. They must satisfy the relations α = β α1 = β1 + γ1 α2 = β2 + γ2 + δ2 (6. We present it below in some detail. Thus. we prefer to present it using predictorcorrector notation. are not arbitrary. 6. the principal (and only) σroot is given by 1 1 σ = 1 + λh + λ2 h2 + · · · + λk hk (6. a scheme entirely equivalent to 6. Thus for a RungeKutta method of order k (up to 4th order). t) or for the representative equation.4.69 and 6.72 are a total of 13 parameters which are to be determined such that the method is fourthorder according to the requirements in Eqs.71 is un+α un+α1 ˜ un+α2 un+1 = = = = un + βhun un + β1 hun + γ1 hun+α un + β2 hun + γ2 hun+α + δ2 h˜n+α1 u un + µ1 hun + µ2 hun+α + µ3 h˜n+α1 + µ4 hun+α2 u (6. as we pointed out in Section 6. α. To ensure k’th order accuracy. the method must further satisfy the constraint that erµ = O(hk+1 ) (6.70. and α2 .
It results in the “standard” fourthorder RungeKutta method expressed in predictorcorrector form as 1 un+1/2 = un + hun 2 1 un+1/2 = un + hun+1/2 ˜ 2 un+1 = un + h˜n+1/2 u 1 un+1 = un + h un + 2 un+1/2 + un+1/2 + un+1 ˜ 6 7 (6. 6. Using Eq.74) These four relations guarantee that the ﬁve terms in σ exactly match the ﬁrst 5 terms in the expansion of eλh . RUNGEKUTTA METHODS 109 The algebra involved in ﬁnding algebraic equations for the remaining 10 parameters is not trivial.75 which must be satisﬁed by the 10 unknowns. 6. Since the equations are overdetermined.9. As discussed in Section 6.74 and the ﬁrst equation in 6. but the equations follow directly from ﬁnding P (E) and Q(E) and then satisfying the conditions in Eqs.69 the four conditions µ1 + µ2 + µ3 + µ4 µ2 α + µ3 α1 + µ4 α2 µ3 αγ1 + µ4 (αγ2 + α1 δ2 ) µ4 αγ1 δ2 = 1 = 1/2 = 1/6 = 1/24 (1) (2) (3) (4) (6. To satisfy the condition that erµ = O(k 5 ). two parameters can be set arbitrarily. .73 to eliminate the β’s we ﬁnd from Eq.76) The present approach based on a linear inhomogeneous equation provides all of the necessary conditions for RungeKutta methods of up to third order.4.6.70. the fourth condition in Eq. we have to fulﬁll four more conditions 2 2 µ2 α2 + µ3 α1 + µ4 α2 3 3 µ2 α3 + µ3 α1 + µ4 α2 2 2 2 µ3 α γ1 + µ4 (α γ2 + α1 δ2 ) µ3 αα1 γ1 + µ4 α2 (αγ2 + α1 δ2 ) = 1/3 = 1/4 = 1/12 = 1/8 (3) (4) (4) (4) (6.75 are all satisﬁed. Thus if the ﬁrst 3 equations in 6.74 and 6. 6. 6. which is based on a linear nonhomogenous representative ODE.75) The number in parentheses at the end of each equation indicates the order that is the basis for the equation. Several choices for the parameters have been proposed.69 and 6. the resulting method would be thirdorder accurate.75 cannot be derived using the methodology presented here. A more general derivation based on a nonlinear ODE can be found in several books.7 There are eight equations in 6.6. but the most popular one is due to Runge.
2.77) using the implicit Euler method. Our presentation of the timemarching methods in the context of a linear scalar equation obscures some of the issues involved in implementing an implicit method for systems of equations and nonlinear equations.72 by setting µ4 = 0 and satisfying only Eqs. as Euler Predictor Euler Corrector Leapfrog Predictor Milne Corrector ≡ RK4 One can easily show that both the Burstein and the MacCormack methods given by Eqs. but they require more derivative evaluations than their order.74 and the ﬁrst equation in 6. 6. respectively.110 CHAPTER 6.66 and 6.78) . For example. 6. 6. Higherorder RungeKutta methods can be developed. It is clear that for orders one through four. These are covered in this Section. storage requirements reduce the usefulness of RungeKutta methods of order higher than four for CFD applications. This leads to various tradeoﬀs which must be considered in selecting a method for a speciﬁc application. we will see that these methods can have widely diﬀerent properties with respect to stability.67 are secondorder RungeKutta methods.1 Application to Systems of Equations Consider ﬁrst the numerical solution of our representative ODE u = λu + aeµt (6. 6.10 Implementation of Implicit Methods We have presented a wide variety of timemarching methods and shown how to derive their λ − σ relations.10. We shall discuss the consequences of this in Chapter 8.75. In any event. a ﬁfthorder method requires six evaluations to advance the solution one step. 6. TIMEMARCHING METHODS FOR ODE’S Notice that this represents the simple sequence of conventional linear multistep methods referred to. RK methods of order k require k evaluations of the derivative function to advance the solution one time step. and thirdorder methods can be derived from Eqs. In the next chapter. Following the steps outlined in Section 6. we obtained (1 − λh)un+1 − un = heµh · aeµhn (6.
84) This is a nonlinear diﬀerence equation.2 Application to Nonlinear Equations du = F (u. 6. 6. Now the equivalent to Eq. t) dt Now consider the general nonlinear scalar ODE given by (6. For our onedimensional examples.79) This calculation does not seem particularly onerous in comparison with the application of an explicit method to this ODE.85) . tn+1 ) (6. as we shall see in Chapter 8. consider the nonlinear ODE du 1 2 + u =0 dt 2 (6. A = −aBp (−1.81 as a linear system of equations. but rather we solve Eq. for biconvection..83) Application of the implicit Euler method gives un+1 = un + hF (un+1. requiring only an additional division. 1)/2∆x).81) The inverse is not actually performed. IMPLEMENTATION OF IMPLICIT METHODS Solving for un+1 gives un+1 = 1 (un + heµh · aeµhn ) 1 − λh 111 (6.10.78 is (I − hA)un+1 − un = −hf (t + h) or un+1 = (I − hA)−1 [un − hf (t + h)] (6. but in multidimensions the bandwidth can be very large. the system of equations which must be solved is tridiagonal (e.6. Now let us apply the implicit Euler method to our generic system of equations given by u = Au − f (t) (6.80) where u and f are vectors and we still assume that A is not a function of u or t. The primary area of application of implicit methods is in the solution of stiﬀ ODE’s. and hence its solution is inexpensive. In general.10. the cost per time step of an implicit method is larger than that of an explicit method. 0. 6.82) (6.g. As an example.
89) . t) = Fn + (u − un) + n (t − tn ) + O(h2 ) n (6.86) which requires a nontrivial method to solve for un+1 . both (t − tn )k and (u − un)k are O(hk ).83. There are several diﬀerent approaches one can take to solving this nonlinear diﬀerence equation. An iterative method. A Taylor series expansion about these reference quantities gives ∂F ∂u ∂F ∂t F (u. and Eq. which gives 1 un+1 + h u2 = un 2 n+1 (6. In order to implement the linearization. 6. TIMEMARCHING METHODS FOR ODE’S solved using implicit Euler time marching. the expansion of u(t) in terms of the independent variable t is ∂u ∂t 1 ∂2 u + (t − tn )2 2 ∂t2 n u(t) = un + (t − tn ) +··· n (6. 6.87) On the other hand. Such an approach is described in the next Section.10.87 can be written ∂F ∂u ∂F ∂t F (u. Designate the reference value by tn and the corresponding value of the dependent variable by un . tn ) + 1 ∂2 F + 2 ∂u2 + 1 ∂2 F 2 ∂t2 (u − un ) + n 2 (t − tn ) n ∂2F (u − un ) + ∂u∂t n (t − tn )2 + · · · n (u − un )(t − tn ) n (6.88) If t is within h of tn .3 Local Linearization for Scalar Equations General Development Let us start the process of local linearization by considering Eq.112 CHAPTER 6. which implies that a linearization approach may be quite successful. t) = F (un . since the “initial guess” is simply the solution at the previous time step. can be used. we expand F (u. such as Newton’s method (see below). t) about some reference point in time. In practice. the “initial guess” for this nonlinear problem can be quite close to the solution. 6.
6.6. The important point to be made here is that local linearization updated at each time step has not reduced the order of accuracy of a secondorder timemarching process. t) that is valid in the vicinity of the reference station tn and the corresponding un = u(tn ).90) Implementation of the Trapezoidal Method As an example of how such an expansion can be used. Using Eq. 6.91) where we write hO(h2 ) to emphasize that the method is second order accurate. There are. combine to make a secondorderaccurate numerical integration process. other secondorder implicit timemarching methods that can be used. and the trapezoidal time march. namely du = dt ∂F ∂u u + Fn − n ∂F ∂u un + n ∂F ∂t (t − tn ) + O(h2 ) n (6. The trapezoidal method is given by 1 un+1 = un + h[Fn+1 + Fn ] + hO(h2 ) 2 (6. IMPLEMENTATION OF IMPLICIT METHODS 113 Notice that this is an expansion of the derivative of the function.92) Note that the O(h2 ) term within the brackets (which is due to the local linearization) is multiplied by h and therefore is the same order as the hO(h2 ) error from the Trapezoidal Method.92 results in the expression 1 ∂F 1− h 2 ∂u 1 ∂F ∆un = hFn + h2 2 ∂t (6. it represents a secondorderaccurate. of course. tn+1 ).10.83. With this we obtain the locally (in the neighborhood of tn ) timelinear representation of Eq.89 to evaluate Fn+1 = F (un+1 . relative to the order of expansion of the function.83. Thus. locallylinear approximation to F (u. 6. 6. In many ﬂuid mechanic applications the nonlinear function F is not an explicit function .93) n n which is now in the delta form which will be formally introduced in Section 12. consider the mechanics of applying the trapezoidal method for the time integration of Eq. The use of local time linearization updated at the end of each time step. A very useful reordering of the terms in Eq. one ﬁnds 1 ∂F un+1 = un + h Fn + 2 ∂u +hO(h2 ) (un+1 − un ) + h n ∂F ∂t + O(h2 ) + Fn n (6. 6.
Implementation of the Implicit Euler Method We have seen that the ﬁrstorder implicit Euler method can be written un+1 = un + hFn+1 (6. TIMEMARCHING METHODS FOR ODE’S of t. In such cases the partial derivative of F (u) with respect to t is zero and Eq. thus un+1 = ∆un + un The solution for un+1 is generally stored such that it overwrites the value of un and the process is repeated. Find un+1 by adding ∆un to un . and remove the explicit dependence on time. 6.114 CHAPTER 6. (Very seldom does one ﬁnd B−1 in carrying out this step). Let this storage area be referred to as B. 3.94 is usually implemented as follows: 1. A numerical timemarching procedure using Eq.96) . Solve for the elements of hFn . it is generally the spacediﬀerenced form of the steadystate equation of some ﬂuid ﬂow problem. 6. and save un. store them in an array say R.90 into this method. the basic equation was the simple scalar equation 6. 6. 4. 2.94) Notice that the RHS is extremely simple.83. rearrange terms. It is the product of h and the RHS of the basic equation evaluated at the previous time step.95) if we introduce Eq. Solve the coupled set of linear equations B∆un = R for ∆un . we arrive at the form 1−h ∂F ∂u ∆un = hFn n (6. but for our applications.93 simpliﬁes to the secondorder accurate expression 1 ∂F 1− h 2 ∂u ∆un = hFn n (6. In this example. Solve for the elements of the matrix multiplying ∆un and store in some appropriate manner making use of sparseness or bandedness of the matrix if possible.
97) Fn n (6. IMPLEMENTATION OF IMPLICIT METHODS 115 We see that the only diﬀerence between the implementation of the trapezoidal method 1 and the implicit Euler method is the factor of 2 in the brackets of the left side of Eqs. 6. Omission of this factor degrades the method in time accuracy by one order of h. 6. Our task is to apply the implicit trapezoidal method to the equations d3 f d2 f df + f 2 + β 1 − 3 dt dt dt 2 =0 (6.10. i. Newton’s Method Consider the limit h → ∞ of Eq. 6.98) This is the wellknown Newton method for ﬁnding the roots of a nonlinear equation F (u) = 0.10.94 and 6. 6. We choose the FalknerSkan equations from classical boundarylayer theory. let us consider an example involving some simple boundarylayer equations. There results − or un+1 = un − ∂F ∂u −1 ∂F ∂u ∆un = Fn n (6. We shall see later that this method is an excellent choice for steady problems. By quadratic convergence.88 (remember the dependence on t has been eliminated for this case).87 and 6. . the error at a given iteration is some multiple of the error at the previous iteration.96 obtained by dividing both sides by h and setting 1/h = 0. Use of a ﬁnite value of h in Eq.96 leads to linear convergence. where the error is the diﬀerence between the current solution and the converged solution.4 Local Linearization for Coupled Sets of Nonlinear Equations In order to present this concept. we mean that the error after a given iteration is proportional to the square of the error at the previous iteration.. The reader should ponder the meaning of letting h → ∞ for the trapezoidal method. 6. The fact that it has quadratic convergence is veriﬁed by a glance at Eqs. and β is a scaling factor.6.96.99) Here f represents a dimensionless stream function.e.94. 6. Quadratic convergence is thus a very powerful property. given by Eq.
6.90.103) A = ∂F2 ∂u1 ∂F1 ∂u1 ∂F1 ∂u2 ∂F2 ∂u2 ∂F3 ∂u2 ∂F1 ∂u3 ∂F2 ∂u3 (6. u2 = .87.8 Let us refer to this matrix as A. The required extension requires the evaluation of a matrix. u3 = f 2 dt dt This gives the coupled set of three nonlinear equations u1 = u1 = F1 = −u1 u3 − β 1 − u2 2 u2 = F2 = u1 u3 = F3 = u2 and these can be represented in vector notation as du = F (u) (6.100) (6.102 by the following process A = (aij ) = ∂Fi /∂uj For the general case involving a third order matrix this is (6.99 to a set of ﬁrstorder nonlinear equations by the transformations d2 f df . 6. rather than a simple nonlinear scalar function. called the Jacobian matrix. a matrix times a vector for the second term. 9 . It is derived from Eq. except that this time we are faced with a nonlinear vector function.2 The Taylor series expansion of a vector contains a vector for the ﬁrst term. TIMEMARCHING METHODS FOR ODE’S First of all we reduce Eq.104) ∂F3 ∂u1 ∂F3 ∂u3 The expansion of F (u) about some reference state un can be expressed in a way similar to the scalar expansion given by eq 6. Omitting the explicit dependency on the independent variable t. one has 9 8 Recall that we derived the Jacobian matrices for the twodimensional Euler equations in Section 2. and deﬁning F n as F (un ).101) (6.102) dt Now we seek to make the same local expansion that derived Eq.116 CHAPTER 6. and tensor products for the terms of higher order. 6.
2.105) where t − tn and the argument for O(h2 ) is the same as in the derivation of Eq. Note that the series can be expressed as the solution to a diﬀerence equation of the form un+1 = un + un−1 . Using this we can write the local linearization of Eq. 6. and the manner in which. Without any iterating within a step advance.88. which is given by Eq. Thus for the FalknerSkan equations the trapezoidal method results in 1 + h (u3 )n −βh(u2 )n h (u1)n 2 2 −h 1 0 2 0 −h 1 2 − (u1 u3 )n − β(1 − u2 )n (∆u1 )n 2 (u1 )n (∆u2 )n = h (∆u3 )n (u2 )n We ﬁnd un+1 from ∆un + un . 6.106) dt “constant” which is a locallylinear. 8. 6. the terms in the Jacobian matrix are updated as the solution proceeds depends.6. on the nature of the problem. Any ﬁrst.11. What is u25 ? (Let the ﬁrst term given above be u0 . we ﬁnd the Jacobian matrix to be − u3 2βu2 −u1 0 0 A= 1 (6. 1.102 as du = An u + F n − An un +O(h2 ) (6. . could be used to integrate the equations without loss in accuracy with respect to order. The number of times. 6.) . Returning to our simple boundarylayer example. explicit or implicit.or secondorder timemarching method. Reevaluate the elements using un+1 and continue. Find an expression for the nth term in the Fibonacci series.11 Problems 1. PROBLEMS 117 F (u) = F n + An u − un + O(h2 ) (6. secondorderaccurate approximation to a set of coupled nonlinear ordinary diﬀerential equations that is valid for t ≤ tn + h. 5.93. . of course.101. and the solution is now advanced one step.107) 0 1 0 The student should be able to derive results for this example that are equivalent to those given for the scalar case in Eq. which is given by 1. the solution will be secondorderaccurate in time. 3. 6. .
6. TIMEMARCHING METHODS FOR ODE’S 2. (b) Derive the λσ relation.118 CHAPTER 6. Solve for the σroots and identify them as principal or spurious. The 2ndorder backward method is given by un+1 = 1 4un − un−1 + 2hun+1 3 (a) Write the O∆E for the representative equation. 6. (d) Perform a σroot trace relative to the unit circle for both diﬀusion and convection. The trapezoidal method un+1 = un + 1 h(un+1 + un ) is used to solve the repre2 sentative ODE. Identify the polynomials P (E) and Q(E). 5. (c) Find erλ and the ﬁrst two nonvanishing terms in a Taylor series expansion of the spurious root. (c) Find erλ .65) to the representative equation. (b) Derive the λ − σ relation. Find the λσ relation. Consider the timemarching scheme given by un+1 = un−1 + 2h (u + un + un−1 ) 3 n+1 (a) Write the O∆E for the representative equation. (a) What is the resulting O∆E? (b) What is its exact solution? (c) How does the exact steadystate solution of the O∆E compare with the exact steadystate solution of the ODE if µ = 0? 3. Consider the following timemarching method: un+1/3 = un + hun /3 ˜ un+1/2 = un + h˜n+1/3 /2 ¯ u un+1 = un + h¯n+1/2 u . Identify the polynomials P (E) and Q(E). 4. Find the diﬀerence equation which results from applying the Gazdag predictorcorrector method (Eq.
11. use u(x.5[(x−0. implicit Euler. 7. What order is the homogeneous solution? What order is the particular solution? Find the particular solution if the forcing term is ﬁxed. 8. plot the error given by (uj − uexact)2 j M j=1 M where M is the number of grid nodes and uexact is the exact solution. use a Courant number. For the initial condition.1. 10. including the homogeneous and particular solutions. for the other methods. Find erλ and erµ . PROBLEMS 119 Find the diﬀerence equation which results from applying this method to the representative equation.5)/σ] 2 with σ = 0. 9. For the explicit Euler and AB2 methods. Write a computer program to solve the onedimensional linear convection equation with periodic boundary conditions and a = 1 on the domain 0 ≤ x ≤ 1. Let u(0. Use 2ndorder centered diﬀerences in space and a grid of 50 points. On a loglog scale. use a Courant number of unity.6. 0) = e−0. 11. 200. Run until a periodic steady state is reached which is independent of the initial condition and plot your solution . Find the global order of accuracy from the plot.08. and 4thorder RungeKutta methods. Repeat problem 7 using 4thorder (noncompact) diﬀerences in space. Find the λσ relation. Find the solution to the diﬀerence equation. Use the explicit Euler. compute the solution at t = 1 using 2ndorder centered diﬀerences in space coupled with the 4thorder RungeKutta method for grids of 100. Show solutions at t = 1 and t = 10 compared to the exact solution. Plot the solutions obtained at t = 1 compared to the exact solution (which is identical to the initial condition). of 0. Use only 4thorder RungeKutta time marching at a Courant number of unity. Write a computer program to solve the onedimensional linear convection equation with inﬂowoutﬂow boundary conditions and a = 1 on the domain 0 ≤ x ≤ 1. Using the computer program written for problem 7. t) = sin ωt with ω = 10π. trapezoidal. repeat problem 9 using 4thorder (noncompact) diﬀerences in space. ah/∆x. 2ndorder AdamsBashforth (AB2). Using the computer program written for problem 8. and 400 nodes.
as described in problem 9. Use 2ndorder centered diﬀerences in space with a 1storder backward diﬀerence at the outﬂow boundary (as in Eq. ﬁnd the phase speed error. that obtained using the spatial discretization alone. At the last grid node. 6.6. 12. Note that the required σroots for the various RungeKutta methods can be deduced from Eq. TIMEMARCHING METHODS FOR ODE’S compared with the exact solution. derive and use a 3rdorder backward operator (using nodes j − 3. Find the global order of accuracy. Explain your results. j − 2. Use a thirdorder forwardbiased operator at the inﬂow boundary (as in Eq. the number of grid nodes. erp . Also plot the phase speed error obtained using exact integration in time.69. 2nd. 3. . and 4thorder RungeKutta timemarching at a Courant number of unity.e. j − 1. i. j. 13. 3.67). for the combination of secondorder centered differences and 1st. Repeat problem 11 using 4thorder (noncompact) centered diﬀerences.69) together with 4thorder RungeKutta time marching.. Using the approach described in Section 6.120 CHAPTER 6. and the amplitude error. j − 1. and j + 1. and 400 nodes and plot the error vs. Use grids with 100. use a 3rdorder backwardbiased operator (using nodes j − 2. era . without actually deriving the methods. see problem 1 in Chapter 3). 3rd.2. and j) and at the second last node. 200.
For a onestep method. Note also that only methods in which the time and space discretizations are treated separately can be written in an intermediate semidiscrete form such as Eq.2. For example. In these terms a system is said to be stable in a certain domain if. These equations are represented by du = Au − f (t) dt and un+1 = Cun − g n (7.2 by introducing new dependent variables. 7. the explicit Euler method leads to C = I + hA. Chapters 4 and 6 developed the representative forms of ODE’s generated from the basic PDE’s by the semidiscrete approach. The fullydiscrete form. but we do not dwell on them in this work. 7. Methods involving two or more steps can always be written in the form of Eq. 7. Our basic concern is with timedependent ODE’s and O∆E ’s in which the coeﬃcient matrices are independent of both u and t. and then the O∆E’s generated from the representative ODE’s by application of timemarching methods. These are important and interesting concepts. We will refer to such matrices as stationary.1) . some norm of its solution is always attracted to the same ﬁxed point.1. 121 (7. In the case of the nonlinear ODE’s of interest in ﬂuid dynamics. Eq. stability is often discussed in terms of ﬁxed points and attractors. from within that domain.Chapter 7 STABILITY OF LINEAR SYSTEMS A general deﬁnition of stability is neither simple nor universal and depends on the particular phenomenon being considered. see Section 4. the latter form is obtained by applying a timemarching method to the generic ODE form in a fairly straightforward manner.2) respectively.2 and the associated stability deﬁnitions and analysis are applicable to all methods. and gn = f(nh).
3.2. In many interesting cases. with a reminder that a complete system can be arbitrarily close to a defective one.122 CHAPTER 7.2. If A and C are stationary. estimate their fundamental properties. a stability deﬁnition can depend on both the time and space diﬀerencing.1 and 7. For example. we designated these eigenvalues as λm and σm for Eqs. 7. 1 This is not the case if the coeﬃcient matrix depends on t even if it is linear. 7. These expectations are referred to many times in the following analysis of stability properties. and we will ﬁnd it convenient to examine the stability of various methods in both the complex λ and complex σ planes. see Section 4. The stability of Eq.2 are suﬃcient to determine the stability. we found from our model ODE’s for diffusion and periodic convection what could be expected for the eigenvalue spectrums of practical physical problems containing these phenomena. In previous chapters. respectively.2 can often also be related to the eigensystem of its matrix. Consideration will be given to matrices that have both complete and defective eigensystems. we can. They are important enough to be summarized by the following: • For diﬀusion dominated ﬂows the λeigenvalues tend to lie along the negative real axis.1 and 7. This is discussed in Section 7. in which case practical applications can make the properties of the latter appear to dominate. • For periodic convectiondominated ﬂows the λeigenvalues tend to lie along the imaginary axis.1 depends entirely on the eigensystem1 of A. in this case the situation is not quite so simple since.2. 7. The stability of Eq.3.1 Dependence on the Eigensystem Our deﬁnitions of stability are based entirely on the behavior of the homogeneous parts of Eqs. 7.2. However. .1 and 7. 7. the eigenvalues of the matrices in Eqs.4. in our applications to partial diﬀerential equations (especially hyperbolic ones). Analysis of these eigensystems has the important added advantage that it gives an estimate of the rate at which a solution approaches a steadystate if a system is stable. in Section 4. STABILITY OF LINEAR SYSTEMS 7. in theory at least.
that the ODE’s are inherently stable if and only if (λm ) ≤ 0 for all m (7. for example. Finally we note that for ODE’s with complete eigensystems the eigenvectors play no role in the inherent stability criterion.27. 7.3 and especially on Eqs. or to the left of. we inspect the nature of their solutions in eigenspace. In an eigenspace related to defective systems the form of the representative equation changes from a single equation to a Jordan block.1 Inherent Stability of ODE’s The Criterion Here we state the standard stability criterion used for ordinary diﬀerential equations.2 7. Eq. (7.2.18 to 4. For a stationary matrix A.2. the imaginary axis in the complex λ plane.3 Defective Eigensystems In order to understand the stability of ODE’s that have defective eigensystems. 6. 4. u remains bounded as t → ∞. INHERENT STABILITY OF ODE’S 123 7. all of its eigenvectors are linearly independent. 7.2. for inherent stability.2 Complete Eigensystems If a matrix has a complete eigensystem. when f is constant. 4.2.19 in that section. In such a case it follows at once from Eq. This criterion is satisﬁed for the model ODE’s representing both diﬀusion and biconvection. 7. For this we draw on the results in Sections 4.1 is inherently stable if. but neither does it grow. instead of Eq. It should be emphasized (as it is an important practical consideration in convectiondominated systems) that the special case for which λ = ±i is included in the domain of stability.3) Note that inherent stability depends only on the transient solution of the ODE’s. so the above condition is met. In this case it is true that u does not decay as t → ∞.4) This states that.45 a typical form of the homogeneous part might be u1 λ u1 u2 = 1 λ u2 u3 1 λ u3 . For example.2. and the matrix can be diagonalized by a similarity transformation. all of the λ eigenvalues must lie on.7.
A timespace domain is ﬁxed and stability is deﬁned in terms of what happens to some norm of the solution within this domain as the mesh intervals go to zero at some constant ratio. Note that the stability condition 7. condition 7.1 Numerical Stability of O∆E ’s The Criterion The O∆E companion to Statement 7. 7.7) We see that numerical stability depends only on the transient solution of the O∆E ’s. Eq. on a computer such a growth might destroy the solution process before it could be terminated. We discuss this point of view in Section 7. This deﬁnition of stability is sometimes referred to as asymptotic or time stability. However.3 is For a stationary matrix C. In such cases it is appropriate to consider simultaneously the eﬀects of both the time and space approximations.2 is numerically stable if.6) since for pure imaginary λ.6 excludes the imaginary axis which tends to be occupied by the eigenvalues related to biconvection problems. As we stated at the beginning of this chapter. STABILITY OF LINEAR SYSTEMS for which one ﬁnds the solution u1 (t) = u1 (0)eλt u2 (t) = [u2 (0) + u1 (0)t]eλt u3 (t) = 1 u3 (0) + u2 (0)t + u1 (0)t2 eλt 2 (7. .6 is of little or no practical importance if signiﬁcant amounts of dissipation are present. (7. un remains bounded as n → ∞. 7.5) Inspecting this solution. in practical applications the criterion may be worthless since there may be a very large growth of the polynomial before the exponential “takes over” and brings about the decay. Theoretically this condition is suﬃcient for stability in the sense of Statement 7.3 7. we see that for such cases condition 7.3. Furthermore. A deﬁnition often used in CFD literature stems from the development of PDE solutions that do not necessarily follow the semidiscrete route.3 since tk e− t → 0 as t → ∞ for all nonzero . u2 and u3 would grow without bound (linearly or quadratically) if u2 (0) = 0 or u1(0) = 0.4. when g is constant. stability deﬁnitions are not unique.124 CHAPTER 7. However.4 must be modiﬁed to the form (λm ) < 0 for all m (7.
.5 if σ = 1 and either u2(0) or u1 (0) = 0 we get linear or quadratic growth. follows at once from a study of Eq. Examine Eq. TIMESPACE STABILITY AND CONVERGENCE OF O∆E’S 125 7. This deﬁnition of stability for O∆E ’s is consistent with the stability deﬁnition for ODE’s. Further. Timemarch methods are developed which guarantee that σ(λh) ≤ 1 and this is taken to be the condition for numerical stability. 7.8) This condition states that. 7. 7. Again the sensitive case occurs for the periodicconvection model which places the “correct” location of the principal σ–root precisely on the unit circle where the solution is only neutrally stable. 2. according to the condition set in Eq. for a complete eigensystem. all of the σ eigenvalues (both principal and spurious. The stability criterion. 7.18 and note its similarity with Eq.28 and its companion for multiple σroots. Eq. 6. 6.2 Complete Eigensystems Consider a set of O∆E ’s governed by a complete eigensystem. if there are any) must lie on or inside the unit circle in the complex σplane. for example in Eq. We see that for defective O∆E’s the required modiﬁcation to 7. Clearly. for numerical stability. 7.3.7.4 TimeSpace Stability and Convergence of O∆E’s Let us now examine the concept of stability in a diﬀerent way. The PDE’s are converted to ODE’s by approximating the space derivatives on a ﬁnite mesh.36.8 is (σm )k  < 1 for all m and k (7.9) since defective systems do not guarantee boundedness for σ = 1. the eigenvectors play no role in the numerical stability assessment.4. 6.7.5. 3. In the previous discussion we considered in some detail the following approach: 1. for such systems a timemarching method is numerically stable if and only if (σm )k  ≤ 1 for all m and k (7. Inherent stability of the ODE’s is established by guaranteeing that (λ) ≤ 0.3.3 Defective Eigensystems The discussion for these systems parallels the discussion for defective ODE’s.
2. A method is convergent if it converges to the exact solution as the mesh spacing and time step go to zero in this manner. First construct a ﬁnite timespace domain lying within 0 ≤ x ≤ L and 0 ≤ t ≤ T .126 CHAPTER 7. Cover this domain with a grid that is equispaced in both time and space and ﬁx the mesh ratio by the equation2 cn = ∆t ∆x Next reduce our O∆E approximation of the PDE to a twolevel (i. Later we will refer to the numerical solution converging to a steady state solution.10) Eq. must go to inﬁnity in order to cover the entire ﬁxed domain. Applying simple recursion to Eq. we ﬁnd un = Cn u0 and using vector and matrix pnorms (see Appendix A) and their inequality relations. we have un = Cn u0  ≤ Cn  · u0 ≤ Cn · u0 2 (7.10 is said to be stable if any bounded initial vector. Now let us deﬁne stability in the timespace sense. which states that. will have a numerical solution that is bounded as t = nh → ∞. two timeplanes) formula in the form of Eq. This is the classical deﬁnition of stability.3 Clearly. 7.11) This ratio is appropriate for hyperbolic problems.7 is a necessary condition for this stability criterion.e. The signiﬁcance of this deﬁnition of stability arises through Lax’s Theorem. u0 . un .10. This is further discussed in Section 7. the number of time steps. if a numerical method is stable (in the sense of Lax) and consistent then it is convergent. . as the mesh shrinks to zero for a ﬁxed cn . in the hyperbolic case). a diﬀerent ratio may be needed for parabolic problems such as the model diﬀusion equation. generated from a PDE on some ﬁxed space mesh. Clearly as the mesh intervals go to zero. so the criterion in 7. produces a bounded solution vector. this is an important property. It is often referred to as Lax or LaxRichtmyer stability.. STABILITY OF LINEAR SYSTEMS This does guarantee that a stationary system. 7. The homogeneous part of this formula is un+1 = Cun (7. 7. 3 In the CFD literature.8. A method is consistent if it produces no error (in the Taylor series sense) in the limit as the mesh spacing and the time step go to zero (with cn ﬁxed. the word converge is used with two entirely diﬀerent meanings. This does not guarantee that desirable solutions are generated in the time march process as both the time and space mesh intervals approach zero. N . Here we refer to a numerical solution converging to the exact solution of the PDE.
12. It is clear that the criteria are the same when the spectral radius is a true pnorm. when C is normal. the solution vector is bounded if C ≤ 1 127 (7. and circulant matrices. The spectral radius of a matrix is its L2 norm when the matrix is normal. stability is related to a pnorm of C. asymmetric... its eigenvalue of maximum magnitude.7. i.12 becomes both necessary and suﬃcient for stability.8 and 7. This includes symmetric. The spectral radius is the lower bound of all norms.8 and 7. Two facts about the relation between spectral radii and matrix norms are well known: 1.9.9 with that given in Eq. 2. • If the spectral radius of any governing matrix is greater than one. 7. In this case.11 becomes an equality. Actually the necessary condition is that the spectral radius of C be less than or equal to 1 + O(∆t).12.4. it commutes with its transpose. In Eq. 7.e. the second inequality in Eq.12 are identical for stationary systems when the governing matrix is normal. stability is related to the spectral radius of C. 7. 7. 7. i. the spectral radius condition is necessary4 but not suﬃcient for stability. In Eqs. • The stability criteria in Eqs.e. the method is unstable by any criterion. Eq. These criteria are both necessary and suﬃcient for methods that generate such matrices and depend solely upon the eigenvalues of the matrices. 7. Now we need to relate the stability deﬁnitions given in Eqs. but this distinction is not critical for our purposes here.8 and 7. Thus for general matrices. This is often used as a suﬃcient condition for stability. 7. Furthermore.12) where C represents any pnorm of C. From these relations we draw two important conclusions about the numerical stability of methods used to solve PDE’s. TIMESPACE STABILITY AND CONVERGENCE OF O∆E’S Since the initial data vector is bounded. 4 .
λh is real and negative. The fact that it lies on the unit circle means only that the amplitude of the representation is correct. STABILITY OF LINEAR SYSTEMS 7. 5 . the ﬁnal diﬀerence equations can be represented by un+1 = Cun − gn Furthermore if C has a complete5 eigensystem. Locus of the exact trace Figure 7. which it never leaves. the trace representing the dissipation model heads towards the origin as λh =→ −∞.41). The phase error relates to the position on the unit circle. respectively. As the magnitude of λh increases. for the biconvection model. For this reason we will proceed to trace the locus of the σroots as a function of the parameter λh for the equations modeling diﬀusion and periodic convection6 . This serves as a very convenient guide as to where we might expect the σroots to lie relative to the unit circle in the complex σplane. For the diﬀusion model. the solution to the homogeneous part can always be expressed as n n n un = c1 σ1 x1 + · · · + cm σm xm + · · · + cM σM xM where the σm are the eigenvalues of C. If the semidiscrete approach is used. it tells us nothing of the phase error (see Eq. the parameter h for ﬁxed values of λ equal to 1 and i for the diﬀusion and biconvection cases. 6 Or.128 CHAPTER 7. On the other hand. the trace representing the biconvection model travels around the circumference of the unit circle.1 Numerical Stability Concepts in the Complex σPlane σRoot Traces Relative to the Unit Circle Whether or not the semidiscrete approach was taken to ﬁnd the diﬀerencing approximation of a set of PDE’s. We must be careful in interpreting σ when it is representing eiωh . if you like. From now on we will omit further discussion of this special case.5. 6. In both cases the • represents the starting value where h = 0 and σ = 1. we can ﬁnd a relation between the σ and the λ eigenvalues. The subject of defective eigensystems has been addressed.1 shows the exact trace of the σroot if it is generated by eλh representing either diﬀusion or biconvection. λh = iωh is always imaginary. As the magnitude of ωh increases.5 7.
Table 7. Some λ − σ Relations a. and the method has no range of stability in this case. 1 2 3 4 5 6 7 8 9 10 11 12 13 σ − 1 − λh = 0 Explicit Euler σ 2 − 2λhσ − 1 = 0 Leapfrog 3 λh)σ + 1 λh = 0 2 σ − (1 + 2 AB2 2 5 σ 3 − (1 + 23 λh)σ2 + 16 λhσ − 12 λh = 0 AB3 12 12 ( σ 1 − λh) − 1 = 0 Implicit Euler 1 λh) − (1 + 1 λh) = 0 σ(1 − 2 Trapezoidal 2 1 σ 2 (1 − 2 λh) − 4 σ + 3 = 0 2nd O Backward 3 3 5 λh) − (1 + 8 λh)σ + 1 λh = 0 2 σ (1 − 12 AM3 12 12 1 σ 2 − (1 + 13 λh + 15 λ2 h2 )σ + 12 λh(1 + 5 λh) = 0 ABM3 12 24 2 3 σ 3 − (1 + 2λh)σ2 + 2 λhσ − 1 λh = 0 Gazdag 2 1 λ2 h2 = 0 σ − 1 − λh − 2 RK2 1 λ2 h2 − 1 λ3 h3 − 1 λ4 h4 = 0 RK4 σ − 1 − λh − 2 6 24 σ 2 (1 − 1 λh) − 4 λhσ − (1 + 1 λh) = 0 Milne 4th 3 3 3 Table 7.3 illustrate the results produced by various methods when they are applied to the model ODE’s for diﬀusion and periodicconvection. When used for dissipationdominated cases it is stable for the range 2≤ λh ≤0.1 shows the λσ relations for a variety of methods. Explicit Euler Method Figure 7.5.4 and 4. It is implied that the behavior shown is typical of what will happen if the methods are applied to diﬀusion.7.trace falls outside the unit circle for all ﬁnite h.1. .2 and 7. Eqs. 4. Figures 7. When used for biconvection the σ.2 shows results for the explicit Euler method. Most of the important possibilities are covered by the illustrations. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX σPLANE 129 Examples of some methods Now let us compare the exact σroot traces with some that are produced by actual timemarching methods. (Usually the magnitude of λ has to be estimated and often it is found by trial and error).5.(or dissipation) dominated or periodic convectiondominated problems as well as what does happen in the model cases.
λh a) Dissipation .0 since at that point the spurious root leaves the circle and σ2  becomes greater than one. More is said about this in Chapter 8. when λ is pure imaginary. The situation is quite diﬀerent when the . the method is not only stable. c. for biconvection cases. this does not mean. b. 7. Leapfrog Method This is a tworoot method. it says nothing about the error in phase. The ﬁgure shows that the range ends when λh < −1. the spurious root starts at the origin. unlike the leapfrog scheme. since there are two σ’s produced by every λ. STABILITY OF LINEAR SYSTEMS Ι(σ) Ι(σ) λ h= 0 ωh= 0 λ h= οο R (σ) R (σ) i ωh .oo σ=e oo b) Convection Figure 7. Therefore. 7. Although the ﬁgure shows that there is a range of ωh in which the leapfrog method produces no error in amplitude.2 that the principal root is stable for a range of λh. see Fig. there is a range of real negative λh for which the method will be stable.1: Exact traces of σroots for model equations. When applied to dissipation dominated problems we see from Fig. In fact. rather than on the unit circle. As was pointed out above. ωh σ = e λ h . the spurious root starts on the unit circle and falls outside of it for all (λh) < 0. SecondOrder AdamsBashforth Method This is also a tworoot method but. but the spurious root is not.2. that the method is without error. but it also produces a σ that falls precisely on the unit circle in the range 0 ≤ ωh ≤ 1.130 CHAPTER 7. However.
the principal root falls outside the unit circle for all ωh > 0. and for the biconvection model equation the method is unstable for all h. Its σroots fall on or inside the unit circle for both the dissipating and the periodic convecting case and. Just like the leapfrog method it has the capability of . it is stable for all values of λh for which λ itself is inherently stable. in fact. Trapezoidal Method The trapezoidal method is a very popular one for reasons that are partially illustrated in Fig. In that case as ωh increases away from zero the spurious root remains inside the circle and remains stable for a range of ωh.3. d. However.2: Traces of σroots for various methods. λroot is pure imaginary. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX σPLANE 131 σ1 σ1 λ h=2 a) Euler Explicit ωh = 1 σ2 σ1 σ2 σ1 b) Leapfrog σ2 λ h=1 σ1 σ2 σ1 c) AB2 Diffusion Convection Figure 7.7. 7.5.
Since its characteristic polynomial for σ is a cubic (Table 7. whereas RK4 remains inside the unit circle √ for 0 ≤ ωh ≤ 2 2. a spurious root limits the stability. One can show that the RK4 stability limit is about λh < 2. if the transient solution of interest can be resolved in a limited number of time steps that are small in the sense of the ﬁgure. Such cases are typiﬁed by the AB2 and RK2 methods when they are applied to a biconvection problem. the stability of a method that is required to resolve a transient solution over a relatively short time span may be a moot issue. but there is a major diﬀerence between the two since the trapezoidal method produces no amplitude error for any ωh — not just a limited range between 0 ≤ ωh ≤ 1.3.and fourthorder RungeKutta methods are shown in Fig. Second. 7. Note that both spurious roots are located at the origin when λ = 0. Figs. RK2 and RK4 Traces of the σroots for the second. These are shown in Fig. 10). On the other hand for biconvection RK2 is unstable for all ωh.5.132 CHAPTER 7. it must have two spurious roots in addition to the principal one. In both the dissipation and biconvection cases. Gazdag Method The Gazdag method was designed to produce low phase error. Therefore.g. when ωh > 2 . but that the range of stability for RK4 is greater. and for the biconvecting case. Situations for which this is not the case are considered in Chapter 8.2c and 7. going almost all the way to 2. whereas RK2 is limited to 2. no. e. 7. the error caused by this instability may be relatively unimportant. However. 3 f. The ﬁgures show that both methods are stable for a range of λh when λh is real and negative. h. on the basis of the principal root.3.1. STABILITY OF LINEAR SYSTEMS producing only phase error for the periodic convecting case. Mild instability All conventional timemarching methods produce a principal root that is very close to eλh for small values of λh.8. If the root had fallen inside the circle the .and FourthOrder RungeKutta Methods.2 Stability for Small ∆t It is interesting to pursue the question of stability when the time step size. 1 a spurious root leaves the unit circle when λh < −2 . 7. For the dissipating case. 7. is small so that accuracy of all the λroots is of importance.8 for all complex λh for which (λ) ≤ 0.3f show that for both methods the principal root falls outside the unit circle and is unstable for all ωh.
. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX σPLANE 133 σ1 σ1 λ h=.5.οο σ3 σ2 σ1 d) Trapezoidal σ2 σ1 ω h = 2/3 σ3 e) Gazdag σ1 σ1 λ h=2 f) RK2 σ1 σ1 g) RK4 λ h=2.3: Traces of σroots for various methods (cont’d).7.8 Diffusion ωh= 2 2 Convection Figure 7.
so from an accuracy point of view it is attractive. that this method is thirdorder accurate both in terms of erλ and erµ . and the special procedures chosen to start them initialize the coeﬃcients of the spurious roots.3. just in the opposite direction. Milne and Adams type methods If we inspect the σroot traces of the multiple root methods in Figs. twostep.13 it follows that for λh = 0.2 and 7. and if the method is stable. P (E) = E 2 + 4E − 5. In this case a spurious root falls on the unit circle when h → 0. 7. one can see disaster is imminent because (5)10 ≈ 107 . linear multistep method (see Table 7. using the methods given in Section 6.36. we ﬁnd them to be of two types. Factoring P (σ) = 0 we ﬁnd P (σ) = (σ − 1)(σ + 5) = 0. σ1 . However. This kind of instability is referred to as mild instability and is not a serious problem under the circumstances discussed. the cmk for k > 1 in Eq. equal to 5!! In order to evaluate the consequences of this result. one must understand how methods with spurious roots work in practice. From Eq. One type is typiﬁed by the leapfrog method. One of the best illustrations of this kind of problem stems from a critical study of the most accurate. 7. Such methods are called catastrophically unstable and are worthless for most. For this reason the AB2 and the RK2 methods have both been used in serious quantitative studies involving periodic convection.134 CHAPTER 7. computations. the principal one. Since at least one spurious root for a Milne method always starts on the unit circle. if the magnitude of one of the spurious σ is equal to 5. let us inspect its stability even for very small values of λh. In this case all spurious roots fall on the origin when h → 0. There are two σroots. This can easily be accomplished by studying its characteristic polynomial when λh → 0. However. equal to 1. The former type is referred to as a Milne Method. STABILITY OF LINEAR SYSTEMS method would have been declared stable but an error of the same magnitude would have been committed. a spurious one. 6. The other type is exempliﬁed by the 2ndorder AdamsBashforth and Gazdag methods. If the starting process is well designed these coeﬃcients are forced to be very small. if not all.6. Catastrophic instability There is a much more serious stability problem for small h that can be brought about by the existence of certain types of spurious roots. explicit. they get smaller with increasing n.13) One can show.1): un+1 = −4un + 5un−1 + 2h 2un + un−1 (7. and σ2 . Even a very small initial value of cmk is quickly overwhelmed. the method is likely to become . We know that they are not self starting.
we seek to make the size of this step as small as possible.7. and to minimize this we wish to make the step size as large as possible. they suﬀer. Since for these methods all spurious methods start at the origin for h = 0. The latter type is referred to as an Adams Method. On the basis of a Taylor series expansion.6. they have a guaranteed range of stability for small enough h. However. the order of accuracy of the two types is generally equivalent. the situation in which all of the transient terms are resolved constitutes a rather minor role in stability considerations. stability can play a part. however. from accuracy. This situation is . the cost of the computation generally depends on the number of steps taken to compute a solution. however. Presumably we are seeking to resolve some transient and. The reason to study stability for small values of h is fairly easy to comprehend.6. Aside from ruling out catastrophically unstable methods. or • We seek only to ﬁnd a steadystate solution using a path that includes the unwanted transient. In both of these cases there exist in the eigensystems relatively large values of λh associated with eigenvectors that we wish to drive through the solution process without any regard for their individual accuracy in eigenspace. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX λH PLANE135 unstable for some complex λ as h proceeds away from zero. By far the most important aspect of numerical stability occurs under conditions when: • One has inherently stable.6 7. In the compromise. and stability requirements in CFD applications generally override the (usually small) increase in accuracy provided by a coeﬃcient with lower magnitude. these methods are generally the most accurate insofar as they minimize the coeﬃcient in the leading term for ert . on the basis of the magnitude of the coeﬃcient in the leading Taylor series error term. For a given amount of computational work. On the other hand. coupled systems with λ–eigenvalues having widely separated magnitudes. relatively speaking. 7.1 Numerical Stability Concepts in the Complex λh Plane Stability for Large h. since the accuracy of the transient solution for all of our methods depends on the smallness of the timestep.
for coupled sets with a complete eigensystem.2 Unconditional Stability.15) (7.6.3. A method is Ao stable if the region of stability contains the negative real axis in the complex λh plane. By applying a fairly simple test for Astability in terms of positive real functions to the class of twostep LMM’s given in Section 6. furthermore . Some unconditionally stable (Astable) implicit methods. and Istable if it contains the entire imaginary axis. we start with the deﬁnition: A numerical method is unconditionally stable if it is stable for all ODE’s that are inherently stable. and. one ﬁnds these methods to be Astable if and only if 1 2 1 ξ≥− 2 1 ξ ≤θ+ϕ− 2 θ ≥ϕ+ A set of Astable implicit methods is shown in Table 7.2 and.7. This serves as an excellent reference frame to discuss and deﬁne the general stability features of timemarching methods.2.14) (7. STABILITY OF LINEAR SYSTEMS the major motivation for the study of numerical stability. or to the left of. It leads to the subject of stiﬀness discussed in the next chapter. For example. θ ξ ϕ Method Implicit Euler Trapezoidal 2nd O Backward Adams type Lees Twostep trapezoidal Acontractive Order 1 2 2 2 2 2 2 (7. Notice that none of these methods has an accuracy higher than secondorder. It can be proved that the order of an Astable LMM cannot exceed two. it amounted to the requirement that the real parts of all λ eigenvalues must lie on. AStable Methods Inherent stability of a set of ODE’s was deﬁned in Section 7. 7.136 CHAPTER 7.2. A method with this property is said to be Astable. the imaginary axis in the complex λ plane.16) 1 0 0 1/2 0 0 1 1/2 0 3/4 0 −1/4 1/3 −1/2 −1/3 1/2 −1/2 −1/2 5/8 −1/6 −2/9 Table 7.
This includes all explicit methods and predictorcorrector methods made up of explicit sequences. the parameters (θ.21) Hence. Therefore. ϕ) are related by the condition 1 ϕ=ξ−θ+ 2 and the two sets of inequalities reduce to the same set which is ξ ≤ 2θ − 1 1 ξ≥− 2 (7. The proof follows from the fact that the coeﬃcients of such a polynomial are sums of various combinations of products of all its roots.6. θ) parameter space. secondorder accurate LMM’s that are Astable and Ao stable share the same (ϕ. 7 .3 Stability Contours in the Complex λh Plane.17 to 7. the trapezoidal method has the smallest truncation error. however.20) (7. Although the order of accuracy of an Astable method cannot exceed two. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX λH PLANE137 that of all 2ndorder Astable methods. therefore. such that the resulting Or can be made equal to unity by a trivial normalization (division by a constant independent of λh). Returning to the stability test using positive real functions one can show that a twostep LMM is Ao stable if and only if θ ≥ϕ+ 1 2 1 ξ≥− 2 0≤θ−ϕ (7.16. It is not diﬃcult to prove that methods having a characteristic polynomial for which the coeﬃcient of the highest order term in E is unity7 can never be unconditionally stable.7. Such methods are referred to. ξ.19) For ﬁrstorder accuracy.14 to 7. ξ. no further discussion is necessary for the special case of Istability. 7. A very convenient way to present the stability properties of a timemarching method is to plot the locus of the complex λh for which σ = 1. For secondorder accuracy. It has been shown that for a method to be Istable it must also be Astable.6. the inequalities 7. Ao stable LMM methods exist which have an accuracy of arbitrarily high order.18) (7. as conditionally stable methods.17) (7. twostep.19 are less stringent than 7.
4a shows the stability contour for the explicit Euler method. this method includes no part of the imaginary axis (except for the origin) and so it is unstable for the model biconvection problem. 7. Notice in particular that the third.g.4c shows the stability contour for the implicit Euler method.5. leapfrog. It follows from Section 7.9i and ±2 2i.5 and 7. therefore. 2. Fig.4. it is unstable.4. 7. contour goes through the point λh = 0.g..138 I( λ h) CHAPTER 7. RK2). However. which is derived from the oneroot θmethod given in Section 6.6.6. several others do not (e. Contours for unconditionally stable implicit methods Fig. Typical stability contours for both explicit and implicit methods are illustrated in Fig. In the following two ways it is typical of all stability contours for explicit methods: 1. Notice that the . Contours for explicit methods Fig. STABILITY OF LINEAR SYSTEMS I λ h) ( I λ h) ( Stable Unstable 1 Stable Stable Unstable 1 R(λ h) R(λ h) Unstable R( λ h) θ=1 a) Euler Explicit θ=0 b) Trapezoid Implicit θ = 1/2 c) Euler Implicit Figure 7. that is a root to the characteristic polynomial for a given λh. and therefore. RK3. RK4). 7. see Figs.. The region of stability is inside the boundary. respectively.3 that on one side of this contour the numerical method is stable while on the other. It is typical of many stability contours for unconditionally stable implicit methods. Here σ refers to the maximum absolute value of any σ. We refer to it. principal or spurious. Although several explicit methods share this deﬁciency (e.and fourthorder RungeKutta methods. it is conditional. Gazdag. The contour encloses a ﬁnite portion of the lefthalf complex λhplane. 7.4: Stability contours for the θmethod. 7. as a stability contour. AB2. include a portion of the imaginary axis √ out to ±1.
6.0 2.0 RK2 1.0 1. I( λh) 3.5: Stability contours for some explicit methods.7. NUMERICAL STABILITY CONCEPTS IN THE COMPLEX λH PLANE139 Figure 7.0 RK1 R(λ h) 3.0 Figure 7.6: Stability contours for RungeKutta methods. .0 Stable Regions RK4 RK3 2.
13.5b). Contours for conditionally stable implicit methods Just because a method is implicit does not mean that it is unconditionally stable.7. In all of these cases the imaginary axis is part of the stable region. This means that the method is numerically stable even when the ODE’s that it is being used to integrate are inherently unstable. i. The classic example of a method that is stable only when the generating ODE’s are themselves inherently stable is the trapezoidal method. Two other methods that have this property are the twostep trapezoidal method un+1 = un−1 + h un+1 + un−1 and a method due to Lees 2 un+1 = un−1 + h un+1 + un + un−1 3 Notice that both of these methods are of the Milne type. One of these is the AdamsMoulton 3rdorder method (no.1 as no. Some other implicit unconditionally stable methods with the same property are shown in Fig.0 2. 7.0 3. 7.0 2. It is stable only for λ = ±iω when 0 ≤ ω ≤ 3. 7.0 R( λh) a) 2nd Order Backward Implicit b) Adams Type Figure 7. The boundary is the imaginary axis and the numerical method is stable for λh lying on or to the left of this axis.0 I(λh) Stable Outside 1. 8. Table 7.0 1. Another is the 4thorder Milne method given by the point operator 1 un+1 = un−1 + h un+1 + 4un + un−1 3 √ and shown in Table 7. method is stable for the entire range of complex λh that fall outside the boundary. Not all unconditionally stable methods are stable in some regions where the ODE’s they are integrating are inherently unstable.4b. the special case of the θmethod for which θ = 1 . 7. Two illustrations of this are shown in Fig.7: Stability contours for the 2 unconditionally stable implicit methods. Its stability boundary is very similar to that for the leapfrog method (see Fig.0 1..0 4. .0 2.1).140 I(λh) CHAPTER 7. The 2 stability boundary for this case is shown in Fig. STABILITY OF LINEAR SYSTEMS Stable Outside 2.0 Unstable R(λ h) Unstable 1.e.8.
7.7.7 Fourier Stability Analysis By far the most popular form of stability analysis for numerical schemes is the Fourier or von Neumann approach.8 In practical application it is often used as a guide for estimating the worthiness of a method for more general problems.0 141 I(λh) 3.0 1. Strictly speaking it applies only to diﬀerence approximations of PDE’s that produce O∆E’s which are linear.0 Stable 1. and have periodic boundary conditions. t) = eαt · eiκx (7. 7. FOURIER STABILITY ANALYSIS I(λh) 3.1 The Basic Procedure One takes data from a “typical” point in the ﬂow ﬁeld and uses this as constant throughout time and space according to the assumptions given above. have no space or time varying coeﬃcients. uj+m = eα(t+ ∆t) · eiκ(x+m∆x) = eα ∆t · eiκm∆x · uj 8 (n+ ) (n) Another way of viewing this is to consider it as an initial value problem on an inﬁnite space domain.22) is a solution to the diﬀerence equation. This analysis is usually carried out on point operators and it does not depend on an intermediate stage of ODE’s. . Then one imposes a spatial harmonic as an initial value on the mesh and asks the question: Will its amplitude grow or decay in time? The answer is determined by ﬁnding the conditions under which u(x.0 2.0 2. Since.0 R(λ h) R(λ h) b) Milne 4th Order a) AM3 Figure 7. but it is by no means a suﬃcient one. It serves as a fairly reliable necessary stability condition. 7.7.0 Stable only on Imag Axis 2. where κ is real and κ∆x lies in the range 0 ≤ κ∆x ≤ π.8: Stability contours for the 2 conditionally stable implicit methods.0 3. for the general term.0 1.
7.2 Some Examples The procedure can best be explained by examples. since eαt = eα∆t n = σ n . 9 . 7. in the worst possible combination of parameters.23) For numerical stability σ ≤ 1 and the problem is to solve for the σ’s produced by any given method and.24) Substitution of Eq.2 = −b ± b2 + 1 from which it is clear that one σ is always > 1. Richardson’s method of overlapping steps is unstable for all ν. Consider as a ﬁrst example the ﬁnite diﬀerence approximation to the model diﬀusion equation known as Richardson’s method of overlapping steps.25 are √ σ1. make sure that.142 (n) CHAPTER 7. 4. condition 7. 7. we ﬁnd the term eα∆t which we represent by σ.1: uj (n+1) = uj (n−1) +ν 2∆t (n) (n) (n) u − 2uj + uj−1 ∆x2 j+1 (7.23 is satisﬁed9 .22 into Eq.1 and given as Eq. 7. 7. the condition is often presented as σ ≤ 1 + O(∆t).22 is a solution of Eq.24 gives the relation σ = σ−1 + ν or σ2 + 4ν∆t (1 − cos κ∆x) σ − 1 = 0 ∆x2 2b (7. This was mentioned in Section 4.25) 2∆t iκ∆x − 2 + e−iκ∆x 2 e ∆x Thus Eq. If boundedness is required in a ﬁnite time domain. therefore. The two roots of 7. it is clear that (7. thus: σ ≡ eα∆t Then. as a necessary condition for stability. 7.25.24 if σ is a root of Eq. STABILITY OF LINEAR SYSTEMS the quantity uj is common to every term and can be factored out. In the remaining expressions. We ﬁnd. κ and ∆t.7. that by the Fourier stability test.
On the other hand.3 Relation to Circulant Matrices The underlying assumption in a Fourier stability analysis is that the C matrix. the results present rather obvious conclusions. is circulant. is not an accident. by the Fourier stability test. From Appendix B we ﬁnd that the eigenvalues of this matrix are real negative numbers. −2. 7. and according to Appendix B. 1) where s is a positive scalar coeﬃcient. CONSISTENCY 143 As another example consider the ﬁnitediﬀerence approximation for the model biconvection equation uj In this case (n+1) = uj − (n) a∆t (n) (n) u − uj−1 2∆x j+1 (7. and the two examples just presented outline a simple procedure for ﬁnding the eigenvalues of the circulant matrices formed by application of the two methods to the model problems.2. determined when the diﬀerencing scheme is put in the form of Eq. therefore.8. the space matrix in Eq. The space diﬀerencing in Richardson’s method produces the matrix s · Bp (1. from Fig. 7.8 Consistency ∂u ∂2u =ν 2 ∂t ∂x Consider the model equation for diﬀusion analysis (7.26 is Bp (−1. The choice of σ for the stability parameter in the Fourier analysis. 0. the eiκx in Eq. σ =1− 7. this method is unstable for all eigenvalues with negative real parts. the timemarching is being carried out by the leapfrog method and. this matrix has pure imaginary eigenvalues. It is exactly the same σ we have been using in all of our previous discussions. 7. If we examine the preceding examples from the viewpoint of circulant matrices and the semidiscrete approach.7. this method is always unstable for such conditions. Clearly. 1). but arrived at from a diﬀerent perspective.7.22 represents an eigenvector of the system.4. 7. 7. is unstable for any choice of the free parameters. in this case the explicit Euler method is being used for the timemarch and. However.5. 7.27) . Thus we have another ﬁnitediﬀerence approximation that. according to Fig.26) a∆t · i · sin κ∆x ∆x from which it is clear that σ > 1 for all nonzero a and κ. Such being the case.
we introduced the DuFortFrankel method in Chapter 4: (n+1) uj = (n−1) uj u 2ν∆t (n) j + 2 uj−1 − 2 ∆x (n+1) + uj 2 (n−1) + (n) uj+1 (7. 1)u + (bc) dt ∆x2 (7. Our analysis in this Chapter revealed that this is numerically unstable since the λroots of B(1. −2. The method was used by Richardson for weather prediction.28) with the leapfrog method used for the time march. 4. In Richardson’s time. and this fact can now be a source of some levity. One of them is to introduce mixed time and space diﬀerencing. We presented his method in Eq. . see Fig. Now let α≡ and rearrange terms (1 + α)uj (n+1) 2ν∆t ∆x2 = (1 − α)uj (n−1) + α uj−1 + uj+1 (n) (n) There is no obvious ODE between the basic PDE and this ﬁnal O∆E. in fact).2 and analyzed its stability by the Fourier method in Section 7. Richardson proposed a method for integrating equations of this type. Instead one proceeds immediately to the σroots. 7. Hence.2 has been replaced by its average value at two diﬀerent time levels. For example. However. As a semidiscrete method it can be expressed in matrix notation as the system of ODE’s: du ν = B(1.28 since it is stable for real negative λ’s. There are many ways to manipulate the numerical stability of algorithms.144 CHAPTER 7.1) are all real and negative and the spurious σroot in the leapfrog method is unstable for all such cases. the concept is quite clear today and we now know immediately that his approach would be unstable. however. conditionally stable and for this case we are rather severely limited in time step size by the requirement ∆t ≤ ∆x2 /(2ν). 7. however.29) in which the central term in the space derivative in Eq. It is.2.5b. the concept of numerical instability was not known. We could. 4. use the 2ndorder RungeKutta method to integrate Eq. of course. Lewis F. there is no intermediate λroot structure to inspect. the hand calculations (the only approach available at the time) were not carried far enough to exhibit strange phenomena.7. a possibility we have not yet considered. In all probability. STABILITY OF LINEAR SYSTEMS Many years before computers became available (1910.
This leads at once to (1 + α)σ = (1 − α)σ−1 + α eiκ∆x + e−iκ∆x or (1 + α)σ2 − 2ασ cos(k∆x) − (1 − α) = 0 The solution of the quadratic is σ= α cos κ∆x ± 1 − α2 sin2 κ∆x 1+α There are 2M σroots all of which are ≤ 1 for any real α in the range 0 ≤ α ≤ ∞. we expand the terms in Eq. We ﬁnd ∂u ∂2u ∂2 u = ν 2 − νr2 2 ∂t ∂x ∂t where r≡ ∆t ∆x (7. 7. since we have found an unconditionally stable method using an explicit combination of Lagrangian interpolation polynomials.8. CONSISTENCY 145 The simplest way to carry this out is by means of the Fourier stability analysis introduced in Section 7. ∆x → 0. For the time derivative we have 1 1 (n+1) (n−1) uj − uj = (∂t u)(n) + ∆t2 (∂ttt u)(n) + · · · j j 2∆t 6 and for the mixed time and space diﬀerences uj−1 − uj (n) (n+1) − uj ∆x2 (n−1) + uj+1 (n) = (∂xx u)(n) j ∆t − ∆x 2 (∂tt u)(n) + j 1 ∆x2 (∂xxxx u)(n) − j 12 2 1 ∆t ∆t2 (∂tttt u)(n) + · · · j 12 ∆x (7. This means that the method is unconditionally stable! The above result seems too good to be true. To prove this.7.31) .7. The price we have paid for this is the loss of consistency with the original PDE.30 in a Taylor series and reconstruct the partial diﬀerential equation we are actually solving as the mesh size becomes very small.30) Replace the terms in Eq.30 with the above expansions and take the limit as ∆t . 7.
1 −0. 7. not a diﬀusion equation. −1 f = 0 0 .30 is not uniformly consistent with the equation we set out to solve even for vanishingly small step sizes. and the method is consistent.27 is parabolic.3: Summary of accuracy and consistency conditions for RK2 and Du FortFrankel methods. If ∆t → 0 and ∆x → 0 in such a way that ∆t2 remains constant. then r2 is O(∆t). The situation is summarized in Table 7.9 Problems u = with 1.1 1 −1 1 A= 10 1 −1 . 7.3. 7. 2ndorder RungeKutta For Stability 2 ∆t ≤ ∆x 2ν DuFort–Frankel ∆t ≤ ∞ Unconditionally Stable For Consistency Conditionally Consistent.146 CHAPTER 7. In such a case Eq. this leads to a constraint on the time step which is just as severe as that imposed by the stability condition of the secondorder RungeKutta method shown in the table. ∆x However. 7. = an arbitrary error bound. 7. Consider the ODE du = Au + f dt −10 −0.30 is a wave equation. Eq. Thus if ∆t → 0 and ∆x → 0 in such ∆t a way that ∆x remains constant. STABILITY OF LINEAR SYSTEMS Eq. Approximates ∂ u = ν ∂ u ∂t ∂x2 only if ∆t 2 ν ∆x < Therefore ∆t < ∆x ν 2 Conditionally Stable Uniformly Consistent with 2 ∂u = ν∂ u ∂t ∂x2 Table 7. the equation we actually solve by the method in Eq.31 is hyperbolic.
(b) Plot the trace of the roots in the complex σplane and draw the unit circle on the same plot. Compare with the RungeKutta methods of order 1 through 4.7. 6.80 and compute the absolute values of the roots to at least 6 places. Recall the compact centered 3point approximation for a ﬁrst derivative: (δx u)j−1 + 4(δx u)j + (δx u)j+1 = 3 (uj+1 − uj−1) ∆x By replacing the spatial index j by the temporal index n.0 for 100 time steps. implicit Euler. h = 0. what are the expected bounds on h for stability? Are your results consistent with these bounds? 2. What is the steadystate solution? How does the ODE solution behave in time? (b) Write a code to integrate from the initial condition u(0) = [1. (c) Using the λσ relations for these three methods. ﬁnd the range of Cn for which the method is stable. θ. 5. is ah/∆x. PROBLEMS 147 (a) Find the eigenvalues of A using a numerical package. Is it Astable. and φ (in Eq. known as the Courant (or CFL) number.2 for 500 time steps. Compare the computed solution with the exact steady solution. Ao stable. When applied to the linear convection equation.05 from 0 to 0. obtain a timemarching method using this formula. Using Fourier stability analysis. 3.59) does it correspond? Derive the λσ relation for the method. 1.4 for 250 time steps and h = 1.1 for 1000 time steps.9. 4. use h = 0. In all three cases. Take h in intervals of 0. Determine and plot the stability contours for the AdamsBashforth methods of order 1 through 4. h = 0. (a) Compute a table of the numerical values of the σroots of the 2ndorder AdamsBashforth method when λ = i. or Istable? . to what values of ξ. and MacCormack methods. (c) Repeat the above for the RK2 method. What order is the method? Is it explicit or implicit? Is it a twostep LMM? If so. 1]T from time t = 0 using the explicit Euler. the widely known Lax–Wendroﬀ method gives: 1 1 2 un+1 = un − Cn (un − un ) + Cn (un − 2un + un ) j j j+1 j−1 j+1 j j−1 2 2 where Cn .
4. Consider the following PDE: ∂u ∂2 u =i 2 ∂t ∂x √ where i = −1. ﬁnd the λeigenvalues. the point where the stability contour intersects the negative real axis. Repeat using Fourier analysis.e. is obtained with λh = −1. ﬁnd the maximum stable time step for the combination of 2ndorder centered diﬀerences and the 2ndorder AdamsBashforth method applied to the model diﬀusion equation. Apply a Dirichlet boundary condition at the left boundary. Consider the linear convection with a positive wave speed as in problem 8. STABILITY OF LINEAR SYSTEMS 6. 9. Hint: is C normal? . Given that the λσ relation for the 2ndorder AdamsBashforth method is σ 2 − (1 + 3λh/2)σ + λh/2 = 0 show that the maximum stable value of λh for real negative λ. analyze the stability of ﬁrstorder backward diﬀerencing coupled with explicit Euler time marching applied to the linear convection equation with positive a.1. Write the ODE system obtained by applying the 2ndorder centered diﬀerence approximation to the spatial derivative in the model diﬀusion equation with periodic boundary conditions. Based on these. if 2ndorder centered diﬀerencing is used for the spatial diﬀerences and the boundary conditions are periodic? Find the stability condition for one of these methods. 8. 7. what is the maximum Courant number allowed for asymptotic stability? Explain why this diﬀers from the answer to problem 8. ﬁnd the eigenvalues of the spatial operator matrix.e.148 CHAPTER 7. Using Appendix B. Using the λσ relation for the explicit Euler method. i. Find the maximum Courant number for stability. un+1 = Cun − gn Write C in banded matrix notation and give the entries of g. Write the system of ODE’s which results from ﬁrstorder backward spatial diﬀerencing in matrixvector form. Using the eigenvalues of the spatial operator matrix. Using Fourier analysis. Which explicit timemarching methods would be suitable for integrating this PDE. ﬁnd the σeigenvalues.. Using Appendix B. No boundary condition is permitted at the right boundary.. i. Write the O∆E which results from the application of explicit Euler time marching in matrixvector form.
An important concept underlying much of this discussion is stiﬀness. 7. However.1 Stiﬀness Deﬁnition for ODE’s Relation to λEigenvalues The introduction of the concept referred to as “stiﬀness” comes about from the numerical analysis of mathematical models constructed to simulate dynamic phenomena containing widely diﬀerent time scales. since this part of the ODE has no eﬀect on the numerical stability of 149 .1. 8. In the following discussion we concentrate on the transient part of the solution. and. Examples are given showing how timemarching methods can be compared in a given context.Chapter 8 CHOICE OF TIMEMARCHING METHODS In this chapter we discuss considerations involved in selecting a timemarching method for a speciﬁc application. The forcing function may also be time varying in which case it would also have a time scale. but fortunately we now have the background material to construct a deﬁnition which is entirely suﬃcient for our purposes.1 8. We start with the assumption that our CFD problem is modeled with suﬃcient accuracy by a coupled set of ODE’s producing an A matrix typiﬁed by Eq. which is deﬁned in the next section. and the decision to use some numerical timemarching or iterative method to solve it. we assume that this scale would be adequately resolved by the chosen timemarching method. Any deﬁnition of stiﬀness requires a coupled system with at least two eigenvalues. The diﬀerence between the dynamic scales in physical space is represented by the diﬀerence in the magnitude of the eigenvalues in eigenspace. Deﬁnitions given in the literature are not unique.1.
In many numerical applications the eigenvectors associated with the small λm  are well resolved and those associated with the large λm  are resolved much less accurately.28. This is given by Eq. 6. 6. The whole concept of stiﬀness in CFD arises from the fact that we often do not need the time resolution of eigenvectors associated with the large λm  in the transient solution. In this ﬁgure the time step has been chosen so that time accuracy is given to the eigenvectors associated with the eigenvalues lying in the small circle and stability without time accuracy is given to those associated with the eigenvalues lying outside of the small circle but still inside the large circle.1.27 and its solution using a oneroot. although these eigenvectors must remain coupled into the system to maintain a high accuracy of the spatial resolution.150 CHAPTER 8. the homogeneous part. timemarching method is represented by Eq. . The situation is represented in the complex λh plane in Fig. 8. we exclude the forcing function from further discussion in this section. Consider now the form of the exact solution of a system of ODE’s with a complete eigensystem. CHOICE OF TIMEMARCHING METHODS Stable Region λh I(λ h) 1 R(λ h) Accurate Region Figure 8. if at all. For a given time step. the time integration is an approximation in eigenspace that is diﬀerent for every eigenvector xm .1: Stable and accurate regions for the explicit Euler method.
one [λ1 → λp ] called the driving eigenvalues (our choice of a timestep and marching method must accurately approximate the time variation of the eigenvectors associated with these). called the parasitic eigenvalues (no time accuracy whatsoever is required for the eigenvectors associated with these. complex. STIFFNESS DEFINITION FOR ODE’S 151 8.3 Stiﬀness Classiﬁcations The following deﬁnitions are somewhat useful. we ﬁnd that. First we order the eigenvalues by their magnitudes. It is important to notice that these deﬁnitions make no distinction between real. . An inherently stable set of ODE’s is stiﬀ if λM  λp  In particular we deﬁne the ratio Cr = λM  / λp  and form the categories Mildlystiﬀ Stronglystiﬀ Extremelystiﬀ Cr 103 < Cr 106 < Cr < 102 < 105 < 108 Pathologicallystiﬀ 109 < Cr It should be mentioned that the gaps in the stiﬀ category deﬁnitions are intentional because the bounds are arbitrary. 8. it states that we can separate our eigenvalue spectrum into two groups. Unfortunately.2) Driving Parasitic This concept is crucial to our discussion. and imaginary eigenvalues. but their presence must not contaminate the accuracy of the complete solution). although time accuracy requirements are dictated by the driving eigenvalues. numerical stability requirements are dictated by the parasitic ones.1.1. [λp+1 → λM ].8. thus λ1  ≤ λ2  ≤ · · · ≤ λM  Then we write M Transient cm eλm t xm + cm eλm t xm = Solution m=1 m=p+1 p (8.2 Driving and Parasitic Eigenvalues For the above reason it is convenient to subdivide the transient solution into two parts. and the other. Rephrased.1) (8.1.
3. CHOICE OF TIMEMARCHING METHODS 8. In order to demonstrate this. We see that in both cases we are faced with the rather annoying fact that the more we try to increase the resolution of our spatial gradients. this ratio is about 680. negative numbers that automatically obey the ordering given in Eq. The simplest example to discuss relates to the model diﬀusion equation. Typical CFD problems without chemistry vary between the mildly and strongly stiﬀ categories. Our brief analysis has been limited to equispaced . One quickly ﬁnds that this grid clustering can strongly aﬀect the eigensystem of the resulting A matrix. near an airfoil leading or trailing edge. For the biconvection model a similar analysis shows that λM /λ1 ≈ λM  / λ1  ≈ 1 ∆x Here the stiﬀness parameter is still spacemesh dependent.1.152 CHAPTER 8. the stiﬀness is determined by the ratio λM /λ1 .e. Examples of where this clustering might occur are at a shock wave. in diﬀusion problems this stiﬀness is proportional to the reciprocal of the space mesh size squared. For a mesh size M = 40. In this case the eigenvalues are all real.2. i. and are greatly aﬀected by the resolution of a boundary layer since it is a diﬀusion process.2 Relation of Stiﬀness to Space Mesh Size Many ﬂow ﬁelds are characterized by a few regions having high spatial gradients of the dependent variables and other domains having relatively low gradient phenomena. Consider the case when all of the eigenvalues are parasitic.. As a result it is quite common to cluster mesh points in certain regions of space and spread them out otherwise. the stiﬀer our equations tend to become. and in a boundary layer. A simple calculation shows that λ1 = − 4ν π 2 2 sin 2(M + 1) ∆x λM ≈ − and the ratio is ≈− 4ν ∆x2 =− 4ν ∆x2 ∆x 2 2 = −ν 4ν 2 π 2 sin 2 ∆x 4 M +1 2 =4 π ∆x2 The most important information found from this example is the fact that the stiﬀness of the transient solution is directly related to the grid spacing. we are interested only in the converged steadystate solution. 8. but much less so than for diﬀusiondominated problems. Furthermore. Under these conditions. let us examine the eigensystems of the model problems given in Section 4. Even for a mesh of this moderate size the problem is already approaching the category of strongly stiﬀ.
t) as a function evaluation and denote the total number of such evaluations by Fev . and the accuracy would be related to how much the energy of certain harmonics would be permitted to distort from a given level. The length of the time interval. Nevertheless. Let us now examine some ground rules that might be appropriate. . Since there are an endless number of these methods to choose from. the time interval would be that required to extract a reliable statistical sample. and technology inﬂuencing this is undergoing rapid and dramatic changes as this is being written. T . relevant conclusions can be reached. The appropriate error measures to be used in comparing methods for calculating an event are the global ones. one can wonder how this information is to be used to pick a “best” choice for a particular problem. Erλ and Erω . rather than the local ones erλ . highly dependent upon the speed. It should then be clear how the analysis can be extended to other cases. if certain ground rules are agreed upon. and architecture of the available computer.3 Practical Considerations for Comparing Methods We have presented relatively simple and reliable measures of stability and both the local and global accuracy of timemarching methods. PRACTICAL CONSIDERATIONS FOR COMPARING METHODS 153 problems.5. t) at least once. but in general the stiﬀness of CFD problems is proportional to the mesh intervals in the manner shown above where the critical interval is the smallest one in the physical domain. For example. This function is usually nonlinear. and erp discussed earlier. There is no unique answer to such a question. We refer to a single calculation of the vector F (u. The actual form of the coupled ODE’s that are produced by the semidiscrete approach is du = F (u.3. in calculating the amount of turbulence in a homogeneous ﬂow. Let us consider the problem of measuring the eﬃciency of a time–marching method for computing. For example. it is. capacity. era . 8. among other things. an accurate transient solution of a coupled set of ODE’s. discussed in Section 6. over a ﬁxed interval of time. and its computation usually consumes the major portion of the computer time required to make the simulation. t) dt At every time step we must evaluate the function F (u. and the accuracy required of the solution are dictated by the physics of the particular problem involved. Such a computation we refer to as an event.6. Era .8.
CHOICE OF TIMEMARCHING METHODS 8.154 CHAPTER 8. Implications of computer storage capacity and access time are ignored. In some contexts. must have the property: Erλ < 0.1 Comparing the Eﬃciency of Explicit Methods Imposed Constraints As mentioned above. the allowable error constraint means that the global error in the amplitude. let us set the additional requirement • The error in u at the end of the event. 8. To the constraints imposed above. u(T ) = 0. see Eq. The follow assumptions bound the considerations made in this Section: 1. and RK4. so that T = Nh where N is the total number of time steps. Fev . the global error.005 eλT .25.5%. the eﬃciency of methods can be compared only if one accepts a set of limiting constraints within which the comparisons are carried out.4. Eq. Since the exact solution is u(t) = u(0)e−t .. i.48. First of all.e.3 is obtained from our representative ODE with λ = −1. h. must be < 0.25) with u(0) = 1. 2. this makes the exact value of u at the end of the event equal to 0. and must use a constant time step size. must simulate an entire event which takes a total time T . We judge the most eﬃcient method as the one that satisﬁes these conditions and has the fewest number of evaluations. 3.4.e.2 An Example Involving Diﬀusion du = −u dt Let the event be the numerical solution of (8.4 8. 6. i.. 8. AB2. Three methods are compared — explicit Euler.25. The timemarch method is explicit.3) from t = 0 to T = − ln(0. a = 0. The calculation is to be timeaccurate. this can be an important consideration.
8. and the number of these evaluations must be kept to an absolute minimum because of the magnitude of the problem.4.5012 8 . for a given global accuracy. a complete event must be established in order to obtain meaningful statistical samples which are the essence of the solution. it follows that 1 − (σ1 (ln(. Thus the secondorder AdamsBashforth method is much better than the ﬁrstorder Euler method. and the fourthorder RungeKutta method is the best of all.25 < 0. COMPARING THE EFFICIENCY OF EXPLICIT METHODS Then.4.1. leading to the socalled incomplete .25)/N))N /. In this numerical experiment the function evaluations contribute overwhelmingly to the CPU time.4.001195 best Table 8.0866 .3 An Example Involving Periodic Convection Let us use as a basis for this example the study of homogeneous turbulence simulated by the numerical solution of the incompressible NavierStokes equations inside a cube with periodic boundary conditions on all sides.1: Comparison of timemarching methods for a simple dissipation problem.001248 worst AB2 16 . On the other hand. In this case.001137 RK4 2 .00718 .1. Under these conditions a method is judged as best when it has the highest global accuracy for resolving eigenvectors with imaginary eigenvalues. The main purpose of this exercise is to show the (usually) great superiority of secondorder over ﬁrstorder timemarching methods. In this example we see that.8.6931 . in addition to the constraints given in Section 8. the method with the highest local accuracy is the most eﬃcient on the basis of the expense in evaluating Fev . we add the following: • The number of evaluations of F (u.005 155 where σ1 is found from the characteristic polynomials given in Table 7.9172 16 .1 were computed using a simple iterative procedure. The results shown in Table 8. t) is ﬁxed. since h = T /N = − ln(0. Method N h σ1 Fev Erλ Euler 193 .25)/N.99282 193 . The above constraint has led to the invention of schemes that omit the function evaluation in the corrector step of a predictorcorrector combination.
h. In order to discuss our comparisions we introduce the following deﬁnitions: • Let a kevaluation method be deﬁned as one that requires k evaluations of F (u. The presumption is. the time span required for one application of the method increases. given in Section 6. of course. etc. and the number of evaluations. For a 1evaluation method the total number of time steps. notice also that as k increases. 0 k=1  k=2  k=4  T    uN [σ(λh1 )]8 [σ(2λh1 )]4 [σ(4λh1 )]2 • 2h1 • • 4h1 • • • • • • • • Step sizes and powers of σ for kevaluation methods used to get to the same value of T if 8 evaluations are allowed. the derivative of the fundamental family is never found so there is only one evaluation required to complete each cycle. In general.8.156 CHAPTER 8. However. For a 4evaluation method the time interval must be h = 4h1 . in order to arrive at the same time T after K evaluations. the power to which σ1 is raised to arrive at the ﬁnal destination decreases. t) to advance one step using that method’s time interval. see the Figure below. • Let K represent the total number of allowable Fev . The Gazdag. For a 2evaluation method N = K/2 since two evaluations are used for each step. K. respectively. An example is the method of Gazdag. Notice that as k increases. that more eﬃcient methods will result from the omission of the second function evaluation. The λσ relation for the method is shown as entry 10 in Table 7.38 and 6. in this case. The second and fourth order RK methods are 2. so that for these methods h = h1 . one evaluation being used for each step. • Let h1 be the time interval advanced in one step of a oneevaluation method. the global amplitude and phase error for kevaluation methods applied to systems with pure imaginary λroots can be written1 Era = 1 − σ1 (ikωh1 )K/k 1 (8.1. 6. N . and AB2 schemes are all 1evaluation methods. This is the key to the true comparison of timemarch methods for this type of problem.4) See Eqs. . CHOICE OF TIMEMARCHING METHODS predictorcorrector methods. Basically this is composed of an AB2 predictor and a trapezoidal corrector. However.39. However. after K evaluations. are the same. leapfrog. the time step must be twice that of a one–evaluation method so h = 2h1 .and 4evaluation methods.
7.8 1. the results shown in Table 8.2 the Gazdag method produces a σ1  = 0. Amplitude.1 100 −. see Fig. It is not diﬃcult to show that on the basis of local error (made in a single step) the Gazdag method is superior to the RK4 method in both amplitude and phase.8 (which must be used to keep the number of evaluations the same) the RK4 method produces a σ1  = 0. leapfrog. .995 .003 . Table 8.2: Comparison of global amplitude and phase errors for four methods. Phase error in degrees. in terms of phase error (for which it was designed) it is superior to the other two methods shown.0.8. For example.5 b. On the basis of global error the Gazdag method is not superior to RK4 in either amplitude or phase.9992276 whereas for ωh = 0. We consider two maximum evaluation limits K = 50 and K = 100 and choose from four possible methods.998324.45 . 8. The ﬁrst three of these are oneevaluation methods and the last one is a fourevaluation method. by some rather simple calculations2 made using Eqs. The event must proceed to the time t = T = 10. Notice that to ﬁnd global error the Gazdag root must be raised to the power of 50 while the RK4 root is raised only to the power of 50/4. exact = 1.4.5.962 . COMPARING THE EFFICIENCY OF EXPLICIT METHODS Erω = ωT − K [σ1 (ikωh1 )]imaginary tan−1 k [σ1 (ikωh1 )]real 157 (8. AB2. Gazdag. 2 The σ1 root for the Gazdag method can be found using a numerical root ﬁnding routine to trace the three roots in the σplane. and the fact that ω = 1. although.1 100 1. First of all we see that for a oneevaluation method h1 = T /K.2 50 1. we are making our comparisons on the basis of global error for a ﬁxed number of evaluations.2.96 −2.979 a.999 ωh1 = .4 and 8. We idealize to the case where λ = iω and set ω equal to one.5) Consider a convectiondominated event for which the function evaluation is very time consuming.0 1. However.0 1. Using this. K leapfrog AB2 Gazdag RK4 ωh1 = . and RK4.5 1.2 50 −3.3e.8 −9.022 .12 ωh1 = .4 . for ωh = 0. K leapfrog AB2 Gazdag RK4 ωh1 = . we ﬁnd.
which is determined from λ1 .5 and 7. 0) as shown in Fig.1. The coupled equations in real space are represented by u1 (n) = c1 (1 − 100h)n x11 + c2 (1 − h)n x12 + (P S)1 u2 (n) = c1 (1 − 100h)n x21 + c2 (1 − h)n x22 + (P S)2 (8. Numerical solutions. 7. If uncoupled and evaluated in wave space. 8. Also shown by the small circle centered at the origin is the region of Taylor series accuracy. The transient solution is un = (1 + λh)n and the trace of the complex value of λh which makes 1 + λh = 1 gives the whole story. A good example of the concept is produced by studying the Euler method applied to the representative equation. consider the mildly stiﬀ system composed of a coupled twoequation set having the two eigenvalues λ1 = −100 and λ2 = −1. CHOICE OF TIMEMARCHING METHODS Using analysis such as this (and also considering the stability boundaries) the RK4 method is recommended as a basic ﬁrst choice for any explicit timeaccurate calculation of a convectiondominated problem. Analytical evaluation of the time histories poses no problem since e−100t quickly becomes very small and can be neglected in the expressions when time becomes large.001 for λ1 and h = 0. these λh are stiﬀ. We ﬁrst run 66 time steps with h = 0. Stability boundaries for some explicit methods are shown in Figs. In this case the trace forms a circle of unit radius centered at (−1. is h = 0. 8.6. If some λh fall outside the small circle but stay within the stable region. Numerical evaluation is altogether diﬀerent.5.001 is negligible. We have deﬁned these λh as parasitic eigenvalues. This deﬁnes a time step limit based on accuracy considerations of h = 0. but stable.1 Coping With Stiﬀness Explicit Methods The ability of a numerical method to cope with stiﬀness can be illustrated quite nicely in the complex λh plane. and a relatively slowly decaying function in the other. of course. The time step limit based on stability. Let us choose the simple explicit Euler method for the time march. We will also assume that c1 = c2 = 1 and that an amplitude less than 0. depend upon [σ(λm h)]n and no σm  can exceed one for any λm in the coupled system or else the process is numerically unstable. the time histories of the two solutions would appear as a rapidly decaying function in one case.02.5 8.158 CHAPTER 8. For a speciﬁc example. With this time step the .1.001 in order to resolve the λ1 term. If h is chosen so that all λh in the ODE eigensystem fall inside this circle the integration will be numerically stable.1 for λ2 .6) We will assume that our accuracy requirements are such that suﬃcient accuracy is obtained as long as λh ≤ 0.
8.5. COPING WITH STIFFNESS
159
λ2 term is resolved exceedingly well. After 66 steps, the amplitude of the λ1 term (i.e., (1 − 100h)n ) is less than 0.001 and that of the λ2 term (i.e., (1 − h)n ) is 0.9361. Hence the λ1 term can now be considered negligible. To drive the (1 − h)n term to zero (i.e., below 0.001), we would like to change the step size to h = 0.1 and continue. We would then have a well resolved answer to the problem throughout the entire relevant time interval. However, this is not possible because of the coupled presence of (1 − 100h)n , which in just 10 steps at h = 0.1 ampliﬁes those terms by ≈ 109 , far outweighing the initial decrease obtained with the smaller time step. In fact, with h = 0.02, the maximum step size that can be taken in order to maintain stability, about 339 time steps have to be computed in order to drive e−t to below 0.001. Thus the total simulation requires 405 time steps.
8.5.2
Implicit Methods
Now let us reexamine the problem that produced Eq. 8.6 but this time using an unconditionally stable implicit method for the time march. We choose the trapezoidal method. Its behavior in the λh plane is shown in Fig. 7.4b. Since this is also a oneroot method, we simply replace the Euler σ with the trapezoidal one and analyze the result. It follows that the ﬁnal numerical solution to the ODE is now represented in real space by u1 (n) = c1 u2 (n) = c1 1 − 50h 1 + 50h 1 − 50h 1 + 50h
n
x11 + c2
n
x21 + c2
1 − 0.5h 1 + 0.5h 1 − 0.5h 1 + 0.5h
n
x12 + (P S)1
n
x22 + (P S)2
(8.7)
In order to resolve the initial transient of the term e−100t , we need to use a step size of about h = 0.001. This is the same step size used in applying the explicit Euler method because here accuracy is the only consideration and a very small step size must be chosen to get the desired resolution. (It is true that for the same accuracy we could in this case use a larger step size because this is a secondorder method, but that is not the point of this exercise). After 70 time steps the λ1 term has amplitude less than 0.001 and can be neglected. Now with the implicit method we can proceed to calculate the remaining part of the event using our desired step size h = 0.1 without any problem of instability, with 69 steps required to reduce the amplitude of the second term to below 0.001. In both intervals the desired solution is secondorder accurate and well resolved. It is true that in the ﬁnal 69 steps one σroot is [1 − 50(0.1)]/[1 + 50(0.1)] = 0.666 · · ·, and this has no physical meaning whatsoever. However, its inﬂuence on the coupled solution is negligible at the end of the ﬁrst 70 steps, and, since (0.666 · · ·)n < 1, its inﬂuence in the remaining 70
160
CHAPTER 8. CHOICE OF TIMEMARCHING METHODS
steps is even less. Actually, although this root is one of the principal roots in the system, its behavior for t > 0.07 is identical to that of a stable spurious root. The total simulation requires 139 time steps.
8.5.3
A Perspective
It is important to retain a proper perspective on a problem represented by the above example. It is clear that an unconditionally stable method can always be called upon to solve stiﬀ problems with a minimum number of time steps. In the example, the conditionally stable Euler method required 405 time steps, as compared to about 139 for the trapezoidal method, about three times as many. However, the Euler method is extremely easy to program and requires very little arithmetic per step. For preliminary investigations it is often the best method to use for mildlystiﬀ diﬀusion dominated problems. For reﬁned investigations of such problems an explicit method of second order or higher, such as AdamsBashforth or RungeKutta methods, is recommended. These explicit methods can be considered as eﬀective mildly stiﬀstable methods. However, it should be clear that as the degree of stiﬀness of the problem increases, the advantage begins to tilt towards implicit methods, as the reduced number of time steps begins to outweigh the increased cost per time step. The reader can repeat the above example with λ1 = −10, 000, λ2 = −1, which is in the stronglystiﬀ category. There is yet another technique for coping with certain stiﬀ systems in ﬂuid dynamic applications. This is known as the multigrid method. It has enjoyed remarkable success in many practical problems; however, we need an introduction to the theory of relaxation before it can be presented.
8.6
Steady Problems
In Chapter 6 we wrote the O∆E solution in terms of the principal and spurious roots as follows: un = c11 (σ1 )n x1 + · · · + cm1(σm )n xm + · · · + cM 1 (σM )n xM + P.S. 1 1 1 +c12 (σ1 )n x1 + · · · + cm2 (σm )n xm + · · · + cM 2 (σM )n xM 2 2 2 +c13 (σ1 )n x1 + · · · + cm3 (σm )n xm + · · · + cM 3 (σM )n xM 3 3 3 +etc., if there are more spurious roots (8.8)
When solving a steady problem, we have no interest whatsoever in the transient portion of the solution. Our sole goal is to eliminate it as quickly as possible. Therefore,
8.7. PROBLEMS
161
the choice of a timemarching method for a steady problem is similar to that for a stiﬀ problem, the diﬀerence being that the order of accuracy is irrelevant. Hence the explicit Euler method is a candidate for steady diﬀusion dominated problems, and the fourthorder RungeKutta method is a candidate for steady convection dominated problems, because of their stability properties. Among implicit methods, the implicit Euler method is the obvious choice for steady problems. When we seek only the steady solution, all of the eigenvalues can be considered to be parasitic. Referring to Fig. 8.1, none of the eigenvalues are required to fall in the accurate region of the timemarching method. Therefore the time step can be chosen to eliminate the transient as quickly as possible with no regard for time accuracy. For example, when using the implicit Euler method with local time linearization, Eq. 6.96, one would like to take the limit h → ∞, which leads to Newton’s method, Eq. 6.98. However, a ﬁnite time step may be required until the solution is somewhat close to the steady solution.
8.7
Problems
1. Repeat the timemarch comparisons for diﬀusion (Section 8.4.2) and periodic convection (Section 8.4.3) using 2nd and 3rdorder RungeKutta methods. 2. Repeat the timemarch comparisons for diﬀusion (Section 8.4.2) and periodic convection (Section 8.4.3) using the 3rd and 4thorder AdamsBashforth methods. Considering the stability bounds for these methods (see problem 4 in Chapter 7) as well as their memory requirements, compare and contrast them with the 3rd and 4thorder RungeKutta methods. 3. Consider the diﬀusion equation (with ν = 1) discretized using 2ndorder central diﬀerences on a grid with 10 (interior) points. Find and plot the eigenvalues and the corresponding modiﬁed wavenumbers. If we use the explicit Euler timemarching method what is the maximum allowable time step if all but the ﬁrst two eigenvectors are considered parasitic? Assume that suﬃcient accuracy is obtained as long as λh ≤ 0.1. What is the maximum allowable time step if all but the ﬁrst eigenvector are considered parasitic?
162
CHAPTER 8. CHOICE OF TIMEMARCHING METHODS
Chapter 9 RELAXATION METHODS
In the past three chapters, we developed a methodology for designing, analyzing, and choosing timemarching methods. These methods can be used to compute the timeaccurate solution to linear and nonlinear systems of ODE’s in the general form du = F (u, t) dt (9.1)
which arise after spatial discretization of a PDE. Alternatively, they can be used to solve for the steady solution of Eq. 9.1, which satisﬁes the following coupled system of nonlinear algebraic equations: F (u) = 0 (9.2)
In the latter case, the unsteady equations are integrated until the solution converges to a steady solution. The same approach permits a timemarching method to be used to solve a linear system of algebraic equations in the form Ax = b (9.3)
To solve this system using a timemarching method, a time derivative is introduced as follows dx = Ax − b dt (9.4)
and the system is integrated in time until the transient has decayed to a suﬃciently low level. Following a timedependent path to steady state is possible only if all of the eigenvalues of the matrix A (or −A) have real parts lying in the left halfplane. Although the solution x = A−1 b exists as long as A is nonsingular, the ODE given by Eq. 9.4 has a stable steady solution only if A meets the above condition. 163
9.7) which is identical to the solution of Eq. In the simplest possible case the conditioning matrix is a diagonal matrix composed of a constant D(b). RELAXATION METHODS The common feature of all timemarching methods is that they are at least ﬁrstorder accurate.1. To use these schemes. there are wellknown techniques for accelerating relaxation schemes if the eigenvalues of CAb are all real and of the same sign. 9. Such methods are known as relaxation methods.8) (9. In this chapter.1 9. provided C−1 exists.7 is u = [CAb ]−1 C f b = A−1 C −1 C f b = A−1 f b b b (9. at each time step of an implicit timemarching method or at each iteration of Newton’s method. 9. 9.5.164 CHAPTER 9. we consider iterative methods which are not time accurate at all. the problem becomes one of solving for u in CAb u − C f b = 0 Notice that the solution of Eq.5.7 depends crucially on the eigenvalue and eigenvector structure of the matrix CAb .5) where Ab is nonsingular. does not depend at all on the eigensystem of the basic matrix Ab . For example. and.5 from the left by some nonsingular matrix.6) 9. our analysis will focus on their application to large sparse linear systems of equations in the form Ab u − f b = 0 (9. equally important. 9. Using an iterative method. If we designate the conditioning matrix by C. and the use of the subscript b will become clear shortly. 9.1 Formulation of the Model Problem Preconditioning the Basic Matrix It is standard practice in applying relaxation procedures to precondition the basic equation. Such systems of equations arise. This preconditioning has the eﬀect of multiplying Eq. In the following we will see that our approach to the iterative solution of Eq. which is given by u∞ = A−1 f b b (9.2. the . While they are applicable to coupled systems of nonlinear algebraic equations in the form of Eq. for example. we seek to obtain rapidly a solution which is arbitrarily close to the exact solution of Eq.
It can ﬁrst be conditioned so that the modulus of each element is 1.11) . FORMULATION OF THE MODEL PROBLEM 165 conditioning matrix C must be chosen such that this requirement is satisﬁed. For example. this leads to the approximation 0 1 −1 0 1 1 −1 0 1 δx u = 2∆x −1 0 −2 u+ 1 2 −ua 0 0 0 0 (9. Using a ﬁrstorder backward diﬀerence on the right side (as in Section 3. The result is A2 = −AT A1 = 1 0 −1 1 0 −1 1 0 −1 1 0 1 −1 −1 0 −1 1 0 −1 1 0 −1 1 0 −1 1 1 = −1 0 1 0 −2 0 1 1 0 −2 0 1 1 0 −2 1 1 1 −2 (9.9) The matrix in Eq. A choice of C which ensures this condition is the negative transpose of Ab . We then further condition with multiplication by the negative transpose.9 has eigenvalues whose imaginary parts are much larger than their real parts. This is accomplished using a diagonal preconditioning matrix D = 2∆x 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 2 (9.10) which scales each row.1. consider a spatial discretization of the linear convection equation using centered diﬀerences with a Dirichlet condition on the left side and no constraint on the right side.9.6). 9.
e. we will thoroughly investigate some simple equations which model these structures. 1 . as deﬁned in Appendix A. RELAXATION METHODS If we deﬁne a permutation matrix P 1 and carry out the process P T [−AT A1 ]P (which 1 just reorders the elements of A1 and doesn’t change the eigenvalues) we ﬁnd 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 −1 0 1 0 −2 0 1 1 0 −2 0 1 1 0 −2 1 1 1 −2 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 −2 1 1 −2 1 1 −2 1 = 1 −2 1 1 −1 (9. In the remainder of this chapter. symmetric with negative real eigenvalues).12) which has all negative real eigenvalues.166 CHAPTER 9. and the classical relaxation methods can be applied. Relaxation methods are applicable to more general matrices.2 The Model Equations Preconditioning processes such as those described in the last section allow us to prepare our algebraic equations in advance so that certain eigenstructures are guaranteed. 9. we simply wish to show that a broad b range of matrices can be preconditioned into a form suitable for our analysis. The classical methods will usually converge if Ab is diagonally dominant..2 The symbol for the dependent variable has been changed to φ as a reminder that the physics being modeled is no longer time A permutation matrix (deﬁned as a matrix with exactly one 1 in each row and column and has the property that P T = P −1 ) just rearranges the rows and columns of a matrix. We do not necessarily recommend the use of −AT as a preconditioner.1. Thus even when the basic matrix Ab has nearly imaginary eigenvalues. the conditioned matrix −AT Ab b is nevertheless symmetric negative deﬁnite (i. We will consider the preconditioned system of equations having the form Aφ − f = 0 (9. as given in Appendix B.13) where A is symmetric negative deﬁnite. 2 We use a symmetric negative deﬁnite matrix to simplify certain aspects of our analysis.
−2.1.13. In this case 1 B(1. Introducing the threepoint central diﬀerencing scheme for the second derivative with Dirichlet boundary conditions. −2. 9.7. It is instructive in formulating the concepts to consider the special case given by the diﬀusion equation in one dimension with unit diﬀusion coeﬃcient ν: ∂u ∂2 u = 2 − g(x) ∂t ∂x This has the steadystate solution ∂2u = g(x) ∂x2 (9. 1 .16) which is the onedimensional form of the Poisson equation. 1) ∆x2 = g − (bc) Ab = fb Choosing C = ∆x2 I. 9.15) (9. is guaranteed to exist because A is nonsingular.18) B(1. −2. Note that the solution of Eq. FORMULATION OF THE MODEL PROBLEM 167 accurate when we later deal with ODE formulations. 1)u + (bc) − g dt ∆x2 (9. we obtain (9.14) The above was written to treat the general case. b. If we consider a Dirichlet boundary condition on the left side and either a Dirichlet or a Neumann condition on the right side. In the notation of Eqs. A = CAb and f = Cf b (9. then A has the form A = B 1. we ﬁnd du 1 = B(1.5 and 9.9. φ = A−1 f .17) where (bc) contains the boundary conditions and g contains the values of the source term at the grid nodes.19) where f = ∆x2 fb . 1)φ = f (9.
1 Classical Relaxation The Delta Form of an Iterative Scheme We will consider relaxation methods which can be expressed in the following delta form: H φn+1 − φn = Aφn − f (9. s]T s = −2 or −1 (9. and much of the remaining material is devoted to this case.2. We attempt to do this in such a way.2 9.24) Hence it is clear that H should lead to a system of equations which is easy to solve.22) 9. or at least easier to solve than the original system.23) where G ≡ I + H −1 A (9. RELAXATION METHODS b = [−2.2 The Converged Solution.25) . 3. that it is directly applicable to two. the Residual. The matrix H is independent of n for stationary methods and is a function of n for nonstationary ones.20) Note that s = −1 is easily obtained from the matrix resulting from the Neumann boundary condition given in Eq.21 for φn+1 gives (9.and threedimensional problems. · · · ·. The iteration count is designated by the subscript n or the superscript (n). −2. and the Error φn+1 = [I + H −1 A]φn − H −1 f = Gφn − H −1 f Solving Eq.24 using a diagonal conditioning matrix. The error at the nth iteration is deﬁned as en ≡ φn − φ∞ = φn − A−1 f (9. A tremendous amount of insight to the basic features of relaxation is gained by an appropriate study of the onedimensional case. 9. however.21) where H is some nonsingular matrix which depends upon the iterative method. 9. The converged solution is designated φ∞ so that φ∞ = A−1 f (9.168 CHAPTER 9. −2.2.
5.22.28) (9.27) Consequently. The GaussSeidel method is φj (n+1) = 1 (n+1) (n) φ + φj+1 − fj 2 j−1 (9.25 by A from the left. 9. G is referred to as the basic iteration matrix.2.1. −2. 9. and its eigenvalues. The PointJacobi method is expressed in point operator form for the onedimensional case as φ(n+1) = j 1 (n) φj−1 + φ(n) − fj j+1 2 (n+1) (9. CLASSICAL RELAXATION 169 where φ∞ was deﬁned in Eq. The method of successive overrelaxation (SOR) . Hence the jth row of Eq. Nonstationary processes in which H (and possibly C) is varied at each iteration are discussed in Section 9.3 The Classical Methods Point Operator Schemes in One Dimension Let us consider three classical relaxation procedures for our model equation B(1. There results the relation between the error and the residual Aen − rn = 0 Finally. the jth row of Eq.2.26.30) This operator comes about by choosing the value of φj such that together with the old values of φj−1 and φj+1. 9. 9.31) This operator is a simple extension of the pointJacobi method which uses the most recent update of φj−1 . 9.2.9. 9. determine the convergence rate of a method.29) as given in Section 9. it is not diﬃcult to show that en+1 = Gen (9.29 is satisﬁed. The residual at the nth iteration is deﬁned as rn ≡ Aφn − f (9.26) Multiply Eq. we have considered only what are usually referred to as stationary processes in which H is constant throughout the iterations. 1)φ = f (9.29 is satisﬁed using the new values of φj and φj−1 and the old value of φj+1. In all of the above. and use the deﬁnition in Eq. which we designate as σm .
but it can also be written in the single line φj (n+1) = ω (n+1) ω (n) ω (n) φj−1 + (1 − ω)φj + φj+1 − fj 2 2 2 (9. 9. the portion of the matrix below the diagonal. which is also easy to solve.32) where ω generally lies between 1 and 2. 9.33) The General Form The general form of the classical methods is obtained by splitting the matrix A in Eq. 9. then perhaps it would be better to move further in this direction. such that A=L+D+U (9. D.21 leads to an interpretation of relaxation methods in terms of solution techniques for coupled ﬁrstorder ODE’s.13 into its diagonal. about which we have already learned a great deal. and the portion above the diagonal.3 9.3. being lower triangular. 9.34) Then the pointJacobi method is obtained with H = −D. It is usually expressed in two steps as 1 (n+1) (n) ˜ φj−1 + φj+1 − fj φj = 2 (n+1) (n) (n) ˜ φj = φj + ω φj − φj (9.170 CHAPTER 9. U.36) dφ = Aφ − f dt (9.21 results from the application of the explicit Euler timemarching method (with h = 1) to the following system of ODE’s: H This is equivalent to dφ = H −1 C Ab φ − f b = H −1 [Aφ − f] dt (9.35) . One can easily see that Eq. L.1 The ODE Approach to Classical Relaxation The Ordinary Diﬀerential Equation Formulation The particular type of delta form given by Eq. The GaussSeidel method is obtained with H = −(L + D). which certainly meets the criterion that it is easy to solve. RELAXATION METHODS is based on the idea that if the correction produced by the GaussSeidel method tends to move the solution toward φ∞ .
The λ eigenvalues are ﬁxed by the basic matrix in Eq. It is clear that. The generalization of this in our approach is to make h. if all of the eigenvalues of H −1 A have negative real parts (which implies that H −1 A is nonsingular). that is.36 are independent of t. These are ﬁxed by the initial guess.35. We see that the goal of a relaxation method is to remove the transient solution from the general solution in the most eﬃcient way possible. 9. they are not changed throughout the iteration process. 9.28) φn = c1 x1 (1 + λ1 h)n + · · · + cm xm (1 + λm h)n + · · · + cM xM (1 + λM h)n +φ∞ (9. Assuming that H −1 A has been chosen (that is. H −1 f is also independent of t. In a stationary method.38) which is the solution of Eq. the “time” step. Throughout the remaining discussion we will refer to the independent variable t as “time”. Hence the numerical solution after n steps of a stationary relaxation method can be expressed as (see Eq. a constant for the entire iteration. As we shall see.39) error The initial amplitudes of the eigenvectors are given by the magnitudes of the cm . the solution can be written as φ = c1 eλ1 t x1 + · · · + cM eλM t xM +φ∞ error (9. even though no true time accuracy is involved. H and C in Eq. an iteration process has been decided upon). then the system of ODE’s has a steadystate solution which is approached as t → ∞. and the eigenvectors of H −1 A are linearly independent. Suppose the explicit Euler method is used for the time integration.9. given by φ∞ = A−1 f (9.37) where what is referred to in timeaccurate analysis as the transient solution. the only free choice remaining to accelerate the removal of the error terms is the choice of h. THE ODE APPROACH TO CLASSICAL RELAXATION 171 In the special case where H−1 A depends on neither u nor t. and the secondary conditioning matrix in 9. 9. For this method σm = 1 + λm h.7. so that we must devise a way to remove them all from the general solution on an equal basis. In general it is assumed that any or all of the eigenvectors could have been given an equally “bad” excitation by the initial guess.13. The σ eigenvalues are ﬁxed for a given λh by the choice of timemarching method.3. the three classical methods have all been conditioned by the choice of H to have an optimum h equal to 1 for a stationary iteration process. . the preconditioning matrix in 9. 6. is now referred to in relaxation analysis as the error.36.
3 6.2. 1)φn − f ω (9.2.40) As a start.2 ODE Form of the Classical Methods The three iterative procedures deﬁned by Eqs.3 is that all the elements above the diagonal (or below the diagonal if the sweeps are from right to left) are zero.1.5 .3. If we impose this constraint and further restrict ourselves to banded tridiagonals with a single constant in each band. 9.42) (9.3 can be identiﬁed using the entries in Table 9. 1) gives the correct answer in one step. The constraint on H that is in keeping with the formulation of the three methods described in Section 9.35.30. TABLE 9. −2.2.31 and 9. With this choice of notation the three methods presented in Section 9.43 THAT LEAD TO CLASSICAL RELAXATION METHODS β 0 1 1 ω 1 1 2/ 1 + sin M π 1 + Method PointJacobi GaussSeidel Optimum SOR Equation 6. . 9. since multiplication by the inverse amounts to solving the problem by a direct method without iteration. 0)(φn+1 − φn ) = B(1.43) where β and ω are arbitrary.172 CHAPTER 9.3. Insert the model equation 9.1: VALUES OF β and ω IN EQ. h.32 obey no apparent pattern except that they are easy to implement in a computer code since all of the data required to update the value of one point are explicitly available at the time of the update. However. 1) since then multiplication from the left by −B−1 (1.4 6.1. We arrive at H(φn+1 − φn ) = B(1. we are led to 2 B(−β. this is not in the spirit of our study.2.41) It is clear that the best choice of H from the point of view of matrix algebra is −B(1. 1)φn − f (9. Then H dφ = B(1.29 into the ODE form 9. −2. −2. let us use for the numerical integration the explicit Euler method φn+1 = φn + hφn with a step size. RELAXATION METHODS 9. equal to 1. 9. 1)φ − f dt (9. −2. −2. Now let us study these methods as subsets of ODE as formulated in Section 9.2.
In this light. . The three classical methods.45) 2 2 2 2 This represents a generalization of the classical relaxation techniques.4 Eigensystems of the Classical Methods The ODE approach to relaxation can be summarized as follows. and φ∞ ≡ A−1 f is the steadystate solution. PointJacobi. −2. and SOR.44) and appear from it in the special case when the explicit Euler method is used for its numerical integration. our purpose is to examine the whole procedure as a special subset of the theory of ordinary diﬀerential equations. the three methods are all contained in the following set of ODE’s 2 dφ = B −1 (−β.49) where en is the transient. 9.48) This solution has the analytical form φn = en + φ∞ (9. 9. solution of the set of ODE’s H dφ = Aφ − f dt (9.47) An iterative scheme is developed to ﬁnd the converged. 0) B(1. or steadystate. EIGENSYSTEMS OF THE CLASSICAL METHODS 173 The fact that the values in the tables lead to the methods indicated can be veriﬁed by simple algebraic manipulation. or error.4.1.44 and Table 9. 1)φ − f dt ω (9. However. The point operator that results from the use of the explicit Euler scheme is φj (n+1) = ωβ (n+1) ω ωh (n) ωh (n) (n) φj−1 + (h − β)φj−1 − (ωh − 1)φj + φj+1 − fj(9.46) This equation is preconditioned in some manner which has the eﬀect of multiplication by a conditioning matrix C giving Aφ − f = 0 (9. GaussSeidel. . The basic equation to be solved came from some timeaccurate derivation Ab u − f b = 0 (9.9. are identiﬁed for the onedimensional case by Eq.
Thus Convergence rate ∼ σm max .52) (9. M +1 m = 1. This amounts to solving the generalized eigenvalue problem Axm = λm H xm for the special case 2 B(1. c = 1. and f = 0. M (9.3. b = −2.0. 9. 1 ). 9. The eigenvalues are given by the equation λm = −1 + cos mπ . Note that for h > 1. The three special cases considered below are obtained with a = 1.30.1 since both d and f are zero.2.44. To illustrate the behavior. RELAXATION METHODS Given our assumption that the component of the error associated with each eigenvector is equally likely to be excited. i.4. e = 2/ω. 9. is shown in Fig.e. · · · .0 (depending on M ) there is the possibility of instability. 9. . M (9.51) The generalized eigensystem for simple tridigonals is given in Appendix B. d = −β. −1. we use the ODE analysis to ﬁnd the convergence rates of the three classical methods represented by Eqs. Fig. For h < 1. the maximum σm  is obtained with m = M . we take M = 5 for the matrix order. the asymptotic convergence rate is determined by the eigenvalue σm of G (≡ I + H −1 A) having maximum absolute value.31. It is also instructive to inspect the eigenvectors and eigenvalues in the H−1 A matrix for the three methods.1. This special case makes the general result quite clear. The plot for h = 1. . and 9. 0)xm ω (9. . σ1  = σM  and the best possible convergence rate is achieved: σm max = cos π M +1 (9.53) The λσ relation for the explicit Euler method is σm = 1 + λm h. 2. 9. . 2 2 The eigensystem can be determined from Appendix B. 2. 1)xm = λm B(−β. . the ODE matrix H −1 A reduces to simply B( 1 .174 CHAPTER 9. 9. σm  > 1.2 and for h > 1. the optimum stationary case.54) . Fig. 9. To obtain the optimal scheme we wish to minimize the maximum σm  which occurs when h = 1.1 The PointJacobi System If β = 0 and ω = 1 in Eq.32. −2. m = 1.50) In this section. the maximum σm  is obtained with m = 1. This relation can be plotted for any h.
4. we obtain σm max = 0. 9.55) This is a very “wellbehaved” eigensystem with linearly independent eigenvectors and distinct eigenvalues. M (9. .0 (9. 9. . the eigenvectors can be written as x1 = 1/2 √ 3/2 √1 3/2 1/2 3/2 1 √ 0 3/2 . 2.53 √ λ1 = −1 + 23 = −0. .5 = −1.866 · · · From Eq. 0 √ − 3/2 0 √ 1 − 3/2 √ 3/2 1/2 √ √ − 3/2 − 3/2 .1.56) The corresponding eigenvalues are. The ﬁrst 5 eigenvectors are simple sine waves. from Eq.57) −1 − 1 = −1. Thus after 500 iterations the error content associated with each eigenvector is reduced to no more than 0. the eigenvectors of H−1 A are given by xj = (xj )m = sin j mπ M +1 . x2 = . x3 = −1 .9971.5 2 √ λ5 = −1 − 23 = −1.134 · · · λ2 = λ3 = λ4 = −1 + 1 2 −1 = −0. Again from Appendix B.23 times its initial level.39. EIGENSYSTEMS OF THE CLASSICAL METHODS 175 For M = 40. x5 = x4 = √ 0 1 √ 3/2 − 3/2 √ 1/2 − 3/2 √ (9.9. . For M = 5. the numerical solution written in full is √ 3 n φn − φ∞ = c1 [1 − (1 − )h] x1 2 1 + c2 [1 − (1 − )h]n x2 2 . j = 1.
176
CHAPTER 9. RELAXATION METHODS
1.0
m=5
h = 1.0
m=1
Values are equal
σm
m=4 m=2
Μ=5
0.0 2.0
m=3
1.0
0.0
Figure 9.1: The σ, λ relation for PointJacobi, h = 1, M = 5.
+ c3 [1 − (1
)h]n x3
1 + c4 [1 − (1 + )h]n x4 2 √ 3 n + c5 [1 − (1 + )h] x5 2
(9.58)
9.4.2
The GaussSeidel System
If β and ω are equal to 1 in Eq. 9.44, the matrix eigensystem evolves from the relation B(1, −2, 1)xm = λm B(−1, 2, 0)xm (9.59)
which can be studied using the results in Appendix B.2. One can show that the H−1 A matrix for the GaussSeidel method, AGS , is AGS ≡ B −1 (−1, 2, 0)B(1, −2, 1) =
9.4. EIGENSYSTEMS OF THE CLASSICAL METHODS
177
1.0
Approaches 1.0
h < 1.0
h
0.0 2.0
De
cr
ea
h
si
σm
ng
ng
Figure 9.2: The σ, λ relation for PointJacobi, h = 0.9, M = 5.
Exceeds 1.0 at h
De cr ea si
1.0
0.0
λ mh
∼ ∼
1.072 / Ustable h > 1.072
1.0
h > 1.0
ng si ea cr In
In
cr
ea
si
σm
ng
h
h
M = 5
0.0 2.0
1.0
0.0
λmh
Figure 9.3: The σ, λ relation for PointJacobi, h = 1.1, M = 5.
178
CHAPTER 9. RELAXATION METHODS −1 1/2 0 −3/4 1/2 0 1/8 −3/4 1/2 0 1/16 1/8 −3/4 1/2 0 1/32 1/16 1/8 −3/4 1/2 . . ... ... ... ... ... . . . . 0 1/2M · · ·
1 2 3 4 5 . . . M
(9.60)
The eigenvector structure of the GaussSeidel ODE matrix is quite interesting. If M is odd there are (M + 1)/2 distinct eigenvalues with corresponding linearly independent eigenvectors, and there are (M − 1)/2 defective eigenvalues with corresponding principal vectors. The equation for the nondefective eigenvalues in the ODE matrix is (for odd M ) λm = −1 + cos2 ( mπ ) , M +1 mπ M +1 m = 1, 2, . . . , M +1 2 M +1 2 (9.61)
and the corresponding eigenvectors are given by xm = cos mπ M +1
j−1
sin j
,
m = 1, 2, . . . ,
(9.62)
The λσ relation for h = 1, the optimum stationary case, is shown in Fig. 9.4. The σm with the largest amplitude is obtained with m = 1. Hence the convergence rate is σm max = cos π M +1
2
(9.63)
Since this is the square of that obtained for the PointJacobi method, the error associated with the “worst” eigenvector is removed in half as many iterations. For M = 40, σm max = 0.9942. 250 iterations are required to reduce the error component of the worst eigenvector by a factor of roughly 0.23. The eigenvectors are quite unlike the PointJacobi set. They are no longer symmetrical, producing waves that are higher in amplitude on one side (the updated side) than they are on the other. Furthermore, they do not represent a common family for diﬀerent values of M . The Jordan canonical form for M = 5 is
X −1 AGS X = JGS =
φ1 φ2
1
φ3
1 φ3
(9.64)
φ3
9.4. EIGENSYSTEMS OF THE CLASSICAL METHODS
179
1.0
h = 1.0
σm Μ=5
2 defective λ m 0.0 2.0 1.0
0.0
λmh
Figure 9.4: The σ, λ relation for GaussSeidel, h = 1.0, M = 5. The eigenvectors and principal vectors are all real. For M = 5 they can be written √ 1/2 1 0 0 √3/2 3/4 0 2 0 3/4 , x3 = 0 , x4 = −1 , x5 = 4 (9.65) x1 = 3/4 , x2 = √0 9/16 −√3/16 0 0 −4 9/32 0 0 1 − 3/32 The corresponding eigenvalues are λ1 = −1/4 λ2 = −3/4 λ3 = −1 (4) (5) Defective, linked to λ3 Jordan block (9.66)
The numerical solution written in full is thus h φn − φ∞ = c1 (1 − )n x1 4 3h n + c2 (1 − ) x2 4
180 +
CHAPTER 9. RELAXATION METHODS n n(n − 1) c3 (1 − h)n + c4 h (1 − h)n−1 + c5 h2 (1 − h)n−2 x3 1! 2! n + c4 (1 − h)n + c5 h (1 − h)n−1 x4 1! n + c5 (1 − h) x5 (9.67)
9.4.3
The SOR System
If β = 1 and 2/ω = x in Eq. 9.44, the ODE matrix is B−1 (−1, x, 0)B(1, −2, 1). One can show that this can be written in the form given below for M = 5. The generalization to any M is fairly clear. The H −1 A matrix for the SOR method, ASOR ≡ B −1 (−1, x, 0)B(1, −2, 1), is
1 x5
−2x4 x4 0 0 0 −2x3 + x4 x3 − 2x4 x4 0 0 −2x2 + x3 x2 − 2x3 + x4 x3 − 2x4 x4 0 2 2 3 2 3 4 3 −2x + x x − 2x + x x − 2x + x x − 2x4 x4 −2 + x 1 − 2x + x2 x − 2x2 + x3 x2 − 2x3 + x4 x3 − 2x4
(9.68)
Eigenvalues of the system are given by λm = −1 + where zm = [4(1 − ω) + ω2 p2 ]1/2 m pm = cos[mπ/(M + 1)] If ω = 1, the system is GaussSeidel. If 4(1 − ω) + ω2 pm < 0, zm and λm are complex. If ω is chosen such that 4(1 − ω) + ω2p2 = 0, ω is optimum for the stationary case, 1 and the following conditions hold: 1. Two eigenvalues are real, equal and defective. 2. If M is even, the remaining eigenvalues are complex and occur in conjugate pairs. 3. If M is odd, one of the remaining eigenvalues is real and the others are complex occurring in conjugate pairs. ωpm + zm 2
2
,
m = 1, 2, . . . M
(9.69)
the optimum value for the stationary case. The remaining linearly independent eigenvectors are all complex.8578. Hence the convergence rate is σm max = ωopt − 1 ωopt π = 2/ 1 + sin M +1 (9.70) mπ M +1 (9. Hence the worst error component is reduced to less than 0. This illustrates the fact that for optimum stationary SOR all the σm  are identical and equal to ωopt − 1. x5 = 1/3 (9.4. x3. For odd M . 9.72) For M = 40.74) . EIGENSYSTEMS OF THE CLASSICAL METHODS One can easily show that the optimum ω for the stationary case is π ωopt = 2/ 1 + sin M +1 and for ω = ωopt 2 λm = ζm − 1 j−1 xm = ζm sin j 181 (9. For M = 5 they can be written √ 3(1 )/2 1/2 −6 1 √ √ 1/2 9 0 3(1 ± i 2)/6 . there are two real eigenvectors and one real principal vector. the optimum value of ω may have to be determined by trial and error. σm max = 0. In practical applications.23 times its initial value in only 10 iterations.5.71) where ζm = ωopt pm + i p2 − p2 1 m 2 2 Using the explicit Euler method to integrate the ODE’s.4 = √ 1/6 13 √ 3(5 ± i 2)/54 0 √ 1/18 6 1/9 3(7 ± 4i 2)/162 The corresponding eigenvalues are λ1 (2) λ3 λ4 λ5 = = = = −2/3 Defective linked to λ1 √ −(10 − 2 2i)/9 √ −(10 + 2 2i)/9 −4/3 (9. the λσ relation reduces to that shown in Fig. much faster than both GaussSeidel and PointJacobi.9. and the beneﬁt may not be as great. σm = 1 − h + hζm . x2 = 16 .73) 0√ x1 = 1/3 . and if h = 1.
λ relation for optimum stationary SOR.0 h = 1.5 Nonstationary Processes In classical terminology a method is said to be nonstationary if the conditioning matrices. H and C. 9.5: The σ. The nonstationary form of Eq.182 CHAPTER 9.39 is . h = 1. however.0 0. The numerical solution written in full is φn − φ∞ = [c1 (1 − 2h/3)n + c2 nh(1 − 2h/3)n−1 ]x1 + c2 (1 − 2h/3)n x2 √ + c3 [1 − (10 − 2 2i)h/9]n x3 √ + c4 [1 − (10 + 2 2i)h/9]n x4 + c5 (1 − 4h/3)n x5 (9. This does not change the steadystate solution A−1 f b .75) 9. M = 5.0 λmh Figure 9. are varied at each time step.0 1. to study the case of ﬁxed H and C but variable step size. but it can greatly aﬀect the convergence rate. In our ODE b approach this could also be considered and would lead to a study of equations with nonconstant coeﬃcients.0 2 real defective λ m σm 2 complex λ m 1 real λ m Μ=5 0. h. For the GaussSeidel and SOR methods it leads to processes that can be superior to the stationary methods.0 2. This process changes the PointJacobi method to Richardson’s method in standard terminology. It is much simpler. RELAXATION METHODS 1.
M . . .76.78) where Pe signiﬁes an “Euler” polynomial. Consider one of the error terms taken from eN ≡ φN − φ∞ = and write it in the form cm xm Pe (λm ) ≡ cm xm N M N cm xm m=1 (1 + λm hn ) n=1 (9.77) (1 + λm hn ) n=1 (9. the eigenvalues λm are generally unknown and costly to compute. We will see that a few well chosen h’s can reduce whole clusters of eigenvectors associated with nearby λ’s in the λm spectrum. Let us choose the hn so that the maximum value of (Pe )N (λ) is as small as possible for all λ lying between λa and λb such that λb ≤ λ ≤ λa ≤ 0.76) where the symbol Π stands for product. 2. M . Now focus attention on the polynomial (Pe )N (λ) = (1 + h1 λ)(1 + h2 λ) · · · (1 + hN λ) (9. 9. . Mathematically stated.80) . This leads to the concept of selectively annihilating clusters of eigenvectors from the error terms as part of a total iteration process. Since hn can now be changed at each step. the error term can theoretically be completely eliminated in M steps by taking hm = −1/λm .5. we seek λb ≤λ≤λa max (Pe )N (λ) = minimum . 2. It is therefore unnecessary and impractical to set hm = −1/λm for m = 1. NONSTATIONARY PROCESSES 183 N φN = c1 x1 (1 + λ1 hn ) + · · · + cm xm N N (1 + λm hn ) n=1 n=1 + · · · + cM xM (1 + λM hn ) + φ∞ n=1 (9. Now we pose a less demanding problem.79) treating it as a continuous function of the independent variable λ. we considered making the error exactly zero by taking advantage of some knowledge about the discrete values of λm for a particular case. with(Pe )N (0) = 1 (9. However. In the annihilation process mentioned after Eq. . · · · . for m = 1. This is the basis for the multigrid methods discussed in Chapter 10. Let us consider the very important case when all of the λm are real and negative (remember that they arise from a conditioned matrix so this constraint is not unrealistic for quite practical cases).9.
84 gives us the best choice of a series of hn that will minimize the amplitude of the error carried in the eigenvectors associated with the eigenvalues between λb and λa . As an example for the use of Eq.184 CHAPTER 9. 2. RELAXATION METHODS This problem has a well known solution due to Markov. In relaxation terminology this is generally referred to as Richardson’s method. .82) + 1 y− 2 y2 − 1 N (9. 9.83) are the Chebyshev polynomials for y > 1.84) Remember that all λ are negative real numbers representing the magnitudes of λm in an eigenvalue spectrum.86) (9.81) where TN (y) = cos(N arccos y) are the Chebyshev polynomials along the interval −1 ≤ y ≤ 1 and TN (y) = 1 y+ 2 y2 − 1 N (9.76 is expressed in terms of a set of eigenvectors. n = 1. ampliﬁed by the coeﬃcients cm (1 + λm hn ). The error in the relaxation process represented by Eq.85) . With each eigenvector there is a corresponding eigenvalue. xm . . and it leads to the nonstationary step size choice given by 1 1 (2n − 1)π = −λb − λa + (λb − λa ) cos hn 2 2N . let us consider the following problem: Minimize the maximum error associated with the λ eigenvalues in the interval −2 ≤ λ ≤ −1 using only 3 iterations.84. The three values of h which satisfy this problem are hn = 2/ 3 − cos (2n − 1)π 6 (9. 9. N (9. It is TN (Pe )N (λ) = 2λ − λa − λb λa − λb −λa − λb TN λa − λb (9. Eq. 9. .
92) . 9.76.6 are √ h1 = 4/(6 − 3) h2 = 4/(6 − 0) √ h3 = 4/(6 + 3) Return now to Eq. 9.87) (9. This calls for a study of the rational “trapezoidal” polynomial.90) would result. NONSTATIONARY PROCESSES and the amplitude of the eigenvector is reduced to (Pe )3 (λ) = T3 (2λ + 3)/T3 (3) where T3 (3) = [3 + √ 8]3 + [3 − √ 8]3 /2 ≈ 99 185 (9.5. Eq.37 on the condition that the explicit Euler method. This was derived from Eq.88) A plot of Eq. with (Pt )N (0) = 1 (9.41. The values of h used in Fig. 9. 9.6 and we see that the amplitudes of all the eigenvectors associated with the eigenvalues in the range −2 ≤ λ ≤ −1 have been reduced to less than about 1% of their initial values.91) under the same constraints as before. 9. If instead the implicit trapezoidal rule 1 φn+1 = φn + h(φn+1 + φn ) 2 is used.9. namely that λb ≤λ≤λa max (Pt )N (λ) = minimum .87 is given in Fig. the nonstationary formula 1 1 + hn λm 2 +φ φN = cm xm ∞ 1 m=1 n=1 1 − h λ n m 2 M N (9. was used to integrate the basic ODE’s. Pt : 1 N 1 + hn λ 2 (Pt )N (λ) = 1 n=1 1 − h λ n 2 (9.89) (9. 9.
2 0 0.4 −1.6 (Pe )3 (λ) 0.1 0 −2 −1.6 −1.015 −0.2 0 Figure 9.8 −1.4 −0.8 0. .186 CHAPTER 9.02 −2 −1.01 0.9 0.4 0.8 −0.015 0.005 −0. RELAXATION METHODS 1 0.2 0.2 −1 λ −0. minimization over −2 ≤ λ ≤ −1.4 −1.8 −1.6 −0.5 0.005 (P ) (λ) e 3 0 −0.6 −0.01 −0.8 −0.2 −1 λ −0.6: Richardson’s method for 3 steps.4 −0.7 0.02 0.6 −1.3 0.
N (9. Recall that the eigenvalues of H −1 A satisfy Axm = λm Hxm . · · · . PROBLEMS 187 The optimum values of h can also be found for this problem. 2.2. Using Appendix B. consider the iterative method x (I + µA1 )˜ = (I − µA2 )xn + µb (I + µA2 )xn+1 = (I − µA1 )˜ + µb x where µ is a parameter.7.7 are h1 = 1 √ h2 = 2 h3 = 2 9. −2. For a linear system of the form (A1 + A2 )x = b. 9. 1. n = 1. . Show that this iterative method can be written in the form H(xk+1 − xk ) = (A1 + A2 )xk − b Determine the iteration matrix G if µ = −1/2. Given a relaxation method in the form 2.85. The error amplitude is about 1/5 of that found for (Pe )3 (λ) in the same interval of λ. 9. 1) and ω = ωopt .6 Problems H∆φn = Aφn − f show that φn = Gn φ0 + (I − Gn )A−1 f where G = I + H −1 A. ﬁnd the eigenvalues of H−1 A for the SOR method with A = B(4 : 1. Then ﬁnd the corresponding σm  values. (You do not have to ﬁnd H −1 . not just the expressions.) Find the numerical values. The values of h used in Fig. but we settle here for the approximation suggested by Wachspress 2 λa = −λb hn λb (n−1)/(N −1) . The results for (Pt )3 (λ) are shown in Fig.93) This process is also applied to problem 9.6.9. 3.
6 −1.4 −1.6 −0.5 −2 −2 −1.2 −1 λ −0.8 −0.3 0.5 0.5 −1 −1.4 0.188 CHAPTER 9.4 −0.8 −1.6 −0. RELAXATION METHODS 1 0. .8 0.2 0 2 x 10 −3 1.8 −0.1 0 −2 −1.7 0.2 0.4 −1.7: Wachspress method for 3 steps. minimization over −2 ≤ λ ≤ −1.9 0.2 −1 λ −0.6 −1.4 −0.6 (Pt )3 (λ) 0.5 1 0.2 0 Figure 9.5 (Pt )3 (λ) 0 −0.8 −1.
use u(x) = 0. the number of iterations. PROBLEMS 189 4. 3. Iterate to steady state using (a) the pointJacobi method. u(1) = 1: ∂2u − 6x = 0 ∂x2 For the initial condition. Plot the logarithm of the L2 norm of the residual vs.5. Compare with the theoretical asymptotic convergence rate. Use secondorder centered diﬀerences on a grid with 40 cells (M = 39). and 4 orders of magnitude. . (b) the GaussSeidel method.9. and (d) the 3step Richardson method derived in Section 9.6. (c) the SOR method with the optimum value of ω. Solve the following equation on the domain 0 ≤ x ≤ 1 with boundary conditions u(0) = 0. Plot the solution after the residual is reduced by 2. Determine the asymptotic convergence rate.
RELAXATION METHODS .190 CHAPTER 9.
Notice that as the magnitudes of the eigenvalues increase. This has a rational explanation from the origin of the banded matrix. The eigenvalues and eigenvectors are given in Sections 4.1) then the corresponding eigenvectors are ordered from low to high space frequencies. Note that ∂2 sin(mx) = −m2 sin(mx) ∂x2 and recall that δxx φ = 1 1 D(λ) X −1 φ 2 B(1.3. 1)φ = X ∆x ∆x2 191 (10. if the eigenvalues are ordered such that λ1  ≤ λ2  ≤ · · · ≤ λM  (10.2) .1. the spacefrequency (or wavenumber) of the corresponding eigenvectors also increase. −2.3.1 10.1 Motivation Eigenvector and Eigenvalue Identiﬁcation with Space Frequencies Consider the eigensystem of the model matrix B(1. There are many variations of the process and many viewpoints of the underlying theory.2 and 4. 10. −2. The viewpoint presented here is a natural extension of the concepts discussed in Chapter 9. respectively.Chapter 10 MULTIGRID The idea of systematically using sets of coarser grids to accelerate the convergence of iterative schemes that arise from the numerical solution to partial diﬀerential equations was made popular by the work of Brandt. 1).3) (10. That is.3.
the property can be restored by using h < 1.4) Hence. However. of the GaussSeidel method and. −m2 . One ﬁnds that 1 M +1 2 λm = π ∆x 2 −2 + 2 cos mπ M +1 ≈ −m2 . and Xφ. error components corresponding to high space frequencies will be reduced more quickly than those corresponding to low space frequencies. 1. It is also true.4. this correlation is not necessary in general.192 CHAPTER 10. as shown in Fig.2 The Basic Process First of all we assume that the diﬀerence equations representing the basic partial diﬀerential equations are in a form that can be related to a matrix which has certain . from Appendix B. of the Richardson method described in Section 9.5. In fact. for example. Therefore. This is the key concept underlying the multigrid process.1. 9.2 Properties of the Iterative Method The second key motivation for multigrid is the following: • Many iterative methods reduce error components corresponding to eigenvalues of large amplitude more eﬀectively than those corresponding to eigenvalues of small amplitude.2. However. 2 ). 10. by design. As we saw in Section 9. When an iterative method with this property is applied to a matrix with the above correlation between the modulus of the eigenvalues and the space frequency of the eigenvectors. 10. the complete counterexample of the above association is contained in the eigensystem 1 for B( 1 . this method produces the same value of σ for λmin and λmax . For this matrix one ﬁnds. We have seen that X −1 φ represents a sine transform. exactly the opposite 2 behavior. The classical pointJacobi method does not share this property. m << M (10. a sine synthesis. the operation 1 2 D(λ) represents the numerical approximation of the multiplication of the ∆x appropriate sine wave by the negative square of its wavenumber.1. the correlation of large magnitudes of λm with high spacefrequencies is to be expected for these particular matrix operators. This is consistent with the physics of diﬀusion as well. MULTIGRID where D(λ) is a diagonal matrix containing the eigenvalues. This is to be expected of an iterative method which is time accurate.
as in the example given by Eq. We do not attempt to develop an optimum procedure here. of the matrix are all real and negative. 2. The problem is linear. or it can be “contrived” by further conditioning. In Eq.6 is used. 10.5) where the matrix formed by the product CAb has the properties given above. The eigenvectors associated with the eigenvalues having largest magnitudes can be correlated with high frequencies on the diﬀerencing mesh. This form can be arrived at “naturally” by simply replacing the derivatives in the PDE with diﬀerence schemes.27. The λm are fairly evenly distributed between their maximum and minimum values. 9.10. 3. the residual.11. we are led to the starting formulation C[Ab φ∞ − f b ] = 0 (10.6) Recall that the φ used to compute r is composed of the exact solution φ∞ and the error e in such a way that Ae − r = 0 where A ≡ CAb (10. but for clarity we suppose that the threestep Richardson method illustrated in Fig. 4. The basic assumptions required for our description of the multigrid process are: 1. and φ∞ is a vector representing the desired exact solution. The eigenvalues. THE BASIC PROCESS 193 basic properties. 5. where r = C[Ab φ − f b ] (10. 2 These conditions are suﬃcient to ensure the validity of the process described next. At the end of the three steps we ﬁnd r. the vector f b represents the boundary conditions and the forcing function. if any.8) (10. Having preconditioned (if necessary) the basic ﬁnite diﬀerencing scheme by a procedure equivalent to the multiplication by a matrix C. as in the examples given by Eq.5.7) . 9.2. 3. The iterative procedure used greatly reduces the amplitudes of the eigenvectors associated with eigenvalues in the range between 1 λmax and λmax. We start with some initial guess for φ∞ and proceed through n iterations making use of some iterative process that satisﬁes property 5 above. λm .
and the other the even entries of the original vector (or any other appropriate sorting which is consistent with the interpolation approximation to be discussed below). 10. For a 7point example e2 e4 e6 e1 e3 e5 e7 = 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 e1 e2 e3 e4 e5 e6 e7 .7 for e then φ∞ = φ − e CHAPTER 10. we can be sure that the high frequency content of e has been greatly reduced (about 1% or less of its original value in the initial guess). we can write P A[P −1P ]e = P r The operation P AP −1 partitions the A matrix to form (10. ee eo = Pe (10. We can write the exact solution for e in terms of the eigenvectors of A.12) A1 A2 A3 A4 ee eo = re ro (10.13) Notice that A1 ee + A2 eo = re (10.10) very low amplitude Combining our basic assumptions. since a permutation matrix has an inverse which is its transpose. Next we construct a permutation matrix which separates a vector into two parts. one containing the odd entries.11) Multiply Eq.9) Thus our goal now is to solve for e. assumption 4 ensures that the error has been smoothed.194 If one could solve Eq. 10. MULTIGRID (10. In addition. and the σ eigenvalues of the Richardson process in the form: M/2 3 M 3 e= m=1 cm xm n=1 [σ(λm hn )] + m=M/2+1 cm xm n=1 [σ(λm hn )] (10.7 from the left by P and.14) .
ea and eb are zero. no operations on r are required or justiﬁed.14 reduces to A1 ee + A2 A2 ee = re or Ac ee − re = 0 where Ac = [A1 + A2 A2 ] (10. 10. THE BASIC PROCESS 195 is an exact expression. no aliasing of these functions can occur in subsequent steps. it is reasonable to suppose that the odd points are the average of the even points. Finally. It is that there is some connection between ee and eo brought about by the smoothing property of the Richardson relaxation procedure. For example e1 ≈ e3 e5 e7 1 (ea + e2 ) 2 1 ≈ (e2 + e4 ) 2 1 (e4 + e6 ) ≈ 2 1 (e6 + eb ) ≈ 2 or eo = A2 ee (10. in this formulation.10. Since the top half of the frequency spectrum has been removed.2. It is also important to notice that we are dealing with the relation between e and r so the original boundary conditions and forcing function (which are contained in f in the basic formulation) no longer appear in the problem.16) With this approximation Eq. notice that.15) It is important to notice that ea and eb represent errors on the boundaries where the error is zero if the boundary conditions are given. the averaging of e is our only approximation.18) (10. and one can write for the example case A2 = 1 2 1 1 0 0 0 1 1 0 0 0 1 1 (10. If the boundary conditions are Dirichlet.19) (10.17) . At this point we make our one crucial assumption. Hence.
10.19. 10.22) and are illustrated in Fig. m = 1. −2.21) Ac 1/2 −1 If the boundary conditions are mixed DirichletNeumann. M (10. .196 CHAPTER 10.23) .1 are met. the matrix on the coarse mesh. 10. All of them go through zero on the left (Dirichlet) side. 10. In the particular case illustrated in Fig. · · · . B. MULTIGRID The form of Ac . The eigensystem is given by Eq. the eigenvector structure is diﬀerent from that given in Eq.15 must be changed. 1) where b = [−2. 9.. It is easy to show that the high spacefrequencies still correspond to the eigenvalues with high magnitudes. When eb = eM . b. For Neumann conditions.55 for Dirichlet conditions.18 gives A2 A1 1 −2 1 1 1 1 1 −2 1 1 + · 1 1 2 −2 1 1 1 A2 −1 1/2 = 1/2 −1 1/2 (10.1.20) and Eq. our 7point example would produce −1 P AP = −2 1 −2 −2 −2 1 1 1 1 1 1 1 1 1 1 −2 −2 −2 1 A1 = A3 A2 A4 (10. is completely determined by the choice of the permutation matrix and the interpolation approximation. and. in fact.1. A in the 1D model equation is B(1. −2. eb is equal to eM . −2. If the original A had been B(7 : 1. 1). 2.16 changes to A2 = 1 2 1 1 0 0 0 1 1 0 0 0 1 2 (10. 10... the interpolation formula in Eq. −1]T . In the present case they are given by xjm = sin j (2m − 1)π 2M + 1 . ea = e1 . and all of them reﬂect on the right (Neumann) side. all of the properties given in Section 10. If Neumann conditions are on the left. the example in Eq. However.
Recall that our goal is 2 to solve for e. THE BASIC PROCESS 197 X Figure 10. −1 Ac = Xc Λc Xc (10.” We will continue with Dirichlet boundary conditions for the remainder of this Section. we must now determine the relationship between ee and e. · · · . and thus e.24) 1/2 −1/2 which gives us what we might have “expected.15. j = 1. 2. In order to examine this relationship. −2.10. −2.25) For A = B(M : 1. Given ee computed on the coarse grid (possibly using even coarser grids). Therefore. we can compute eo using Eq. xm = sin j mπ M +1 .9. 1)ee = re on the next coarser mesh.2. The permutation matrix remains the same and both A1 and A2 in the partitioned matrix P AP −1 are unchanged (only A4 is modiﬁed by putting −1 in the lower right element). M .26) m = 1. M (10.1: Eigenvectors for the mixed Dirichlet–Neumann case. we need to consider the eigensystems of A and Ac : A = XΛX −1. 1)e = r on the ﬁne mesh to 1 B(1. we have reduced the problem from B(1. At this stage. which will provide us with the solution φ∞ using Eq. · · · . 10. −2. we can construct the coarse matrix from A2 A1 1 −2 1 1 1 1 1 −2 1 1 + · 1 1 2 −2 1 1 2 A2 −1 1/2 = 1/2 −1 Ac 1/2 (10. 1) the eigenvalues and eigenvectors are λm = −2 1 − cos mπ M +1 . In order to complete the process. 10. 2.
.33) . 1) are 2 (λc )1 = − 1 − cos 2π M +1 . In addition. . it contains the even elements of x1 . 0 (10. that is. The eigenvalue and eigenvector corresponding to m = 1 for the matrix Ac = 1 B(Mc . 1. Then the residual is r = Ax1 = λ1 x1 and the residual on the coarse grid is re = λ1 (xc )1 (10.198 CHAPTER 10. If we restrict our attention to odd M . x1 = sin j π M +1 .27) For example. j = 1.31) = λ1 Xc 1/(λc )1 0 . j = 1. e = x1 . M (10. · · · . then Mc = (M − 1)/2 is the size of Ac . (xc )1 = sin j 2π M +1 . we obtain (λc )1 = −0. corresponding to λ1 = −2 1 − cos π M +1 . As M increases.28) For M = 51 (Mc = 25).998λ1 . the most diﬃcult error mode to eliminate is that with m = 1. .29) since (xc )1 contains the even elements of x1 . 0 (10.30) (10.007291 = 1. one can easily see that (xc )1 coincides with x1 at every second point of the latter vector.32) (10. λ1 = −0. (λc )1 approaches 2λ1 . i.003649. · · · . with M = 51. −2.e. 2. Now let us consider the case in which all of the error consists of the eigenvector component x1 . MULTIGRID Based on our assumptions. 2. . Mc (10. The exact solution on the coarse grid satisﬁes −1 ee = A−1 re = Xc Λ−1 Xc λ1 (xc )1 c c = λ1 Xc Λ−1 c 1 0 ..
40) which is precisely the matrix appearing in Eq.37. and usually is. carried to even coarser grids before returning to the ﬁnest level. we must multiply the result by 2.38) and the preconditioning matrix C is simply C= ∆x2 I ν (10. The problem to be solved on the coarse grid is the same as that solved on the ﬁne grid. 1) = B(Mc : 1.2. −2. . quite important but they are not discussed here.15). The remaining steps required to complete an entire multigrid process are relatively straightforward. since ∆xc = 2∆x. THE BASIC PROCESS = λ1 (xc )1 (λc )1 199 (10. −2. This is equivalent to solving 1 Ac ee = re 2 or 1 B(Mc : 1. −2.and upgoing paths. 1)ee = re 4 (10. −2.39) Applying the discretization on the coarse grid with the same preconditioning matrix as used on the ﬁne grid gives. 1) 2 2 ∆xc 4 ∆xc (10. obviously. but they vary depending on the problem and the user. 10. 1) ∆x2 (10.37) (10.34) 1 ≈ (xc )1 2 (10. the matrix A = B(M : 1. −2. −2. The details of ﬁnding optimum techniques are. 1) = B(Mc : 1. in addition to interpolating ee to the ﬁne grid (using Eq. in each case the appropriate permutation matrix and the interpolation approximation deﬁne both the down. However.35) Since our goal is to compute e = x1 . The reduction can be. 10.10. C ν ∆x2 1 B(Mc : 1. 1) comes from a discretization of the diﬀusion equation. Thus we see that the process is recursive.36) In our case. which gives Ab = ν B(M : 1.
43) 2. which can be easily generalized to a process with an arbitrary number of grids due to the recursive nature of multigrid. 10.” Some form of weighted restriction can also be used. Note that the coarse grid matrix denoted A2 here was denoted Ac in the preceding section.g. .44) In our example in the preceding section.45) that is. MULTIGRID 10. 3. This type of restriction is known as “simple injection. the restriction matrix is 0 1 0 0 0 0 0 2 R1 = 0 0 0 1 0 0 0 0 0 0 0 0 1 0 (10. Transfer (or restrict) r(1) to the coarse grid: 2 r(2) = R1 r(1) (10. This gives1 φ(1) = Gn1 φn + (I − Gn1 ) A−1 f 1 1 where −1 G1 = I + H1 A1 (10. 1.11. Next compute the residual based on φ(1) : r(1) = Aφ(1) − f = AGn1 φn + A (I − Gn1 ) A−1 f − f 1 1 = AGn1 φn − AGn1 A−1 f 1 1 (10.21). Perform n1 iterations of the selected relaxation method on the ﬁne grid. starting with φ = φn .200 CHAPTER 10. Solve the problem A2 e(2) = r(2) on the coarse grid exactly:2 e(2) = A−1 r(2) 2 1 2 (10.46) See problem 1 of Chapter 9.42) and H1 is deﬁned as in Chapter 9 (e.. Extension to nonlinear problems requires that both the solution and the residual be transferred to the coarse grid in a process known as full approximation storage multigrid. Call the result φ(1) . 9. the ﬁrst three rows of the permutation matrix P in Eq. Eq.3 A TwoGrid Process We now describe a twogrid process for the linear problem Aφ = f .41) (10.
10.3. A TWOGRID PROCESS
201
Here A2 can be formed by applying the discretization on the coarse grid. In the 1 preceding example (eq. 10.40), A2 = 4 B(Mc : 1, −2, 1). It is at this stage that the generalization to a multigrid procedure with more than two grids occurs. If this is the coarsest grid in the sequence, solve exactly. Otherwise, apply the twogrid process recursively. 4. Transfer (or prolong) the error back to the ﬁne grid and update the solution:
1 φn+1 = φ(1) − I2 e(2)
(10.47)
In our example, the prolongation matrix is
1 I2 =
1/2 0 0 1 0 0 1/2 1/2 0 0 1 0 0 1/2 1/2 0 0 1 0 0 1/2
(10.48)
which follows from Eq. 10.15. Combining these steps, one obtains
1 2 1 2 φn+1 = [I − I2 A−1 R1 A]Gn1 φn − [I − I2 A−1 R1 A]Gn1 A−1 f + A−1 f 2 1 2 1
(10.49)
Thus the basic iteration matrix is
1 2 [I − I2 A−1 R1 A]Gn1 2 1
(10.50)
The eigenvalues of this matrix determine the convergence rate of the twogrid process. The basic iteration matrix for a threegrid process is found from Eq. 10.50 by replacing A−1 with (I − G2 )A−1 , where 2 2 3
2 3 G2 = [I − I3 A−1 R2 A2 ]Gn2 3 3 2
(10.51)
2 3 In this expression n2 is the number of relaxation steps on grid 2, I3 and R2 are the transfer operators between grids 2 and 3, and A3 is obtained by discretizing on grid 3. Extension to four or more grids proceeds in similar fashion.
202
CHAPTER 10. MULTIGRID
10.4
Problems
1. Derive Eq. 10.51. 2. Repeat problem 4 of Chapter 9 using a fourgrid multigrid method together with (a) the GaussSeidel method, (b) the 3step Richardson method derived in Section 9.5. Solve exactly on the coarsest grid. Plot the solution after the residual is reduced by 2, 3, and 4 orders of magnitude. Plot the logarithm of the L2 norm of the residual vs. the number of iterations. Determine the asymptotic convergence rate. Calculate the theoretical asymptotic convergence rate and compare.
Chapter 11
NUMERICAL DISSIPATION
Up to this point, we have emphasized the secondorder centereddiﬀerence approximations to the spatial derivatives in our model equations. We have seen that a centered approximation to a ﬁrst derivative is nondissipative, i.e., the eigenvalues of the associated circulant matrix (with periodic boundary conditions) are pure imaginary. In processes governed by nonlinear equations, such as the Euler and NavierStokes equations, there can be a continual production of highfrequency components of the solution, leading, for example, to the production of shock waves. In a real physical problem, the production of high frequencies is eventually limited by viscosity. However, when we solve the Euler equations numerically, we have neglected viscous eﬀects. Thus the numerical approximation must contain some inherent dissipation to limit the production of highfrequency modes. Although numerical approximations to the NavierStokes equations contain dissipation through the viscous terms, this can be insuﬃcient, especially at high Reynolds numbers, due to the limited grid resolution which is practical. Therefore, unless the relevant length scales are resolved, some form of added numerical dissipation is required in the numerical solution of the NavierStokes equations as well. Since the addition of numerical dissipation is tantamount to intentionally introducing nonphysical behavior, it must be carefully controlled such that the error introduced is not excessive. In this Chapter, we discuss some diﬀerent ways of adding numerical dissipation to the spatial derivatives in the linear convection equation and hyperbolic systems of PDE’s. 203
204
CHAPTER 11. NUMERICAL DISSIPATION
11.1
OneSided FirstDerivative Space Diﬀerencing
We investigate the properties of onesided spatial diﬀerence operators in the context of the biconvection model equation given by ∂u ∂u = −a ∂t ∂x (11.1)
with periodic boundary conditions. Consider the following point operator for the spatial derivative term −a(δx u)j = −a [−(1 + β)uj−1 + 2βuj + (1 − β)uj+1] 2∆x −a = [(−uj−1 + uj+1) + β(−uj−1 + 2uj − uj+1)] 2∆x
(11.2)
The second form shown divides the operator into an antisymmetric component (−uj−1+ uj+1)/2∆x and a symmetric component β(−uj−1 + 2uj − uj+1)/2∆x. The antisymmetric component is the secondorder centered diﬀerence operator. With β = 0, the operator is only ﬁrstorder accurate. A backward diﬀerence operator is given by β = 1 and a forward diﬀerence operator is given by β = −1. For periodic boundary conditions the corresponding matrix operator is −aδx = −a Bp (−1 − β, 2β, 1 − β) 2∆x
The eigenvalues of this matrix are λm = −a 2πm β 1 − cos ∆x M + i sin 2πm M for m = 0, 1, . . . , M − 1
If a is positive, the forward diﬀerence operator (β = −1) produces Re(λm ) > 0, the centered diﬀerence operator (β = 0) produces Re(λm ) = 0, and the backward diﬀerence operator produces Re(λm ) < 0. Hence the forward diﬀerence operator is inherently unstable while the centered and backward operators are inherently stable. If a is negative, the roles are reversed. When Re(λm ) = 0, the solution will either grow or decay with time. In either case, our choice of diﬀerencing scheme produces nonphysical behavior. We proceed next to show why this occurs.
11.2. THE MODIFIED PARTIAL DIFFERENTIAL EQUATION
205
11.2
The Modiﬁed Partial Diﬀerential Equation
First carry out a Taylor series expansion of the terms in Eq. 11.2. We are lead to the expression 1 ∂u (δx u)j = 2∆x 2∆x ∂x − β∆x2
j
∂2u ∂x2
j
∆x3 + 3
∂3 u ∂x3
j
β∆x4 − 12
∂4 u ∂x4
+ . . .
j
We see that the antisymmetric portion of the operator introduces odd derivative terms in the truncation error while the symmetric portion introduces even derivatives. Substituting this into Eq. 11.1 gives ∂u ∂u aβ∆x ∂ 2 u a∆x2 ∂ 3 u aβ∆x3 ∂ 4 u = −a + − + +... ∂t ∂x 2 ∂x2 6 ∂x3 24 ∂x4
(11.3)
This is the partial diﬀerential equation we are really solving when we apply the approximation given by Eq. 11.2 to Eq. 11.1. Notice that Eq. 11.3 is consistent with Eq. 11.1, since the two equations are identical when ∆x → 0. However, when we use a computer to ﬁnd a numerical solution of the problem, ∆x can be small but it is not zero. This means that each term in the expansion given by Eq. 11.3 is excited to some degree. We refer to Eq. 11.3 as the modiﬁed partial diﬀerential equation. We proceed next to investigate the implications of this concept. Consider the simple linear partial diﬀerential equation ∂3u ∂4u ∂u ∂u ∂2u = −a +ν 2 +γ 3 +τ 4 ∂t ∂x ∂x ∂x ∂x (11.4)
Choose periodic boundary conditions and impose an initial condition u = eiκx . Under these conditions there is a wavelike solution to Eq. 11.4 of the form u(x, t) = eiκx e(r+is)t provided r and s satisfy the condition r + is = −iaκ − νκ2 − iγκ3 + τ κ4 or r = −κ2 (ν − τ κ2 ),
2 2
s = −κ(a + γκ2 )
2 )t]
The solution is composed of both amplitude and phase terms. Thus u = e−κ (ν−τ κ ) eiκ[x−(a+γκ amplitude phase (11.5)
11. and the phase depends only on a and γ.2. the coeﬃcients of the even derivatives in Eq. by using the antisymmetry of the matrix diﬀerence operators. any departure from antisymmetry in the matrix diﬀerence operator must introduce dissipation (or ampliﬁcation) into the modiﬁed PDE. Therefore. 11. The symmetric component approximates ∆x3 uxxxx/12. this operator produces fourthorder accuracy in phase with a thirdorder dissipative term. The resulting spatial operator is not onesided but it is dissipative.5. By examining the term governing the phase of the solution in Eq. Note that the use of onesided diﬀerencing schemes is not the only way to introduce dissipation. One can easily show. Referring to the modiﬁed PDE. We have seen that the threepoint centered diﬀerence approximation of the spatial derivative produces a modiﬁed PDE that has no dissipation (or ampliﬁcation). Under the same condition. the phase speed of the numerical solution is less than the actual phase speed. the numerical phase speed is dependent upon the wavenumber κ. Our purpose here is to investigate the properties of onesided spatial diﬀerencing operators relative to centered diﬀerence operators. . Eq. Therefore.4. the coeﬃcients of the odd derivatives. Biased schemes use more information on one side of the node than the other. one could choose β = 1/2 in Eq. 11. a thirdorder backwardbiased scheme is given by 1 (uj−2 − 6uj−1 + 3uj + 2uj+1) 6∆x 1 = [(uj−2 − 8uj−1 + 8uj+1 − uj+2) 12∆x + (uj−2 − 4uj−1 + 6uj − 4uj+1 + uj+2)] (δx u)j = (11. For example. As a corollary. Therefore. If the wave speed a is positive. we see that the speed of propagation is a + γκ2 . Furthermore.3 we have γ = −a∆x2 /6.6) The antisymmetric component of this operator is the fourthorder centered diﬀerence operator.206 CHAPTER 11. that the same is true for any centered diﬀerence approximation of a ﬁrst derivative. Any symmetric component in the spatial operator introduces dissipation (or ampliﬁcation). 11. the choice of a forward diﬀerence scheme (β = −1) is equivalent to deliberately adding a destabilizing term to the PDE. This we refer to as dispersion. the choice of a backward diﬀerence scheme (β = 1) produces a modiﬁed PDE with ν − τ κ2 > 0 and hence the amplitude of the solution decays. This is tantamount to deliberately adding dissipation to the PDE. NUMERICAL DISSIPATION It is important to notice that the amplitude of the solution depends only upon ν and τ .
3 The LaxWendroﬀ Method In order to introduce numerical dissipation using onesided diﬀerencing. Next we consider a method which introduces dissipation independent of the sign of the wave speed. ∂t ∂x ∂2u ∂2u = a2 2 ∂t2 ∂x (11. . . This explicit method diﬀers conceptually from the methods considered previously in which spatial diﬀerencing and timemarching are treated separately. known as the LaxWendroﬀ method. It is a fullydiscrete ﬁnitediﬀerence scheme. 1. . backward diﬀerencing must be used if the wave speed is positive. Thus ∂t ∂x ∂u ∂u = −a . For periodic boundary conditions. and forward diﬀerencing must be used if the wave speed is negative. . It is equal to the ratio of the distance travelled by a wave in one time step to the mesh spacing. t + h) = u + h ∂u 1 2 ∂ 2 u + h + O(h3) 2 ∂t 2 ∂t (11.7) First replace the time derivatives with space derivatives according to the PDE (in this case.11. Consider the following Taylorseries expansion in time: u(x. The quantity ∆x  is known as the Courant (or CFL) number.3. the linear convection equation ∂ u + a ∂ u = 0). There is no intermediate semidiscrete stage.9) This is the LaxWendroﬀ method applied to the linear convection equation. giving (n+1) uj = (n) uj 1 ah (n) 1 (n) (uj+1 − uj−1) + − 2 ∆x 2 ah ∆x 2 (uj+1 − 2uj + uj−1 ) (n) (n) (n) (11. .8) Now replace the space derivatives with threepoint centered diﬀerence operators. − + un un+1 = Bp 2 ∆x ∆x ∆x 2 ∆x ∆x The eigenvalues of this matrix are ah σm = 1 − ∆x 2 1 − cos 2πm M −i ah 2πm sin ∆x M for m = 0. the corresponding fullydiscrete matrix operator is 2 2 2 1 ah ah ah 1 ah ah + .1− . THE LAXWENDROFF METHOD 207 11. M − 1 ah For  ∆x  ≤ 1 all of the eigenvalues have modulus less than or equal to unity and hence ah the method is stable independent of the sign of a.
For the linear convection equation. 11. these should be applied alternately. NUMERICAL DISSIPATION The nature of the dissipative properties of the LaxWendroﬀ scheme can be seen by examining the modiﬁed partial diﬀerential equation. The two methods diﬀer when applied to nonlinear hyperbolic systems.11) which can be shown to be identical to the LaxWendroﬀ method. 11. . Recall MacCormack’s timemarching method. presented in Chapter 6: un+1 = un + hun ˜ 1 [un + un+1 + h˜n+1 ] un+1 = ˜ u 2 (11. however.1 a dissipative secondorder method is obtained. Therefore. Recall that the odd derivatives on the right side lead to unwanted dispersion and the even derivatives lead to dissipation (or ampliﬁcation. the leading error term in the LaxWendroﬀ method is dispersive and proportional to 3 a ∂3u a∆x2 2 ∂ u − (∆x2 − a2 h2 ) 3 = − (1 − Cn ) 3 6 ∂x 6 ∂x The dissipative term is proportional to − 4 a2 h ∂4u a2 h∆x2 2 ∂ u (∆x2 − a2 h2 ) 4 = − (1 − Cn ) 4 8 ∂x 8 ∂x This term has the appropriate sign and hence the scheme is truly dissipative as long as Cn ≤ 1.10) If we use ﬁrstorder backward diﬀerencing in the ﬁrst stage and ﬁrstorder forward diﬀerencing in the second stage. this approach leads to uj ˜ uj (n+1) (n+1) ah (n) (n) (uj − uj−1) ∆x 1 (n) ah (n+1) (n+1) (n+1) = [uj + uj ˜ − (˜j+1 − uj u ˜ )] 2 ∆x = uj − (n) (11. depending on the sign). The two leading error terms appear on the right side of the equation.208 CHAPTER 11. . for nonlinear problems. . 1 Or viceversa.8.9 and converting the time derivatives to space derivatives using Eq. A closely related method is that of MacCormack. ∂t ∂x 6 ∂x 8 ∂x This is derived by substituting Taylor series expansions for all terms in Eq. Hence MacCormack’s method has the same dissipative and dispersive properties as the LaxWendroﬀ method. which is given by ∂u a ∂u ∂ 3 u a2 h ∂4u +a = − (∆x2 − a2 h2 ) 3 − (∆x2 − a2 h2 ) 4 + .
12) where we do not make any assumptions as to the sign of a. Consider again the linear convection equation: ∂u ∂u +a =0 ∂t ∂x (11. some form of splitting is required. This is the basic concept behind upwind methods. For example. u − a. manner. the eigenvalues of the ﬂux Jacobian for the onedimensional Euler equations are u. whether it is a forward or a backward diﬀerence) or the sign of the symmetric component depends on the sign of the wave speed. then a+ = 0 and a− = a ≤ 0.5. that is. When a hyperbolic system of equations is being solved.e.. these are of mixed sign.11. then a = a ≥ 0 and a = 0. UPWIND SCHEMES 209 11. In order to apply onesided diﬀerencing schemes to such systems. by adding a symmetric component to the spatial operator. as a result of their superior ﬂexibility. From Eq. These are by no means unique. When the ﬂow is subsonic. if a ≤ 0. For more subtle aspects of implementation and application of such techniques to . but entirely equivalent. The above approach to obtaining a stable discretization independent of the sign of a can be written in a diﬀerent.12 as ∂u a ± a ∂u + (a+ + a− ) = 0 . In this section. the direction of the onesided operator (i. u + a. some decomposition or splitting of the ﬂuxes into terms which have positive and negative characteristic speeds so that appropriate diﬀerencing schemes can be chosen. Alternatively. 11.1. more generally. This is achieved by the following point operator: −a(δx u)j = −1 [a(−uj−1 + uj+1) + a(−uj−1 + 2uj − uj+1)] 2∆x (11. However. we see that a stable discretization is obtained with β = 1 if a ≥ 0 and with β = −1 if a ≤ 0.13) This approach is extended to systems of equations in Section 11. In the next two sections. the wave speeds can be both positive and negative.2. 11. We can rewrite Eq. we saw that numerical dissipation can be introduced in the spatial diﬀerence operator using onesided diﬀerence schemes or.4. schemes in which the numerical dissipation is introduced in the spatial operator are generally preferred over the LaxWendroﬀ approach. Now for the a+ (≥ 0) term we can safely backward diﬀerence and for the a− (≤ 0) term forward diﬀerence. a± = ∂t ∂x 2 + − If a ≥ 0. we present two splitting techniques commonly used with upwind methods. we present the basic ideas of ﬂuxvector and ﬂuxdiﬀerence splitting. With this approach. This is avoided in the LaxWendroﬀ scheme.4 Upwind Schemes In Section 11.
λi . In order to apply a onesided (or biased) spatial diﬀerencing scheme. Λ. NUMERICAL DISSIPATION nonlinear hyperbolic systems such as the Euler equations. and the wi ’s are the characteristic variables. 11.5 that a linear. Λ+ contains the positive eigenvalues and Λ− contains the negative eigenvalues. the reader is referred to the literature on this subject. are the eigenvalues of the Jacobian matrix. To accomplish this. A. λi . is positive.210 CHAPTER 11.18) The spatial terms have been split into two components according to the sign of the wave speeds.19) . constantcoeﬃcient. Premultiplying by X and inserting the product ∂x X −1 X in the spatial terms gives ∂Xw ∂XΛ+ X −1 Xw ∂XΛ− X −1 Xw + + =0 ∂t ∂x ∂x (11. We can now rewrite the system in terms of characteristic variables as ∂w ∂w ∂w ∂w ∂w +Λ = + Λ+ + Λ− =0 ∂t ∂x ∂t ∂x ∂x (11.1 FluxVector Splitting Recall from Section 2. We can use backward diﬀerencing for the Λ+ ∂ w term and forward ∂x diﬀerencing for the Λ− ∂ w term.15) (11.17) (11.16) With these deﬁnitions. 2 Λ− = Λ − Λ 2 (11. we need to apply a backward diﬀerence if the wave speed. let us split the matrix of eigenvalues.14) where the wave speeds.4. and a forward diﬀerence if the wave speed is negative. into two components such that Λ = Λ+ + Λ− where Λ+ = Λ + Λ . hyperbolic system of partial diﬀerential equations given by ∂u ∂f ∂u ∂u + = +A =0 ∂t ∂x ∂t ∂x can be decoupled into characteristic equations of the form ∂wi ∂wi + λi =0 ∂t ∂x (11.
as discussed in Appendix C.26) For the Euler equations. ∂u ∂f − = A− ∂u (11. This approach is known as ﬂuxvector splitting. Note that f = f+ + f− (11. . f = Au. For the Euler equations. 2 With these deﬁnitions A+ has all positive eigenvalues. and we can write ∂u ∂f + ∂f − + + =0 ∂t ∂x ∂x f − = A− u A− = XΛ−X −1 211 (11. the Jacobians of the split ﬂux vectors are required. UPWIND SCHEMES With the deﬁnitions2 A+ = XΛ+X −1 . one must ﬁnd and use the new Jacobians given by A++ = ∂f + . f is also equal to Au as a result of their homogeneous property. the deﬁnition of the split ﬂuxes follows directly from the deﬁnition of the ﬂux. A++ has eigenvalues which are all positive. and A− has all negative eigenvalues.21) (11.24) Thus by applying backward diﬀerences to the f+ term and forward diﬀerences to the f − term. and A−− has all negative eigenvalues.11.20) (11. When an implicit timemarching method is used.23) In the linear case.25) Therefore. ∂u A−− = ∂f − ∂u (11. we obtain ∂u ∂A+ u ∂A− u + + =0 ∂t ∂x ∂x Finally the split ﬂux vectors are deﬁned as f + = A+ u.4. ∂f + = A+ .22) (11. and recalling that u = Xw. we are in eﬀect solving the characteristic equations in the desired manner. In the nonlinear case.
29) where the subscript i indicates individual components of w and g.32) . constantcoeﬃcient. more suited to ﬁnitevolume methods. we require (ˆi )j+1/2 = g λi (wi)L if λi > 0 λi (wi )R if λi < 0 (11. wL and wR . 11.33) (11. which is nondissipative. as in Eq. is known as ﬂuxdiﬀerence splitting. we premultiply by X to return to the original variables and insert the product X −1 X after Λ and Λ to obtain 1 1 X gj+1/2 = XΛX −1 X (wL + wR ) + XΛX −1X (wL − wR ) ˆ 2 2 and thus 1 1 ˆ fj+1/2 = (fL + fR ) + A (uL − uR ) 2 2 where A = XΛX −1 (11. as a ˆ function of the states to the left and right of the interface. gj+1/2 . hyperbolic system of equations ∂w ∂w +Λ =0 ∂t ∂x (11.212 CHAPTER 11. This is achieved with 1 1 (ˆi)j+1/2 = λi [(wi )L + (wi )R ] + λi  [(wi )L − (wi )R ] g 2 2 or 1 1 gj+1/2 = Λ (wL + wR ) + Λ (wL − wR ) ˆ 2 2 (11. We again begin with the diagonalized form of the linear. the ﬂuxes must be evaluated at cell boundaries.27) The ﬂux vector associated with this form is g = Λw. respectively. NUMERICAL DISSIPATION 11. In a ﬁnitevolume method.31) (11. we consider the numerical ﬂux at the interface between nodes j and j + 1.2 FluxDiﬀerence Splitting Another approach. Now.19. The centered approximation to gj+1/2 .28) In order to obtain a onesided upwind approximation.4. is given by gj+1/2 = ˆ 1 (g(wL) + g(wR)) 2 (11.30) Now.34) (11. as in Chapter 5.
this leads to an upwind operator which is identical to that obtained using ﬂuxvector splitting. there is some ambiguity regarding the deﬁnition of A at the cell interface j + 1/2. 3 . if they are all negative. Hence satisfaction of Eq.37 is satisﬁed by the ﬂux Jacobian evaluated at the Roeaverage state given by √ √ ρL uL + ρR uR uj+1/2 = (11. in the nonlinear case.39) √ √ ρL + ρR where u and H = (e + p)/ρ are the velocity and the total enthalpy per unit mass.11.5 Artiﬁcial Dissipation We have seen that numerical dissipation can be introduced by using onesided differencing schemes together with some form of ﬂux splitting. constantcoeﬃcient case. consider a situation in which the eigenvalues of A are all of ˆ the same sign.38) √ √ ρL + ρR √ √ ρL HL + ρR HR Hj+1/2 = (11.37) (11.35 is obtained by the deﬁnition 1 1 ˆ fj+1/2 = (fL + fR ) + Aj+1/2  (uL − uR ) 2 2 if Aj+1/2 satisﬁes fL − fR = Aj+1/2 (uL − uR ) (11. 11. see problem 6 at the end of this chapter. u = Xw. A = A. Note that the ﬂux Jacobian can be written in terms of u and H only. A = −A.3 11. Thus we can generalize the concept of upwinding to include any scheme in which the symmetric portion of the operator is treated in such a manner as to be truly dissipative. In the linear. respectively. we would like our deﬁnition of fj+1/2 to satisfy ˆ fj+1/2 = fL if all λi s > 0 fR if all λi s < 0 (11. ARTIFICIAL DISSIPATION 213 and we have also used the relations f = Xg. In this case.36) For the Euler equations for a perfect gas. Eq. We have also seen that such dissipation can be introduced by adding a symmetric component to an antisymmetric (dissipationfree) operator. 11.5. and A = XΛX−1 . However. In order to resolve this. If the eigenvalues of A are all positive.35) giving pure upwinding.
applying δx = δx − δx is stable if λi ≤ 0 and unstable if λi > 0. Similarly. 11. let a (δx u)j = CHAPTER 11. 11.36 and 11. but is beyond the scope of this book. A secondorder backward diﬀerence approximation to a 1st derivative is given as a point operator by (δx u)j = 1 (uj−2 − 4uj−1 + 3uj ) 2∆x . a s With appropriate choices of δx and δx .43) where A is deﬁned in Eq. as in the previous two sections.42) or a s δx f = δx f + δx (Au) (11. gives a s δx (Au) = δx (Au) + δx (Au) (11. The second spatial term is known as artiﬁcial dissipation.41) Extension to a hyperbolic system by applying the above approach to the characteristic variables. A more complicated treatment of the numerical dissipation is also required near shock waves and other discontinuities. this often provides suﬃcient damping of high frequency modes without greatly aﬀecting the low frequency modes. This is particularly evident from a comparison of Eqs. 11. NUMERICAL DISSIPATION uj+1 − uj−1 .6 Problems 1. s It is common to use the following operator for δx s (δx u)j = ∆x (uj−2 − 4uj−1 + 6uj − 4uj+1 + uj+2) (11.44) where is a problemdependent coeﬃcient.40) a s Applying δx = δx + δx to the spatial derivative in Eq. The appropriate implementation is thus a s λi δx = λi δx + λi δx (11.15 is stable if λi ≥ 0 and a s unstable if λi < 0. For details of how this can be implemented for nonlinear hyperbolic systems. the reader should consult the literature. this approach can be related to the upwind approach. This symmetric operator approximates ∆x3 uxxxx and thus introduces a thirdorder dissipative term.34. 11. 2∆x s (δx u)j = −uj+1 + 2uj − uj−1 2∆x (11. It is also sometimes referred to as artiﬁcial diﬀusion or artiﬁcial viscosity. With an appropriate value of .214 For example.43.
ﬁnd σ for the combination of this spatial operator with 4thorder RungeKutta time marching at a Courant number of unity and plot vs. Find the modiﬁed wavenumber for this operator with = 0. Show that the use of the Roe average state given in Eqs. 11.39 leads to satisfaction of Eq. κ∆x for 0 ≤ κ∆x ≤ π. 1/12. Find the matrix A. 2. Using Fourier analysis as in Section 6. Consider the hyperbolic system derived in problem 8 of Chapter 2. PROBLEMS 215 (a) Express this operator in banded matrix form (for periodic boundary conditions). Plot the real and imaginary parts of κ∗ ∆x vs.1. ﬁnd σ for the combination of this spatial operator with 4thorder RungeKutta time marching at a Courant number of unity and plot vs.44. 11. 4. Show that the ﬂux Jacobian for the 1D Euler equations can be written in terms of u and H. 11. κ∆x for 0 ≤ κ∆x ≤ π.38 and 11. Consider the spatial operator obtained by combining secondorder centered differences with the symmetric operator given in Eq. Plot the real and imaginary parts of κ∗ ∆x vs. (See Appendix A. κ∆x for 0 ≤ κ∆x ≤ π.) (b) Using a Taylor table. Find the modiﬁed wavenumber for the operator given in Eq. .6. then derive the symmetric and skewsymmetric matrices that have the matrix operator as their sum.6. Using Fourier analysis as in Section 6. ﬁnd σ for the combination of this spatial operator with 4thorder RungeKutta time marching at a Courant number of unity and plot vs. 6. Find the modiﬁed wavenumber for the ﬁrstorder backward diﬀerence operator. 1/24.2. Plot the real and imaginary parts of κ∗ ∆x vs. 5. 11.37.11. Form the plusminus split ﬂux vectors as in Section 11. Using Fourier analysis as in Section 6.4.6. and 1/48. κ∆x for 0 ≤ κ∆x ≤ π. ﬁnd the derivative which is approximated by the corresponding symmetric and skewsymmetric operators and the leading error term for each. κ∆x for 0 ≤ κ∆x ≤ π.6. κ∆x for 0 ≤ κ∆x ≤ π.6. 3.3 to see how to construct the symmetric and skewsymmetric components of a matrix.2.2.
NUMERICAL DISSIPATION .216 CHAPTER 11.
the concept of factoring is quite simple to present and grasp. Now recall the generic ODE’s produced by the semidiscrete approach du = Au − f dt 217 (12. When we approach numerical analysis in the light of matrix derivative operators. This gives the reader a feel for some of the modiﬁcations which can be made to the basic algorithms in order to obtain eﬃcient solvers for practical multidimensional applications. Let us start with the following observations: 1. They are the basis for a wide variety of methods variously known by the labels “hybrid”. and “fractional step”. Advancing to the next time level always requires some reference to a previous one. Matrices can be split in quite arbitrary ways. Factored forms are especially useful for the derivation of practical algorithms that use implicit methods. h. 3. 2. “time split”.1 The Concept Factored forms of numerical operators are used extensively in constructing and applying numerical methods to problems in ﬂuid mechanics. Time marching methods are valid only to some order of accuracy in the step size. we present and analyze split and factored algorithms.Chapter 12 SPLIT AND FACTORED FORMS In the next two chapters. and a means for analyzing such modiﬁed forms. 12.1) .
from observation 2 (new data un+1 in terms of old un ): un+1 = [ I + hA1 + hA2 ]un − hf + O(h2 ) or its equivalent un+1 = [ I + hA1 ][ I + hA2 ] − h2 A1 A2 un − hf + O(h2 ) Finally. Choose again the explicit Euler time march so that un+1 = [ I + hAd + hAc ]un + h(bc) + O(h2 ) 1 (12. The semidiscrete approach to its solution might be put in the form du = Ac u + Ad u + (bc) dt (12.5) where Ac and Ad are matrices representing the convection and dissipation terms.2 Factoring Physical Representations — Time Splitting Suppose we have a PDE that represents both the processes of convection and dissipation. respectively. For the ti