You are on page 1of 783

Vemuri Balakotaiah, Ram R.

Ratnakar
Applied Linear Analysis for Chemical Engineers
Also of Interest
Chemical Reaction Technology
Dmitry Yu. Murzin, 2022
ISBN 978-3-11-071252-0, e-ISBN (PDF) 978-3-11-071255-1,
e-ISBN (EPUB) 978-3-11-071260-5

Non-equilibrium Thermodynamics and Physical Kinetics


Halid Bikkin, Igor I. Lyapilin, 2021
ISBN 978-3-11-072706-7, e-ISBN (PDF) 978-3-11-072719-7,
e-ISBN (EPUB) 978-3-11-072738-8

Process Technology. An Introduction


André B. de Haan, Johan T. Padding, 2022
ISBN 978-3-11-071243-8, e-ISBN (PDF) 978-3-11-071244-5,
e-ISBN (EPUB) 978-3-11-071246-9

Multi-level Mixed-Integer Optimization. Parametric Programming


Approach
Styliani Avraamidou, Efstratios Pistikopoulos, 2022
ISBN 978-3-11-076030-9, e-ISBN (PDF) 978-3-11-076031-6,
e-ISBN (EPUB) 978-3-11-076038-5

Data Science. Time Complexity, Inferential Uncertainty, and Spacekime


Analytics
Ivo D. Dinov, Milen Velchev Velev, 2021
ISBN 978-3-11-069780-3, e-ISBN (PDF) 978-3-11-069782-7,
e-ISBN (EPUB) 978-3-11-069797-1

Outliers in Control Engineering. Fractional Calculus Perspective


Paweł D. Domański, YangQuan Chen, Maciej Ławryńczuk, 2022
ISBN 978-3-11-072907-8, e-ISBN (PDF) 978-3-11-072912-2,
e-ISBN (EPUB) 978-3-11-072913-9
Vemuri Balakotaiah, Ram R. Ratnakar

Applied Linear Analysis


for Chemical Engineers

|
A Multi-scale Approach with Mathematica®
Authors
Prof. Vemuri Balakotaiah Dr. Ram R. Ratnakar
University of Houston Shell International Exploration & Production
Dept of Chemical and Biomolecular Engineering Houston, TX 77082
4800 Calhoun Road USA
Houston, TX 77204-4004 Ram.Ratnakar@shell.com
USA
bala@uh.edu

The citation of registered names, trade names, trade marks, etc. in this work does not imply, even in
the absence of a specific statement, that such names are exempt from laws and regulations
protecting trade marks etc. and therefore free for general use.

ISBN 978-3-11-073969-5
e-ISBN (PDF) 978-3-11-073970-1
e-ISBN (EPUB) 978-3-11-073978-7

Library of Congress Control Number: 2022944897

Bibliographic information published by the Deutsche Nationalbibliothek


The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data are available on the Internet at http://dnb.dnb.de.

© 2023 Walter de Gruyter GmbH, Berlin/Boston


Cover image: akinbostanci / iStock / Getty Images Plus
Typesetting: VTeX UAB, Lithuania
Printing and binding: CPI books GmbH, Leck

www.degruyter.com
Preface
This book is based on a course that the first author taught at the University of Hous-
ton for about 30 years. This course was a requirement for all first-year graduate stu-
dents and was a prerequisite for two other optional courses, taken mostly by graduate
students whose research involved modeling, computational and nonlinear analysis.
As we state in the Introduction, while this book deals only with the solution of lin-
ear equations, linear analysis is the foundation of all numerical and nonlinear tech-
niques.
Since there are many books already available on applied mathematics for chem-
ical engineers, it is fair to ask the question why another book? For this, our response
is that every author has a unique perspective that may be appealing to others. Fur-
ther, the authors are not aware of any book that deals exclusively with the solution
of various linear equations that arise in engineering in a unified manner, and with
examples.
The senior author had the pleasure of taking the applied mathematics course
from Professor Neal R. Amundson and later teaching the same course when Pro-
fessor Amundson retired. Both authors have used the material extensively in their
own research and would like to point out the following highlights of the material
presented: (i) use of symbolic software (Mathematica® ) for illustrating and enhanc-
ing the impact of physical parameter changes on solutions, (ii) multiscale analysis
of chemical engineering problems with physical interpretation of time and length
scales in terms of eigenvalues and eigenvectors/eigenfunctions, (iii) detailed dis-
cussion of compartment models for various finite- dimensional problems and their
solution in phase spaces, (iv) evaluation and illustration of functions of matrices
(and use of symbolic manipulation) to solve multicomponent diffusion-convection-
reaction problems, (v) illustration of the techniques and interpretation of solutions
to several classical chemical engineering and related problems, (vi) emphasis on
the connection between discrete (matrix algebra) and continuum models (initial,
boundary and initial-boundary value problems), (vii) physical interpretation of ad-
joint operator and adjoint systems and their application in solving inverse problems
and (viii) use of complex analysis and algebra in the solution of practical engineering
problems.
The senior author has taught most of contents of Parts I, III, IV and V in a single
semester (14 weeks or 28 lectures of 90 minutes duration). However, the entire con-
tents of the book can be taught in a two-semester course. For a single semester course,
we recommend covering Chapters 1 to 5, 14, 17, selected sections of Chapters 18 to 21
and 23 to 25.
We wish to acknowledge many colleagues, former students and our mentors who
over the years contributed to our understanding and organization of the subject.

https://doi.org/10.1515/9783110739701-201
VI | Preface

We also want to thank Karin Sora, Nadja Schedensack and Vilma Vaičeliūnienė of
De Gruyter for their help during production.
The second author wishes to acknowledge the constant encouragement and sup-
port of his familty, especially his eldest brother Siddhesh Satyakar.
Finally, the first author wishes to acknowledge the patience and understanding of
his wife, Nalini Vemuri, and dedicate it to her with affection and gratitude.
Introduction
This book deals with the solution of linear equations. We discuss the solutions of lin-
ear algebraic equations, linear initial value problems, linear boundary value prob-
lems, linear integral equations and linear partial differential equations along with
their application to various chemical engineering problems.
It should be pointed out that most practical problems encountered by engineers
are nonlinear and are often solved on a computer using numerical techniques. In most
cases, the nonlinear problem is linearized around a known or approximate solution
and a more accurate solution is obtained by solving a sequence of linear problems. The
nonlinear methods of analysis as well as the numerical techniques used by engineers
draw heavily from the linear analysis. In other words, linear analysis is the foundation
of all nonlinear and numerical techniques.
Generally speaking, most linear problems that arise in applications may be clas-
sified into two groups: (i) problems describing the steady state or equilibrium state of
a physical system and (ii) problems describing the dynamic or transient behavior of
a physical system. The first type of problems are described by linear equations of the
form

Lu = f (1)

where L is a linear operator, u is a state vector and f is a source function. For example,
in finite dimensions, equation (1) may be a set of n linear algebraic equations in n
unknowns,

Au = b (2)

where A is a n×n matrix, u and b are n×1 vectors. When the state vector u belongs to an
infinite-dimensional space, equation (1) may be a two-point boundary value problem
such as

d du
− (p(x) ) + q(x)u = f (x), a<x<b (3)
dx dx
u(a) = u(b) = 0 (4)

or an integral equation such as the Fredholm integral equation of the first kind given
by

∫ K(x, s)u(s) ds = f (x) (5)


a

or a partial differential equation such as the Poisson’s equation

https://doi.org/10.1515/9783110739701-202
VIII | Introduction

𝜕2 u 𝜕2 u
−( + ) = f (x, y) in Ω (6)
𝜕x 2 𝜕y2
u=0 on 𝜕Ω (7)

where Ω is some domain in the x-y plane and 𝜕Ω is its boundary.


The second class of problems are of the form

du
= Lu, t > 0 (8)
dt
u = u0 at t = 0 (9)

where t is the time and the evolution equation (8) describes the system behavior for
t > 0, while equation (9) gives the initial condition. In the simpler case of the finite-
dimensional problems, equations (8)–(9) may be of the form

du
= Au (10)
dt
u = u0 at t = 0 (11)

where A is a constant coefficient n×n matrix, u is a n×1 vector of state variables and u0
is a n × 1 vector of initial conditions. An example of an initial value problem in infinite
dimensions is the heat equation in one spatial coordinate and time:

𝜕u 𝜕2 u
= 2 ; 0 < x < 1, t > 0 (12)
𝜕t 𝜕x
u(0, t) = u(1, t) = 0 (Boundary conditions) (13)
u(x, 0) = f (x) (Initial condition) (14)

We shall see that many of the concepts involved in the solution of linear ordinary
and partial differential equations are generalizations of the ideas involved in the solu-
tion of the finite-dimensional problems represented by equations (2) and (10). There-
fore, we shall focus first on the finite-dimensional case.

Properties of solutions to linear equations


When the matrix A is invertible, the solution of equation (2) may be expressed as
n
1
u=∑ cx (15)
j=1
λj j j

where the scalars λj (eigenvalues) and the (eigen)vectors xj depend only on the ma-
trix A, while the constants cj are given by

⟨b, yj ⟩
cj = , (16)
⟨xj , yj ⟩
Introduction | IX

where yj are known as the left eigenvectors of A. Here, ⟨x, y⟩ denotes the dot or inner
product of vectors. When the matrix A is symmetric (self-adjoint), xj = yj and the
eigenvectors are normalized to have unit length (⟨xj , xj ⟩ = 1), the expression for the
constants cj simplifies to

cj = ⟨b, xj ⟩. (17)

The above form of the solution has advantages over the direct solution (e. g., by Gaus-
sian elimination) when n is large. For example, when A is symmetric and the eigenval-
ues are well separated (0 < |λ1 | ≪ |λ2 | ≪ |λ3 | ≪ ⋅ ⋅ ⋅ ≪ |λn |), the first few terms may be
sufficient to compute the solution if the desired accuracy is not high. A second advan-
tage is that the solution has the same form for all linear equations of the form given
by (1). For example, when the linear (differential/integral) operator L is symmetric,
the same solution is applicable with a slight modification:

1
u=∑ cϕ; cj = ⟨f , ϕj ⟩, (18)
j=1
λj j j

where λj are the eigenvalues and ϕj are the normalized eigenfunctions of the opera-
tor L.
The solution of the initial value problem, equations (10)–(11) may be expressed as

n ⟨u0 , yj ⟩
u(t) = ∑ cj eλj t xj ; cj = (19)
j=1
⟨xj , yj ⟩

which for the case of symmetric matrix simplifies to cj = ⟨u0 , xj ⟩. The generalization
of this result for the case of a symmetric differential operator is

u(t) = ∑ cj eλj t ϕj ; cj = ⟨f , ϕj ⟩. (20)
j=1

An important observation regarding the various solutions to the linear equations is


that they are all expressed in terms of the eigenvalues and eigenfunctions of the oper-
ator appearing in the equation. These eigenvalues and eigenfunctions correspond to
various time and length scales that are of interest in the physical system. An important
task of linear analysis is the identification of these length and time scales and relating
them to the parameters of the physical system. We hope to illustrate this for various
chemical engineering problems.
Contents
Preface | V

Introduction | VII

Part I: Applied matrix algebra

1 Matrices and linear algebraic equations | 3


1.1 Simultaneous linear equations | 3
1.2 Review of basic matrix operations | 4
1.2.1 Matrix addition and subtraction | 5
1.2.2 Matrix multiplication | 5
1.2.3 Special matrices | 6
1.3 Elementary row operations and row echelon form of a matrix | 7
1.3.1 Representation of elementary row operations | 8
1.4 Rank of a matrix and condition for existence of solutions | 10
1.4.1 The homogeneous system Au = 0 | 10
1.4.2 The inhomogeneous system Au = b | 12
1.5 Gaussian elimination and LU decomposition | 15
1.5.1 Lower and upper triangular systems | 15
1.5.2 Gaussian elimination | 16
1.5.3 LU decomposition/factorization | 18
1.6 Inverse of a square matrix | 19
1.6.1 Properties of inverse | 19
1.6.2 Calculation of inverse | 20
1.7 Vector-matrix formulation of some chemical engineering
problems | 21
1.7.1 Batch reactor: evolution equations with multiple reactions | 22
1.7.2 Continuous-flow stirred tank reactor (CSTR): transient and steady-state
models with multiple reactions | 24
1.7.3 Two interacting tank system: transient model for mixing with in- and
outflows | 25
1.7.4 Models for transient diffusion, convection and diffusion-convection
(compartment models) | 27
1.8 Application of elementary matrix concepts | 30
1.9 Application of computer algebra and symbolic manipulation | 33
1.9.1 Example 1: mass transfer disguised matrix for a five species
system | 35
1.9.2 Example 2: mass transfer disguised matrix for a ten species
system | 36
XII | Contents

2 Determinants | 44
2.1 Definition of determinant | 44
2.2 Properties of the determinant | 46
2.3 Computation of determinant by pivotal condensation | 48
2.4 Minors, cofactors and Laplace’s expansion | 49
2.4.1 Classical adjoint and inverse matrices | 51
2.5 Determinant of the product of two matrices | 52
2.6 Rank of a matrix defined in terms of determinants | 53
2.7 Solution of Au = 0 and Au = b by Cramer’s rule | 54
2.8 Differentiation of a determinant | 56
2.9 Applications of determinants | 57

3 Vectors and vector expansions | 65


3.1 Linear dependence, basis and dimension | 66
3.2 Dot or scalar product of vectors | 67
3.3 Linear algebraic equations | 70
3.4 Applications of vectors and vector expansions | 72
3.4.1 Stoichiometry | 72
3.4.2 Dimensional analysis | 74
3.5 Application of computer algebra and symbolic manipulation | 76
3.5.1 Determination of independent reactions | 76

4 Solution of linear equations by eigenvector expansions | 82


4.1 The matrix eigenvalue problem | 82
4.2 Left eigenvectors and the adjoint eigenvalue problem (eigenrows) | 85
4.3 Properties of eigenvectors/eigenrows | 87
4.4 Orthogonal and biorthogonal expansions | 96
4.4.1 Vector expansions | 96
4.4.2 Orthogonal expansions | 96
4.4.3 Biorthogonal expansions | 98
4.5 Solution of linear equations using eigenvector expansions | 99
4.5.1 Solution of linear algebraic equations Au = b | 99
4.5.2 (Fredholm alternative): solution of linear algebraic equations Au = b
when A is singular | 100
4.5.3 Linear coupled first-order differential equations with constant
coefficients | 102
4.5.4 Linear coupled inhomogeneous equations | 104
4.5.5 A second-order vector initial value problem | 105
4.5.6 Multicomponent diffusion and reaction in a catalyst pore | 107
4.6 Diagonalization of matrices and similarity transforms | 109
4.6.1 Examples of similarity transforms | 110
4.6.2 Canonical form | 112
Contents | XIII

4.6.3 Similarity transform when AT = A | 114

5 Solution of linear equations containing a square matrix | 122


5.1 Cayley–Hamilton theorem | 122
5.2 Functions of matrices | 125
5.3 Formal solutions of linear differential equations containing a square
matrix | 129
5.4 Sylvester’s theorem | 131
5.5 Spectral theorem | 135
5.6 Projections operators and vector projections | 142
5.6.1 Standard basis and projection in ℝ2 | 142
5.6.2 Nonorthogonal projections | 143
5.6.3 Geometric interpretation with real and negative eigenvalues | 145
5.6.4 Geometrical interpretation with complex eigenvalues with negative real
part | 148
5.6.5 Geometrical interpretation with one zero eigenvalue | 149
5.6.6 Physical and geometrical interpretation of transient behavior of
interacting tank systems for various initial conditions | 150

6 Generalized eigenvectors and canonical forms | 160


6.1 Repeated eigenvalues and generalized eigenvectors | 160
6.1.1 Linearly independent solutions of du
dt
= Au with repeated
eigenvalues | 161
6.1.2 Examples of repeated EVs and GEVs | 162
6.2 Jordon canonical forms | 164
6.3 Multiple eigenvalues and generalized eigenvectors | 166
6.4 Determination of f (A) when A has multiple eigenvalues | 173
6.5 Application of Jordon canonical form to differential equations | 175

7 Quadratic forms, positive definite matrices and other applications | 179


7.1 Quadratic forms | 179
7.2 Positive definite matrices | 183
7.3 Rayleigh quotient | 184
7.4 Maxima/minima for a function of several variables | 185
7.5 Linear difference equations | 190
7.6 Generalized inverse and least square solutions | 196

Part II: Abstract vector space concepts

8 Vector space over a field | 207


8.1 Definition of a field | 207
XIV | Contents

8.2 Definition of an abstract vector or linear space: | 208


8.2.1 Subspaces | 209
8.2.2 Bases and dimension | 210
8.2.3 Coordinates | 211

9 Linear transformations | 214


9.1 Definition of a linear transformation | 214
9.2 Matrix representation of a linear transformation | 216
9.2.1 Change of basis | 221
9.2.2 Kernel and range of a linear transformation | 222
9.2.3 Relation to linear equations | 223
9.2.4 Isomorphism | 224
9.2.5 Inverse of a linear transformation | 225

10 Normed and inner product vector spaces | 229


10.1 Definition of normed linear spaces | 229
10.2 Inner product vector spaces | 231
10.2.1 Gram–Schmidt orthogonalization procedure | 236
10.3 Linear functionals and adjoints | 238

11 Applications of finite-dimensional linear algebra | 253


11.1 Weighted dot/inner product in ℝn | 253
11.2 Application of weighted inner product to interacting tank
systems | 258
11.3 Application of weighted inner product to monomolecular
kinetics | 262

Part III: Linear ordinary differential equations-initial value


problems, complex variables and laplace transform

12 The linear initial value problem | 277


12.1 The vector initial value problem | 277
12.2 The n-th order initial value problem | 280
12.2.1 The n-th order inhomogeneous equation | 284
12.3 Linear IVPs with constant coefficients | 286

13 Linear systems with periodic coefficients | 292


13.1 Scalar equation with a periodic coefficient | 292
13.2 Vector equation with periodic coefficient matrix | 295

14 Analytic solutions, adjoints and integrating factors | 302


Contents | XV

14.1 Analytic solutions | 302


14.2 Adjoints and integrating factors | 307
14.2.1 First-order equation | 307
14.2.2 Second-order equation | 308
14.3 Relationship between solutions of Lu = 0 and L∗ v = 0 | 310
14.4 Vector initial value problem | 310

15 Introduction to the theory of functions of a complex variable | 318


15.1 Complex valued functions | 318
15.1.1 Algebraic operations with complex numbers | 318
15.1.2 Polar form of complex numbers | 318
15.1.3 Roots of complex numbers | 320
15.1.4 Complex-valued functions | 320
15.2 Limits, continuity and differentiation | 321
15.2.1 Limits | 321
15.2.2 Continuity | 321
15.2.3 Derivative | 321
15.2.4 The Cauchy–Riemann equations | 322
15.2.5 Some elementary functions of a complex variable | 323
15.2.6 Zeros and singular points of complex-valued functions | 325
15.3 Complex integration, Cauchy’s theorem and integral formulas | 326
15.3.1 Simply and multiply connected domains | 327
15.3.2 Contour integrals and traversal of a closed path | 327
15.3.3 Cauchy’s theorem | 328
15.3.4 Cauchy’s integral formulas | 330
15.4 Infinite series: Taylor’s and Laurent’s series | 332
15.4.1 Taylor’s series | 333
15.4.2 Practical methods of obtaining power series | 334
15.4.3 Laurent series | 334
15.5 The residue theorem and integration by the method of residues | 335
15.5.1 Other methods for evaluating residues | 338
15.5.2 Residue theorem | 340

16 Series solutions and special functions | 344


16.1 Series solution of a first-order ODE | 344
16.2 Ordinary and regular singular points | 345
16.3 Series solutions of second-order ODEs | 350
16.4 Special functions defined by second-order ODEs | 353
16.4.1 Airy equation | 353
16.4.2 Bessel equation | 354
16.4.3 Modified Bessel equation | 354
16.4.4 Spherical Bessel equation | 356
XVI | Contents

16.4.5 Legendre equation | 357


16.4.6 Associated Legendre equation | 358
16.4.7 Hermite’s equation | 359
16.4.8 Laguerre’s equation | 360
16.4.9 Chebyshev’s equation | 360

17 Laplace transforms | 361


17.1 Definition of Laplace transform | 361
17.2 Properties of Laplace transform | 363
17.2.1 Examples of Laplace transform | 366
17.3 Inversion of Laplace transform | 369
17.3.1 Bromwich’s complex inversion formula | 371
17.4 Solution of linear differential equations by Laplace transform | 374
17.4.1 Initial value problems with constant coefficients | 374
17.4.2 Elementary derivation of Heaviside’s formula | 377
17.4.3 Two-point boundary value problems | 381
17.4.4 Linear ODEs with variable coefficients: | 382
17.4.5 Simultaneous ODEs with constant coefficients | 383
17.5 Solution of linear differential/partial differential equations by Laplace
transform | 384
17.5.1 Heat transfer in a finite slab | 385
17.5.2 TAP reactor model | 386
17.5.3 Dispersion of tracers in unidirectional flow | 389
17.5.4 Unsteady-state operation of a packed-bed | 398
17.6 Control system with delayed feedback | 404
17.6.1 PI control with delayed feedback | 404

Part IV: Linear ordinary differential equations-boundary value


problems

18 Two-point boundary value problems | 423


18.1 The adjoint differential operator | 423
18.1.1 The Lagrange identity for an n-th order linear differential
operator | 426
18.2 Two-point boundary value problems | 428
18.3 The adjoint boundary value problem | 434
18.3.1 Adjoint BCs and conditions for self-adjointness of the BVP | 438

19 The nonhomogeneous BVP and Green’s function | 445


19.1 Introduction to Green’s function | 445
19.2 Green’s function for second-order self-adjoint TPBVP | 447
Contents | XVII

19.3 Properties of the Green’s function for the second-order self-adjoint


BVP | 454
19.4 Green’s function for the n-th order TPBVP | 458
19.4.1 Physical interpretation of the Green’s function | 466
19.5 Solution of TPBVP with inhomogeneous boundary conditions | 471

20 Eigenvalue problems for differential operators | 478


20.1 Definition of eigenvalue problems | 478
20.2 Determination of the eigenvalues | 480
20.2.1 Relationship between the n-th order eigenvalue problem and the vector
eigenvalue problem | 481
20.3 Properties of the characteristic equation | 483

21 Sturm–Liouville theory and eigenfunction expansions | 496


21.1 Sturm–Liouville theory | 496
21.2 Eigenfunction expansions | 503
21.3 Convergence in function spaces and introduction to Banach and Hilbert
spaces | 505
21.3.1 Cauchy sequence | 505
21.3.2 Riemann and Lebesque integration | 506
21.3.3 Banach and Hilbert spaces | 506
21.3.4 Convergence theorems for eigenfunction expansions | 507
21.3.5 Fourier series (eigenfunction expansions) and Parseval’s
theorem | 508
21.3.6 Example of Fourier series (eigenfunction expansions) | 509
21.3.7 Fourier series (eigenfunction expansion) of the Green’s function | 514

22 Introduction to the solution of linear integral equations | 520


22.1 Introduction | 520
22.2 Transformation of an IVP into an IE of Volterra type | 521
22.3 Transformation of TPBVP into an IE of Fredholm type | 523
22.4 Solution of Fredholm integral equations with separable kernels | 524
22.4.1 Homogeneous equation | 524
22.4.2 Inhomogeneous equation | 526
22.5 Solution procedure for Volterra integral equations of the second
kind | 530
22.5.1 Method of successive approximation | 530
22.5.2 Adomian decomposition method | 533
22.6 Solution procedure for Volterra integral equations of the first
kind | 534
22.6.1 Differentiation approach | 534
22.6.2 Integration approach | 535
XVIII | Contents

22.7 Volterra integral equations with convolution kernel | 536


22.8 Fredholm integral equations of the second kind | 538
22.8.1 Solution by successive substitution | 538
22.8.2 Solution by Adomian decomposition method | 539
22.9 Fredholm integral equations with symmetric kernels | 540
22.10 Adjoint operator and Fredholm alternative | 542
22.11 Solution of FIE of the second kind with symmetric kernels | 543

Part V: Fourier transforms and solution of boundary and


initial-boundary value problems

23 Finite Fourier transforms | 551


23.1 Definition and general properties | 551
23.1.1 Example 1 (solution of Poisson’s equation) | 552
23.1.2 Example 2 (solution of heat/diffusion equation) | 553
23.1.3 Example 3 (solution of the wave equation) | 553
23.2 Application of FFT for BVPs in 1D | 554
23.2.1 Example 1 (Poisson’s equation in 1-D) | 554
23.2.2 Example 2: higher-order boundary value problems (coupled equations)
in 1D | 559
23.3 FFT for parabolic, hyperbolic and elliptic PDEs (two independent
variables) | 560
23.3.1 Example 3: heat/diffusion equation in a finite domain | 560
23.3.2 Example 4: Green’s function for the heat/diffusion equation in a finite
domain | 565
23.3.3 Example 5: heat/diffusion equation in the finite domain with time
dependent boundary condition | 566
23.3.4 Example 6: heat/diffusion equation in a finite domain with general
initial and boundary conditions | 569
23.3.5 Example 7 (wave equation) | 569
23.3.6 Example 8 (Poisson’s equation in 2-D) | 571
23.4 Additional applications of FFT in rectangular coordinates | 576
23.4.1 Example 9 (diffusion and reaction in a catalyst cube) | 576
23.4.2 Example 10 (axial dispersion model) | 579
23.4.3 Example 11 (Fourier’s ring problem) | 588
23.4.4 Example 12: (coupled equations) reaction–diffusion equations | 590

24 Fourier transforms on infinite intervals | 598


24.1 Fourier transform on (−∞, ∞) | 598
24.1.1 Fourier integral formula | 600
24.2 Finite Fourier transform and the Fourier transform | 602
Contents | XIX

24.2.1 Physical interpretation | 604


24.2.2 Properties of the Fourier transform | 605
24.2.3 Moments theorem for Fourier transform | 607
24.2.4 Fourier transform in spatial and cyclic frequencies | 609
24.2.5 Fourier transform and Plancherel’s theorem | 611
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 612
24.3.1 Heat equation in an infinite rod | 612
24.3.2 Solution of the heat equation in semi-infinite domain | 618
24.3.3 Transforms on the half-line | 623
24.3.4 Solution of heat/diffusion equation with radiation BC | 625
24.3.5 Fourier transforms on an infinite domain: solution of the wave
equation | 628
24.3.6 Laplace’s equation in infinite and semi-infinite domains | 631
24.3.7 Multiple Fourier transforms | 635
24.4 Relationship between Fourier and Laplace transforms | 637

25 Fourier transforms in cylindrical and spherical geometries | 642


25.1 BVP and IBVP in cylindrical and spherical geometries | 642
25.1.1 Cylindrical geometries | 643
25.1.2 Spherical geometries | 644
25.1.3 3D eigenvalue problems in cylindrical geometries | 646
25.1.4 3D eigenvalue problems in spherical geometries | 648
25.2 FFT method for 1D problems in spherical and cylindrical
geometries | 651
25.2.1 Steady-state diffusion and reaction in a cylindrical catalyst | 651
25.2.2 Transient heat/mass transfer in an 1D infinite cylinder | 654
25.2.3 Steady-state 1D diffusion and reaction in a spherical catalyst
particle | 658
25.2.4 Transient 1D heat conduction in a spherical geometry | 660
25.3 2D and 3D problems in cylindrical geometry | 662
25.3.1 Solution of Laplace’s equation inside a unit circle | 662
25.3.2 Vibration of a circular membrane | 664
25.3.3 Three-dimensional problems in cylindrical geometry | 669
25.4 2D and 3D problems in spherical geometry | 671
25.4.1 Poisson’s equation in a sphere | 671

Part VI: Formulation and solution of some classical chemical


engineering problems

26 The classical Graetz–Nusselt problem | 683


26.1 Model formulations and formal solution | 683
XX | Contents

26.1.1 Analysis of constant wall temperature boundary condition | 684


26.2 Parallel plate with fully-developed velocity profile | 688
26.3 Circular channel with fully-developed velocity profile | 690

27 Friction factors for steady-state laminar flow in ducts | 695


27.1 Model formulations and formal solution | 695
27.2 Specific example: parallel plates | 697
27.2.1 Direct solution | 697
27.2.2 FFT approach | 698
27.3 Specific case: elliptical ducts | 699

28 Multicomponent diffusion and reaction | 704


28.1 Generalized effectiveness factor problem | 704
28.1.1 Effectiveness factor | 706
28.1.2 Sherwood number (for internal mass-transfer coefficient) | 707
28.1.3 Exact expressions for Shi for some common geometries | 708
28.2 Multicomponent diffusion and reaction in the washcoat layer of a
monolith reactor | 709
28.3 Isothermal monolith reactor model for multiple reactions | 712
28.3.1 Example: reversible sequential reactions | 713

29 Packed-bed chromatography | 721


29.1 Model formulation | 721
29.1.1 Adsorption isotherm | 722
29.1.2 Nondimensional form | 724
29.1.3 Limiting case: p → 0 | 725
29.2 Similarity with heat transfer in packed-beds | 727
29.3 Impact of interphase mass transfer | 727
29.3.1 Pseudo-homogeneous model | 729
29.4 Solution of the hyperbolic model by Laplace transform | 729
29.5 Chromatography model with dispersion in fluid phase | 731
29.5.1 Limiting cases | 732
29.5.2 Lumped model for p → 0 | 732
29.5.3 Lumped model for p > 0 | 733
29.5.4 Chromatography model with dispersion in fluid phase for unit impulse
input | 735
29.5.5 Finite stage chromatography model | 736
29.6 Impact of intraparticle gradients | 737

30 Stability of transport and reaction processes | 740


30.1 Lapwood convection in a porous rectangular box | 740
30.1.1 Model formulation | 740
Contents | XXI

30.1.2 Conduction state and its stability | 742


30.1.3 Neutral curve and critical Rad | 745
30.2 Chemical reactor stability and dynamics | 748
30.2.1 Model of a cooled CSTR | 749
30.2.2 Dimensionless form of model for a single reaction | 750
30.2.3 Stability analysis | 751

Bibliography | 759

Index | 761
|
Part I: Applied matrix algebra
1 Matrices and linear algebraic equations
1.1 Simultaneous linear equations
We consider m simultaneous linear equations in n unknowns:

a11 u1 + a12 u2 + ⋅ ⋅ ⋅ + a1n un = b1


a21 u1 + a22 u2 + ⋅ ⋅ ⋅ + a2n un = b2
. (1.1)
.
am1 u1 + am2 u2 + ⋅ ⋅ ⋅ + amn un = bm

or in matrix notation,

a11 a12 a13 . . a1n u1 b1


[ ][ ] [ ]
[ a21 a22 a23 . . a2n ][ u2 ] [ b2 ]
[ ][ ] [ ]
[ . . . . . . ][ . ] [ ]
[ ][ ]=[ . ]
[ . . . . . . ][ . ] [
] ]
[ ][ [ . ]
[ ][ ] [ ]
[ . . . . . . ][ . ] [ . ]
[ am1 am2 am3 . . amn ] [ u n ] [ bm ]
Au = b (1.2)

where A is the coefficient matrix with m rows and n columns (m × n matrix), u is the
unknown vector (n × 1 matrix) and b is a m × 1 vector of constants. The elements aij of
the matrix A and bi of the vector b may be real or complex numbers. The matrix

a11 a12 a13 . . a1n b1


[ ]
[ a21 a22 a23 . . a2n b2 ]
[ ]
[ . . . . . . . ]
[ ]
[ . . . . . . . ]
[ ]
[ ]
[ . . . . . . . ]
[ am1 am2 am3 . . amn bm ]

with m rows and (n + 1) columns is called the augmented matrix and is denoted by

aug A = [A b]

and the matrix A will often be written as

A = [aij ]; i = 1, 2, . . . , m; j = 1, 2, . . . , n

where aij is the element of A in the i-th row and j-th column. When b = 0, we obtain
the homogeneous system of equations

https://doi.org/10.1515/9783110739701-001
4 | 1 Matrices and linear algebraic equations

Au = 0 (1.3)

or

a11 u1 + a12 u2 + ⋅ ⋅ ⋅ + a1n un = 0


a21 u1 + a22 u2 + ⋅ ⋅ ⋅ + a2n un = 0
. (1.4)
.
am1 u1 + am2 u2 + ⋅ ⋅ ⋅ + amn un = 0

As stated in the Introduction, many of the ideas involved in the solution of linear differ-
ential equations are generalizations of those involved in the solution of the homoge-
neous algebraic equation (1.3) and the inhomogeneous algebraic equation (1.2). Gen-
erally speaking, linear equations have either 0, 1 or ∞ number of solutions. In what
follows, we shall discuss the conditions under which equations (1.1)–(1.2) have no so-
lution (inconsistent), a unique solution and an infinite number of solutions.

1.2 Review of basic matrix operations


We review here briefly some terminology, basic matrix operations and some special
matrices. We shall refer to the m × n matrix

a11 a12 a13 . . a1n


[ ]
[ a21 a22 a23 . . a2n ]
[ ]
[ . . . . . . ]
A=[
[ .
]
]
[ . . . . . ]
[ ]
[ . . . . . . ]
[ am1 am2 am3 . . amn ]

as real-valued if all its elements are real numbers or real-valued functions. It will be
called complex-valued if one or more of the elements is a complex number or complex-
valued function.
By convention, the elements of a matrix are double subscripted to denote location.
For example, aij refers to the element appearing in the i-th row of the j-th column. If
the number of rows equals to the number of columns (m = n), the matrix is referred
to as a square matrix of order n. (Square matrices appear in most of our applications).
In a square matrix, the elements aii (i = 1, 2, 3, . . . , n) are called diagonal elements. For
the special case n = 1, the matrix is called a column vector (with m elements) while
for m = 1, we have a row vector. The transpose of an m × n matrix A is the n × m matrix
obtained by interchanging the rows and columns of A and is denoted by AT .
1.2 Review of basic matrix operations | 5

1.2.1 Matrix addition and subtraction

Let A = [aij ] be an m1 × n1 matrix and B = [bij ] be an m2 × n2 matrix. Then A and B can


be added only if m1 = m2 and n1 = n2 , i. e., the number of rows and columns in A and
B are equal. The sum is given by

C = A + B,

where

cij = aij + bij

i. e., the sum is obtained by adding the corresponding elements. Similarly, we define
for any scalar k,

A ± kB = [aij ± kbij ]

1.2.2 Matrix multiplication

Let A = [aij ] be an m × n matrix and B = [bij ] be another p × r matrix. If the number of


columns of A equals to the number of rows of B (i. e., n = p) we say that A and B are
conformable to multiplication or the product of AB is defined. We define

AB = C

where the elements of the m × r matrix C are given by

n
cij = ∑ aik bkj ; i = 1, 2, . . . , m, j = 1, 2, . . . , r
k=1

From this definition, it can be shown that matrix multiplication is associative and dis-
tributes over addition. However, it is not commutative. Thus,

A(BC) = (AB)C
A(B + C) = AB + AC,

whenever the products are defined. In general,

AB ≠ BA

even in the cases in which both the products are defined. Two square matrices A and
B for which AB = BA are said to commute with each other. Also, it may be shown that
6 | 1 Matrices and linear algebraic equations

(AB)T = BT AT

where the superscript T on the matrix denotes the transpose.

1.2.3 Special matrices

We now review some special types of square matrices that play an important role in
our applications.
A diagonal matrix is a square matrix of all zero elements except possibly those on
the main diagonal (aij = 0 if i ≠ j). The zero matrix is a matrix having all its elements
equal to zero.
An identity matrix of order n is a diagonal matrix of order n having all its diagonal
elements equal to one. It is usually denoted by Im , or simply by I when the order is not
specified. Thus,

1 0 0 0
[ 0 1 0 0 ]
[ ]
I4 = [ ]
[ 0 0 1 0 ]
[ 0 0 0 1 ]

A matrix with real elements is called symmetric if it is equal to its transpose, i. e.,

A = AT

or

aij = aji

for a real symmetric matrix. A square matrix with complex elements is called Hermi-
tian if it equals its conjugate transpose, i. e.,

A = (A)T = A∗

or

aij = ā ji ,

where ’∗’ stands for the transpose and complex conjugation and the overbar stands
for only complex conjugation. Thus,

1 2
A=[ ]
2 −4
1.3 Elementary row operations and row echelon form of a matrix | 7

is a (real) symmetric matrix while

1 i 2 + 3i
[ ]
B=[ −i −4 3 ]
[ 2 − 3i 3 6 ]

is a Hermitian matrix.
A square matrix is said to be normal if

AA∗ = A∗ A,

i. e., if it commutes with its conjugate transpose. If A has real elements then AT has
real elements and A is normal if it commutes with its transpose.
A square matrix A is called lower triangular if aij = 0 for j > i, i. e., all the elements
above the diagonal are zero. Similarly, A is called upper triangular if aij = 0 for i > j,
or equivalently all the elements below the diagonal are zero. For example,

1 3 5
[ ]
A=[ 0 7 9 ]
[ 0 0 8 ]

is an upper triangular matrix of order 3.


A square matrix A is called tridiagonal if aij = 0 for |i − j| > 1, i. e., all elements
except those on the diagonal and two main off-diagonals are zero. For example,

−2 1 0 0 0
[ 1 −3 1 0 0 ]
[ ]
[ ]
A=[ 0 1 −4 1 0 ]
[ ]
[ 0 0 1 −5 1 ]
[ 0 0 0 1 −6 ]

is a tridiagonal matrix of order 5.


There are many other special matrices that will appear in our applications. We
shall discuss them as they arise.

1.3 Elementary row operations and row echelon form of a matrix


Consider the m simultaneous linear algebraic equations in n-unknowns

Au = b

and recall the following operations that are used to simplify the system and obtain a
solution:
8 | 1 Matrices and linear algebraic equations

(a) Rearrangement (or reordering) of the rows.


(b) Multiplication of any row by a nonzero constant
(c) Multiplication of any row by a constant and adding to another row.

We note that these operations do not change the solution. Thus, we define elementary
row operations (ERO) of three basic types on the rows of a matrix:
(E1): interchange of any two rows of a matrix
(E2): multiplication of any row by a nonzero scalar
(E3): multiplication of a row by a constant and add to another row, element by ele-
ment

(Similarly, we can define elementary column operations on the columns of a matrix


but that is not of interest here.) Given any matrix A, we can use elementary row opera-
tions on its rows to reduce it to row echelon form. A matrix is said to be in row echelon
form if
(i) Any nonzero row is above that of any zero row
(ii) The first nonzero element in any nonzero row is unity
(iii) If the first nonzero element in a row appears in column r, then all elements in
column r in succeeding rows are zero.
(iv) The first nonzero element in row j occurs to the right of the first nonzero element
in row i if j > i.

Examples 1.1.

1 3 1 5 4
0 1 2 3 1
A=( 0 0 1 4 6 )
0 0 0 1 0
0 0 0 0 0

is in row echelon form.

1 0 5 1 2 3 4
B=( 2 0 0 ); C=( 0 1 1 5 )
0 0 1 0 3 6 1

B and C are not in row echelon form.

1.3.1 Representation of elementary row operations

Elementary row operations on an m × n matrix may be represented by using matrix


multiplication. To illustrate, we consider the 3 × 4 matrix
1.3 Elementary row operations and row echelon form of a matrix | 9

a11 a12 a13 a14


A = ( a21 a22 a23 a24 )
a31 a32 a33 a34

and show that all EROs on A can be performed by doing the same operations on the
m × m identity matrix and premultiplying A by the resulting matrix. For this example,
we take

1 0 0
I3 = ( 0 1 0 )
0 0 1

Suppose that we interchange rows 2 and 3 (R2 ←→ R3 ). This transforms I3 to

1 0 0
E1 = ( 0 0 1 )
0 1 0

We note that

a11 a12 a13 a14


E1 A = ( a31 a32 a33 a34 )
a21 a22 a23 a24

Similarly, let

1 0 0
E2 = ( 0 k 0 ), k ≠ 0
0 0 1
1 0 0
E3 = ( 0 1 k )
0 0 1

Then

a11 a12 a13 a14


E2 A = ( ka21 ka22 ka23 ka24 )
a31 a32 a33 a34
a11 a12 a13 a14
E3 A = ( a21 + ka31 a22 + ka32 a23 + ka33 a24 + ka34 )
a31 a32 a33 a34

Thus, every elementary operation on A can be represented as a premultiplication of A


by Ei (i = 1, 2, 3, . . .). This property implies that given any A, we can find a m × m matrix
10 | 1 Matrices and linear algebraic equations

P such that PA is in row echelon form. [The matrix P is the product of the elementary
matrices Ei ].

1.4 Rank of a matrix and condition for existence of solutions


We define the rank (or more precisely the row rank) of a matrix A as the number of
nonzero rows in its row echelon form (Note: There are many other equivalent defini-
tions of rank and it can be shown that the row rank and column rank are identical).

Definition. If A is a square matrix of order n, then it is called nonsingular (or invert-


ible) if rank A = n. If rank of A < n, then A is called singular.

Example 1.2. We consider the matrices

2 1 0 1 2 −1
A=( 3 6 1 ), B=( 3 8 9 )
5 7 1 2 −1 2

and note that their row-echelon forms are given by


1
1 2
0 1 2 −1
2
AR = ( 0 1 9
), BR = ( 0 1 6 )
0 0 0 0 0 1

Thus, rank A = 2 while rank B = 3. Thus, A is a singular matrix while B is nonsingular.

We now consider the linear equations Au = 0 and Au = b and state the conditions
under which they have solutions.

1.4.1 The homogeneous system Au = 0

The following theorem may be stated for the homogeneous system:

Theorem. Consider the m simultaneous linear homogeneous algebraic equations in n


unknowns:

Au = 0, u ∈ ℝn /ℂn (1.5)

A necessary and sufficient condition for (1.5) to have a nontrivial (nonzero) solution is

rank(A) < n

Proof. The necessity is clear, for suppose that rank A = n. Then reducing A to echelon
form gives the following equivalent set of equations:
1.4 Rank of a matrix and condition for existence of solutions | 11

u1 + γ12 u2 + γ13 u3 + ⋅ ⋅ ⋅ + γ1n un = 0


u2 + γ23 u3 + ⋅ ⋅ ⋅ + γ2n un = 0
u3 + ⋅ ⋅ ⋅ + γ3n un = 0
⋅ (1.6)

un−1 + γn−1,n un = 0
un = 0

Here, γij are the elements in the echelon form of A. We note that the only solution to
equations (1.6) is the trivial one.
To prove sufficiency (i. e., there is a nonzero solution when rank A < n), we let
rank A = r. Then, based on the row echelon form of A, the reduced equivalent system
may be written as

u1 + γ12 u2 + γ13 u3 + ⋅ ⋅ ⋅ + γ1n un = 0


u2 + γ23 u3 + ⋅ ⋅ ⋅ + γ2n un = 0
⋅ (1.7)

ur + γr r+1 ur+1 + ⋅ ⋅ ⋅ + γr n un = 0

Now, we can choose nonzero values for (ur+1 , . . . , un ) and evaluate (u1 , u2 , . . . , ur )
uniquely from equations (1.7). Hence, we get a nontrivial solution when r < n.

Example 1.3. Consider the homogeneous system in three variables

u1 − u2 = 0
u2 − u3 = 0
u1 + u3 = 0

for which rank A = 3. Thus, the only solution is the trivial one.

Example 1.4. Consider the homogeneous system in four variables (with complex co-
efficients)

u1 − iu2 = 0
u2 + u3 = 0
u1 + u2 − u4 = 0
u2 + iu3 + iu4 = 0
12 | 1 Matrices and linear algebraic equations

or Au = 0 with

1 −i 0 0
0 1 1 0
A=( ); i = √−1.
1 1 0 −1
0 1 i i

It may be verified that rank A = 3 and

i
1
u = α( )
−1
1+i

is a solution for any α (real or complex constant).

Example 1.5. Consider the homogeneous system in four variables

u1 − 2u2 − u4 = 0
−2u1 + 3u2 + 3u3 = 0
−u2 + 3u3 − 2u4 = 0
3u1 − 7u2 + 3u3 − 5u4 = 0

for which rank A = 2. It may be verified that

3 0
0 1
u = c1 ( ) + c2 ( )
2 −1
3 −2

is a solution for any constants c1 and c2 . [We discuss in Chapter 3 how to obtain this
solution.]

1.4.2 The inhomogeneous system Au = b

We now consider the inhomogeneous (or non-homogeneous) system and examine


when it has 0, 1 or an infinite number of solutions. We use the elementary row opera-
tions to reduce the augmented matrix to row echelon form. Without loss of generality,
we can assume the echelon form of the augmented matrix is given by
1.4 Rank of a matrix and condition for existence of solutions | 13

1 γ12 γ13 . . γ1r γ1r+1 . . γ1n α1


[ ]
[ 0 1 γ22 . . γ2r γ2r+1 . . γ2n α2 ]
[ ]
[ . . . . . . . . . . . ]
[ ]
[ . . . . . . . . . . . ]
[ ]
[ ] (1.8)
[ 0 0 0 . . 1 γr r+1 . . γr n αr ]
[ ]
[ 0 0 0 . . 0 0 . . 0 αr+1 ]
[ ]
[ . . . . . . . . . . . ]
[ ]
[ 0 0 0 . . . . . . 0 αm ]
We now consider various cases
Case 1: r ≤ m ≤ n (more unknowns than equations). The equations are consistent
only if

αr+1 = 0
.
αm = 0

If αi ≠ 0 for any r + 1 ≤ i ≤ m, the rank of A and aug A are different and the equations
are inconsistent. Hence, no solution exists in this case. Thus, a necessary condition
for solutions to exist is

rank A = rank(aug A) (1.9)

We now show that the above condition is also sufficient. Suppose that (1.9) is satisfied
and let

rank A = r

We can rearrange (1.8) as follows:

ur = −γr r+1 ur+1 − γr r+2 ur+2 − ⋅ ⋅ ⋅ − γr n un + αr


ur−1 = −γr−1 r ur − γr−1 r+1 ur+1 − ⋅ ⋅ ⋅ − γr−1 n un + αr−1
⋅ (1.10)

u1 = −γ12 u2 − γ13 u3 − ⋅ ⋅ ⋅ − γ1r+1 ur+1 − ⋅ ⋅ ⋅ − γ1n un + α1

Thus, we can choose (n − r) of the variables (ur+1 , . . . , un ) as we please and obtain the
values of remaining variables using (1.10) above to get a solution. [In Chapter 3, we
shall show that the solution space has dimension (n − r)].
Case 2: r ≤ n ≤ m (more equations than unknowns) In this case again, for consis-
tency, we require
14 | 1 Matrices and linear algebraic equations

αr+1 = 0
.
.
αm = 0

and the last m − r equations are redundant. If r = n, then there is a unique solution.
If r < n, we can assign the values of (n − r) variables at pleasure and determine the
values of the other variables using (1.10). Thus, we have the following theorem.

Theorem.
(a) A necessary and sufficient condition for the system Au = b to have solutions is
rank A = rank(aug A)
(b) If rank A = rank(aug A) = r and n is the number of unknowns, we can assign val-
ues of (n − r) of the unknowns and determine the remaining r unknowns uniquely
provided the matrix of coefficients of these unknowns has rank r.

An important (and very useful) corollary that follows from this theorem is given below.

Corollary. Suppose that is A a square matrix and consider the inhomogeneous system
Au = b. This system has a unique solution for any b iff (if and only if) the only solution
to the corresponding homogeneous system Au = 0 is the trivial one.

The generalization of the above theorem to the case in which A is replaced by a linear
operator is called Fredholm alternative (theorem) and will be discussed later.

Example 1.6. Consider the inhomogeneous system in three variables

u1 − u2 = b1
u2 − u3 = b2
u1 + u3 = b3

for which rank(A) = 3 (see Example 1.3). Thus, there is a unique solution to the above
equations for any choice of b1 , b2 and b3 .

Example 1.7. Consider the inhomogeneous system in four variables

u1 − 2u2 − u4 = b1
−2u1 + 3u2 + 3u3 = b2
−u2 + 3u3 − 2u4 = b3
3u1 − 7u2 + 3u3 − 5u4 = b4

for which rank(A) = 2. It may be verified that the system is consistent iff b3 = 2b1 + b2
and b4 = 5b1 + b2 . We shall return to this example in Chapter 3.
1.5 Gaussian elimination and LU decomposition | 15

1.5 Gaussian elimination and LU decomposition


We now discuss the Gaussian elimination algorithm for determining (or numerically
computing) the solution(s) of the linear system Ax = b. [For notational convenience,
we shall use x in place of u in this section]. Before we illustrate the Gaussian algorithm,
we consider two special cases of the linear system

Ax = b (1.11)

in which A is an n × n nonsingular upper or lower triangular matrix.

1.5.1 Lower and upper triangular systems

We first consider the case in which A is lower triangular and write equation (1.11) as

Lx = b (1.12)

or in expanded form

l11 x1 = b1
l21 x1 + l22 x2 = b2
l31 x1 + l32 x2 + l33 x3 = b3
(1.13)
.
.
ln1 x1 + ln2 x2 + ⋅ ⋅ ⋅ + lnn xn = bn

Since we assumed L is nonsingular, lii ≠ 0 for any i and we can solve equations (1.13)
by forward substitution:

x1 = b1 /l11
x2 = (b2 − l21 x1 )/l22
.
. (1.14)
k−1
(bk − ∑ lkj xj )
j=1
xk = ; k = 1, 2, . . . , n
lkk

It is of practical interest to count the number of arithmetic operations (additions or


subtractions denoted by AS and multiplications or divisions denoted by MD) needed
to obtain the solution. It follows from (1.14) that the operation count in solving the
lower triangle system is given by
16 | 1 Matrices and linear algebraic equations

#AS = 0 + 1 + 2 + ⋅ ⋅ ⋅ + (n − 1)
n(n − 1) n2
= ≈ for a large n (1.15)
2 2
#MD = 1 + 2 + ⋅ ⋅ ⋅ + n
n(n + 1) n2
= ≈ for large n (1.16)
2 2

Usually, when n is large, AS is equal to MD and hereafter, we shall only count MD.
(Another reason for this is that multiplication or division on the computer takes much
longer than addition or subtraction). Thus, the operation count (OC) for solving a
lower triangular system, given by (1.12), by forward substitution is 0.5n2 (n ≫ 1).
Next, we consider the upper triangular system

Ux = c (1.17)

or in expanded form

u11 x1 + u12 x2 + ⋅ ⋅ ⋅ + u1n xn = c1


u22 x2 + ⋅ ⋅ ⋅ + u2n xn = c2
(1.18)
.
unn xn = cn

Again, we assume that U is not singular, i. e., uii ≠ 0 for any i. The solution of (1.18)
can be obtained by back substitution as

xn = cn /unm
xn−1 = (cn−1 − un−1 n xn )/un−1,n−1
.
(1.19)
.
n
xk = (ck − ∑ uk,j xj )/uk,k , k = n, n − 1, . . . , 1
k+1

We note that the operation count for solving the upper triangular system is also 0.5n2
(for n ≫ 1).

1.5.2 Gaussian elimination

We now describe the Gaussian elimination algorithm for solving the general linear
system given by equation (1.11). In this method, we first reduce Ax = b to an equivalent
1.5 Gaussian elimination and LU decomposition | 17

upper triangular system Ux = c by using elementary row operations of type 3. The


upper triangular system is then solved by back substitution.
Denote the augmented matrix of the original system by

aug A(1) = [A(1) b(1) ]


a(1) a(1) . . . a1n
(1)
b1(1)
[ 11 12
]
[ a(1) a(1) . . . a2n
(1)
b2(1) ]
[ 21 22 ]
[ ]
[ ]
=[ . . . . . . .
]
[ . . . . . . . ]
[ ]
[ . . . . . . . ]
[ ]
[ an1 a(1) ann bn(1)
(1) (1)
n2 . . . ]

In the first step, we assume a(1)


11 ≠ 0 and define row multipliers

mi1 = a(1) (1)


i1 /a11 ; i = 2, 3, . . . , n

Multiply row 1 by mi1 and subtract from row i (i = 2, . . . , n). At the end of the step, the
form of the augmented system is given by

a(1) a(1) . . . a1n


(1)
b1(1)
[ 11 12
]
[ 0 a(2) . . . a(2) b(2) ]
[ 22 2n 2 ]
[ ]
[ ]
aug A(2) =[ . . . . . . .
]
[ . . . . . . . ]
[ ]
[ . . . . . . . ]
[ ]
[ 0 a(2)
n2 . . . a(2)
nn b(2)
n ]

where a(2)
ij
= a(1)
ij
− mi1 a(1)
1j
; i, j = 2, . . . , n. In the second step, we assume a22
(2)
≠ 0 and
continue to eliminate the unknowns leaving the first row undisturbed. After (n − 1)
steps, we obtain the upper triangular system

a(1)
11 a(1)
12 . . . a1n
(1)
b1(1)
[ ]
[ 0 a(2)
22 . . . a(2)
2n b(2)
2
]
[ ]
[ ]
[ 0 0 a(3)
33 . . a(3)
3n b(3)
3
]
[ ]
aug A(n) =[ . . . . . . . ]
[ ]
[ ]
[ . . . . . . . ]
[ ]
[ . . . . . . . ]
[ 0 0 . . . a(n)
nn b(n)
n ]

or equivalently,

Ux = c (1.20)
18 | 1 Matrices and linear algebraic equations

This completes the elimination procedure. The upper triangular system given by (1.20)
can be solved by back substitution as shown earlier. [Remark: The element aii(i) which
is at the upper left corner after i − 1 steps is called the pivot. When the Gaussian elim-
ination algorithm is implemented in practice, the rows are interchanged so that the
pivot element has the maximum absolute value. This partial pivoting procedure min-
imizes round off errors when solving large systems of linear equations. However, this
procedure does not preserve the initial matrix.]

1.5.3 LU decomposition/factorization

Let

a(1)
11 a(1)
12 . . . a(1)
1n 1 0 0 0 . . 0
[ ] [ ]
[ 0 a(2) . . . a(2) ] [ m21 1 0 0 . . 0 ]
[ 22 2n ] [ ]
[ ] [ m31 m32 1 . . . 0 ]
[ ]
U=[ 0 0 a(3) a3n
(3)
33 . . ], L=[ ]
[ . ] [ . . . . . . . ]
[ . . . . . ] [ ]
[ ] [ ]
[ . . . . . . ] [ . . . . . . . ]
[ 0 0 . . . a(n)
nn ] [ mn1 mn2 0 . . . 1 ]

where mij are the row multipliers determined in the elimination process (Remark:
These row multipliers can be stored in place of zeros during the elimination process).
A straightforward but tedious calculation shows that

A = LU (1.21)

We also note that the number of operations (AS or MD) needed to factorize A as in
(1.21) is given by

OC = (n − 1)2 + (n − 2)2 + ⋅ ⋅ ⋅ + 12
(n − 1)n(2n − 1) 1 3
= ≈ n (for n ≫ 1)
6 3

Thus, the total operation count for solving Ax = b is 31 n3 + n2 (for n ≫ 1). Hence,
for large n, the major part of the work is the LU decomposition. We also note that a
matrix and vector multiplication involves n2 operations while multiplication of two
n × n matrices requires n3 operations.

Example 1.8.
x1 + 2x2 + x3 = 3
2x1 + 3x2 − x3 = −6
3x1 − 2x2 − 4x3 = −2
1.6 Inverse of a square matrix | 19

1 2 1 3
[ ]
aug A(1) = [ 2 3 −1 −6 ]
[ 3 −2 −4 −2 ]
1 2 1 3
[ ]
aug A(2) = [ 0 −1 −3 −12 ]
[ 0 −8 −7 −11 ]
1 2 1 3
[ ]
aug A (3)
=[ 0 −1 −3 −12 ]
[ 0 0 17 85 ]
17x3 = 85, x3 = 5
−x2 − 3x3 = −12 ⇒ x2 = 12 − 3x2 = −3
x1 + 2x2 + x3 = 3 󳨐⇒ x1 = 3 + 6 − 5 = 4

We also note that

1 2 1 1 00
[ ] [ ]
U=[ 0 −1 −3 ] , L = [ 2 1
0 ]
[ 0 0 17 ] [ 3 8
1 ]
1 0 0 1 2 1 1 2 1
[ ][ ] [ ]
LU = [ 2 1 0 ] [ 0 −1 −3 ]=[ 2 3 −1 ] = A
[ 3 8 1 ] [ 0 0 17 ] [ 3 −2 −4 ]

1.6 Inverse of a square matrix


If A is a square matrix of order n, the inverse of A is another square matrix B such that

AB = BA = In (1.22)

The inverse of A is often denoted by A−1 . When A has an inverse, it is said to be non-
singular or invertible. If A does not have an inverse, it is said to be singular.

1.6.1 Properties of inverse

The following properties may be verified from the definition of the inverse:
1. A has an inverse if and only if it has rank n.
2. When it exists, the inverse of A is unique.
3. When A is nonsingular,

=A (1.23)
−1
(A−1 )
20 | 1 Matrices and linear algebraic equations

4. If A and B are square matrices of same order and both have inverses, then

(AB)−1 = B−1 A−1 (1.24)

5. If A is invertible, so is AT and
T
(AT ) (1.25)
−1
= (A−1 )

1.6.2 Calculation of inverse

Suppose that A and B are nonsingular square matrices of order n and

AB = I. (1.26)

Then it may be shown that A and B commute and

BA = I (1.27)

Thus, to calculate A−1 , it is sufficient to satisfy the relation given by (1.26). Suppose
that the columns of B are denoted by b1 , b2 , b3 , . . . , bn and let ej (j = 1, 2, . . . , n) be
the n-dimensional column vector having unity element in row j and zeros everywhere
else. Then (1.26) is equivalent to

Abj = ej ; j = 1, 2, 3, . . . , n (1.28)

and the j-th column of A−1 can be found by solving the linear equations given by equa-
tion (1.28). Thus, we have the following two methods for finding the inverse of a non-
singular matrix A.
Method 1: Use LU decomposition to factor A = LU. Then solve LUbj = ej ; j =
1, 2, . . . , n. We note that this procedure gives A−1 with a total of 31 n3 operations (for LU
decomposition) plus n × n2 operations (for solving equation (1.28)). Thus, total opera-
tion count is 43 n3 .
Method 2: We form the n × 2n augmented matrix [A I] and use elementary row
operations to transform it to the form [I B], where B = A−1 . It can be shown that the
operation count for this procedure is the same as that for method 1.

Example 1.9.
5 8 1
[ ]
A=[ 0 2 1 ]
[ 4 3 −1 ]

We use method 2 and form the augmented matrix


1.7 Vector-matrix formulation of some chemical engineering problems | 21

5 8 1 1 0 0
[ ]
[ 0 2 1 0 1 0 ]
[ 4 3 −1 0 0 1 ]

(− 45 )R1 + R3 gives

5 8 1 1 0 0
[ ]
[ 0 2 1 0 1 0 ]
[ 0 − 175 − 95 − 45 0 1 ]

R1 17 R2
5
, ( 10 )R2 + R3 and 2
gives

8 1 1
1 5 5 5
0 0
[ 1 1
]
[ 0 1 0 0 ]
[ 2 2 ]
1
[ 0 0 − 10 − 45 17
10
1 ]

R3 × 5 + R2 and R3 × (−10) gives

8 1 1
1 5 5 5
0 0
[ ]
[ 0 1 0 −4 9 5 ]
[ 0 0 1 8 −17 −10 ]

R3 (− 51 ) + R1 and R2 (− 85 ) + R1 gives

1 0 0 5 −11 −6
[ ]
[ 0 1 0 −4 9 5 ]
[ 0 0 1 8 −17 −10 ]

Thus,

5 −11 −6
[ ]
A−1 = [ −4 9 5 ]
[ 8 −17 −10 ]

1.7 Vector-matrix formulation of some chemical engineering


problems
In this section, we consider some chemical engineering applications of elementary
matrix concepts. First, we present the formulation of some flow and reaction problems
in the vector-matrix notation. Next, we illustrate the application of the elementary
matrix concepts discussed above. Further analysis of these and other similar models
will be considered in later chapters.
22 | 1 Matrices and linear algebraic equations

1.7.1 Batch reactor: evolution equations with multiple reactions

Consider a batch reactor of constant volume in which the reactions


S
∑ νij Aj = 0; i = 1, 2, . . . , R (1.29)
j=1

occur. There are R reactions among S species. Let VR be the volume of reactor contents
(assumed to be constant) and Cj be the molar concentration of species Aj . Further as-
sumptions are: (i) the reactor contents are well mixed so that there are no spatial gra-
dients and Cj is uniform throughout the tank, (ii) the density of the fluid is constant,
(iii) isothermal system and (iv) the volume of fluid in the tank remains constant. Let
ri (C1 , . . . , CS ) be the rate of reaction i and νij be the stoichiometric coefficient of species
Aj in reaction i. The mole balance for species Aj is

Rate of production of moles Aj


{Rate of accumulation of moles of Aj } = { }
due to various chemical reactions

In the notation introduced above, this leads to

R
d
{VR Cj } = (∑ νij ri )VR ; j = 1, 2, . . . , S (1.30)
dt i=1

Since VR is assumed to be constant, the above balance may be simplified and written
in the following vector form:

dc
= ν T r(c) (1.31)
dt

or in expanded form

C1 ν11 ν21 . . . νR1 r1


[ ] [ ][ ]
[ C2 ] [ ν12 ν22 . . . νR2 ][ r2 ]
[ ] [ ][ ]
d [[ . ] [ .
]=[ . . . . . ][
][ . ]
] (1.32)
dt [
[ . ] [ .
] [ . . . . . ][
][ . ]
]
[ ] [ ][ ]
[ . ] [ . . . . . . ][ . ]
C
[ S ] [ ν1S ν2S . . . νRS r
][ R ]

Here, c is the S × 1 vector of concentrations, r(c) is the R × 1 vector of reaction rates (as
a function of various concentrations) and ν is the R × S matrix of stoichiometric coeffi-
cients. To complete the model, we also specify the initial condition corresponding to
the species concentrations at time zero, i. e.,

c(t = 0) = c0 (1.33)
1.7 Vector-matrix formulation of some chemical engineering problems | 23

[Note that the same model is obtained for the case of an ideal isothermal tubular plug
flow reactor (PFR) with time replaced by space time. In this case, the initial condition
is the vector of inlet concentrations]. For the special case of linear kinetics, we have

̂
r(c) = K.c, (1.34)

where K̂ is the R × S matrix of first-order rate constants. Defining the S × S matrix of


rate constants K by

K = ν T .K
̂ (1.35)

we obtain the batch reactor evolution equation (initial value problem):

dc
= K.c, t > 0; c(t = 0) = c0 (1.36)
dt

As an example, we consider the monomolecular reaction scheme (shown in Figure 1.1)


where kji is the first-order rate constant for the formation of species Aj from Ai . Here,
S = 3, R = 6 and ordering the six reactions as (A1 → A2 , A2 → A1 , A1 → A3 , A3 →
A1 , A2 → A3 , A3 → A2 ) the various matrices may be expressed as

−1 1 −1 1 0 0
T
ν =( 1 −1 0 0 −1 1 ) (1.37)
0 0 1 −1 1 −1
k21 0 0
0 k12 0
−(k21 + k31 ) k12 k13
(
̂=( k31 0 0 )
);
K ( ) K=( k21 −(k12 + k32 ) k23 )
0 0 k13
k31 k32 −(k13 + k23 )
0 k32 0
( 0 0 k23 )
(1.38)

[Remark: The ordering of the reactions changes the matrices ν and K ̂ but K depends
only on the ordering of the species]. Since the total concentration is fixed for this spe-
cific reaction system, i. e., C1 + C2 + C3 = C10 + C20 + C30 = C0 , we can define the mole
fraction of species Aj as xj = Cj /C0 and write the evolution equation (1.36) as

dx
= K.x, t > 0; x(t = 0) = x0 . (1.39)
dt
24 | 1 Matrices and linear algebraic equations

Figure 1.1: Schematic diagram of monomolecular first-order reaction scheme between 3 species.

1.7.2 Continuous-flow stirred tank reactor (CSTR): transient and steady-state


models with multiple reactions

We now extend the above batch reactor model to include the flow terms. With the same
assumptions as those in the batch reactor case, the species balance equations for the
isothermal case may be expressed as

Rate of accumulation Inlet molar Outlet molar


{ }= { }−{ }
of moles of Aj flow rate of Aj flow rate of Aj
Rate of production of moles Aj
+{ },
due to various chemical reactions

which in mathematical form may be expressed as

R
d
[VR Cj ] = qin Cj,in (t) − qout Cj + (∑ νij ri )VR (1.40)
dt i=1

For the special case of constant and equal in and out flow rates qin = qout = q and
constant VR , the above equation simplifies to

Cj,in (t) − Cj R
d
[Cj ] = + (∑ νij ri ), (1.41)
dt τ i=1

where τ is the residence or space time, defined as the volume of reactor contents over
the volumetric flow rate (τ = VR /q). In vector-matrix form, the above equations may
be expressed as

dc 1
= (cin (t) − c) + νT r(c), t>0 (1.42)
dt τ

along with the initial condition given by equation (1.33). Here, the S × 1 vector c rep-
resents the species molar concentration and ν is the R × S stoichiometric coefficient
matrix. If the inlet concentrations are independent of time, the steady-state reactor ef-
1.7 Vector-matrix formulation of some chemical engineering problems | 25

fluent concentrations are described by the following set of nonlinear algebraic equa-
tions:
1
(c − cs ) + ν T r(cs ) = 0. (1.43)
τ in

For the special case of linear kinetics and constant feed/inlet concentrations, we ob-
tain the linear system of equations

(I + K∗ τ)cs = cin ; K∗ = −ν T .K
̂ (1.44)

where cs is the vector of steady-state effluent concentrations.

1.7.3 Two interacting tank system: transient model for mixing with in- and outflows

Consider the interacting two tank system shown in Figure 1.2. To develop a mathemat-
ical model for describing the transient behavior of the system, we make the follow-
ing assumptions: (i) each tank is an ideal mixer so that the concentration is uniform
within each tank so that the concentration of species A (salt, tracer or a chemical) in
the stream leaving each tank is equal to that in the tank, (ii) the flow rate entering
each tank (q) is constant (independent of time) but the inlet concentration to tank 1,
Cin (t), may change with time, (iii) the exchange or circulation flow rate (qe ) between
the tanks is constant, (iv) the density of the fluid is constant and the volume of fluid
that each tank holds is constant at VR1 and VR2 [This assumption implies that the to-
tal volumetric flow rate of the streams entering must be equal to that of the streams
leaving the tank], (v) no chemical reaction takes place in either tank. The notation for
various quantities (volumes of tanks, flow rates and concentrations of species A in
each tank) is as shown in the figure.

Figure 1.2: Schematic diagram of two interacting tanks with in and outflows.
26 | 1 Matrices and linear algebraic equations

Mass or mole balance of species A in tank 1 gives

dC1
VR1 = qCin (t) + qe C2 − (q + qe )C1 (1.45)
dt

Similarly, mass/mole balance of species A in tank 2 gives

dC2
VR2 = (q + qe )C1 − (q + qe )C2 (1.46)
dt

[These are mass balances on species if concentration is measured in kg/m3 , and mole
balances on species A if the concentration is measured in molar units, moles/m3 or
moles/liter].
To complete the model, we have to supplement it with initial conditions which
specify the concentration of the species at time zero, i. e.,

C1 (at t = 0) = C10 (1.47)


C2 (at t = 0) = C20 (1.48)

In vector-matrix form, the above model may be written as

dc C10
= Ac + b(t); c (at t = 0) = c0 = ( ) (1.49)
dt C20
(q+qe ) qe q
C − VR1 VR1 C (t)
c = ( 1 ); A=( ); b(t) = ( VR1 in ) (1.50)
C2 (q+qe )

(q+qe ) 0
VR2 VR2

Note that the matrix can be written as the sum of diffusive (or exchange) matrix and a
convective (flow) matrix:

q qe
−Ve VR1 − Vq 0
A=( R1
)+( q
R1 )
qe q
−Ve VR2
− Vq
VR2 R2 R2

= Ad + Ac

One special case of this model is obtained when the two tanks are of equal volume. In
q t
this case, we can define a dimensionless time (t ′ = Ve ) and Peclet number (PeD = qq ;
R e
qe ≠ 0) and write it as

dc ̂ ̂ ′ ); c (at t ′ = 0) = c
= Ac + PeD b(t 0 (1.51)
dt ′
̂ = ( −1 1 ) + PeD ( −1 0 ) ; b(t ̂ ′ ) = ( Cin (t ) ) .

A (1.52)
1 −1 1 −1 0
1.7 Vector-matrix formulation of some chemical engineering problems | 27

The model defined by equations (1.51)–(1.52) is the simplest example of a transient


discrete diffusion-convection system. When there is no inflow or outflow from the sys-
tem, i. e., PeD = 0, we obtain a homogeneous initial value problem describing tran-
sient mixing in the system. A second limiting case is that of equal volume tanks with
no exchange (or diffusive) flow between them, i. e., qe = 0. In this case, we define the
total residence time (or space time as there is no change in moles or volumetric flow
rate),

2VR
τ=
q

and write the model in the form

τ dc −1 0 C (t)
=( ) c + ( in ); c (at t ′ = 0) = c0 . (1.53)
2 dt 1 −1 0

1.7.4 Models for transient diffusion, convection and diffusion-convection


(compartment models)

The example above of two interacting tanks (or cells) can be generalized to any num-
ber of cells which interact through exchange (diffusion), imposed flow (convection)
and with or without reaction. We consider here these models without reaction so that
the structure of the models can be seen more clearly. These models are referred to
as cell or compartment models and are discrete (or finite-dimensional) analogs of the
continuous diffusion–convection–reaction models. With the same assumptions as
above, the formulation of the transient models for these cases is similar to the two
tank (cell) system. We provide here only the final model equations for different cases
as their derivation is straightforward. [Remark: The compartment models formulated
here for discrete interacting systems also appear when partial differential equations
of diffusion–convection–reaction type are discretized using finite difference or finite
volume methods.]

Discrete transient diffusion model


For the case of N identical (equal volume) interacting tanks arranged in a linear array
with equal forward and backward exchange flows (and no imposed external flow), the
evolution is described by

dc
= Ad c; c (at t ′ = 0) = c0 (1.54)
dt ′

where c is the N × 1 vector of species concentrations and the N × N dimensionless


diffusion (exchange) matrix Ad is given by
28 | 1 Matrices and linear algebraic equations

−1 1 0 0 . 0 0 0
1 −2 1 0 . 0 0 0
( 0 1 −2 1 . 0 0 0 )
( )
Ad = ( . . . . . . . . ) (1.55)
( )
0 0 0 0 . −2 1 0
0 0 0 0 . 1 −2 1
( 0 0 0 0 . 0 1 −1 )

Note that the matrix is symmetric and sum of each row and column is zero. The same
matrix is obtained when the one-dimensional transient diffusion equation (with zero
flux boundary conditions at the ends) is discretized using the second order finite dif-
ference or finite volume method. If the cells are arranged in a circular array (so that
cell N is connected to cell N − 1 as well as cell 1), the exchange matrix is modified to

−2 1 0 0 . 0 0 1
1 −2 1 0 . 0 0 0
( 0 1 −2 1 . 0 0 0 )
( )
Ad = ( . . . . . . . . ). (1.56)
( )
0 0 0 0 . −2 1 0
0 0 0 0 . 1 −2 1
( 1 0 0 0 . 0 1 −2 )

Again, this discrete model (with the symmetric circulant matrix) is obtained when the
one-dimensional transient diffusion equation on a circle (and periodic boundary con-
ditions) is discretized using second-order finite difference or finite volume methods.

Discrete transient diffusion-convection model


When convective flow is superimposed on the exchange flow, we have a generalization
of the two-cell model to the N-cell system. The model equations for a linear array of N
identical cells are given by equation (1.51) with

̂ = Ad + PeD Ac
A (1.57)

where Ad is as defined by equation (1.55) and the N × N convective flow matrix Ac and
̂ ′ ) (assuming that there is only a single inlet stream entering
the N × 1 forcing vector b(t
tank 1) are given by

−1 0 0 0 . 0 0 0
1 −1 0 0 . 0 0 0
( 0 1 −1 0 . 0 0 0 )
( )
Ac = ( . . . . . . . . ) (1.58)
( )
0 0 0 0 . −1 0 0
0 0 0 0 . 1 −1 0
( 0 0 0 0 . 0 1 −1 )
̂ ′ ) = C (t ′ )e .
b(t (1.59)
in 1
1.7 Vector-matrix formulation of some chemical engineering problems | 29

Here, e1 is the N × 1 unit vector (corresponding to unity in the first element and zeros
in all other elements). Once again, the discrete model in this case is obtained from the
continuum transient diffusion-convection model by using differencing methods.

Model for discrete transient convective loop


Consider a set of 3 identical cells arranged in a convective loop as shown in Figure 1.3.
For simplicity, assume that all the cells have the same volume and the flow rate
through the loop is constant (at q). The transient model of the system is given by

dc
= AL c; c (at t ′ = 0) = c0 (1.60)
dt ′

where the 3 × 3 dimensionless loop connectivity matrix AL is given by

−1 0 1
AL = ( 1 −1 0 ) (1.61)
0 1 −1

Generalizing to the case of a loop consisting of N identical cells, we obtain equation


(1.60) with

−1 0 0 0 . 0 0 1
1 −1 0 0 . 0 0 0
( 0 1 −1 0 . 0 0 0 )
( )
AL = (
( . . . . . . . . ).
) (1.62)
( 0 0 0 0 . −1 0 0 )
0 0 0 0 . 1 −1 0
( 0 0 0 0 . 0 1 −1 )

We note that AL is not a symmetric matrix but is a special case of a circulant matrix.
A more general circulant matrix denotes the matrix

a1 a2 ... ... ... aN


[ aN a1 ... ... ... aN−1 ]
[ ]
[ .. .. ]
[ .. ]
[ . . . ]
[ ]
[ .. .. .. ] (1.63)
[ . . . ]
[ ]
[ .. .. ]
[ .. ]
[ . . . ]
[ a2 ... ... ... ... a1 ]

in which every row starting with the second can be obtained from the previous row
by moving each of its elements one column to the right with the last element circling
to become the first. Each diagonal of a circulant matrix consists of identical elements.
30 | 1 Matrices and linear algebraic equations

Figure 1.3: Schematic diagram of a discrete convective loop with 3 cells.

It is a special case of a Toeplitz matrix for which elements on all the diagonals are
constants. The matrices appearing in the above examples are also known as banded
Toeplitz matrices.

1.8 Application of elementary matrix concepts


Example 1.10 (Mass transfer disguised reaction rate constant matrix). Consider a
fluid-solid system in which reactions occur only on the solid (catalyst) surface (Fig-
ure 1.4).

Figure 1.4: Schematic diagram of flow past a particle on the surface of which chemical reactions
occur.

A simplified model of such a system at steady state is obtained by using the concept
of the mass transfer coefficient between the fluid and solid. For the case of a single re-
1.8 Application of elementary matrix concepts | 31

action of the form A → B, the steady-state model is obtained by equating the external
flux

je = kc (CAb − CAs ) (1.64)

to the reaction rate (the rate of disappearance of A), which for the case of linear kinet-
ics (first-order reaction) may be expressed as

r = ks av CAs = kv CAs . (1.65)

Here, kc is the external mass transfer coefficient, av is the solid-fluid exchange area
per unit volume, ks is the first-order surface reaction rate constant (kv = ks av is the
first-order rate constant) and CAb and CAs are the bulk (cup-mixing) and surface con-
centrations of the reactant species A, respectively. For the case of several reactions
among N species with linear kinetics, the external flux vector may be expressed as

je = Kc (Cb − Cs ) (1.66)

while the reaction rate vector is of the form

r = KRs Cs (1.67)

where Kc is an N × N matrix of mass transfer coefficients, KRs is an N × N reaction rate


constant matrix and Cb (Cs ) is the N × 1 bulk (surface) concentration vector. The i-th
component of the vector on the LHS of equation (1.66) is the rate of transport of the
species i from the bulk to the solid (catalytic) surface while the i-th component of the
vector (KRs Cs ) is the net rate of consumption of species i in various chemical reactions
on the surface. Eliminating the unknown surface concentration vector Cs gives the
rate in terms of the bulk concentration as

r = K∗ Cb (1.68)

where the apparent or mass transfer disguised rate constant matrix K∗ is defined by

K∗ = KRs (KRs + Kc )−1 Kc . (1.69)

For numerical calculations and physical interpretation of this result, it is convenient


to define

KRs = ks A, Kc = kc M (1.70)

where A is the matrix of relative rate constants and M is the matrix of relative mass
transfer coefficients. This allows us to write (1.69) as
32 | 1 Matrices and linear algebraic equations

K∗ = KRs H (1.71)

where the external effectiveness factor matrix is defined by

H = (Dapm A + M)−1 M
= (I + Dapm M−1 A) (1.72)
−1

and Dapm is the particle (or local) Damköhler number defined by

ks k
Dapm = = v
kc kc av

[Remarks: The matrices A, M and H are dimensionless or have elements with no units.
In simplifying the expression for H, we have assumed that M is invertible and used
the property given by equation (1.24).] To illustrate how the external mass transfer
can disguise the true reaction network (leading to the so called falsified kinetics), we
consider the following numerical values:

1 0 0 1 − 21 0
Dapm = 1, M=( 0 1 0 ), A = ( −1 1 − 41 ) ,
0 0 1 0 − 21 1
4

which gives

19 5 1
1
H= ( 10 20 4 )
33
4 8 28
14 −5 −1
ks
K∗ = ( −10 13 −4 )
33
−4 −8 5

The true and disguised rate constants (and reaction networks) are shown in Figure 1.5.
We note that two extra reactions (P → R and R → P) (that are not present in the true
reaction network) appear in the mass transfer disguised reaction network.
In the general case of several reactions among many species, the true reaction net-
work may be represented by a sparse matrix of about N nonzero rate constants while
the mass transfer disguised rate constant matrix may have N 2 nonzero rate constants.
Thus, the number of spurious reactions that appear in the mass transfer disguised rate
constant matrix is N 2 − N, which is quite large for large N. See below for further illus-
tration for the case of large N, where, symbolic manipulation and computer algebra
software (such as Mathematica® , Matlab® , Fortran/Python or others) can be used, as
solving by hand may take much longer time.
1.9 Application of computer algebra and symbolic manipulation | 33

Figure 1.5: Schematic representation of the true (left) and mass transfer disguised reaction net-
works.

1.9 Application of computer algebra and symbolic manipulation


Consider the true reaction network among S species Aj as given by

k1 k2 ki−1 ki ks−1
A1 󴀔󴀭 A2 󴀔󴀭 A3 . . . 󴀔󴀭 Ai 󴀔󴀭 Ai+1 . . . 󴀔󴀭 As (1.73)
k−1 k−2 k−(i−1) k−i k−(s−1)

where ki and k−i are forward and backward rate constants between species Ai to Ai+1
with linear kinetics. Thus, the net rate of formation of these species is given by

d[A1 ]
= −k1 [A1 ] + k−1 [A2 ] (1.74)
dt
d[A2 ]
= k1 [A1 ] − (k−1 + k2 )[A2 ] + k−2 [A3 ] (1.75)
dt
..
.
d[Ai ]
= ki−1 [Ai−1 ] − (k−(i−1) + ki )[Ai ] + k−i [Ai+1 ] (1.76)
dt
..
.
d[As−1 ]
= ks−2 [As−2 ] − (k−(s−2) + ks−1 )[As−1 ] + k−(s−1) [As ] (1.77)
dt
d[As ]
= ks−1 [As−1 ] − k−(s−1) [As ] (1.78)
dt

Thus assuming xi being the mole fraction:

[Ai ]
xi = S
, (1.79)
∑j=1 [Aj ]

we can write

dx
= KRs x, t>0 and x = x0 @ t = 0 (1.80)
dt

where the true rate constant matrix is given by


34 | 1 Matrices and linear algebraic equations

k1 −k−1 0 0 0 0 0
−k1 k−1 + k2 −k−2 0 0 0 0
( 0 .. .. .. )
( . . . 0 0 0 )
( )
KRs =( 0 0 −k(i−1) k−(i−1) + ki −k−i 0 0 )
( )
( .. .. .. )
0 0 0 . . . 0
0 0 0 0 −k(s−2) k−(s−2) + ks−1 −k−(s−1)
( 0 0 0 0 0 −k(s−1) k−(s−1) )
(1.81)
T
and the mole fraction vector x = ( x1 x2 ⋅⋅⋅ xs ) . (1.82)

Thus depending on the transfer coefficient matrix Kc , the apparent (mass-transfer dis-
guised) rate constant matrix K∗ given equation (1.69) can have all elements nonnega-
tive, i. e.,

K∗ = {kij∗ } where kij∗ ≠ 0 (1.83)

and new reaction appears between species Ai and Aj (where |i − j| > 1) as

kij∗
Ai 󴀔󴀭∗ Aj , |i − j| > 1. (1.84)
kji

The number of new reactions appearing for the true reaction network (equation (1.73))
are shown in Table 1.1. It can be shown that for large number of species, sequential
reactions is observed as the complex reaction network due to the mass transfer, and
significant number of disguised reactions appear.

Table 1.1: Number of new mass-transfer disguised reactions.

Number of Number of reactions in true Number of reactions in mass- Spurious reactions


species S network (equation (1.73)) transfer disguised network (difference)
2 2 2 0
3 4 6 2
4 6 12 6
5 8 20 12
10 18 90 72
50 98 2450 2352
100 198 9900 9702
S 2(S − 1) S(S − 1) (S − 1)(S − 2)

Note that when number of species (S) is large, the evaluation of mass-transfer dis-
guised rate constants can be time consuming, and hence any symbolic/numerical
programming software such as Mathematica® , Matlab® and Maple can be utilized for
these purposes. Here, we use Mathematica® to demonstrate some examples.
1.9 Application of computer algebra and symbolic manipulation | 35

1.9.1 Example 1: mass transfer disguised matrix for a five species system

Assuming S = 5 and the true rate constants (see equation (1.73)) are

1
k1 = ks ; k−1 = ks ;
2
1 1
k2 = ks ; k−2 = ks ;
2 4
1 1
k3 = ks ; k−3 = ks ;
4 8
1 1
k4 = ks ; k−4 = k.
10 20 s

1 −1/2 0 0 0
−1 1 −1/4 0 0
KRs = ks A where A = ( 0 −1/2 1/2 −1/8 0 )
0 0 −1/4 9/40 −1/20
0 0 0 −1/10 1/20

In addition, assuming that the transfer coefficient matrix Kc is diagonal with all diag-
onal elements same, i. e.,

1 0 0 0 0
0 1 0 0 0
Kc = ks M, where M = I5 = ( 0 0 1 0 0 ),
0 0 0 1 0
0 0 0 0 1

k
such that Dapm = ks = 1. Then mass-transfer disguised rate constant can be given by
c
using equations (1.71) and (1.72) as

K∗ = KRs H = ks A(I+ Dapm M−1 A) = ks A(I5 +A)−1


−1

4/5 −1/5 −1/5 −1/5 −1/5


−1/5 4/5 −1/5 −1/5 −1/5
= ks ( −1/5 −1/5 4/5 −1/5 −1/5 )
−1/5 −1/5 −1/5 4/5 −1/5
−1/5 −1/5 −1/5 −1/5 4/5

which has all elements nonzero. In other words, there are S(S − 1) = 20 reaction can
be observed due to mass transfer. The true and mass transfer disguised reactions net-
works are shown in Figure 1.6, where additional reactions with corresponding rate
constants are depicted.
36 | 1 Matrices and linear algebraic equations

Figure 1.6: Effect of mass-transfer on observed kinetics: (left) true reaction network and (right) mass-
transfer disguised reaction network for Example 1 (5 species leading to 12 new reactions).

1.9.2 Example 2: mass transfer disguised matrix for a ten species system

Assuming S = 10 and the true rate constants (see equation (1.73)) are

k1 = ks ; k−1 = 0.5ks ;
k2 = 0.5ks ; k−2 = 0.25ks ;
k3 = 0.25ks ; k−3 = 0.125ks ;
k4 = 0.1ks ; k−4 = 0.05ks ;
k5 = 0.25ks ; k−5 = 0.1ks
k6 = 0.1ks ; k−6 = 0.15ks
k7 = 0.2ks ; k−7 = 0.3ks
k8 = 0.1ks ; k−8 = 0.05ks
k9 = 0.2ks ; k−9 = 0.1ks

⇒ KRs = ks A where

1 −0.5 0 0 0 0 0 0 0 0
−1 1 −0.25 0 0 0 0 0 0 0
(
( 0 −0.5 0.5 −0.125 0 0 0 0 0 0 )
)
( 0 0 −0.25 0.225 −0.05 0 0 0 0 0 )
( )
( 0 0 0 −0.1 0.3 −0.1 0 0 0 0 )
( )
A=( )
( 0 0 0 0 −0.25 0.2 −0.15 0 0 0 )
( )
( 0 0 0 0 0 −0.1 0.35 −0.3 0 0 )
( )
( 0 0 0 0 0 0 −0.2 0.4 −0.05 0 )
0 0 0 0 0 0 0 −0.1 0.25 −0.1
( 0 0 0 0 0 0 0 0 −0.2 0.1 )

In addition, assuming that the transfer coefficient matrix Kc is diagonal as given by


Kc = ks M, where
1.9 Application of computer algebra and symbolic manipulation | 37

1 0 0 0 0 0 0 0 0 0
0 2 0 0 0 0 0 0 0 0
( 0 0 3 0 0 0 0 0 0 0 )
( )
( 0 0 0 1 0 0 0 0 0 0 )
( )
( 0 0 0 0 4 0 0 0 0 0 )
( )
M=( ),
( 0 0 0 0 0 2 0 0 0 0 )
( )
( 0 0 0 0 0 0 1 0 0 0 )
( )
( 0 0 0 0 0 0 0 3 0 0 )
0 0 0 0 0 0 0 0 5 0
( 0 0 0 0 0 0 0 0 0 1 )

k
such that Dapm = ks = 2. Then mass transfer disguised rate constant can be given by
c
using equations (1.71) and (1.72) as

K∗ = KRs H = ks A(I + Dapm M−1 A) = ks A(I10 + 2M−1 A)


−1 −1

0.478 −0.043 −0.065 −0.022 −0.087 −0.043 −0.022 −0.065 −0.109 0.022
−0.043 0.913 −0.13 −0.043 −0.174 −0.087 −0.043 −0.13 −0.217 −0.043
−0.065 −0.13 1.304 −0.065 −0.261 −0.13 −0.065 −0.196 −0.326 −0.065
(−0.022 −0.043 −0.065 0.478 −0.087 −0.043 −0.022 −0.065 −0.109 −0.022)
( )
(−0.087 1.652 −0.087)
K∗ = ks (−0.043
−0.174 −0.261 −0.087 −0.174 −0.087 −0.261 −0.434
0.913
)
( −0.087 −0.13 −0.043 −0.174 −0.043 −0.13 −0.217 −0.043)
(−0.022 −0.043 −0.065 −0.022 −0.087 −0.043 0.478 −0.065 −0.109 −0.022)
−0.065 −0.13 −0.196 −0.065 −0.261 −0.13 −0.065 1.304 −0.326 −0.065
−0.109 −0.217 −0.326 −0.109 −0.434 −0.217 −0.109 −0.326 1.956 −0.109
(−0.022 −0.043 −0.065 −0.022 −0.087 −0.043 −0.022 −0.065 −0.109 0.478 )

which has all elements nonzero. Note that all diagonal elements are positive while
off-diagonal elements are negative as expected. In this case, total number of observed
reactions are S(S − 1) = 90, i. e., 72 new reactions appear as shown in Figure 1.7, where
the true and the mass-transfer disguised reactions networks are depicted.

Problems
1. (Formulation of linear models): Consider the following simplified economic model
of a small town represented by three major industries: coal mining, transportation
and electricity. It is estimated that the production of a one dollar value of coal re-
quires the purchase of 10 cents of electricity and 20 cents of transportation. Simi-
larly, the production of one dollar output of transportation requires the purchase
of 2 cents of coal and 35 cents of electricity, while production of one dollar value
of electricity requires purchase of 10 cents of electricity, 50 cents of coal and 30
cents of transportation. The town has external contracts for $1,000,000 of coal,
$1,600,000 of transportation and $500,000 of electricity. Show that the problem
38 | 1 Matrices and linear algebraic equations

Figure 1.7: Effect of mass transfer on kinetics: (a) true reaction network and (b) mass-transfer dis-
guised reaction network for example 2 (10 species leading to 72 new reactions).

of determining how much coal, electricity and transportation is required to sup-


ply external demand without a surplus is equivalent to solving a system of (three)
linear equations of the form

x = Ax + d or (I − A)x = d

where x is called the production vector, d is the demand vector and A is the con-
sumption matrix. Determine the production vector by solving the linear equa-
tions. [Remark: Models of this type were first developed by W. W. Leontief, who
won the 1973 economics Nobel prize.]
2. (Formulation of linear models): Paul, Jim and Mike decide to help each other build
houses. Paul will spend half of his time on his own house and a quarter of his
time on each of the houses of Jim and Mike. Jim will spend one-third of his time
on each of the three houses under construction. Mike will spend one-sixth of his
time on Paul’s house, one-third on Jim’s house and one-half on his own house.
For tax purposes, each must place a price on his labor, but they want to do so in a
way that each will break even. Formulate the relevant equations, solve them and
suggest a price on the labor of each person such that the hourly wages for each
exceed the minimum wage.
3. (Gaussian elimination and LU decomposition):
(a) Consider the linear system

x1 + 2x2 − x3 = 0
2x1 + x2 + 2x3 = 8
6x1 + 2x2 + 2x3 = 14

i. Solve the above system by Gaussian elimination.


ii. Determine the matrices L and U in the decomposition A = LU.
iii. Use the result in (ii) to find A−1 .
1.9 Application of computer algebra and symbolic manipulation | 39

i. Develop an algorithm to decompose A = LU where A is a tridiagonal ma-


trix.
ii. Use the algorithm to solve Ax = b where

4 −1 0 0 0 1
−1 4 −1 0 0 0
A=( 0 −1 4 −1 0 ), b=( 0 )
0 0 −1 4 −1 0
0 0 0 −1 4 0

4. (Operation count for LU decomposition and matrix inverse calculations):


(a) Show that the number of operations required to do the decomposition of A =
LU of a square matrix is given by

(n − 1)n(n + 1) 1 3
MD = ≈ n (n ≫ 1)
3 3
n(n − 1)(2n − 1) 1 3
AS = ≈ n (n ≫ 1)
6 3

where AS = additions/subtractions and MD = multiplications/divisions.


How many additional operations are needed to solve Ax = b?
(b) Show that (for n ≫ 1) the calculation of A−1 is only four times the expense of
solving Ax = b.
(c) Obtain the results analogous to those in (a) for a tridiagonal system of equa-
tions.
5. (Solution of linear homogeneous equations):
(a) Solve the following system and determine the number of linearly independent
solutions:

u2 − 2u3 − 4u4 = 0
u3 − 3u4 = 0
2u1 + u2 + 3u3 + 7u4 = 0
6u1 + 2u2 + 10u3 + 28u4 = 0

(b) Determine the relationship between the parameters k and Ra for which the
following set of equations has a nontrivial solution:

c1 − (π 2 + k 2 )c2 = 0
−(π 2 + k 2 )c1 + Ra k 2 c2 = 0.

[Remark: This relation is called the neutral stability curve and defines the on-
set of convection in a fluid layer heated from below. Here, k is the wave num-
40 | 1 Matrices and linear algebraic equations

ber and Ra is the Rayleigh number]. Determine the minimum value of Ra for
which a nontrivial solution exists.
(c) Determine all of the nontrivial solutions for the following system of homoge-
neous equations:

u1 − iu2 = 0
u2 + u3 = 0
u1 + u2 − u4 = 0
u2 + iu3 + iu4 = 0, where i = √−1

6. (Solution of linear inhomogeneous equations):


(a) Determine the values of λ for which the following system has (i) a unique so-
lution, (ii) no solution and (iii) more than one solution:

u1 − 3u3 = −3
2u1 + λu2 − u3 = −2
u1 + 2u2 + λu3 = 1.

(b) Verify that the following linear system is inconsistent, and hence has no so-
lution:

u1 + 4u2 − u3 = −5
5u1 + 2u2 − 3u3 = −1
−2u1 + u2 + u3 = −2
u1 + 5u2 = −2

(c) Obtain a general solution to the following system of equations:

u1 − u3 + 2u4 + u5 + 6u6 = −3
u2 + u3 + 3u4 + 2u5 + 4u6 = 1
u1 − 4u2 + 3u3 + u4 + 2u6 = 0
2u1 − 4u2 + 2u3 + 3u4 + u5 + 8u6 = −3.

7. (Simultaneous linear equations: Countercurrent extraction process): The models


used to describe the steady-state (and also transient) behavior of plate, gas ab-
sorbers, extraction units and rectifying and stripping sections of a distillation
column are essentially the same. As an illustration, consider a 3-stage absorber
(Figure 1.8) in which a heavy (liquid) phase and a light (gas) phase pass coun-
tercurrent to each other. The contacting may be assumed to be uniform so that
equilibrium is attained in each stage and the equilibrium relationship is linear
(y = Kx). (a) Show that the steady-state model of the system is of the form
1.9 Application of computer algebra and symbolic manipulation | 41

0 = αxj−1 − (α + β)xj + βxj+1 j = 1, 2, 3

where α = Lh , β = GK
h
, L is the liquid (heavy-phase) flow rate, h is the holdup (which
is assumed to be constant and same for all stages), G is the gas (light-phase) flow
rate and xj is the composition of the transferable component in the liquid stream
leaving stage j. State any other assumptions involved. (b) Generalize the model
in (a) to the case of N stages and put it in vector-matrix form. Assuming that the
y
compositions x0 and xN+1 = N+1 K
, of the entering streams are known, show that
the model may be written in the form

Ax = b

Identify the vectors x, b and the matrix A (c) Compute the steady-state values for
y1 and x3 when N = 3 (three-stage process) and other parameters are given as
follows:

L = 5, G = 3, h = 1, K = 1, x0 = 0, y4 = 0.5

Figure 1.8: Schematic diagram of three stage extraction process.

8. (Formulation of interacting cell/discrete diffusion models): Consider the flow sys-


tem shown in Figure 1.9. Assume that each tank is well mixed and a unit mass
(e. g., 1 kg) of species A is suddenly dumped into the larger tank at time zero and
that all the tanks are free of species A at t < 0. Further assume that VR = 1 m3 and
qe = 1 m3 /min:
(a) Formulate the differential equations describing the transient behavior of the
system and put them in vector/matrix form.
(b) Determine the steady-state concentrations in each tank.
(c) Generalize the model for a system of N interacting tanks/cells of equal volume
arranged in a circular array with all equal forward and reverse exchange flow
rates. Put the equations in dimensionless form and identify the structure of
the matrix that appears.
9. (Mass transfer disguised rate constant matrix): Determine the mass-transfer dis-
guised rate constant matrix for the following reaction network (Figure 1.10):
Assume that the mass transfer coefficients for all the species are equal and
Dapm = 1. How many new reactions appear? Generalize the result for the case of
42 | 1 Matrices and linear algebraic equations

Figure 1.9: Schematic diagram of three intercating tanks.

Figure 1.10: Sequential reversible reaction network with five species.

2N consecutive (reversible) reactions among (N +1) species. [Use of Mathematica®


is recommended for this exercise.]
10. (Discrete or compartmental loop diffusion–convection–reaction model): Consider
a discrete convective loop consisting of N identical cells of equal volume and as-
sume that on the main convective flow (QL ), we superimpose weak flows that cor-
respond to the entry or exit of one or more streams containing reactants and/or
products in any particular cell. Assume a constant density system with a single
reaction occurring in each cell. Using the same notation as in Section 1.7, show
that the reactant species balance in vector-matrix form is given by

dc
Qc = VR [ + r(c)] − qin cin (t) + qe c (1.85)
dt

with an appropriate initial condition. Here, c (cin (t)) is the vector representing the
limiting reactant (inlet) concentrations in various cells, qin is a diagonal matrix
representing the inlet volumetric flow rates to the cells, qe is a matrix representing
the auxiliary flow rates leaving various cells (excluding the main convective flow),
Q is the N × N (loop or cell connectivity) matrix defined by

1 0 ... 0 −1
−1 1 ... 0 0
Q = QL ( 0 0 0 )
( −1 ... ), (1.86)
.. .. .. .. ..
. . . . .
( 0 0 ... −1 1 )
1.9 Application of computer algebra and symbolic manipulation | 43

and r(c) is the vector of reaction rates. Cast the model in dimensionless form and
identify the various matrices for the special case of one entering stream in cell 1
and one exit stream in cell j (1 ≤ j ≤ N).
11. Compartment models for 2D and 3D transient diffusion: Consider the arrangement
of cells shown in Figure 1.11. Assuming that all cells are of equal volume and ex-
change flow rates are identical in magnitude, determine the coupling matrix for
each case.

Figure 1.11: Schematic of interacting tanks/cells in two and three dimensions.

12. Discrete interacting convective loops: Determine the coupling matrix for the dis-
crete interacting convective loops shown in Figure 1.12. Assume all cells to be of
identical volume and magnitude of all exchange/convective flows to be equal.

Figure 1.12: Schematic of interacting loops with a common cell.


2 Determinants
The theory of determinants plays a very important role in the solution of linear al-
gebraic and differential equations. Specifically, the conditions for the existence and
uniqueness of solutions to linear equations are often expressed in terms of determi-
nants. The conditions for the existence of new (bifurcating) solutions to nonlinear
equations may also be expressed in terms of the determinant of the linearized (Jaco-
bian) matrix. In this chapter, we review the properties of determinants and illustrate
their use with some examples.

2.1 Definition of determinant


Let A be an n × n square matrix with real or complex entries

a11 a12 a13 . . . . a1n


[ ]
[ a21 a22 a23 . . . a2n ]
[ ]
[ . . . . . . . ]
[ ]
A=[ ]
[ . . . . . . . ]
[ ]
[ . . . . . . . ]
[ ]
[ an1 an2 an3 . . . ann ]

The determinant of A, denoted by |A|, det A or


󵄨󵄨 a a12 a13 . . . . a1n 󵄨󵄨
󵄨󵄨 11 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 a21 a22 a23 . . . a2n 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 an1 an2 an3 . . . ann 󵄨󵄨

is defined by

det A = ∑(−1)h a1k1 a2k2 . . . ankn (2.1)

where the summation is taken over all possible products a1k1 a2k2 . . . ankn in which each
product has n elements with exactly one element arising from each row and each col-
umn of A. The value of h is the number of transpositions required to put the sequence
(k1 , k2 , . . . , kn ) in its natural order. Though h is not unique, it may be shown that it is
always even or odd for a given sequence. Note that there are n! terms in the summation
in equation (2.1) and (k1 , k2 , . . . , kn ) is a permutation of the sequence (1, 2, 3, . . . , n).

https://doi.org/10.1515/9783110739701-002
2.1 Definition of determinant | 45

Example 2.1. Consider the 2 × 2 matrix

a11 a12
A=( )
a21 a22

det A = ∑(−1)h a1k1 a2k2

(k1 , k2 ) = (1, 2) 󳨐⇒ h = 0
(k1 , k2 ) = (2, 1) 󳨐⇒ h = 1

Therefore, we get

|A| = a11 a22 − a12 a21

Example 2.2. Consider the 3 × 3 matrix

a11 a12 a13


A = ( a21 a22 a23 )
a31 a32 a33

|A| = ∑(−1)h a1k1 a2k2 a3k3

(k1 , k2 , k3 ) = (1, 2, 3) ⇒ h = 0
(k1 , k2 , k3 ) = (2, 3, 1) ⇒ h = 2
(k1 , k2 , k3 ) = (3, 1, 2) ⇒ h = 2
(k1 , k2 , k3 ) = (3, 2, 1) ⇒ h = 1
(k1 , k2 , k3 ) = (1, 3, 2) ⇒ h = 1
(k1 , k2 , k3 ) = (2, 1, 3) ⇒ h = 1

Thus, the determinant of a 3 × 3 matrix is given by

|A| = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 − a11 a23 a32 − a12 a21 a33
= a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a22 a31 )

Example 2.3. What is the sign of the term a13 a24 a31 a42 in the expansion of a 4 × 4
determinant?
Since the first subscripts are ordered, we need to look at the sequence formed
by the second subscripts (3, 4, 1, 2). Permuting 1 and 3 gives (1, 4, 3, 2). Permuting 2
and 4 gives (1, 2, 3, 4). Thus, h = 2 and the term a13 a24 a31 a42 appears with a positive
sign.
46 | 2 Determinants

2.2 Properties of the determinant


The following properties of a determinant may be established from the definition.
1. (a) The determinant of a matrix A and its transpose are identical, i. e., det A =
det AT . From the definition, we have

det A = ∑(−1)h a1k1 a2k2 a3k3 . . . ankn (rows ordered, columns permuted)
T h′
det A = ∑(−1) ak1 1 ak2 2 ak3 3 . . . akn n (columns ordered, rows permuted)

The same n! terms appear with the same sign.


(b) det(A) = (det A). This property follows from the following property of the com-
plex numbers:

α1 .α2 = α1 α2

For a square matrix with complex entries, we have


T
det A∗ = det(A ) = det(A) = det A.

2. If A has a row (or column) of zeros, then |A| = 0


3. When two rows of A are interchanged, |A| changes sign. Suppose rows i and j of A
are interchanged, to obtain A.
̃

det A = ∑(−1)h a1k1 . . . aiki . . . ajkj . . . ankn (2.2)


̃ = ∑(−1)h a1k . . . ajk . . . aik . . . ank

det A 1 i j n

= ∑(−1)h ±1 a1k1 . . . aiki . . . ajkj . . . ankn



(2.3)

Equations (2.2) and (2.3) differ by one transposition. Thus, h = h′ ± 1 and

|A|
̃ = −|A|

4. If two rows (or columns) are identical, then

|A| = 0

Using property (3), we get

det A = − det A (interchanging rows)


⇒ det A = 0

5. Multiplication of a row or column by a nonzero scalar:


2.2 Properties of the determinant | 47

α|A| = α ∑(−1)h a1k1 a2k2 . . . ankn


= ∑(−1)h a1k1 a2k2 . . . (αaiki ) . . . ankn

This implies that if all the elements in a row (or column) are multiplied by α then
the determinant is multiplied by α.
6. If row j is multiplied by α(≠ 0) and added to row i, the determinant is unchanged

det A = ∑(−1)h a1k1 . . . aiki . . . ajkj . . . ankn


̃ = ∑(−1)h a1k . . . (aik + αajk ) . . . ajk . . . ank

det A 1 i j j n

= ∑(−1)h a1k1 . . . ankn + α ∑(−1)h a1k1 . . . ajkj . . . ajkj . . . ankn


′ ′

= det A + 0

7. If A is upper or lower triangular matrix, then det A is the product of all diagonal
elements. From the definition, we have

det A = ∑(−1)h a1k1 . . . aiki . . . ajkj . . . ankn

Now, to get a nonzero value we must have k1 = 1, k2 = 2, . . . , kn = n,


8. Let In = n × n identity matrix. Then

det In = 1

(a) Let E1n = elementary matrix obtained by performing row operation of type 1
(interchange of rows) on In . Then

|En | = −1

(b) Let E2n = elementary matrix obtained by performing row operation of type 2
on In . Then

|E2n | = k

(c) Let E3n = matrix obtained by performing elementary row operation of type 3
on In . Then

|E3n | = 1

Let B = Ein A (i = 1, 2 or 3). Then |B| = |Ein ||A|. Now suppose that

B = Em Em−1 . . . E1 A

Then
48 | 2 Determinants

|B| = |Em ||Em−1 | . . . |E1 ||A|

It follows from this property that if a square matrix is not singular, then its echelon
form is an upper triangular matrix with nonzero elements along the diagonal.

2.3 Computation of determinant by pivotal condensation


The above properties of the determinant may be used to calculate any n-th order de-
terminant numerically. Let

󵄨󵄨 a a12 . . . . a1n 󵄨󵄨
󵄨󵄨 11 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 a21 a22 . . . . a2n 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . . . 󵄨󵄨
󵄨 󵄨󵄨
det A = 󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨󵄨 . . . . . . . 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 a an2 . . . . ann 󵄨󵄨
󵄨 n1

Then the pivotal condensation algorithm may be stated as follows:


1. Set det A = 1
2. Perform elementary row operations on A to reduce it to a triangular matrix. Reset
det A after each operation as follows:
type 1 → multiply det A by (−1)
type 2 → multiply det A by k1 (k ≠ 0)
type 3 → multiply det A by 1
3. Compute the determinant as the product of diagonal elements and all the factors
in step (2).

Example 2.4.

1 3 5
A=( 2 0 −1 )
1 4 3

Set D0 = det A = 1. Multiply row 1 by 2 and subtract from row 2. Subtract row 1 from
row 3. D1 = 1.

1 3 5
A→( 0 −6 −11 )
0 1 −2

Multiply row 2 by 61 . D2 = 6
2.4 Minors, cofactors and Laplace’s expansion | 49

1 3 5
A→( 0 −1 − 116 )
0 1 −2

Add row 2 to row 3. D3 = 6

1 3 5
A→( 0 −1 − 116 )
0 0 − 23
6

Thus,

−23
det A = 6 × 1 × (−1) × ( )
6
= 23

2.4 Minors, cofactors and Laplace’s expansion

We consider the n-th order determinant

󵄨󵄨
󵄨󵄨 a11 a12 . a1j . a1n 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 a21 a22 . a2j . a2n 󵄨󵄨
󵄨󵄨
󵄨󵄨 . . . . . . 󵄨󵄨
󵄨 󵄨󵄨
det A = 󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 ai1 ai2 . aij . ain 󵄨󵄨
󵄨󵄨
󵄨󵄨 . . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 an1 an2 . anj . ann 󵄨󵄨
= ∑(−1)h a1k1 a2k2 . . . ankn

and the (n − 1) × (n − 1) matrix obtained from A by deleting the i-th row and j-th column
of A. Denote this matrix by Mij . This is called a minor. The cofactor of element aij is
defined by

Aij = (−1)i+j |Mij |

Note that the cofactor is a number whereas minor is a matrix of order (n − 1) × (n − 1).
Laplace’s expansion of an n-th order determinant in terms of determinants of order
(n − 1) may be stated as follows:

n
det A = ∑ aij Aij for any i = 1, 2, 3, . . . , n
j=1
50 | 2 Determinants

That is, take any row and determine the cofactors of the elements of this row. Multiply
the elements by the corresponding cofactors and sum to get the determinant. Simi-
larly, the expansion in terms of columns is given by

n
det A = ∑ aij Aij for any j = 1, 2, 3, . . . , n
i=1

Proof of Laplace’s expansion.

det A = ∑(−1)h a1k1 a2k2 . . . aiki . . . ankn (2.4)


n
= ∑ aij (−1)i+j |Mij | (2.5)
j=1

To prove Laplace’s expansion (equations (2.4) and (2.5) are identical), first we note
that (2.5) has n! terms and each term contains exactly one element from each row and
column of A. This follows from the fact that each |Mij | has (n − 1)! terms, which do not
include any elements from the i-th row or j-th column of A. Thus, the sum in (2.5) has
(n)(n − 1)! = n! terms. To account for the signs of the terms, we consider the matrix
 obtained by moving the i-th row of A to the last row and the j-th column of A to
the last column. Then, when we expand the determinant of A,̂ each term in it differs
from that of A by exactly (n − i) row transpositions and (n − j) column transpositions.
Equivalently, the sign of the term differs by a factor (−1)(n−i)+(n−j) = (−1)2n−(i+j) = (−1)i+j .
Thus, the expansion given by (2.5) gives the determinant of A.

Classical adjoint of a square matrix A: (also called the adjugate of A)


The matrix obtained by replacing each element aij of A by its corresponding cofactor
and transposing the rows and columns is called the classical adjoint or adjugate of A
and is denoted by adj A, i. e.,

adj A = {Aji } (2.6)

Alien cofactor expansion


Suppose we multiply the elements of row i by the cofactors of the k-th row (k ≠ i) and
sum the result. We claim
n
∑ aij Akj = 0 (k ≠ i) (2.7)
j=1

Akj is called the alien cofactor of aij and equation (2.7) is called the alien cofactor ex-
pansion. To establish this expansion, we consider the identity
2.4 Minors, cofactors and Laplace’s expansion | 51

󵄨󵄨
󵄨󵄨 a11 a12 a13 . . . a1n 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 a21 a22 a23 . . . a2n 󵄨󵄨
󵄨󵄨
󵄨󵄨 . . . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 ai1 ai2 ai3 . . . ain 󵄨󵄨
󵄨󵄨 = ak1 Ak1 + ak2 Ak2 + ⋅ ⋅ ⋅ + akn Akn
󵄨󵄨 . . . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 ak1 ak2 ak3 . . . akn 󵄨󵄨
󵄨󵄨
󵄨󵄨 . . . . . . . 󵄨󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 an1 an2 an3 . . . ann 󵄨󵄨󵄨

and replace (ak1 , ak2 , . . . , akn ) by (ai1 , ai2 , . . . , ain ), i. e., replace the elements of the k-th
row by the elements of the i-th row. Then, on the left-hand side (LHS.) we have two
identical rows, and hence LHS = 0. This replacement of the k-th row by elements of
the i-th row does not change the cofactors Aki (i = 1, . . . , n). Now, RHS = ∑nj=1 aij Akj (we
are multiplying the cofactors of the k-th row by elements of the i-th row). Therefore,
we have ∑nj=1 aij Akj = 0, i ≠ k.

2.4.1 Classical adjoint and inverse matrices

The Laplace and alien cofactor expansions may be used to prove the following theo-
rem.

Theorem. Let adj A = adjugate of the n × n matrix A (Classical adjoint). Then

A(adj A) = (adj A)A = (det A)I

Proof.

A11 A21 . . . An1


[ ]
[ A12 A22 . . . An2 ]
[ ]
[ . . . . . . ]
adj A = [
[ .
]
]
[ . . . . . ]
[ ]
[ . . . . . . ]
[ A1n A2n . . . Ann ]
a11 a12 . a1n A11 A21 . An1
[
[ a21 a22 . a2n ][
][ A12 A22 . An2 ]
]
A adj A = [ ][ ]
[ . . . . ][ . . . . ]
[ an1 an2 . ann ][ A1n A2n . Ann ]
det A 0 . 0
[
[ 0 det A . 0 ]
]
=[ ] = (det A)In
[ . . . . ]
[ 0 0 . det A ]
52 | 2 Determinants

Similarly,

(adj A)A = (det A)In

Theorem. If det A ≠ 0, then the inverse matrix A−1 exists and is given by

1
A−1 = (adj A)
|A|

Proof. It follows from the definition and the previous theorem.

2.5 Determinant of the product of two matrices


Suppose that A and B are two square matrices of order n and C = AB. Then |C| =
|A||B|, i. e., the determinant of a product of two matrices is equal to the product of
determinants. This property can be established in several ways. We show it here more
directly and discuss further implications.
Since the proof (for the general case) is notationally complicated (but otherwise
straightforward), we sketch the procedure for the case n = 2. We let

a11 a12 0 0
[ a a22 0 0 ]
P = [ 21
[ ]
]
[ −1 0 b11 b12 ]
[ 0 −1 b21 b22 ]

and show that |P| = |A||B| = |AB|. We expand P by row 1 to get

󵄨󵄨 a 0 0 󵄨󵄨 󵄨󵄨 a 0 0 󵄨󵄨
󵄨󵄨 22 󵄨󵄨 󵄨󵄨 21 󵄨󵄨
󵄨 󵄨󵄨 󵄨 󵄨󵄨
|P| = a11 󵄨󵄨󵄨󵄨 0 b11 b12 󵄨󵄨 − a12 󵄨󵄨󵄨 −1
󵄨󵄨 󵄨󵄨 b11 b12 󵄨󵄨
󵄨󵄨
󵄨󵄨
󵄨󵄨 −1 b21 b23 󵄨󵄨󵄨 󵄨󵄨 0
󵄨 b21 b22 󵄨󵄨
󵄨

Expand again the 3 × 3 determinant by row 1 to get

󵄨󵄨 󵄨󵄨
󵄨 b b21 󵄨󵄨
|P| = (a11 a22 − a12 a21 ) 󵄨󵄨󵄨󵄨 11 󵄨󵄨
󵄨󵄨 b21 b22 󵄨󵄨
󵄨
= |A||B|

To show |P| = |AB|, we use elementary row operation of type 3 to transform P to P.̂
Since ERO of type 3 does not change the value of a determinant, |P|̂ = |P|. To get P,̂ we
multiply row 3 by a11 and add to row 1, row 4 by a12 and add to row 1, row 3 by a21 and
add to row 2, row 4 by a22 and add to row 2. This gives
2.6 Rank of a matrix defined in terms of determinants | 53

0 0 c11 c12
[ 0 0 c21 c22 ]
[ ]
P̂ = [ ]
[ −1 0 b11 b12 ]
[ 0 −1 b21 b22 ]

where

2
cij = ∑ aik bkj ; i, j = 1, 2
k=1

or C = AB. Expanding P̂ by column 1, we get

󵄨󵄨 0 c11 c12 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨 󵄨󵄨
|P| = (−1) 󵄨󵄨󵄨󵄨 0
̂ c21 c22 󵄨󵄨
󵄨󵄨
󵄨󵄨
󵄨󵄨 −1 b21 b22 󵄨󵄨
󵄨

Expanding again by column 1 gives the result.


It follows from the above result that if A is nonsingular,

1
det(A−1 ) = (2.8)
(det A)

This relation is useful in many applications.

2.6 Rank of a matrix defined in terms of determinants

Recall our earlier definition of the rank of a matrix as the number of nonzero rows in
the row echelon from of A. Now, let Ei be the elementary matrix obtained by perform-
ing ERO of type i on the identity matrix I. Then we have seen that

|E1 | = −1
|E2 | = k(k ≠ 0)
|E3 | = 1

Now, we let A be any m × n matrix and Ae be the row echelon form of A. Then

Ae = PA (2.9)

where P is the product of elementary matrices Ei . Thus, |P| ≠ 0. Suppose that rank of
A = r. Then, without loss of generality, we may assume that Ae is of the form
54 | 2 Determinants

1 â 12 â 13 . . â 1r . â 1n
[
[ 0 1 â 23 . . â 2r . â 2n ]
]
[ ]
[ 0 0 . . . â 3r . â 3n ]
[ ]
Ae = [
[ . . . . . . . . ]
]
[ 0 0 0 . . 1 . â rn ]
[ ]
[ ]
[ 0 0 0 . . 0 . 0 ]
[ 0 0 0 . . 0 . 0 ]

Here, Ae has m − r zero rows and the first nonzero element in row i appears in the i-th
column (this can always be arranged by renumbering the columns of A). Thus, Ae has
a r × r minor whose determinant is not zero. From equation (2.9), it follows that A also
has a r × r minor with a nonzero determinant. Thus, if A has rank r, there is at least
one r × r minor of A whose determinant is not zero. Conversely, if A has rank r then all
k × k minors (k > r) of A have a zero determinant.

2.7 Solution of Au = 0 and Au = b by Cramer’s rule


If A is an n × n nonsingular matrix, the system of equations

Au = b (2.10)

or the homogeneous system

Au = 0 (2.11)

has a unique solution. This solution may be expressed in terms of determinants using
Cramer’s rule. Let D = |A| ≠ 0. Then we have

󵄨󵄨 a a12 . . a1j uj . . a1n 󵄨󵄨


󵄨󵄨 11 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 a21 a22 . . a2j uj . . a2n 󵄨󵄨
󵄨󵄨 󵄨󵄨
uj D = 󵄨󵄨󵄨󵄨 . . . . . . . . 󵄨󵄨
󵄨󵄨 (2.12)
󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 a an2 . . anj uj . . ann 󵄨󵄨
󵄨 n1 󵄨

Next, for each k ≠ j, add uk times column k to column j of the matrix in equation (2.12).
This does not change the value of the determinant. Thus,

󵄨󵄨 a a12 . . a11 u1 + a12 u2 + ⋅ ⋅ ⋅ + a1n un . . a1n 󵄨󵄨


󵄨󵄨 11 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 a21 a22 . . a21 u1 + a22 u2 + ⋅ ⋅ ⋅ + a2n un . . a2n 󵄨󵄨
󵄨󵄨 󵄨󵄨
uj D = 󵄨󵄨󵄨󵄨 . . . . . . . . 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨󵄨 an1 an2 . . an1 u1 + an2 u2 + ⋅ ⋅ ⋅ + ann un . . ann 󵄨󵄨
󵄨
2.7 Solution of Au = 0 and Au = b by Cramer’s rule | 55

But

n
∑ aik uk = bi ; i = 1, 2, . . . , n
k=1

Thus,

󵄨󵄨 a a12 . . b1 . . a1n 󵄨󵄨
󵄨󵄨 11 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 a21 a22 . . b2 . . a2n 󵄨󵄨
󵄨󵄨 󵄨󵄨
uj D = 󵄨󵄨󵄨󵄨 . . . . . . . . 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨󵄨 an1 an2 . . bn . . ann 󵄨󵄨
󵄨

We define Dj to be the determinant obtained by replacing the j-th column of A by the


vector b. Thus, if D ≠ 0, we have

Dj
uj = , j = 1, 2, . . . , n (2.13)
D

This is the explicit solution of the linear system given by equation (2.10). The result
given by equation (2.13) is referred to as Cramer’s rule.

Example 2.5. For a 2 × 2 system,

a11 u1 + a12 u2 = b1
a21 u1 + a22 u2 = b2

we have

󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 b1 a12 󵄨󵄨 󵄨󵄨 a11 b1 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨 b2 a22 󵄨󵄨
󵄨
󵄨󵄨
󵄨 a21 b2 󵄨󵄨
󵄨
u1 = 󵄨 󵄨󵄨 , u2 = 󵄨 󵄨󵄨 (2.14)
󵄨󵄨
󵄨󵄨 a11 a12 󵄨󵄨 󵄨󵄨
󵄨󵄨 a11 a12 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 a21 a22 󵄨󵄨
󵄨 󵄨󵄨 a21 a22 󵄨󵄨
󵄨

For the 3 × 3 system,

a11 u1 + a12 u2 + a13 u3 = b1


a21 u1 + a22 u2 + a23 u3 = b2
a31 u1 + a32 u2 + a33 u3 = b3

we have
56 | 2 Determinants

󵄨󵄨
󵄨󵄨 b1 a12 a13 󵄨󵄨
󵄨󵄨
󵄨󵄨
󵄨󵄨 a11 b1 a13 󵄨󵄨
󵄨󵄨
󵄨󵄨
󵄨󵄨 a11 a12 b1 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 b2 a22 a23 󵄨󵄨
󵄨󵄨
󵄨󵄨
󵄨󵄨 a21 b2 a23 󵄨󵄨
󵄨󵄨
󵄨󵄨
󵄨󵄨 a21 a22 b2 󵄨󵄨
󵄨󵄨
󵄨󵄨
󵄨 b3 a32 a33 󵄨󵄨
󵄨
󵄨󵄨
󵄨 a31 b3 a33 󵄨󵄨
󵄨
󵄨󵄨
󵄨 a31 a32 b3 󵄨󵄨
󵄨
u1 = 󵄨 󵄨󵄨 , u2 = 󵄨 󵄨󵄨 , u3 = 󵄨
󵄨󵄨
󵄨󵄨 a11 a12 a13 󵄨󵄨 󵄨󵄨
󵄨󵄨 a11 a12 a13 󵄨󵄨 󵄨󵄨
󵄨󵄨 a11 a12 a13 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 a21 a22 a23 󵄨󵄨
󵄨󵄨
󵄨󵄨
󵄨󵄨 a21 a22 a23 󵄨󵄨
󵄨󵄨
󵄨󵄨
󵄨󵄨 a21 a22 a23 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 a31 a32 a33 󵄨󵄨
󵄨 󵄨󵄨 a31 a32 a33 󵄨󵄨
󵄨 󵄨󵄨 a31 a32 a33 󵄨󵄨
󵄨

It follows from Cramer’s rule that for the special case of the homogeneous system
(b = 0), we obtain u = 0. Thus, as already seen before, when det A ≠ 0, the only
solution to the homogeneous system is the trivial one.
It should also be noted that when det A ≠ 0, the solution given by Cramer’s rule
is unique. To show this, suppose that there are two solutions and call them u and y.
We have

Au = b
Ay = b

Subtracting, we get

Az = 0 (2.15)

where

z=u−y

Since D ≠ 0, the only solution to the homogeneous system (2.15) is the trivial one.
Thus, z = 0 and u = y.
Cramer’s rule is not used in practice for higher order systems (e. g., n > 10) as it
requires more computational time than the Gaussian elimination procedure.

2.8 Differentiation of a determinant


In our later applications, we will be dealing with square matrices whose elements de-
pend continuously on a parameter t. We consider now determinants of such matrices
and the derivatives of the determinant. First, we illustrate the problem with a 2 × 2
matrix. Let

a11 (t) a12 (t)


A(t) = [ ] (2.16)
a21 (t) a22 (t)

Then

D(t) ≡ det A = a11 (t)a22 (t) − a12 (t)a21 (t) (2.17)


2.9 Applications of determinants | 57

If we assume that aij (t) is differentiable for all i and j, then

dD da da11 da da
= a11 (t) 22 + a − a12 21 − 12 a21
dt dt dt 22 dt dt
󵄨󵄨 da11 da12 󵄨󵄨 󵄨󵄨󵄨 a a
󵄨󵄨
󵄨
󵄨 󵄨󵄨 󵄨󵄨 11 12 󵄨󵄨
= 󵄨󵄨󵄨󵄨 dt dt 󵄨󵄨 + 󵄨󵄨 da
󵄨󵄨 󵄨󵄨 21 da22 󵄨󵄨󵄨󵄨
󵄨󵄨 a21 a22 󵄨 󵄨 dt dt 󵄨
= D1 + D2

Thus, the derivative of a determinant is the sum of two determinants in which a single
row is differentiated. This result is easily generalized to the n-th order determinant:

D = ∑(−1)h a1k1 (t)a2k2 (t) . . . ankn (t)


dD n dajkj (t)
= ∑(∑(−1)h a1k1 (t)a2k2 (t) . . . . . . ankn (t))
dt j=1 dt
n
= ∑ Dj = D1 + D2 + ⋅ ⋅ ⋅ + Dn
j=1

where
󵄨󵄨 a (t) a12 (t) . . a1n (t) 󵄨󵄨
󵄨󵄨 11 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 a21 (t) a22 (t) . . a2n (t) 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
Dj = 󵄨󵄨󵄨 . . . . . 󵄨󵄨
󵄨
󵄨󵄨 daj1 (t) daj2 (t) dajn (t) 󵄨󵄨󵄨
󵄨󵄨 . . 󵄨󵄨
󵄨󵄨 dt dt dt 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . 󵄨󵄨
󵄨󵄨 󵄨
󵄨󵄨 a (t) an2 (t) . . ann (t) 󵄨󵄨󵄨
󵄨 n1

is the determinant obtained by differentiating only the j-th row.

2.9 Applications of determinants


In this section, we consider some elementary applications of determinants. As out-
lined above, the most important use of a determinant is in determining the conditions
under which a system of homogeneous equations has a nontrivial solution. This is
illustrated in the following examples as well as in later chapters.

Example 2.6 (Linear dependence/independence of functions). A set of functions


{wi (x); i = 1, 2, . . . , n} is called linearly independent if the only solution to the homoge-
neous equation

c1 w1 (x) + ⋅ ⋅ ⋅ + cn wn (x) = 0 (2.18)


58 | 2 Determinants

is c1 = c2 = ⋅ ⋅ ⋅ = cn = 0. Suppose that the set {wi (x); i = 1, 2, . . . , n} is linearly indepen-


dent. Differentiating equation (2.18) w. r. t. x once, twice, and (n − 1) times gives

c1 w1′ (x) + ⋅ ⋅ ⋅ + cn wn′ (x) = 0 (2.19)


c1 w1′′ (x)
+ ⋅⋅⋅ + cn wn′′ (x)
=0 (2.20)
...
...
...
c1 w1[n−1] (x) + ⋅ ⋅ ⋅ + cn wn[n−1] (x) = 0 (2.21)

Now, equations (2.18) to (2.21) are a set of n homogeneous linear equations for the
coefficients {c1 , c2 , . . . , cn } whose only solution is the trivial one. Thus, the determinant
of the coefficient matrix (also called the Wronskian determinant), defined by
󵄨󵄨 w (x) w2 (x) . . wn (x) 󵄨󵄨
󵄨󵄨 1 󵄨󵄨
󵄨󵄨 ′ 󵄨󵄨
󵄨󵄨 w1 (x) w2′ (x) . . wn′ (x) 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . 󵄨󵄨
W(x) = 󵄨󵄨󵄨 󵄨󵄨 ≠ 0
󵄨󵄨
󵄨󵄨 . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . 󵄨󵄨
󵄨󵄨 [n−1] 󵄨󵄨
󵄨󵄨 w w2[n−1] . . wn[n−1] 󵄨󵄨
󵄨 1

Conversely, if the Wronskian determinant is zero, then there exists a nontrivial


solution {c1 , c2 , . . . , cn } and the set {wi (x); i = 1, 2, . . . , n} is said to be linearly dependent.
As an illustration, we note that the set {1, x, x 2 } is linearly independent since
󵄨󵄨 1
󵄨󵄨 x x2 󵄨󵄨
󵄨󵄨
󵄨 󵄨󵄨
W(x) = 󵄨󵄨󵄨 0 1 2x 󵄨󵄨 = 2
󵄨󵄨 󵄨󵄨
󵄨󵄨 0 0 2 󵄨󵄨

while the set {2x, 1 + x 2 , (1 − x)2 } is linearly dependent since


󵄨󵄨 2x
󵄨󵄨 1 + x2 (1 − x)2 󵄨󵄨
󵄨󵄨
󵄨 󵄨󵄨
W(x) = 󵄨󵄨󵄨 2 2x −2(1 − x) 󵄨󵄨 = 0.
󵄨󵄨 󵄨󵄨
󵄨󵄨 0 2 2 󵄨󵄨

Example 2.7 (Bifurcation of solutions to nonlinear equations). Another important


application of the theory of determinants is in the solution of nonlinear equations. In
contrast to the linear system Au = b, which can have either zero (inconsistent), one
(when rank A = rank[Ab] = n) or an infinite number (when rank A = rank[Ab] = r < n)
of solutions, the nonlinear parametrized system of equations

fi (u1 , u2 , . . . , un , α) = 0; i = 1, 2, . . . , n, (2.22)

where α is a vector of m parameters (m ≥ 1) can have any number of solutions


(0, 1, 2, . . . , ∞). However, if the functions fi are continuous and have continuous deriva-
2.9 Applications of determinants | 59

tives, the implicit function theorem of multivariable calculus states that if the deter-
minant of the (linearized) Jacobian matrix of equations (2.22) does not vanish, then
the solution is a continuous function of the parameter α. Equivalently, the number of
solutions to equations (2.22) can change only when the determinant of the Jacobian
matrix

𝜕fi
J=[ (u, α)]; i = 1, 2, . . . , n; j = 1, 2, . . . , n
𝜕uj

vanishes, i. e.,

det J = 0 (2.23)

The elimination of the state variables u from equations (2.22) and (2.23) gives a locus
in the α parameter space. This locus is called the bifurcation set, as new solutions can
emerge (or bifurcate) only when α values cross this set. [Remark: It can be shown that if
the zero eigenvalue of J is simple, the number of solutions of equations (2.22) changes
if and only if the parameters a cross the bifurcation set. However, when the eigen-
value is not simple, the vanishing of the Jacobian determinant is necessary but not a
sufficient condition for bifurcation.] Thus, the solution of the original set of equations
(2.22) along with the vanishing of the Jacobian determinant can be used to determine
the bifurcation set for nonlinear problems
As an example, we consider the steady-state equation describing the temperature
(u) in an adiabatic CSTR with parameters B > 0 and Da > 0:

B Da eu
f (u, B, Da) = u − =0 (2.24)
1 + Da eu

Differentiating equation (2.24) w. r. t. u and setting the derivatives to zero gives

df B Da eu
=1− = 0. (2.25)
du (1 + Da eu )2

Writing

t = Da eu , (2.26)

the two equations may be solved for B and Da and the bifurcation set may be expressed
in a parametric form as

(1 + t)2
B= (2.27)
t
Da = t exp{−1 − t}, t>0 (2.28)

This is plotted in Figure 2.1. The bifurcation set consists of two branches, the upper
ignition branch and the lower extinction branch. It divides the (B, Da) space into two
60 | 2 Determinants

Figure 2.1: A schematic diagram of the bifurcation set of equation (2.24).

regions, corresponding to unique and three solutions of equation (2.24). [Remark: The
cusp point in Figure 2.1 where the ignition and extinction branches meet is at B = 4
and Da = e−2 . We note that for B < 4, there is only a unique solution for any value of
the dimensionless residence time, Da]. For higher-dimensional problems, symbolic
manipulation or computer programming can be used to determine the bifurcation set.

Example 2.8 (Lorenz equations). Consider the following discrete model of convection
(where the fluid density is assumed to vary only with temperature):

1 dx
= y − x = f1 (2.29)
Pr dt
dy
= −xz + Rx − y = f2 (2.30)
dt
dz 8
= xy − z = f3 (2.31)
dt 3
where R is the (scaled) Rayleigh number and Pr is the Prandtl number (taken to be
unity here).
Writing the vector of variables as ψ = (x, y, z)T , then equations (2.29)–(2.31) can
be written as

= f(ψ) = (f1 , f2 , f3 )T . (2.32)
dt
It can easily be verified that equations (2.29)–(2.31) or (2.32) has a trivial steady-state
solution, i. e., ψs = (x, y, z)T = 0 is a steady-state solution for any R and Pr. [Remark:
The steady-state solution does not depend on Pr.] The Jacobian of the function f can
be calculated easily and is given by

𝜕fi −1 1 0
J={ }=( R−z −1 −x ) (2.33)
𝜕ψj y x −8
3

Evaluation of the Jacobian matrix at the trivial solution ψs = 0 gives


2.9 Applications of determinants | 61

−1 1 0
⇒ Js = J|ψs =0 = ( R −1 0 ). (2.34)
0 0 −8
3

Thus, the Jacobian matrix at the trivial steady-state is of rank 3 except when R = 1
(where it has rank 2, i. e., det Js = 0 when R = 1).
Note that the linearized system of equations at R = 1,

Jψ = 0

α
has a nontrivial solution ψ = ( α ) for any arbitrary α. Thus, the linearized matrix Js has
0
a simple zero eigenvalue and R = 1 is a bifurcation point, i. e., new solutions appear
or disappear when R-value crosses unity.
In this specific case, we can determine all solutions since at steady-state (equa-
tions (2.29) and (2.31)) give

x=y (2.35)
3 3
z = xy = x 2 (2.36)
8 8

and equation (2.30) gives

3
− x 3 − x + Rx = 0
8
or

3 2
x(R − 1 − x )=0
8

8
x=0 or x = ±√ (R − 1) (2.37)
3

In other words, for R > 1, the system has three steady-state solutions (including a
trivial steady state).
The solution diagram (referred to as pitchfork bifurcation in the literature) is
shown in Figure 2.2. In this figure, x = 0 is the trivial conduction solution while x ≠ 0
correspond to convective solutions. As stated in the Introduction to this chapter, the
condition for the existence of a nontrivial solution to a system of linearized equa-
tions (expressed in terms of a determinant) may be used to determine the possible
bifurcation points of many nonlinear systems.
For further discussion on the theory of determinants, we refer to the books by
Amundson [3] and Lipschutz and Lipson [22].
62 | 2 Determinants

Figure 2.2: Steady-state solution diagram of Lorenz equations illustrating the pitchfork bifurcation.

Problems
1. (Simplification of determinants):
(a) Show that
󵄨󵄨 1 + c 1 1 . . 1 󵄨󵄨
󵄨󵄨 1 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 1 1 + c2 1 . . 1 󵄨󵄨
󵄨󵄨 = c c . . . c (1+ 1 + 1 +⋅ ⋅ ⋅+ 1 )
󵄨󵄨 󵄨󵄨
Dn = 󵄨󵄨󵄨󵄨 1 1 1 + c3 . . 1 󵄨󵄨 1 2 n
󵄨󵄨 󵄨󵄨 c1 c2 cn
󵄨󵄨 . . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 1 1 1 . . 1 + cn 󵄨󵄨󵄨
󵄨

Hint: First, show that Dn satisfies the recursion formula Dn = cn Dn−1 +


c1 c2 . . . cn−1 .
(b) Show without expanding that
󵄨󵄨
󵄨󵄨 x 1 1 . . . 1 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 1 x 1 . . . 1 󵄨󵄨
󵄨󵄨
Dn = 󵄨󵄨󵄨󵄨 1 1 x . . . 1 󵄨󵄨 = (x − 1)n−1 (x + n − 1)
󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨 1 1 1 . . . x 󵄨󵄨
󵄨

for this n-th order determinant.


(c) Show without expanding the determinant that the Vandermonde determi-
nant
󵄨󵄨 x n−1 x1n−2 . . . x1 1 󵄨󵄨
󵄨󵄨 1 󵄨󵄨
󵄨󵄨 n−1 󵄨󵄨
󵄨󵄨 x2
󵄨󵄨 x2n−2 . . . x2 1 󵄨󵄨
󵄨󵄨
󵄨󵄨 . . . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 n−1 󵄨󵄨
󵄨󵄨 xn xnn−2 . . . xn 1 󵄨󵄨

has the value


2.9 Applications of determinants | 63

Dn = (x1 − x2 )(x1 − x3 ) . . . (x1 − xn ).(x2 − x3 )(x2 − x4 ) . . . (x2 − xn )


. . . (x3 − x4 ) . . . (x3 − xn ) . . . (xn−1 − xn )
n−1
= ∏(xi − xj )
i=1
j>i

2. (Linear homogeneous equations and determinants): Let A be a square matrix of or-


der n, x be the vector of n unknowns and consider the homogeneous equations
Ax = 0. Use the properties of determinant to prove that a necessary and sufficient
condition for Ax = 0 to possess a nontrivial solution is det (A) = 0. (Hint: Verify
the result for n = 1 and use induction along with the properties of the determi-
nant).
3. (Determinant and neutral stability curve): Determine the relationship between the
parameters k, Le, Rat and Rac for which the following set of equations has non-
trivial solution:

c1 − (π 2 + k 2 )c2 = 0
Le c1 − (π 2 + k 2 )c3 = 0
−(π 2 + k 2 )c1 + Rat k 2 c2 + Rac k 2 c3 = 0.

[Remark: This relation is called the neutral stability curve and defines the onset of
convection in a fluid layer heated from below. Here, k is the wave number, Le is the
Lewis number and Rat , Rac are the thermal and concentration Rayleigh numbers,
respectively.]
4. (Equation for a plane and a circle in terms of determinants)
(a) Show that the equation of a plane passing through the points (xi , yi ), i = 1, 2, 3
may be expressed as

󵄨󵄨
󵄨󵄨 x y z 1 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 x1 y1 z1 1 󵄨󵄨
󵄨󵄨 = 0
󵄨󵄨
󵄨󵄨 x2 y2 z2 1 󵄨󵄨󵄨
󵄨󵄨
󵄨󵄨
󵄨󵄨 x3 y3 z3 1 󵄨󵄨
󵄨

(b) (Equation for a circle in terms of determinants) Show that the equation of a
circle passing through the points (xi , yi ), i = 1, 2, 3 may be expressed as

x2 + y2
󵄨󵄨 󵄨󵄨
󵄨󵄨 x y 1 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 x12 + y12 x1 y1 1 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨 = 0
󵄨󵄨
󵄨󵄨 x22 + y22 x2 y2 1 󵄨󵄨󵄨
󵄨󵄨 󵄨󵄨
x32 y32
󵄨
󵄨󵄨
󵄨 + x3 y3 1 󵄨󵄨󵄨
64 | 2 Determinants

5. (Common root condition for polynomial equations) Show that the necessary and
sufficient conditions for the equations

x 3 + ax2 + bx + c = 0
x 2 + αx + β = 0

to have a common root may be expressed as

󵄨󵄨
󵄨󵄨 1 a b c 0 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 0 1 a b c 󵄨󵄨
󵄨󵄨
󵄨󵄨 1 α β 0 0 󵄨󵄨 = 0
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 0 1 α β 0 󵄨󵄨
󵄨󵄨
󵄨󵄨
󵄨 0 0 1 α β 󵄨󵄨
󵄨
3 Vectors and vector expansions
In this chapter, we review some elementary concepts about vectors and vector expan-
sions. A more general discussion will be given in Part II when we deal with abstract
vector space concepts.
For the purpose of this chapter, we define a vector to be an n-tuple of real or com-
plex numbers arranged in a single row or column:

u1
u2
u=( . ) (column vector)
.
un
uT = (u1 u2 u3 . . . un ) (row vector)

For simplicity, we shall deal with only column vectors in the discussion below. How-
ever, all the concepts and properties of column vectors are also applicable to row vec-
tors. Also, when the elements of the column vector u are complex numbers, we define
the corresponding row vector by

u∗ = (u1 u2 u3 . . . un ) (row vector)

where ui is the complex conjugate of ui and u∗ denotes complex conjugate transpose


of the vector u.
Let V be the collection of all such vectors with two operations specifying the vec-
tor addition and multiplication of a vector by a scalar be defined. It is assumed that
these two operations are defined such that the usual rules (associative, commutative,
distributive, etc.) are satisfied. This set V is denoted by ℝn (the space of n-tuples of real
numbers) or ℂn (the space of n-tuples of complex numbers). The sum of two vectors u
and v is defined by

u1 + v1
u2 + v2
u+v=( ),
.
un + vn

and the product (scalar multiplication) of a vector u by a real (or complex) number α
by

αu1
αu2
αu = ( . )
.
αun

https://doi.org/10.1515/9783110739701-003
66 | 3 Vectors and vector expansions

The set V with the above operations is called a vector space. We now deal with the
algebraic and geometric properties of this set.

3.1 Linear dependence, basis and dimension


Suppose that V is the collection of vectors, all having n elements. We define the zero
vector in V as the n-tuple whose elements are all zero. It will be denoted by the symbol
0n or simply by 0, when the number of elements is clear. By definition, a nonzero
vector contains at least one element which is not zero. Now, suppose that S is a subset
of vectors {u1 , u2 , . . . , ur } in V. This subset is called linearly independent if the relation

c1 u1 + c2 u2 + ⋅ ⋅ ⋅ + cr ur = 0 (3.1)

implies

c1 = c2 = ⋅ ⋅ ⋅ = cr = 0

Otherwise, the set is called linearly dependent. We note that equation (3.1) defines a
system of n homogeneous equations in r unknowns.

Example 3.1. Consider the set u1 = ( 21 ), u2 = ( 35 ) in ℝ2 . It is linearly independent since


c1 u1 + c2 u2 = 0 implies c1 + 3c2 = 0, 2c1 + 5c2 = 0, whose only solution is c1 = c2 = 0.
2 3 8
Example 3.2. Consider the set u1 = ( 6 ), u2 = ( 1 ), u3 = ( 16 ) in ℝ3 . To check if this
−2 2 −3
set is linearly independent, we form the homogeneous equations

c1 u1 + c2 u2 + c3 u3 = 0,

which is equivalent to the system

2c1 + 3c2 + 8c3 = 0


6c1 + c2 + 16c3 = 0
−2c1 + 2c2 − 3c3 = 0

Using the elementary row operations, we reduce this system to the following echelon
form:
3
1 2
4
( 0 1 1 ) c = 0.
0 0 0

Thus, we have c2 = −c3 , c1 = − 52 c3 , and can get a nontrivial solution (e. g., by taking
c3 = 2, c2 = −2 and c1 = −5), and hence, the vectors are linearly dependent.
3.2 Dot or scalar product of vectors | 67

The following facts may be established from the above definitions of linear depen-
dence:
1. The zero vector is linearly dependent (since 1. 0 = 0).
2. Any single nonzero vector is linearly independent.
3. If a set of vectors is linearly dependent, then any larger set containing this set is
also linearly dependent.
4. Any subset of a linearly independent set is also linearly independent.
5. Any set of vectors containing the zero vector is linearly dependent.
6. If r > n, the set {u1 , u2 , . . . , ur } is linearly dependent, i. e., there can be at most
n linearly independent vectors in a set where all the vectors have n elements. As
already noted, equation (3.1) defines a set of n linear homogeneous equations in
r unknowns. Where r > n, there are more unknowns than equations and we can
always find a nontrivial solution.

The collection of all vectors, which are linear combinations of elements of the set S =
{u1 , u2 , . . . , ur } is called the subspace spanned by S. A set S = {u1 , u2 , . . . , ur } is called
a basis for a vector space V if it is linearly independent and spans V. The number
of elements in a basis is called the dimension of the vector space V. The following
theorem may be established easily from the above properties of n-tuples.

Theorem. The vector space V of all n-tuples of real numbers ℝn (or complex num-
bers, ℂn ) has dimension n.

Example 3.3. Consider the set

1 0
e1 = ( ), e2 = ( )
0 1

in ℝ2 . This is called the standard basis. The set in example (3.1) is another basis for ℝ2 .

3.2 Dot or scalar product of vectors


The vector space of n-tuples as defined above (i. e., with vector addition and multi-
plication of a vector by a scalar) has only algebraic structure. The concept of scalar
or dot product of vectors allows us to introduce geometrical properties and extend the
familiar geometric concepts such as distances, lengths, angles and orthogonality from
two or three dimensions (ℝ2 or ℝ3 ) to other finite- (and also to infinite-) dimensional
vector spaces.
Let V be a vector space consisting of n-tuples of real or complex numbers. Sup-
pose that to each pair of vectors u, v ∈ V we assign a scalar denoted by u.v, or more
generally (anticipating our later notation) ⟨u, v⟩, which is a real or complex number.
68 | 3 Vectors and vector expansions

This function is called scalar or dot product (or more generally, an inner product) if it
satisfies the following three rules:
(i) ⟨αu + βv, w⟩ = α⟨u, w⟩ + β⟨v, w⟩; for u, v, w ∈ V and α, β are scalars
(ii) ⟨u, v⟩ = ⟨v, u⟩
(iii) ⟨u, u⟩ ≥ 0 and ⟨u, u⟩ = 0 iff u = 0

It is important to note that the scalar product maps pairs of vectors in V to the set of
real or complex numbers. The first property requires linearity in the first variable. The
second property is called Hermitian symmetry. For the case in which u and v contain
real elements, this simply requires the scalar product to be symmetric. The third prop-
erty known as the positive definiteness requires that the scalar product of a vector with
itself to be positive for all vectors in V except the zero vector. A vector space in which
a scalar product is defined has a geometric structure (and we can change this geomet-
ric structure by properly choosing the scalar product for a particular application. This
will be demonstrated in the second part.) We define the length of a vector by

‖u‖ = √⟨u, u⟩ (3.2)

and the distance between two vectors u and v by

d(u, v) = ‖u − v‖ = √⟨u − v, u − v⟩ (3.3)

Using the Schwarz’s inequality,

󵄨󵄨 󵄨2
󵄨󵄨⟨u, v⟩󵄨󵄨󵄨 ≤ ⟨u, u⟩⟨v, v⟩ (3.4)

we can also define the angle between two vectors. [A proof of Schwarz’s inequality is
given in Part II]. When V is the set of n-tuples of real numbers, we define the angle
between two vectors u, v ∈ V as

⟨u, v⟩
cos θ = (3.5)
‖u‖‖v‖

When V is the set of n-tuples of complex numbers, we define the angle between two
vectors u, v ∈ V as

|⟨u, v⟩|
cos θ = (3.6)
‖u‖‖v‖

[Remark: It can be shown that the angle defined by equation (3.5) satisfies 0 ≤ θ ≤ π
while that defined by equation (3.6) satisfies 0 ≤ θ ≤ π2 .] The vectors u, v ∈ V are
said to be orthogonal if ⟨u, v⟩ = 0. A vector u is said to be normalized (or is a unit
vector) if ‖u‖ = 1. If the set of vectors {u1 , u2 , . . . , un } is linearly independent and forms
a basis for V, then this basis is called an orthonormal basis if each vector in the set is
3.2 Dot or scalar product of vectors | 69

orthogonal to other vectors and is normalized to have unit length. In terms of scalar
product, an orthonormal basis satisfies the condition

⟨ui , uj ⟩ = δij (3.7)

where δij is the Kronecker delta function (δij = 1 for i = j and zero otherwise).

Example 3.4. Let V = ℝn and for u, v ∈ V define


n
⟨u, v⟩ = ∑ ui vi
i=1

This is the usual inner (dot) product and it may be verified that it satisfies all the three
axioms. The length of a vector with respect to this inner product is given by

‖u‖ = √u21 + u22 + ⋅ ⋅ ⋅ + u2n

and the distance between the vectors u and v is given by

d(u, v) = ‖u − v‖ = √(u1 − v1 )2 + (u2 − v2 )2 + ⋅ ⋅ ⋅ + (un − vn )2

The set consisting of the unit vectors e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), . . . , en =
(0, 0, . . . , 1) is one possible orthonormal basis for this space. This vector space is of-
ten referred to as the n-dimensional Euclidean space.

Example 3.5. Let V = ℂn and for u, v ∈ V, define


n
⟨u, v⟩ = ∑ ui vi ,
i=1

where the bar denotes complex conjugate. Again, it may be verified that all the three
axioms are satisfied. The length of a vector with respect to this inner product is given
by

‖u‖ = √u1 u1 + u2 u2 + ⋅ ⋅ ⋅ + un un

= √|u1 |2 + |u2 |2 + ⋅ ⋅ ⋅ + |un |2 ,

and the distance between the vectors u and v is given by

d(u, v) = ‖u − v‖ = √|u1 − v1 |2 + |u2 − v2 |2 + ⋅ ⋅ ⋅ + |un1 − vn |2

This is the space of n-tuples of complex numbers and has a geometric structure similar
to that of the n-dimensional Euclidean space. It is an example of a finite-dimensional
Hilbert space.
70 | 3 Vectors and vector expansions

It may be shown that every finite-dimensional inner product space has an or-
thonormal basis. If {u1 , u2 , . . . , un } is a basis for V but is not orthogonal, the following
Gram–Schmidt procedure may be used to transform it to an orthogonal basis. Define

v1 = u1

and for k > 1,

k−1
⟨uk , vi ⟩
vk = uk − ∑ v
i=1
‖vi ‖2 i

Then it is easily verified that vk is orthogonal to {v1 , v2 , . . . , vk−1 } and vk ≠ 0 since


this would mean that the vectors {u1 , u2 , u3 , . . . , uk } are linearly dependent. Thus,
{v1 , v2 , . . . , vn } is an orthogonal basis and by dividing each vector by its length we get
an orthonormal basis.

3.3 Linear algebraic equations


We return again to the set of m linear equations in n unknowns:

Au = 0 (3.8)

Suppose that the rank of A is r(r ≤ m, r ≤ n). Then there is at least one r × r minor of
A whose determinant is not zero. Without loss of generality, we can assume that the
nonzero r × r minor is at the upper left corner. We can solve for the first r variables in
terms of the remaining (n − r) variables to obtain

u1 = γ12 u2 + γ13 u3 + ⋅ ⋅ ⋅ + γ1,r+1 ur+1 + γ1,r+2 ur+2 + ⋅ ⋅ ⋅ + γ1,n un


u2 = γ23 u3 + ⋅ ⋅ ⋅ + γ2,r+1 ur+1 + γ2,r+2 ur+2 + ⋅ ⋅ ⋅ + γ2,n un
. (3.9)
.
ur = γr,r+1 ur+1 + γr,r+2 ur+2 + ⋅ ⋅ ⋅ + γr,n un

Suppose that we choose values for the variables {ur+1 , ur+2 , . . . , un } and calculate
{u1 , u2 , . . . , ur } from equation (3.9). Suppose that we make (n − r + 1) choices for
{ur+1 , ur+2 , . . . , un } and arrange the solution in rows. Then, in this solution matrix,
the first r columns are obtained as linear combination of the last (n − r) columns.
Hence, the rank of this matrix is at most (n − r). If the choices of {ur+1 , ur+2 , . . . , un }
are such that the rank of the solution matrix is equal to (n − r), then the last row is
a linear combination of the first (n − r) rows. Thus, there can be at most (n − r) lin-
early independent solutions. The (n − r) linearly independent solutions are called a
3.3 Linear algebraic equations | 71

fundamental set of solutions. The following theorem may be stated for the solutions of
equation (3.8).

Theorem. Every solution of the homogeneous system (3.8) is of the form

uh = c1 u1 + c2 u2 + ⋅ ⋅ ⋅ + cn−r un−r (3.10)

where r is the rank of A and {u1 , u2 , . . . , un−r } is a set of fundamental (linearly indepen-
dent) solutions and ci are arbitrary constants.

We have already seen that the inhomogeneous system Au = b is consistent (has


solutions) iff rank(A) = rank(aug A). When this condition is satisfied, the following
theorem may be stated for the solutions of the inhomogeneous system.

Theorem. Suppose that rank(A) = rank(aug A) = r and 1 ≤ r ≤ n. Then the general


solution to the inhomogeneous system Au = b may be written in the form

u = uh + up (3.11)

where uh is the general solution of the homogeneous system given by (3.10) and up is any
(particular) solution of the inhomogeneous equations.
[Remark: The theorem is also valid for the case r = 0 but this is omitted as it corre-
sponds to zero equations in n unknowns.]

Example 3.6. Consider the homogeneous system in four variables:

u1 − 2u2 − u4 = 0; −2u1 + 3u2 + 3u3 = 0


−u2 + 3u3 − 2u4 = 0; 3u1 − 7u2 + 3u3 − 5u4 = 0

for which rank A = 2. We have already seen that [see Example 1.7] every solution to
the homogeneous system is of the form

3 0
0 1
uh = c1 ( ) + c2 ( )
2 −1
3 −2

where c1 and c2 are constants. The inhomogeneous system

u1 − 2u2 − u4 = b1
−2u1 + 3u2 + 3u3 = b2
−u2 + 3u3 − 2u4 = b3
3u1 − 7u2 + 3u3 − 5u4 = b4

is consistent only if b3 = 2b1 + b2 and b4 = 5b1 + b2 , or equivalently,


72 | 3 Vectors and vector expansions

1 0
0 1
b = b1 ( ) + b2 ( )
2 1
5 1

Taking bT = (0, 1, 1, 1), the general solution of the inhomogeneous system may be writ-
ten as

3 0 −2
0 1 −1
u = c1 ( ) + c2 ( )+( )
2 −1 0
3 −2 0

3.4 Applications of vectors and vector expansions


The vector space concepts discussed above find applications in several topics of in-
terest to chemical engineers. We discuss here two of them: namely, stoichiometry and
dimensional analysis.

3.4.1 Stoichiometry

A single chemical reaction occurring in a homogeneous system among S species de-


noted by A1 , A2 , . . . , AS may be written as

S
∑ νj A j = 0 (3.12)
j=1

where νj is the stoichiometric coefficient of species Aj in the reaction. By convection,


νj is positive if Aj is a product and negative if Aj is a reactant. For example, consider
the methanol synthesis reaction

CO + 2H2 = CH3 OH, (3.13)

and denote A1 = CH3 OH, A2 = CO and A3 = H2 . With this notation, we can write
equation (3.13) in the form of equation (3.12) as

A1 − A2 − 2A3 = 0 (3.14)

with ν1 = 1, ν2 = −1 and ν3 = −2.


3.4 Applications of vectors and vector expansions | 73

A set of R reactions occurring among S species may be written as

S
∑ νij Aj = 0; (i = 1, 2, . . . , R) (3.15)
j=1

where νij is the stoichiometric coefficient of species Aj in the i-th reaction. For obvi-
ous reasons, the R × S matrix {νij } is called the stoichiometric coefficient matrix. For
example, consider the above methanol synthesis reaction with the side reactions

CO2 + H2 = H2 O + CO (3.16)
CO2 + 3H2 = H2 O + CH3 OH (3.17)

Denoting A4 = CO2 and A5 = H2 O, we can write these as

A2 + A5 − A3 − A4 = 0 (3.18)
A1 + A5 − 3A3 − A4 = 0 (3.19)

The three reactions may be written together as

νa = 0

where ν is the 3 × 5 stoichiometric coefficient matrix defined by

1 −1 −2 0 0
ν=( 0 1 −1 −1 1 )
1 0 −3 −1 1

and a is the species vector defined by aT = (A1 A2 A3 A4 A5 ). The two reactions (3.13)
and (3.16) are independent of each other while the third reaction (3.17) is the sum of
the other two reactions. It is important to know just how many independent reactions
there are in a given system. This can be answered using the vector space concepts in
two different ways:
1. When we already know the system of reactions, we can determine the number of
independent reactions and pick one such set by looking at the stoichiometric co-
efficient matrix. In the above example, the rank of ν is two and only two reactions
are independent.
2. We can also determine the number of independent reactions between S species
(A1 , A2 , . . . , AS ) by determining the rank of the atomic matrix. Suppose that each
species Aj is made up of atoms αi and let the number of atoms αi in species Aj be
denoted by λij . A table may be made up listing the species Aj along the top row
and the building blocks of the species (i. e., the atoms) αi vertically at the left so
that the element at the intersection of the i-th row and j-th column is λij . The n × S
matrix {λij } is called the atomic matrix:
74 | 3 Vectors and vector expansions

A1 A2 A3 . . AS
α1 λ11 λ12 λ13 . . λ1S
α2 λ21 λ22 λ23 . . λ2S
. . . . . . .
. . . . . . .
αn λn1 λn2 λn3 . . λnS

In this notation, each species Aj is represented as a vector in the n-dimensional


atom space. The elements of the j-th column of the atomic matrix represent the
various atoms in Aj and the vector representation of this species.

Suppose that the rank of the atomic matrix is r. Then the number of independent vec-
tors is r and the remaining (S − r) vectors (species) may be represented as a linear
combination of r basis vectors (species). These (S − r) relations are nothing but the
independent reactions between the species.

Example 3.7. Consider a reaction mixture consisting of CH3 OH, CO, H2 , CO2 and H2 O.
There are five species and the three distinct atoms. We form the atomic matrix and
see that it has rank 3. Thus, there are two independent reactions between these five
species.

3.4.2 Dimensional analysis

Dimensional analysis is useful to analyze and correlate the behavior of a physical sys-
tem when it is not possible to write down the governing equations explicitly or when
they are too complicated to solve. In such cases, the Buckingham method may be used
to determine the dimensionless groups that characterize the behavior of the system.
In this method, one lists all the variables that are significant in a given problem and
determines the number of independent dimensionless groups formed by these vari-
ables by using the Buckingham pi-theorem. This theorem states that the number of
dimensionless groups used to describe a system involving n variables is equal to n − r,
where r is the rank of the dimensional matrix of the variables. Thus,

i = n − r,

where i is number of independent dimensionless groups, n is the number of variables


and r is the rank of the dimensional matrix of these variables. The dimensionless ma-
trix is simply the matrix formed by tabulating the exponents of the fundamental di-
mensions (such as mass, M; length L; time, t; temperature, T; electric current, A and
so on).
To see how the pi-theorem arises, we assume that the n physical variables may be
expressed in terms of m fundamental dimensions (usually with integer exponents).
3.4 Applications of vectors and vector expansions | 75

The exponents of the fundamental dimensions may be used to represent each variable
as a vector in the m-dimensional space. Suppose that the rank of this matrix is r(≤ m).
Then only r of these vectors are linearly independent, and hence, (n−r) of these vectors
may be expressed as a linear combination of the r independent vectors. These (n − r)
relations are the dimensionless groups formed by the variables.

Example 3.8. Consider the motion of a solid body through a fluid. The drag force ex-
erted by the fluid (FD ) depends on the velocity V0 of this solid body, the size of the solid
body (such as diameter, D), the fluid density (ρ) and the fluid viscosity (μ). Determine
the relevant dimensionless groups.

Variable Symbol Dimensions Vector representation

Drag Force FD MLt −2


(1, 1, −2)
Velocity Vo Lt −1 (0, 1, −1)
Density ρ ML−3 (1, −3, 0)
Viscosity μ ML−1 t −1 (1, −1, −1)
Size of body D L (0, 1, 0)

We note that there are five vectors (variables) in a three-dimensional space and only
three of them can be linearly independent. We take the three linearly independent
vectors to be V0 , ρ and D. The two dimensionless groups may be formed by expanding
the remaining two vectors in terms of these three linearly independent ones. Equiv-
alently, we can form the product of each of the other variables with these three and
choose the exponents so that the resulting combination has no dimensions. (This is
the pi-method.) To form the first group, we write

π1 = (FD )a (V0 )b (ρ)c (D)d


a b c
= (MLt −2 ) (Lt −1 ) (ML−3 ) (L)d
= M (a+c) L(a+b+3cd) t (−2a−b)

To make π1 dimensionless, each of the above exponents must be zero. Solving these
three homogeneous equations, we get a = −c, d = 2c and b = 2c. Thus,
c
ρD2 Vo2
π1 = ( )
FD

The value for c is arbitrary and we take it to be −1 so that π1 becomes the familiar Euler
number:

FD
Eu =
ρD2 Vo2
76 | 3 Vectors and vector expansions

To form the second group, we write

π2 = (Vo )a (ρ)b (μ)c (D)d


= M (b+c) L(a−3b−c+d) t (−a−c)

To make π2 dimensionless, we choose c = −b, a = b and d = b. Thus,

b
DVo ρ
π2 = ( )
μ

We choose b = 1 and identify the dimensionless group as the Reynolds number:

DVo ρ
Re =
μ

Thus, we can relate the five variables in terms of two dimensionless groups Eu and
Re. A relationship of the form Eu = f (Re) may be determined experimentally. In the
literature, the Euler number is often replaced by the drag coefficient, which is defined
by

FD
CD = 1
ρV 2
2 o Ac

where Ac is the projected area of the body in the direction of flow (For a sphere, Ac =
πD2 /4). Plots of experimentally determined drag coefficient curves (CD versus Re) for
various shapes (e. g., sphere) may be found in standard fluid mechanics textbooks.

3.5 Application of computer algebra and symbolic manipulation


3.5.1 Determination of independent reactions

It is often the case that many species are found in a chemical reactor and various re-
action pathways are conjectured. For example, during oxidative dehydrogenation of
methane in a catalytic reactor, following gas-phase species are found:

sT = (CH4 , C2 H6 , H2 O, H2 , O2 , CO, CO2 , C2 H4 , C2 H2 ) (3.20)

and following reactions may occur:

1
2CH4 + O2 → C2 H6 + H2 O (3.21)
2
2CH4 + O2 → C2 H4 + 2H2 O (3.22)
1
CH4 + O2 → CO + 2H2 (3.23)
2
3.5 Application of computer algebra and symbolic manipulation | 77

1
CO + O2 → CO2 (3.24)
2
2CH4 󴀕󴀬 C2 H6 + H2 (3.25)
C2 H6 󴀕󴀬 C2 H4 + H2 (3.26)
C2 H4 󴀕󴀬 C2 H2 + H2 (3.27)
CH4 + H2 O 󴀕󴀬 CO + 3H2 (3.28)
CO + H2 O 󴀕󴀬 CO2 + H2 , (3.29)

where not all of them are independent reactions. There are many ways to determine the
independent reactions as described earlier, however, when number of species is large,
the determination may be cumbersome and computer programming can be utilized for
the task. As an example, we consider the above example of oxidative dehydrogenation
of methane, where nine species (given in equation (3.20)) are made up of 3 atoms and
leads to the following atomic matrix:
󵄨󵄨
󵄨󵄨 CH4 C 2 H6 H2 O H2 O2 CO CO2 C 2 H4 C2 H2
󵄨󵄨
󵄨󵄨
C 󵄨󵄨 1 2 0 0 0 1 1 2 2
󵄨󵄨 (3.30)
H 󵄨󵄨 ( 4 6 2 2 0 0 0 4 2 )
󵄨󵄨
󵄨󵄨
O 󵄨󵄨 0 0 1 0 2 1 2 0 0

It may be verified that the rank of this matrix is 3 from its row echelon form:

1 0 0 2 −4 −5 −7 −2 −4
( 0 1 0 −1 2 3 4 2 3 ). (3.31)
0 0 1 0 2 1 2 0 0

Thus, the number of independent reactions are 9 − 3 = 6. Similarly, the number of


independent reactions can also be determined from stoichiometry. For example, the
stoichiometric matrix ν can be given from equations (3.21)–(3.29) as follows:

−2 1 1 0 − 21 0 0 0 0
−2 0 2 0 −1 0 0 1 0
( −1 0 0 2 − 21 1 0 0 0 )
( )
( 0 0 0 0 − 21 −1 1 0 0 )
( )
( )
ν = ( −2 1 0 1 0 0 0 0 0 ) (3.32)
( )
( 0 −1 0 1 0 0 0 1 0 )
( )
( 0 0 0 1 0 0 0 −1 1 )
−1 0 −1 3 0 1 0 0 0
( 0 0 −1 1 0 −1 1 0 0 )

where species numbers are assigned as the order they appear in equation (3.20). The
row-echelon form of this matrix can be obtained as follows:
78 | 3 Vectors and vector expansions

1 0 0 0 0 0 0 −3
2
1
0 1 0 0 0 0 0 −2 1
3
( 0 0 1 0 0 0 −1
2
−5
4 2
)
( )
( 0 0 0 1 0 0 0 −1 1 )
( )
( 0 0 0 0 1 0 −1
1 )
ν̂ = ( −1 2 ) (3.33)
( 1
)
( 0 0 0 0 0 1 −1 −1 )
( 2 4 2 )
( 0 0 0 0 0 0 0 0 0 )
0 0 0 0 0 0 0 0 0
( 0 0 0 0 0 0 0 0 0 )

which shows that the rank of stoichiometric matrix is 6, i. e., number of independent
reactions is 6. These independent reactions can be obtained by multiplying ν̂ (given
in equation (3.33)) with the species vector given in equation (3.20).
Note that this set of independent reactions is not unique and can be determined in
many other ways, e. g., another method is shown below, which is based on eliminating
each atom as follows:

Step 1: Write each species in terms of atoms


CH4 = C + 4H
C2 H6 = 2C + 6H
H2 O = 2H + O
H2 = 2H
O2 = 2O
CO = C + O
CO2 = C + 2O
C2 H4 = 2C + 4H
C2 H2 = 2C + 2H

Step 2: Eliminate the atoms


Eliminating H from the above nine equations, we get

CH4 = C + 2H2
C2 H6 = 2C + 3H2
H2 O = H2 + O
H2 = H2
O2 = 2O
CO = C + O
CO2 = C + 2O
C2 H4 = 2C + 2H2
C2 H2 = 2C + H2
3.5 Application of computer algebra and symbolic manipulation | 79

Eliminating O from the above eight equations, we get

CH4 = C + 2H2
C2 H6 = 2C + 3H2
1
H2 O = H2 + O2
2
H2 = H2
1
CO = C + O2
2
1
CO2 = C + O2
2
C2 H4 = 2C + 2H2
C2 H2 = 2C + H2

Finally, eliminating C from the above seven equations, we get the six linearly indepen-
dent reactions:

2CH4 = C2 H6 + H2 : dimerization/pyrolysis
1
H2 + O2 = H2 O : hydrogen oxidation
2
1
CH4 + O2 = CO + 2H2 : partial oxidation to syngas
2
1
CO + O2 = CO2 : CO oxidation
2
C2 H6 = C2 H4 + H2 : dehydrogenation of ethane
C2 H4 = C2 H2 + H2 : dehydrogenation of ethylene

Problems
1. (Linear dependence and independence of vectors): Which of the following sets of
vectors are linearly independent? Find the corresponding linear relations:
(i) (5, 4, 3), (3, 3, 2), (8, 1, 3)
(ii) (4, −5, 2, 6), (2, −2, 1, 3), (6, −3, 3, 9), (4, −1, 5, 6)
(iii) Suppose we have a set of vectors

aTi = (ai1 , ai2 , . . . , ain ) i = 1, 2, . . . , n

such that
n
|ajj | > ∑ |aij | j = 1, 2, . . . , n
i=1;i=j̸

Show that this set of vectors is linearly independent.


2. (Application of vector expansions to stoichiometry):
80 | 3 Vectors and vector expansions

(a) Determine the number of independent reactions in the following set by exam-
ining the rank of the stoichiometric coefficient matrix:

4NH3 + 5O2 = 4NO + 6H2 O


4NH3 + 3O2 = 2N2 + 6H2 O
O2 + 2NO = 2NO2
N2 + O2 = 2NO

(b) Determine the number of independent reactions in the above system by ex-
amining the rank of the atomic matrix.
(c) The following species are found to be present in the pyrolysis of a low molec-
ular hydrocarbon:

C2 H6 , H, C2 H5 , CH3 , CH4 , H2 , C2 H4 , C3 H8 , C4 H10 .

Determine the number of independent reactions and write down a set of in-
dependent reactions.
3. (Application of vector expansions to dimensional analysis):
(a) When a gas and liquid flow simultaneously in a horizontal pipe, several dif-
ferent flow patterns are obtained (e. g., stratified flow, bubble flow, slug flow,
annular flow, etc.). The type of flow pattern obtained in a particular system
depends on the pipe diameter (D), the liquid and gas superficial velocities
(ULS , UGs ), the density and viscosities of the phases (ρL , ρG , μL , μG ), the inter-
facial (surface) tension (σ) and the gravitational acceleration (g). Determine
the relevant dimensionless groups and give a physical interpretation.
(b) Small droplets of liquid are formed when a liquid jet breaks up in spray and
fuel injection processes. Assume that the droplet diameter (d) depends on the
liquid density, viscosity and surface tension, as well as the jet speed (V) and
diameter (D). Determine the relationship between these quantities by dimen-
sional analysis. Give a physical interpretation of the dimensionless groups.
4. (Application of vector expansions to dimensional analysis): It was shown by G. I.
Taylor that the energy (E) released in a nuclear explosion may be estimated from
the relation
1/5
E
R=( ) ct 2/5 ,
ρ0

where R is the radius of the spherical shock wave generated by the explosion, ρ0 is
the ambient density, t is the time and c is a constant. Taylor suggested to determine
the constant c (which turns out to be close to unity) by using experimentation
with lighter explosives (such as TNT) and E by using photographic data of R as a
function of time.
3.5 Application of computer algebra and symbolic manipulation | 81

(a) Assuming that R depends on E, ρ0 , t and the ambient pressure p0 , derive the
relevant dimensionless groups.
(b) Discuss the additional assumptions or approximations involved in obtaining
Taylor’s formula from the result in (a).
5. (Gas phase microkinetics): Consider a gas phase system consisting of molecules
H2 , Br2 , HBr and free radicals H and Br.
(a) Determine the number of independent reactions and write down one such set.
(b) Determine the number of reactions if the system has no free radicals.
6. (Catalytic microkinetics) In the oxidation of CO on a catalytic site (s), the following
gas phase and surface species are present:

CO, CO2 , O2 , s, CO.s, O2 .s, O.s, CO2 .s

Determine the number of independent reactions and write down one such set.
4 Solution of linear equations by eigenvector
expansions
The main goal of this chapter is to solve the linear algebraic equations

Au = b, (4.1)

the linear initial value problem

du
= Au, t > 0; u(t = 0) = u0 , (4.2)
dt

and related equations containing a square matrix A by eigenvector expansions. As


stated in the Introduction, the solution of these equations reveals the structure of the
solutions of many other linear equations containing differential and integral opera-
tors.

4.1 The matrix eigenvalue problem


Let A be an n × n square matrix with real or complex entries. Consider the system of
homogeneous equations

Ax = λx (4.3)

where λ is a scalar.

Definition. A real or complex number λ is called an eigenvalue of A if the system of ho-


mogeneous equations (4.3) has a nontrivial solution. The nontrivial solution is called
the eigenvector, or more precisely, the right eigenvector of A corresponding to eigen-
value λ.

Eigenvalues are of fundamental importance in most physical systems as they


represent the time or length scales (temporal or spatial frequencies) associated with
the system. The eigenvectors corresponding to the eigenvalues describe the different
modes (or states of the system). We give here a geometrical interpretation and defer
their physical interpretation until we consider specific physical examples.
In order to interpret equation (4.3) and the concept of right eigenvectors geomet-
rically, we consider the case of two dimensions. Let

x1 a11 a12
x=( ), A=( )
x2 a21 a22

and

https://doi.org/10.1515/9783110739701-004
4.1 The matrix eigenvalue problem | 83

y = Ax
a11 a12 x
=( )( 1 )
a21 a22 x2
a11 x1 + a12 x2 y
=( )≡( 1 )
a21 x1 + a22 x2 y2

The matrix A operating on the vector x gives the vector y. In general, the length of y
is different from that of x as the operator A stretches (or contracts) and rotates x to
obtain y. However, when y = λx (with λ real) we see that when A operates on x we get
only a stretching (or contraction) of x but there is no rotation (Figure 4.1 shows this for
λ real and positive).

Figure 4.1: Schematic diagrams giving geometrical interpretation of y = Ax.

We note that equation (4.3) is a system of n homogeneous equations in n unknowns


and may be written as

(a11 − λ)x1 + a12 x2 + ⋅ ⋅ ⋅ + a1n xn = 0


a21 x1 + (a22 − λ)x2 + ⋅ ⋅ ⋅ + a2n xn = 0
. (4.4)
.
an1 x1 + an2 x2 + ⋅ ⋅ ⋅ + (ann − λ)xn = 0

We have seen that this homogeneous system has a nontrivial solution iff det(A−λI) = 0.

󵄨󵄨 a − λ a12 . . a1n 󵄨󵄨
󵄨󵄨 11 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 a21 a22 − λ . . a2n 󵄨󵄨
󵄨󵄨 󵄨󵄨
⇒ Pn (λ) ≡ 󵄨󵄨󵄨󵄨 . . . . . 󵄨󵄨 = 0
󵄨󵄨 (4.5)
󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨󵄨 an1 an2 . . ann − λ 󵄨󵄨
󵄨
84 | 4 Solution of linear equations by eigenvector expansions

Equation (4.5) is called the characteristic equation of the square matrix A. The LHS of
(4.5) is a polynomial of degree n in λ and may be written as

Pn (λ) = (−λ)n + a1 (−λ)n−1 + ⋅ ⋅ ⋅ + an−1 (−λ) + an = 0 (4.6)


= (λ1 − λ)(λ2 − λ) . . . (λn − λ)

In order to determine the number of solutions of the characteristic equation, we invoke


the fundamental theorem of algebra.

Theorem. Every polynomial of degree n has exactly n roots (real or complex with count-
ing of repetition or multiplicity).
It follows from this theorem that a square matrix A of order n has n eigenvalues.

Definition. An eigenvalue λi of A is called simple if

dPn (λ) 󵄨󵄨󵄨󵄨


Pn (λi ) = 0 and Pn′ (λi ) = 󵄨 ≠ 0
dλ 󵄨󵄨󵄨λ=λi

Theorem. If λi is a simple eigenvalue of A, then the corresponding eigenvector obtained


by solving

(A − λi I)xi = 0

is determined uniquely except for a nonzero multiplicative constant.

Proof.
󵄨󵄨 a − λ a12 . . a1n 󵄨󵄨
󵄨󵄨 11 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 a21 a22 − λ . . a2n 󵄨󵄨
󵄨󵄨 󵄨󵄨
Pn (λ) = 󵄨󵄨󵄨 . . . . . 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨󵄨 . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 an1 an2 . . ann − λ 󵄨󵄨

Using the formula for differentiation of a determinant, we have


󵄨󵄨 a − λ a23 . . a2n 󵄨󵄨 󵄨󵄨 a − λ a13 . . a1n 󵄨󵄨
󵄨󵄨 22 󵄨󵄨 󵄨󵄨 11 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 . . . . a2n 󵄨󵄨 󵄨󵄨 a31
󵄨󵄨 󵄨󵄨 a33 − λ . . a3n 󵄨󵄨
󵄨󵄨
Pn (λ) = − 󵄨󵄨󵄨

. . . . . 󵄨󵄨 − 󵄨󵄨
󵄨󵄨 󵄨󵄨 . . . . . 󵄨󵄨 − ⋅ ⋅ ⋅
󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . . . . . .
󵄨󵄨 󵄨󵄨󵄨 󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 a an2 . . ann − λ 󵄨󵄨 󵄨󵄨 a an3 . . ann − λ 󵄨󵄨
󵄨 n2 󵄨 󵄨 n1
󵄨󵄨 a − λ a12 . . a1 n−1 󵄨󵄨󵄨󵄨
󵄨󵄨 11
󵄨󵄨 󵄨
󵄨󵄨 a21 . . . a2 n−1 󵄨󵄨󵄨󵄨
󵄨󵄨 󵄨󵄨
− 󵄨󵄨󵄨 . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . . . . . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 a an−1 2 . . an−1 n−1 󵄨󵄨󵄨
󵄨 n−1 1

If Pn′ (λi ) ≠ 0, then at least one of the above (n − 1) × (n − 1) determinants is not zero.
4.2 Left eigenvectors and the adjoint eigenvalue problem (eigenrows) | 85

⇒ rank(A − λi I) = n − 1.
∴ There is only one linearly independent solution to the homogeneous system (A −
λi I)xi = 0.
∴ The result.

Remark. The eigenvalues are also called characteristic values, characteristic roots or
latent roots.

Example 4.1 (Characteristic equation for 2 × 2 and 3 × 3 matrices). We have


󵄨󵄨 󵄨
󵄨 a −λ a12 󵄨󵄨󵄨
P2 (λ) = 󵄨󵄨󵄨󵄨 11 󵄨󵄨
󵄨󵄨 a21 a22 − λ 󵄨󵄨󵄨
= λ2 − (a11 + a22 )λ + (a11 a22 − a12 a21 )
= λ2 − (tr A)λ + det A = (λ1 − λ)(λ2 − λ)

where tr A is the trace of A (sum of diagonal elements) and det A is the determinant.
For the 3 × 3 case, we have
󵄨󵄨 a − λ a12 a13 󵄨󵄨
󵄨󵄨 11 󵄨󵄨
󵄨 󵄨󵄨
P3 (λ) = 󵄨󵄨󵄨󵄨 a21 a22 − λ a23 󵄨󵄨
󵄨󵄨
󵄨󵄨
󵄨󵄨 a31 a32 a33 − λ 󵄨󵄨
󵄨
= −λ3 + (tr A)λ2 − (a11 a22 + a11 a33 + a22 a33 − a12 a21 − a13 a31 − a23 a32 )λ + det A
= (λ1 − λ)(λ2 − λ)(λ3 − λ)

For the general case, the characteristic equation may be written as

Pn (λ) = (λ1 − λ)(λ2 − λ) . . . (λn − λ)


n
= ∏(λj − λ)
j=1

and we have the following relations:


n n
∏ λj = det A; ∑ λj = tr A.
j=1 j=1

4.2 Left eigenvectors and the adjoint eigenvalue problem


(eigenrows)
Let A be an n × n square matrix with real or complex entries. Let y∗ be a row vector:

y∗ = (y)T
= ( y1 y2 y3 . yn )
86 | 4 Solution of linear equations by eigenvector expansions

where ∗ denotes complex conjugate transpose and

y1
y2
y=( . )
.
yn

We consider the eigenvalue problem

y∗ A = μy∗ (4.7)

To illustrate, consider the case of n = 2. Then equation (4.7) gives

a11 a12
( y1 y2 ) ( ) = μ ( y1 y2 )
a21 a22

Multiplying A on the left by a row vector gives another row vector. Thus, we get

y∗ A = ( a11 y1 + a21 y2 a12 y1 + a22 y2 )


= ( μy1 μy2 )

This gives the homogeneous equations:

a11 y1 + a21 y2 − μy1 = 0


a12 y1 + a22 y2 − μy2 = 0

Definition. A real or complex number μ for which (4.7) has nontrivial solutions is
called an eigenvalue of A and the nontrivial solution y∗ is called eigenrow or more
precisely left eigenvector of A corresponding to eigenvalue μ.

Taking the complex conjugate transpose operation, equation (4.7) may also be
written as

(y∗ A) = (μy∗ )
∗ ∗

⇒ A∗ y = μy
̄ (4.8)

Thus, the left eigenvalue problem for matrix A (also called the adjoint eigenvalue prob-
lem) is an eigenvalue problem for A∗ [Considered as an operator, the matrix A∗ is
called the adjoint of A]. We shall refer to the column vector y as the adjoint eigen-
vector. (The complex conjugate of the transpose of y, namely y∗ will be referred to as
the eigenrow or left eigenvector of A).
When A has real elements, equation (4.8) reduces to

AT y = μy
̄ (4.9)
4.3 Properties of eigenvectors/eigenrows | 87

Theorem. The set of eigenvalues defined by equation (4.8) is identical to that defined
by equation (4.3).

Proof. The eigenvalues of equation (4.8) are the roots of the polynomial
󵄨 󵄨
Qn (μ) = 󵄨󵄨󵄨A∗ − μĪ 󵄨󵄨󵄨
󵄨 T
= 󵄨󵄨󵄨(A∗ − μI)̄ 󵄨󵄨󵄨󵄨
= |Ā − μI|
̄
= |A − μI|
= Pn (μ)
Pn (μ) = 0 iff Pn (μ) = 0

Thus, the adjoint problem has the same set of eigenvalues. This can be seen more
directly from equation (4.7). This equation may be written as the homogeneous system

y∗ (A − μI) = 0 (4.10)

The condition for a nontrivial solution is

|A − μI| = Pn (μ) = 0. (4.11)

Thus, the eigenvalues defined by (4.3) and (4.7) are the same. However, note that if (4.7)
is written in the form given by equation (4.8), then the adjoint eigenvalue problem has
eigenvalues

μ = λ.̄ (4.12)

Thus, if λ is an eigenvalue of A, λ is an eigenvalue of A∗ .

4.3 Properties of eigenvectors/eigenrows


We now consider some properties of eigenvectors and eigenrows that follow from the
definition:
1.(a) If xj is an eigenvector of A corresponding to eigenvalue λj , then so is αxj where
α is any nonzero constant:

Axj = λj xj
⇒ A(αxj ) = λj (αj xj )
⇒ αj xj (≠ 0) is also an eigenvector.

1.(b) If λi is a simple eigenvalue, the eigenvector is uniquely determined except for an


arbitrary nonzero factor.
88 | 4 Solution of linear equations by eigenvector expansions

2.(a) If xi and xj are eigenvectors of A corresponding to eigenvalues λi and λj (λj ≠ λi ),


then xi and xj are linearly independent.

Proof. Suppose that xi and xj are linearly dependent. Then there exist constants ci
and cj such that

ci xi + cj xj = 0

and at least one of these constants is not zero. Assume that ci ≠ 0 ⇒

cj
xi = − xj
ci
cj
⇒ Axi = − Axj
ci
cj
⇒ λi xi = − λj xj
ci
−cj
⇒ λi xi = λj ( x ) = λj xi
ci j
⇒ (λi − λj )xi = 0

or

xi = 0

But xi cannot be zero since it is an eigenvector. We have also assumed that the eigen-
values are distinct, i. e., λi ≠ λj . Therefore, we arrive at a contradiction. ⇒ xi and xj
are linearly independent.

2.(b) Suppose that A has simple and distinct eigenvalues {λ1 , λ2 , . . . , λn } with eigen-
vectors {x1 , x2 , . . . , xn }. Then the set of eigenvectors {x1 , x2 , x3 , . . . , xn } is linearly
independent.

Proof. Suppose that {x1 , x2 , x3 , . . . , xn } is linearly dependent. Then ∃ constants ci not


all zero such that

c1 x1 + c2 x2 + ⋅ ⋅ ⋅ + cn xn = 0 (4.13)

premultiply (4.13) by (A − λ1 I) ⇒

c1 (A − λ1 I)x1 + c2 (A − λ1 I)x2 + ⋅ ⋅ ⋅ + cn (A − λ1 I)xn = 0

Using the fact that

Axi = λi xi ⇒
0 + c2 (λ2 − λ1 )x2 + ⋅ ⋅ ⋅ + cn (λn − λ1 )xn = 0 (4.14)
4.3 Properties of eigenvectors/eigenrows | 89

Premultiply (4.14) by (A − λ2 I) ⇒

c3 (λ3 − λ2 )(λ3 − λ1 )x3 + ⋅ ⋅ ⋅ + cn (λn − λ2 )(λn − λ1 )xn = 0


n 2
⇒ ∑ cj [∏(λj − λi )]xj = 0
j=3 i=1

Continuing this procedure, we get after (n − 1) steps

n−1
cn [∏(λn − λi )]xn = 0 (4.15)
i=1

Since the eigenvalues are all distinct and xn ≠ 0, (4.15) ⇒ cn = 0. Repeating the same
argument with the remaining part of equation (4.13), we can show that

cn−1 = cn−2 = ⋅ ⋅ ⋅ = c1 = 0

Thus, all constants are zero and we have a contradiction. This implies that the eigen-
vectors are linearly independent.
3. Properties (i) and (ii) are also valid for the eigenrows.
4.(a) If xi is an eigenvector of A corresponding to eigenvalue λi and y∗j is the eigenrow
of A corresponding to eigenvalue λj (≠ λi ), then we have

y∗j xi = 0 (4.16)

This important property is referred to as the biorthogonality property, i. e., eigen-


rows and eigenvectors corresponding to different eigenvalues are orthogonal.

Proof. We have from the definition,

Axi = λi xi (4.17)
y∗j A = λj y∗j (4.18)

Multiply equation (4.18) on the right by xi ⇒

y∗j Axi = λj y∗j xi

Using (4.17) ⇒

y∗j λi xi = λj y∗j xi
λi y∗j xi = λj y∗j xi (since λi is a scalar)

(λi − λj )y∗j xi = 0
90 | 4 Solution of linear equations by eigenvector expansions

Since λi ≠ λj ⇒ y∗j xi = 0. In terms of the dot product, this result may be written as
⟨xi , yj ⟩ = 0; i ≠ j.
4.(b) Suppose that A has simple and distinct eigenvalues λ1 , λ2 , . . . , λn with eigenvec-
tors {x1 , x2 , . . . , xn } and eigenrows {y∗1 , y∗2 , . . . , y∗n }. Then

y∗j xi = ⟨xi , yj ⟩ = 0 i ≠ j (4.19)


y∗j xj ≠ 0 (4.20)

We have already proved equation (4.19). To prove (4.20), we use the property
that the eigenvectors and eigenrows are linearly independent. Now, if y∗j xj = 0,
xj is orthogonal to n linearly independent vectors yi , i = 1, . . . , n. However, the
only such vector is the zero vector. But xj ≠ 0 ⇒ y∗j xj ≠ 0. We can normalize the
eigenrows (or eigenvectors) such that

⟨xi , yj ⟩ = δij (4.21)

[Note: The symbol δij is the Kronecker delta, which takes a values of unity when
the indices are equal and zero otherwise].

We now present some examples illustrating the calculation of the eigenvalues,


eigenvectors and eigenrows.
Example 4.2.

−3 2
A=( )
4 −5

This is a real matrix and is not symmetric.


Eigenvalues:

󵄨󵄨 −3 − λ 2 󵄨󵄨
󵄨 󵄨󵄨
P(λ) = 󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨 4 −5 − λ 󵄨󵄨
= λ2 + 8λ + 7
= (λ + 1)(λ + 7)
P(λ) = 0 ⇒ λ1 = −1, λ2 = −7

eigenvectors:

(A − λ1 I)x1 = 0 ⇒
−2 2
( ) x1 = 0
4 −4
1
⇒ x1 = ( )
1
4.3 Properties of eigenvectors/eigenrows | 91

(A − λ2 I)x2 = 0 ⇒
4 2
( ) x2 = 0
4 2
1
⇒ x2 = ( )
−2

eigenrows:

yT1 (A − λ1 I) = 0
−2 2
⇒ yT1 ( )=0
4 −4
yT1 = (2, 1)
yT2 (A − λ2 I) = 0
4 2
⇒ yT2 ( )=0
4 2
yT2 = (1, −1)

Thus, we have
eigenvalues λ1 = −1, λ2 = −7
1 )
eigenvectors x1 = ( 11 ), x2 = ( −2
eigenrows yT1 = ( 2 1 ), yT2 = (1 − 1)
Biorthogonality relations:

yT1 x1 = 3 ≠ 0 yT2 x1 = 0
yT1 x2 = 0 yT2 x2 = 3 ≠ 0

Figure 4.2 shows a plot of the eigenvectors and eigenrows (dashed lines). The biorthog-
onality relationship can be seen clearly.
Normalizing the eigenrows such that

yTi xj = δij

gives the normalized eigenrows as yT1 = ( 32 1


3 ), yT2 = ( 31 − 31 ).

Example 4.3.

−1 1
A=( )
−1 −1

Here, A is a real matrix but not symmetric.


92 | 4 Solution of linear equations by eigenvector expansions

Figure 4.2: Schematic plot of the eigenvectors and eigenrows of the matrix in Example (4.2).

−1 −1
A∗ = AT = ( )
1 −1
󵄨󵄨 󵄨󵄨
󵄨 −1 − λ 1 󵄨󵄨
|A − λI| = 󵄨󵄨󵄨󵄨 󵄨󵄨 = λ2 + 2λ + 2
󵄨󵄨 −1 −1 − λ 󵄨󵄨󵄨
P(λ) = 0 ⇒ λ1 = −1 + i, λ2 = −1 − i

Eigenvectors:

−i 1 0 1
(A − λ1 I)x1 = 0 ⇒ ( ) x1 = ( ) ⇒ x1 = ( )
−1 −i 0 i
i 1 0 1
(A − λ2 I)x2 = 0 ⇒ ( ) x2 = ( ) ⇒ x2 = ( )
−1 i 0 −i

eigenrows:

−i 1
y∗1 (A − λ1 I) = 0 ⇒ ( ȳ11 ȳ12 ) ( )=( 0 0 )
−1 −i
⇒ y∗1 = ( i 1 )
−i
y1 = ( )
1
i 1
y∗2 (A − λ2 I) = 0 ⇒ ( ȳ21 ȳ22 ) ( )=( 0 0 )
−1 i
i
⇒ y∗2 = ( −i 1 ) or y2 = ( )
1

To summarize, we have
4.3 Properties of eigenvectors/eigenrows | 93

eigenvalues

λ1 = −1 + i, λ2 = −1 − i

eigenvectors

1 1
x1 = ( ), x2 = ( )
i −i

eigenrows

y∗1 = ( i 1 ), y∗2 = ( −i 1 )

or adjoint eigenvectors

−i i
y1 = ( ), y2 = ( )
1 1

Biorthogonality

y∗1 x1 = ⟨x1 , y1 ⟩ = i + i = 2i ≠ 0
y∗2 x1 = ⟨x1 , y2 ⟩ = −i + i = 0
y∗1 x2 = ⟨x2 , y1 ⟩ = i − i = 0
y∗2 x2 = ⟨x2 , y2 ⟩ = −i − i = −2i ≠ 0

Example 4.4.

−1 1
A=( )
1 −1

Here, A is a real symmetric matrix.


󵄨󵄨 󵄨󵄨
󵄨 −1 − λ 1 󵄨󵄨
P(λ) = 󵄨󵄨󵄨󵄨 󵄨󵄨 = λ2 + 2λ = 0 ⇒ λ1 = 0, λ2 = −2
󵄨󵄨 1 −1 − λ 󵄨󵄨󵄨
−1 1 1
(A − λ1 I)x1 = 0 ⇒ ( ) x1 = 0 ⇒ x1 = ( )
1 −1 1
1 1 1
(A − λ2 I)x2 = 0 ⇒ ( ) x2 = 0 ⇒ x2 = ( )
1 1 −1

since AT = A, adjoint eigenvectors are the same as the eigenvectors, i. e., y1 = x1


and y2 = x2 . Note also that x1 and x2 are orthogonal to each other. Normalizing these
vectors such that ‖xi ‖ = 1, we obtain the orthonormal set of eigenvectors (and adjoint
eigenvectors):
94 | 4 Solution of linear equations by eigenvector expansions

1 1
√2 √2
x1 = ( 1 ), x2 = ( −1 ).
√2 √2

Example 4.5. We consider the Hermitian matrix

1 1−i
A=( )
1+i 2

Recall that a Hermitian matrix is characterized by A∗ = A,

1 1+i
A=( )
1−i 2
1 1−i
(A)T = A∗ = ( )=A
1+i 2
󵄨󵄨 󵄨
󵄨 1 − λ 1 − i 󵄨󵄨󵄨
P(λ) = 󵄨󵄨󵄨󵄨 󵄨󵄨 = λ2 − 3λ
󵄨󵄨 1 + i 2 − λ 󵄨󵄨
󵄨

P(λ) = 0 ⇒ λ1 = 0, λ2 = 3

Eigenvectors

1 1−i −1 + i
(A − λ1 I)x1 = 0 ⇒ ( ) x1 = 0 ⇒ x1 = ( )
1+i 2 1
y∗1 (A − λI) = 0
1 1−i
(y11 y21 ) ( )=( 0 0 )
1+i 2
−1 + i
⇒ y∗1 = ( −1 − i 1 ) or y1 = ( ) = x1
1
(A − λ2 I)x2 = 0
−2 1−i
( ) x2 = 0
1+i −1
1−i
⇒ x2 = ( ) = y2
2
−1 + i
y∗2 x1 = ( 1 + i 2 )( ) = 0, y∗1 x2 = 0.
1

We now prove an important theorem about the eigenvalues of a real symmetric


matrix (AT = A) or a complex Hermitian matrix (A∗ = A).

Theorem. Suppose that the square matrix A is such that A∗ = A. Then the eigenvalues
of A are real and the left and right eigenvectors of A are related by y∗i = x∗i , i. e., the
4.3 Properties of eigenvectors/eigenrows | 95

eigenrows are the conjugate transposes of the eigenvectors (equivalently, the eigenvec-
tors and adjoint eigenvectors are the same).

Proof. (a) Let λ be an eigenvalue of A and x be the corresponding eigenvector. From


the definition, we have

Ax = λx (4.22)

Premultiplying (4.22) by x∗ (or taking the dot or inner product with x) we get

x∗ Ax = λx∗ x (4.23)

Now, x∗ Ax is a scalar and equation (4.23) is a scalar identity. We take the ∗ operation
(complex conjugation and transpose) on both sides of (4.23) ⇒

(x∗ Ax) = (λx∗ x)


∗ ∗

⇒ x∗ A∗ x = λx∗ x (4.24)
⇒ x Ax = λx x
∗ ∗
(since A = A)

⇒ x λx = λx x
∗ ∗

⇒ λx∗ x = λx∗ x (since λ is a scalar)


⇒ (λ − λ)x
̄ ∗x = 0

But x∗ x = ‖x‖2 ≠ 0 since x is an eigenvector

󳨐⇒ λ = λ̄
󳨐⇒ λ is real

(b) The eigenrows are defined by

y∗i A = λi y∗i (4.25)

(4.22)󳨐⇒

x∗i A = λ̄i x∗i 󳨐⇒ x∗i A = λi x∗i (since λ is real) (4.26)

Comparing (4.25) and (4.26), we see that y∗i may be chosen to be a scalar multiple
of x∗i . Thus, we choose y∗i = x∗i . Because of this very important property (real eigen-
values and orthogonal set of eigenvectors) of real symmetric (or complex Hermitian)
matrices, many problems involving such matrices can be solved using only orthogonal
expansions.
96 | 4 Solution of linear equations by eigenvector expansions

4.4 Orthogonal and biorthogonal expansions


4.4.1 Vector expansions

Let {x1 , x2 , . . . , xn } be a set of n linearly independent vectors each containing n ele-


ments. Let z be any other vector with n elements. Then z can be expanded as
n
z = ∑ αi xi (4.27)
i=1

This expansion is unique, i. e., the coefficients αi are uniquely determined for each vec-
tor z. These coefficients are called the coordinates of z w. r. t the basis {x1 , x2 , . . . , xn }.
In general, we have to solve a set of n linear equations in n unknowns to determine
{αi }.

Example 4.6. Consider ℝ2 and take x1 = ( 21 ), x2 = ( 52 ) as a basis. Find the coordinates


1 ) with respect to this basis. Writing
of z = ( −1

1 1 2
( ) = α1 ( ) + α2 ( )
−1 2 5

gives the linear equations

α1 + 2α2 = 1
2α1 + 5α2 = −1.

Solving these equations, we obtain

α1 = 7
α2 = −3

4.4.2 Orthogonal expansions

Now, suppose that {x1 , x2 , . . . , xn } is an orthogonal set, i. e.,

x∗i xj = 0 if i ≠ j (4.28)

Then the determination of the coefficients in equation (4.27) is simplified greatly as


shown below. Multiply (4.27) by x∗j on the left (or take scalar product of equation (4.27)
with xj ), 󳨐⇒

n
x∗j z = ∑ αi x∗j xi (4.29)
i=1
4.4 Orthogonal and biorthogonal expansions | 97

Since x∗j xj = ‖xj ‖2 ≠ 0, there is only one nonzero term on the RHS of equation (4.29).
Now, the linear equations for αj are decoupled and we can solve for αj as

x∗j z
αj = (4.30)
x∗j xj

Thus, we obtain the expansion

n ⟨z, xj ⟩
z=∑ xj (4.31)
j=1
‖xj ‖2

Note that if each xj is normalized so that ‖xj ‖ = 1. Then we have

αj = x∗j z = ⟨z, xj ⟩.

Thus, the coordinates of any vector z w. r. t. an orthonormal basis can be obtained by


simply taking the dot product of z with each basis vector.

Example 4.7.
(a) Consider ℝ2 and take e1 = ( 01 ), e2 = ( 01 ) as the orthogonal set. Taking

1
z=( )
−1

we have z = α1 e1 + α2 e2 ,

1
α1 = eT1 z = ( 1 0 )( ) = 1 (first element of z)
−1
1
α2 = eT2 z = ( 0 1 )( ) = −1 (second element of z)
−1

(b) Consider ℝ2 and take x1 = ( 1/√2 ), x2 = ( 1/√2



)
1/ 2 −1/√2

z = α1 x1 + α2 x2
1 1 1
α1 = xT1 z = ( √2 √2 )( )=0
−1
1 1
α2 = xT2 z = ( ) = √2
−1
√2 √2 )(
−1
98 | 4 Solution of linear equations by eigenvector expansions

4.4.3 Biorthogonal expansions

Let {x1 , x2 , . . . , xn } be a set of n linearly independent column vectors and {y∗1 , y∗2 , . . . , y∗n }
be another set of n linearly independent row vectors. Suppose that

y∗j xi = 0 if i ≠ j (4.32)

These vectors satisfy the biorthogonality relation stated above. Now, let z be any vector
and consider the expansion of z in terms of the set {x1 , x2 , x3 , . . . , xn } :

n
z = ∑ αi xi (4.33)
i=1

To determine αi , we multiply equation (4.33) on the left by y∗j 󳨐⇒

n
y∗j z = ∑ αi y∗j xi
i=1

Again, due to the biorthogonality property, there is only one nonzero term (corre-
sponding to i = j) in the sum and we get

y∗j z
αj = (4.34)
y∗j xj

Substituting (4.34) in (4.33), we get the identity

n y∗j z
z = ∑( )xj (4.35)
j=1
y∗j xj

Given the two sets of vectors {xi } and {y∗j }, we can define a transformation that con-
verts the vector z into the vector α through equation (4.34). Given the vector α, we can
recover z through equation (4.33). Thus, we have a transform:

y∗i z
z → α, αi = , (4.36)
y∗i xi

and an inverse transform:


n
α → z, z = ∑ αi xi (4.37)
i=1

This procedure may be used to decouple and solve many linear equations containing
a square matrix A. This is illustrated in the next section.
4.5 Solution of linear equations using eigenvector expansions | 99

4.5 Solution of linear equations using eigenvector expansions


In this section, we show how the eigenvector expansions may be used to solve many
types of linear equations containing a square matrix A. Here, we shall assume that the
eigenvalues of A are simple, and hence there are n eigenvectors (and n eigenrows when
A is not symmetric). The case of repeated eigenvalues will be considered in Chapter 6.

4.5.1 Solution of linear algebraic equations Au = b

Consider the solution of the linear system

Au = b (4.38)

where A is a square matrix of order n and u, b are n × 1 vectors. First, we consider the
case in which rank A = n. Premultiply equation (4.38) by y∗j , 󳨐⇒

y∗j Au = y∗j b

󳨐⇒

λj y∗j u = y∗j b (4.39)

[Remark: Taking the dot product of equation (4.38) with yj or multiplying on the left
by y∗j and using the fact that y∗j is a left eigenvector of A, decouples the equations.]
Now, equation (4.39) 󳨐⇒

y∗j u = y∗j b/λj (assuming λj ≠ 0)

󳨐⇒

y∗j u y∗j b 1
=
y∗j xj y∗j xj λj

󳨐⇒
n y∗j b 1
u = ∑( ) x (4.40)
j=1
y∗j xj λj j

is the solution to the linear equations (4.38) in terms of the eigenvalues, eigenvectors
and eigenrows of the matrix A. We shall see later that this formula is also applicable
for many other types of linear equations. A special case of the above result for the case
of a symmetric matrix A with normalized eigenvectors (orthonormal set) is
n ⟨b, xj ⟩
u=∑ xj (4.41)
j=1
λj
100 | 4 Solution of linear equations by eigenvector expansions

where ⟨b, xj ⟩ = xTj b is the standard dot (inner) product. The above solution in terms of
the eigenvector expansion should be compared with that of direct solution methods
(e. g., by Gaussian elimination). When n is large and the eigenvalues are well separated
(e. g., |λ1 | ≪ |λ2 | ≪ |λ3 | ⋅ ⋅ ⋅ ≪ |λn |), only a few terms of the expansion may be sufficient
to compute the solution to the desired accuracy (the extreme case being one eigen-
value being very small in magnitude compared to all others, requiring only one term
in the expansion). In such cases, the eigenvector expansion is more useful than the di-
rect solution method, especially for large values of n, where the number of operations
required to solve the system varies as 31 n3 . A second application of the solution in terms
of the eigenfunctions is in the development of the so-called multigrid methods for the
solution of large sparse systems (obtained by discretization of Laplace–Poisson-type
equations).

Example 4.8. Consider the symmetric 3 × 3 matrix

4 −1 0
( −1 4 −1 )
0 −1 4

that arises in the solution of discretized Laplace/Poisson equation. The eigenvalues


are given by λ1 = 4 − √2, λ2 = 4, λ3 = 4 + √2 while the corresponding normalized
eigenvectors are

1/2 −1/√2 1/2


x1 = ( 1/√2 ) , x2 = ( 0 ), x3 = ( −1/√2 )
1/2 1/√2 1/2

Taking bT = (1, 1, 1) and calculating the three terms in equation (4.41), we get

0.3301 0.0 0.0270 0.3571 5/14


u = ( 0.4668 ) + ( 0.0 ) + ( −0.0382 ) = ( 0.4286 ) = ( 3/7 ) .
0.3301 0.0 0.0270 0.3571 5/14

In this specific case, though the separation between the eigenvalues is not extreme
(λ3 /λ1 = 2.09), the first term still gives the solution to within about 10 % error.

4.5.2 (Fredholm alternative): solution of linear algebraic equations Au = b when A


is singular

If rank A <n, then the solution given by equation (4.40) needs to be modified. Suppose
that rank A = r (r ≥ 1). Then we have seen that the homogeneous system

Au = 0
4.5 Solution of linear equations using eigenvector expansions | 101

has (n − r) linearly independent solutions. This implies that A has (n − r) eigenvectors


corresponding to the zero eigenvalue. This implies that A has (n − r) zero eigenvalues.
Assume that λr+1 = λr+2 = ⋅ ⋅ ⋅ = λn = 0. Then it follows from equation (4.39) that

y∗j b = 0, j = r + 1, . . . , n, (4.42)

where yj are the linearly independent solutions of the adjoint homogeneous system

A∗ yj = 0, j = r + 1, . . . , n. (4.43)

Equation (4.42) is another way of expressing the consistency of the linear system, i. e.,
if A has rank r (< n), then Au = b is consistent iff b is orthogonal to the adjoint eigen-
vectors corresponding to the zero eigenvalue. In this case, the general solution of

Au = b

is of the form
r y∗j b 1 n
u = ∑( ) x + ∑ c j xj
j=1
y∗j xj λj j j=r+1

= up + uh (4.44)

Here, uh is the solution of the homogeneous equations Au = 0 containing arbitrary


constants cj (j = r + 1, . . . , n) and up is a particular solution defined by the first sum-
mation in equation (4.44). [A common case that occurs in solving nonlinear equations
near bifurcation points is r = n − 1, corresponding to a simple zero eigenvalue of the
linearized matrix. In this case, uh is a scalar multiple of the eigenvector corresponding
to the zero eigenvalue].

Example 4.9. Consider the linear system (from Example 3.6)

u1 − 2u2 − u4 = b1
−2u1 + 3u2 + 3u3 = b2
−u2 + 3u3 − 2u4 = b3
3u1 − 7u2 + 3u3 − 5u4 = b4 .

We have seen that the rank of the 4 × 4 coefficient matrix is two, implying that it has
two zero eigenvalues. We note that the adjoint homogeneous system

1 −2 0 3
T −2 3 −1 −7
A y=( )y = 0
0 3 3 3
−1 0 −2 −5
102 | 4 Solution of linear equations by eigenvector expansions

has two linearly independent solutions

−5 −2
−1 −1
y1 = ( ), y2 = ( )
0 1
1 0

Thus, the solvability conditions, equations (4.42) lead to the relations b4 = 5b1 + b2
and b3 = 2b1 + b2 . Equivalently, the system is consistent and has solutions if and only
if b is of the form

1 0
0 1
b = b1 ( ) + b2 ( ).
2 1
5 1

Taking bT = (0, 1, 1, 1), the general solution of the inhomogeneous system may be writ-
ten as

3 0 −2
0 1 −1
u = c1 ( ) + c2 ( )+( ).
2 −1 0
3 −2 0

4.5.3 Linear coupled first-order differential equations with constant coefficients

Consider the initial value problem

du
= Au (4.45)
dt
u = u0 @ t = 0 (4.46)

Equation (4.45) defines a set of n coupled linear equations while (4.46) gives the initial
conditions. For the case of n = 2, equation (4.45) in component form is

du1
= a11 u1 + a12 u2
dt
du2
= a21 u1 + a22 u2
dt

while the initial condition (4.46) takes the form

u1 = u10 , u2 = u20 @ t = 0.
4.5 Solution of linear equations using eigenvector expansions | 103

To solve equations (4.45) and (4.46), we use the biorthogonal expansion. Multiply
(4.45) on the left by y∗j 󳨐⇒

du
y∗j = y∗j Au
dt

since y∗j is a constant vector (independent of t)󳨐⇒

d ∗
(y u) = λj y∗j u
dt j

This is a scalar differential equation for y∗j u. Solving we get

y∗j u = cj eλj t

At t = 0, u = u0 󳨐⇒ cj = y∗j u0

y∗j u = (y∗j u0 )eλj t

󳨐⇒

y∗j u (y∗j u0 )
= e λj t
y∗j xj y∗j xj

󳨐⇒
n y∗j u0
u = ∑( )eλj t xj (4.47)
j=1
y∗j xj

This is the formal solution. Let

y∗j u0
ĉj =
y∗j xj

Then (4.47) gives


n
u = ∑ ĉj xj eλj t (4.48)
j=1

Thus, the solution is a linear combination of terms of the form xj eλj t . Equivalently, the
state of the system at any time is a linear combination of the eigenvectors. For this
reason, the eigenvectors are also called the fundamental modes or basic states of the
system. Note that if u0 = αxi then equation (4.47) simplifies to

u = u0 eλi t
104 | 4 Solution of linear equations by eigenvector expansions

Thus, if the initial state corresponds to one of the eigenvectors then the system will
be in that state at all times. For this reason, the eigenvectors are called invariants for
the flow (or trajectory) defined by equation (4.45). Note also that the reciprocal of the
eigenvalue λj determines the rate of change or the time constant for the system evolu-
tion for this special initial condition.

4.5.4 Linear coupled inhomogeneous equations

Consider the initial value problem

du
= Au + b, t > 0
dt
u = u0 @ t = 0

We first determine the steady-state solution by setting the time derivative to zero. As-
suming that A is invertible,

Aus + b = 0 ⇒ us = −A−1 b

Define

z = u − us
dz
= A(us + z) + b
dt
= Az
z = z0 @ t = 0; z0 = u0 − us

Using the result of equation (4.47), we have


n y∗j z0
z = ∑( )eλj t xj
j=1
y∗j xj
n y∗j z0
󳨐⇒ u = us + ∑( )xj eλj t (4.49)
j=1
y∗j xj

Physical interpretation of the solution (4.49):


If Re λj < 0 for all j, then u → us for t → ∞, i. e., the system approaches the steady-
state. Suppose the eigenvalues are all simple and real and arranged in the order

λ1 < λ2 < λ3 < ⋅ ⋅ ⋅ < λn

If λn > 0, the term containing eλn t will be the dominant term in the solution as it in-
creases without bound for t → ∞. If λn < 0 (and hence is smaller in magnitude than
4.5 Solution of linear equations using eigenvector expansions | 105

all other λi ), then the term containing eλn t will determine the time taken by the sys-
tem to approach the steady state. All other terms decay to zero more rapidly than this
term. Thus, the time constant of the system (time required for the system to approach
steady state with say < 5 % deviation ≈ |λ3 | ) is determined by the eigenvalue having
n
the smallest magnitude.
Now consider the case of complex eigenvalues. If λj = aj + ibj ,

eλj t = eaj t (cos bj t + i sin bj t), i = √−1

If aj < 0, then the approach to steady state is oscillatory. Once again, the eigenvalue
with the smallest real part (in absolute value) determines the time constant of the sys-
tem.

4.5.5 A second-order vector initial value problem

The vibrations of many systems such as coupled point masses and springs, molecules
and structures are described by the equations of the form

d2 u
M = −Ku (4.50)
dt 2
u = u0 @ t = 0 (initial displacement) (4.51)
du
= v0 @ t = 0 (initial velocity) (4.52)
dt

where M is an n × n matrix (called the inertia matrix) and K is n × n matrix (called the
stiffness matrix or matrix of spring constants) and u is the displacement vector. We
assume that M be invertible and let M−1 K = A,

d2 u
󳨐⇒ = −Au (4.53)
dt 2

Multiplying (4.53) on the left by y∗j 󳨐⇒

d2 u
y∗j = −y∗j Au
dt 2
d2 ∗
(y u) = −λj y∗j u
dt 2 j
󳨐⇒

y∗j u = c1j sin √λj t + c2j cos √λj t (4.54)


u = u0 @ t = 0 󳨐⇒ c2j = y∗j u0
106 | 4 Solution of linear equations by eigenvector expansions

du
= v0 @ t = 0 󳨐⇒ c1j √λj = y∗j v0
dt

sin √λj t
y∗j u = (y∗j v0 ) + (y∗j u0 ) cos √λj t
√λj

y∗j u y∗j v0 sin √λj t y∗j u0


=( ) + ( ) cos √λj t
y∗j xj y∗j xj √λj y∗j xj

n y∗j v0 sin √λj t y∗j u0


u = ∑[( ) + ( ) cos √λj t]xj (4.55)
j=1
y∗j xj √λj y∗j xj

This is the formal solution to the initial value problem defined by equations (4.50) to
(4.52).

Remarks and physical interpretation


1. Suppose that the initial conditions are given by

u0 = αxi , v0 = 0

Then (4.55) gives

u = (cos √λi t)u0 (4.56)

Thus, if the system is initially in a state corresponding to eigenvector xi then it


will be in that state for all t > 0. Also, the solution given by equation (4.56) is
periodic with a period T = 2π . Equivalently, the frequency of vibration is given
√λi
√λi
by fi = 2π . Therefore, the eigenvalues of A give the frequencies of vibration while
the eigenvectors correspond to pure modes of vibration.
2. The solution of the inhomogeneous system

d2 u
= −Au + b
dt 2
u = u0
@ t = 0, { du
dt
= v0

is given by

n y∗j v0 sin √λj t y∗j z0


u = us + ∑[( ) + ( ) cos √λj t]xj (4.57)
j=1
y∗j xj √λj y∗j xj
4.5 Solution of linear equations using eigenvector expansions | 107

where

us = A−1 b and z0 = u0 − us

is the equilibrium solution.

4.5.6 Multicomponent diffusion and reaction in a catalyst pore

The problem of diffusion and reaction in a catalyst pore (with no radial and only lon-
gitudinal gradients) or a slab (or plate) of catalyst is described by the vector boundary
value problem

d2 c
D = Kc, 0<s<L
ds2

with boundary conditions

c = c0 @ s = 0 (bulk or surface condition)


dc
= 0@s = L (no flux through pore end or mid-plane symmetry)
ds

Here, K is the matrix of first-order rate constants, c is the vector of species concentra-
tions and D is a (positive definite matrix) of diffusivities. Define
s
ξ = , A = D−1 KL2 (= Φ2 ; Φ = Thiele matrix)
L
d2 c
󳨐⇒ = Ac (4.58)
dξ 2
c = c0 @ ξ = 0 (4.59)
dc
= 0@ξ = 1 (4.60)

Let {λ1 , λ2 , . . . , λn } be the eigenvalues, {x1 , x2 , . . . , xn } be the eigenvectors and {y∗1 , y∗2 , . . . ,
y∗n } be the eigenrows of A, respectively. Then, from equation (4.58) we obtain after
premultiplying by y∗j ,

d2 ∗
(y c) = y∗j Ac = λj (y∗j c)
dξ 2 j
y∗j c = α1j cosh √λj ξ + α2j cosh √λj (1 − ξ )

Equation (4.60) gives

α1j = 0
108 | 4 Solution of linear equations by eigenvector expansions

Boundary condition (4.59) 󳨐⇒ α2j = (y∗j c0 )/ cosh √λj

n y∗j c0 cosh √λj (1 − ξ )


󳨐⇒ c(ξ ) = ∑( ) xj (4.61)
j=1
y∗j xj cosh √λj

This is the formal solution to the multicomponent diffusion–reaction problem defined


by equations (4.58) to (4.60). This solution may be used to determine the average re-
action rate in the pore in terms of the bulk concentrations. The average (or observed)
reaction rate vector is defined by

L
1
robs = ∫ Kc(s) ds
L
0
D dc 󵄨󵄨
= 2 ( − 󵄨󵄨󵄨󵄨 )
L dξ 󵄨ξ =0
D n yj c0

= ∑ (√λj tanh √λj )xj (4.62)
L2 j=1 y∗j xj

The diffusion disguised rate constant matrix K∗ is defined by

robs = K∗ c0 (4.63)

Comparing equations (4.62) and (4.63), we get

D n xj yj

K∗ = ∑ (√λj tanh √λj )
L2 j=1 y∗j xj
D √ −1 2
= D KL tanh(√D−1 KL2 ) (4.64)
L2

The second equality follows from the spectral theorem to be discussed in the next
chapter. In terms of the Thiele matrix, equation (4.64) may be expressed as

D
K∗ = Φ tanh(Φ) (4.65)
L2

It follows from equation (4.64) that when the pore diffusional effects are negligible
(L → 0) the diffusion disguised rate constant matrix is equal to the true rate constant
matrix (K∗ = K) while for the case of strong pore diffusional limitations (L → ∞) or
more precisely |λj | ≫ 1 for all j, we have K∗ = L1 √DK. (Remark: The square root of a
matrix is uniquely defined only for positive definite matrices. See Chapter 7 for further
details).
4.6 Diagonalization of matrices and similarity transforms | 109

4.6 Diagonalization of matrices and similarity transforms


Definition. Let A and B be two square matrices of order n. They are called similar if ∃
an invertible matrix T such that

B = TAT−1 (4.66)

From this definition, the following properties may be established:


1. If B is similar to A, then A is similar to B,

A = T−1 BT (4.67)

2. Similar matrices have the same eigenvalues. To prove this, let λ be an eigenvalue
of A and x be the corresponding eigenvector. Then

Ax = λx (4.68)

A and B are similar 󳨐⇒

T−1 BTx = λx

Since T is nonsingular, we can rewrite this expression as

B(Tx) = λ(Tx)

or

By = λy, y = Tx (4.69)

Thus, if λ is an eigenvalue of A with eigenvector x, then λ is also an eigenvalue of


B with eigenvector Tx.
3. If A has n distinct eigenvalues, then it is similar to a diagonal matrix. Let

T = (x1 , x2 , . . . , xn ) = modal matrix,

whose columns are eigenvectors of A. It may be shown (using biorthogonality


property) that

T−1 AT = Λ (diagonal matrix).

In applications, similar matrices appear when the same physical phenomenon


(described by linear equations) is modeled using different coordinate systems (not
necessarily orthogonal). The eigenvalues (which represent the system time con-
stants or frequencies) do not change but the eigenvectors change with the coordi-
nate system used. The matrix T usually defines the relationship between the two
coordinate systems.
110 | 4 Solution of linear equations by eigenvector expansions

4.6.1 Examples of similarity transforms

Consider the 3-tank interacting system shown in Figure 4.3. The model describing the
transient behavior of this system may be expressed as

dc
VT = Qc, (4.70)
dt ′

where VT is the diagonal capacitance matrix (of tank volumes) and Q is the (symmet-
ric) matrix of exchange flows. With proper normalization of time and dividing by the
respective capacitances, the dimensionless form of the model may be written as

dc
= Ac, t>0 and c = c0 @ t = 0, (4.71)
dt
−2 1 1
A = ( 1/2 −1 1/2 ) . (4.72)
1 1 −2

Figure 4.3: Three interacting tanks: Configuration 1.

[Note that even though Q is symmetric, the matrix A is not symmetric.] If we label the
tanks differently, i. e., denoting the large tank by subscript 1 as shown in Figure 4.4,
the system is now described by

dc
= Bc, t>0 and c = c0 @ t = 0, (4.73)
dt
4.6 Diagonalization of matrices and similarity transforms | 111

Figure 4.4: Three interacting tanks: Configuration 2.

−1 1/2 1/2
B=( 1 −2 1 ). (4.74)
1 1 −2

It can be seen that A and B are similar matrices. They have the same eigenvalues but
different eigenvectors.
The difference between the two cases is flipping of tanks 1 and 2. Thus, if we take

0 1 0
T=( 1 0 0 ) = T−1 , (4.75)
0 0 1

which is obtained by flipping the first and second row of the identity matrix I3 , then

0 1 0 −2 1 1 0 1 0
T−1 AT = ( 1 0 0 ) ( 1/2 −1 1/2 ) ( 1 0 0 )
0 0 1 1 1 −2 0 0 1
−1 1/2 1/2
=( 1 −2 1 ) = B.
1 1 −2

In addition, we can relate the eigenvectors as follows:

Ax = λx 󳨐⇒ TAx = λTx (4.76)


112 | 4 Solution of linear equations by eigenvector expansions

If we assume y = Tx ⇐⇒ x = T−1 y, and we can write

TAT−1 y = λy 󳨐⇒ By = λy (4.77)

In other words, eigenvalues of A and B are the same but eigenvectors are related by
flipping the first and second element of x. The eigenvalues of A are given by λ1 = 0,
1 1
λ2 = −2 and λ3 = −3 with corresponding eigenvectors x1 = ( 1 ), x2 = ( −1 ) and x3 =
1 1
1
( 0 ). Similarly, the eigenvalues of B are the same as A but corresponding eigenvectors
−1
1 0
have flipped first and second element as y1 = ( 1 ), y2 = ( 1 ) and y3 = ( 1 ).
−1
1 1 −1

Remark. Though we use the same symbol, the matrix T in the above example is not
the modal matrix. It relates the two coordinate systems.

4.6.2 Canonical form

Now, consider again the initial value problem

dc
= Ac, t>0 and c = c0 @ t = 0.
dt

If we change the coordinates by using the transform

c = Tĉ; ĉ = T−1 c (4.78)

1 1 1
where T = (x1 , x2 , x3 ) = ( 1 −1 0 ) is the modal matrix, i. e., matrix whose columns are
1 1 −1
eigenvectors xj of A. Then

dc dĉ
= Ac 󳨐⇒ T = ATĉ
dt dt
dĉ
󳨐⇒ = T−1 ATĉ = Λĉ, (4.79)
dt

which is in canonical (decoupled) form:

dĉ1
= 0,
dt
dĉ2
= −2ĉ2
dt
dĉ3
= −3ĉ3 (4.80)
dt

Here, ĉi are the canonical variables in which the original system becomes diagonal.
Thus, the solution can be expressed in these canonical variables as
4.6 Diagonalization of matrices and similarity transforms | 113

ĉ1 = ĉ10
ĉ2 = ĉ20 e−2t
ĉ3 = ĉ30 e−3t . (4.81)

Further, since the rows of T−1 define the left eigenvectors, each left eigenvector defines
a canonical variable:

yT1 1 2 1
1
T −1
= ( yT2 ) = ( 1 −2 1 )
4
yT3 2 0 −2

where yTi are (normalized) eigenrows of matrix A. Thus,

1 2 1 c1
1
ĉ = T−1 c = ( 1 −2 1 ) ( c2 ) ,
4
2 0 −2 c3

which leads to
1
ĉ1 = (c + 2c2 + c3 )
4 1
1
ĉ2 = (c1 − 2c2 + c3 )
4
1
ĉ3 = (c1 − c3 ). (4.82)
2
Similarly,

1
ĉ10 = (c + 2c20 + c30 )
4 10
1
ĉ20 = (c10 − 2c20 + c30 )
4
1
ĉ30 = (c10 − c30 )
2
Thus, we can also write

c1 + 2c2 + c3 = 4ĉ1 = 4ĉ10 = c10 + 2c20 + c30


(c1 − 2c2 + c3 ) = 4ĉ2 = 4ĉ20 e−2t = (c10 − 2c20 + c30 )e−2t
(c1 − c3 ) = 2ĉ1 = 2ĉ30 e−3t = (c10 − c30 )e−3t .

These relations also define the initial conditions in the phase space for which the tran-
sient behavior is confined to a subspace. For example, if c10 = c30 , the concentration
vector at any time is in the subspace spanned by the first two eigenvectors. Since the
system is decoupled in the variables ĉj , we refer to them as the “canonical variables.”
114 | 4 Solution of linear equations by eigenvector expansions

4.6.3 Similarity transform when AT = A

When A is real and symmetric, the eigenvalues are real and eigenvectors can be nor-
malized to have unit length. They form an orthonormal set, i. e.,

xTi xj = δij

In this case, the modal matrix is an orthogonal matrix, i. e.,

TT−1 = T−1 T = I or T−1 = TT ,

and represents either a rotation of the axes, i. e., the two coordinate systems are either
related by a pure rotation of the axes (about the origin) or a combination of rotations
and reflections.

Example 4.10. Consider a 2 × 2 real symmetric matrix A = ( −1 1


1 −1 ). The eigenvalues
of this matrix are real: λ1 = 0 and λ2 = −2 corresponding to normalized eigenvectors
x1 = √12 ( 11 ) and x2 = √12 ( −1
1 ). Thus, the modal matrix is given by

1 1 1
T = (x1 x2 ) = ( ) (4.83)
√2 1 −1

󳨐⇒

1 1 1
T−1 = ( ) = TT . (4.84)
√2 1 −1

Now, the initial value problem

dc
= Ac (4.85)
dt

can be converted into canonical form as discussed earlier by using the transform

c = Tĉ 󳨐⇒ ĉ = T−1 c

󳨐⇒
c1 + c2
ĉ1 =
√2
c1 − c2
ĉ2 = .
√2

We note that this transformation represents a 45∘ counterclockwise rotation of axes as


shown in Figure 4.5.
4.6 Diagonalization of matrices and similarity transforms | 115

Figure 4.5: Canonical transform and rotation of axes for initial value problems.

For example, in ℝ2 , the matrix

cos θ − sin θ
Rot(θ) = ( ) (4.86)
sin θ cos θ

represents a (counterclockwise) rotation about the origin by an angle θ, while the ma-
trix

cos 2ϕ sin 2ϕ
Ref(θ) = ( ) (4.87)
sin 2ϕ − cos 2ϕ

represents a reflection about a line through the origin that makes an angle ϕ with the
x-axis. The set of all 2×2 orthogonal matrices describing rotations and reflections form
an orthogonal group, denoted by O(2). As discussed in Part II, orthogonal or unitary
matrices represent rotations and reflections.

Problems
1. (a) Given the matrix

−3 0 2
A=( 1 −2 1 )
1 1 −4

i. Determine the eigenvalues.


ii. Determine the eigenvectors and a set of eigenrows (adjoint eigenvectors)
and verify by direct computation that they form two biorthogonal sets.
iii. Show by direct computation that the sets of eigenvectors and eigenrows
are linearly independent.
116 | 4 Solution of linear equations by eigenvector expansions

(b) Given the matrix

−1 2 2
A = ( −5 −3 −1 )
−3 −2 −2

(c) Repeat parts (i), (ii) and (iii).


2. Given the matrix

−3 0 2
A=( 1 −2 1 )
1 1 −4

(a) Solve the initial value problem

1
du
= Au; u(t = 0) = ( 0 )
dt
−1

(b) Solve the initial value problem

2 0
d2 y dy
= Ay; y(0) = ( 2 ) (0) = ( 0 )
dt 2 dt
0 0

3. Consider the system of first-order reactions (shown in Figure 4.6) occurring in a


batch reactor of constant volume:

Figure 4.6: Monomolecular reaction network.

k12 = 0.5s−1 k21 = 0.25s−1 k13 = 0.2s−1


k31 = 0.05s−1 k23 = 0.3s−1 k32 = 0.15s−1

(a) Formulate the differential equations that describe the evolution of the con-
centration vector.
(b) Determine the eigenvalues and eigenvectors of the matrix in (a).
4.6 Diagonalization of matrices and similarity transforms | 117

(c) Give a physical interpretation to the eigenvalues and eigenvectors (Hint: Con-
sider what happens when the initial concentration vector is equal to the equi-
librium vector plus a constant multiple of one of the other two eigenvectors).
4. Consider the vibration of the spring-mass system shown in Figure 4.7. Suppose
that the masses are equal and the springs are identical.

Figure 4.7: Schematic of spring-mass system.

(a) Formulate the Newton’s equations of motion and cast them in dimensionless
form.
(b) Determine the eigenvalues and eigenvectors of the matrix and give a physical
interpretation. Sketch the different modes of vibration.
(c) Generalize the above results to N equal masses with identical springs.
5. The thermal conductivity tensor of an anisotropic solid is given by

6 2 −2
K=( 2 6 −2 )
−2 −2 10

(a) Determine the principal conductivities (eigenvalues) and the principal axes
of conductivity (eigenvectors).
(b) Write the expanded form of the heat conduction equation

𝜕T
ρc = 󳶚.(K. 󳶚 T)
𝜕t

in the two coordinate systems.


(c) Show that with proper scaling and transformations, the heat conduction
equation can be reduced to the form

𝜕T
ρc = k 󳶚2 T
𝜕t

where k is a scalar.
6. (a) A real square matrix A is called idempotent if A2 = A. Show that the eigenval-
ues of an idempotent matrix are equal to either 0 or 1.
(b) A real square matrix A is called orthogonal if A−1 = At (transpose of A). Show
that the real eigenvalues of an orthogonal matrix are equal ±1. Show also that
the complex eigenvalues (if any) of an orthogonal matrix have an absolute
value of unity.
118 | 4 Solution of linear equations by eigenvector expansions

(c) A complex square matrix A is called unitary if A−1 = A∗ (conjugate transpose


of A). Prove that the eigenvalues of a unitary matrix are on the unit circle in
the complex plane.
(d) A complex square matrix A is called skew-Hermitian if A∗ = −A. Prove that
the eigenvalues of a skew-Hermitian matrix are purely imaginary.
(e) A complex square matrix A is called positive definite if A = S∗ S where S is
nonsingular (det S ≠ 0). Prove that any eigenvalue of A must be real and pos-
itive.

Remark. The above special matrices play a very important role in group theory,
tensor analysis, numerical computation of eigenvalues, etc.

7. Let A and B be real k × k matrices and consider the matrices

A B
L1 = ( )
B A
A B B
L2 = ( B A B )
B B A

(a) Show that the eigenvalues of the 2k × 2k matrix L1 are the same as those of
k × k matrices (A + B) and (A − B).
(b) Show that the eigenvalues of the 3k × 3k matrix L2 are the same as those of
(A + 2B) and (A − B) (repeated twice).
(c) Discuss how you would find the eigenvectors of L1 and L2 from the lower-
dimensional matrices.

Remark. Matrices of the above type appear in the transient analysis of coupled
systems such as cells (catalyst particles), reactors, distillation columns, etc.

8. Consider the differential algebraic equation (DAE) system

du
C = Au u(t = 0)u0
dt

where A and C are real n × n matrices and C is not necessarily invertible.


(a) Show that the substitution u = yeλt leads to the eigenvalue problem Ay = λCy
If the rank of C is r(0 ≤ r ≤ n), how many eigenvalues are there?
(b) What is the adjoint eigenvalue problem? Determine the form of the biorthog-
onality relations.
(c) Obtain a formal solution of the DAE in terms of the eigenvalues, eigenvectors,
etc.
(d) Can u0 be arbitrary? What conditions must be satisfied by u0 so that the DAE
is consistent and has a unique solution?
4.6 Diagonalization of matrices and similarity transforms | 119

(e) Solve the DAE

0 = u1 + 2u2 + u3
du2
= u2 + 6u3
dt
du3
= −u1 + u2 + 3u3 .
dt

9. Circulant matrices play an important role in digital signal processing and in the
computation of discrete and fast Fourier transforms. A circulant is a constant di-
agonal matrix with the special form

f0 f1 f2 . . fn−1
fn−1 f0 f1 . . fn−2
A = ( fn−2 fn−1 f0 . . fn−3 )
. . . . . .
( 1f f2 f3 . . f0 )

(a) Show that the eigenvalues of A are given by

λ = f0 + f1 ω + f2 ω2 + ⋅ ⋅ ⋅ + fn−1 ωn−1 ,

where ω is the k-th root of unity, i. e.,

ωk = 1; k = 1, 2, . . . , n

i. Determine the eigenvectors. Let F (for Fourier) be the matrix of eigen-


vectors of A. Determine F and show that it is independent of the num-
bers (f0 , f1 , f2 , . . . , fn−1 ), i. e., all circulant matrices of a given order have the
same eigenvectors.
ii. Show that FF = nI. Here, F is the complex conjugate of F.
(b) the discrete Fourier transform c of a sequence f = (f0 , f1 , f2 , . . . , fn−1 ) is a set of
n-numbers defined by

Fc = f.

Show that the discrete transform (and its inverse) can be computed by a sim-
ple matrix multiplication. Write the explicit form of these formulas (for the
transform and its inverse) in terms of the components of the Fourier matrix.
Note: The discrete transform is extremely useful in applications. Its computa-
tion involves n2 multiplications. However, by taking advantage of the special
structure of the Fourier matrix and choosing n as a power of 2, we can reduce
the number of multiplications to n log2 n. This is the basis for the fast Fourier
transform.
120 | 4 Solution of linear equations by eigenvector expansions

10. Consider the initial value problem:

du
= Au, u(t = 0) = u0
dt

where A is a n × n matrix with real elements and u is a n × 1 vector. Suppose that all
the eigenvalues of A are simple. (a) Write the general form of the solution to the
initial value problem in terms of eigenvalues, eigenvectors and eigenrows of A.
(b) Discuss the asymptotic form (t → ∞) of the solution if (i) all eigenvalues have
negative real part, (ii) one zero eigenvalue while all others have negative real part
and (iii) if A has a pair of purely imaginary eigenvalues while all others have neg-
ative real part.
11. Consider the linear system Ax = b, where A is an m × n matrix and x, b are n × 1
and m×1 vectors, respectively. Reason that a necessary and sufficient condition for
the system Ax = b to have a solution is that every solution of the system y∗ A = 0
(where y∗ is a row vector) should also satisfy y∗ b = 0.
12. A square matrix A is called normal if AA∗ = A∗ A. Show that the eigenvalues of a
normal matrix are real.
13. Suppose that A = XΛX−1 , where Λ is a diagonal matrix. Find a similarity transfor-
mation Y that diagonalizes the matrix

0 A
B=( ).
A 0

What is the diagonal form of B? Here, all matrices except B are n × n while B are
2n × 2n.
14. The same similarity transformation diagonalizes both matrices A and B. Show
that A and B must commute. (a) Two Hermitian matrices A and B have the same
eigenvalues. Show that A and B are related by unitary similarity transformation.
15. The transient response of a two-phase system (containing a solid and fluid) is de-
scribed by the equation

dc
= Ac + b(c, t) (1)
dt

c − ϵ1 1
ϵf
where c = ( cfs ) (subscript f refers to the fluid and s to the solid) A = ( 1
f
),
1−ϵf
− 1−ϵ1
f
b (c ,c ,t)
b = ( b1 (cf ,cs ,t) ), ϵf = volume fraction/capacitance of fluid phase and b is some
2 f s

nonlinear function of c as well as time. (a) Determine the eigenvalues and eigen-
vectors of A and give physical interpretation. (b) What is the transformation that
will reduce the linear part of (1) to a diagonal form? Perform this transformation
and write equation (1) in terms of the canonical variables. Give a physical inter-
pretation of the canonical variables.
4.6 Diagonalization of matrices and similarity transforms | 121

16. Consider the linear system Au = b, where A is an n × n matrix and u and b are
n × 1 vectors. Suppose that rank of A = n − 2 (a) Write the conditions the vector b
has to satisfy so that the system is consistent and has solutions. (b) Assuming that
the conditions in (a) are satisfied, write down the general form of the solution in
terms of the eigenvalues, eigenvectors and eigenrows of A. (c) What is the general
form of the solution to the initial value problem dudt
= Au − b, u(0) = 0, where A
and b are as above?
17. Consider the 3-tank interacting system with the tanks having different volumes
but the exchange flows being all identical. (a) Formulate the transient model
and cast it in dimensionless form (but without dividing by the capacitances).
(b) Rewrite the model in (a) by dividing each equation by the respective capaci-
tance term and reason that the resulting model may be interpreted as three equal
sized tanks but with different exchange flow rates. (c) Give a physical interpreta-
tion of the transient system obtained when the matrix appearing in (b) is replaced
by its transpose (or adjoint matrix).
5 Solution of linear equations containing a square
matrix
While the biorthogonal expansions were useful to express the solution of linear equa-
tions in terms of eigenvectors and eigenrows, they may not be convenient for numer-
ical calculations, especially when the order of the matrix is large. In this chapter, we
discuss other important properties of square matrices and alternate methods for deter-
mining the solutions of linear equations containing a square matrix A. The methods
discussed here also give a procedure to calculate functions defined on square matri-
ces.

5.1 Cayley–Hamilton theorem


A very important property of square matrix is given by the following Cayley–Hamilton
theorem.

Theorem. Every square matrix satisfies its own characteristic equation, i. e.,

Pn (A) = 0

where

Pn (λ) = |A − λI|
= (−λ)n + a1 (−λ)n−1 + a2 (−λ)n−2 + ⋅ ⋅ ⋅ + an−1 (−λ) + an

is the characteristic polynomial.

Before we give a proof of this theorem, we illustrate it with a simple example.

Example 5.1. Let

1 2
A=( )
3 4
󵄨󵄨 󵄨󵄨
󵄨 1−λ 2 󵄨󵄨
P2 (λ) = 󵄨󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨 3 4−λ 󵄨󵄨
󵄨
= λ2 − 5λ − 2

Cayley–Hamilton theorem states that

A2 − 5A − 2I = 0.

To verify this, we compute

https://doi.org/10.1515/9783110739701-005
5.1 Cayley–Hamilton theorem | 123

1 2 1 2 7 10
A2 = ( )( )=( )
3 4 3 4 15 22
5 10 2 0 7 10
5A + 2I = ( )+( )=( )
15 20 0 2 15 22

7 10 7 10
A2 − 5A − 2I = ( )−( )
15 22 15 22
0 0
=( )=0
0 0

Thus, we have verified the Cayley–Hamilton theorem.

Proof. Let

Pn (λ) = (−λ)n + a1 (−λ)n−1 + a2 (−λ)n−2 + ⋅ ⋅ ⋅ + an−1 (−λ) + an (5.1)

Consider the identity

(A − λI) adj(A − λI) = det(A − λI) ⋅ I


= Pn (λ) ⋅ I (5.2)

where, adj(A − λI) = classical adjoint of (A − λI). As seen in Chapter 2, this is the matrix
formed from (n − 1) × (n − 1) determinants of the minors of (A − λI). Thus, each element
of the matrix adj(A − λI) is at most a polynomial of degree (n − 1) and by collecting the
coefficients of various powers of λ, we can write

adj(A − λI) = B0 λn−1 + B1 λn−2 + ⋅ ⋅ ⋅ + Bn−2 λ + Bn−1 (5.3)

where B0 , B1 , . . . , Bn−1 are n × n constant matrices. Substituting (5.1) and (5.3) in (5.2)
󳨐⇒

(A−λI)(B0 λn−1 +B1 λn−2 +⋅ ⋅ ⋅+Bn−2 λ+Bn−1 ) = [(−λ)n +a1 (−λ)n−1 +a2 (−λ)n−2 +⋅ ⋅ ⋅+an ]I (5.4)

Comparing the coefficients of various powers of λ, we get

λn : −B0 = (−1)n I (5.5)


n−1 n−1
λ : AB0 − B1 = (−1) a1 I (5.6)
n−2 n−2
λ : AB1 − B2 = (−1) a2 I (5.7)
.
.
λ: ABn−2 − Bn−1 = (−1)an−1 I (5.8)
0
λ : ABn−1 = an I (5.9)
124 | 5 Solution of linear equations containing a square matrix

Multiply (5.5) by An , (5.6) by An−1 , . . . , and (5.9) by I to obtain

−An B0 = (−1)n An
An B0 − An−1 B1 = (−1)n a1 An−1
An−1 B1 − An−2 B2 = (−1)n−2 a2 An−2
.
.
2
A Bn−2 − ABn−1 = (−1)an−1 A
ABn−1 = an I

Adding all these equations, we have

0 = (−A)n + a1 (−A)n−1 + a2 (−A)n−2 + ⋅ ⋅ ⋅ + an−1 (−A) + an I

Thus, Pn (A) = 0 and the Cayley–Hamilton theorem is proved.

Consequences of Cayley–Hamilton theorem


Let

Pn (λ) = (−λ)n + a1 (−λ)n−1 + a2 (−λ)n−2 + ⋅ ⋅ ⋅ + an−1 (−λ) + an

be the characteristic equation of A. Then Cayley–Hamilton theorem gives

Pn (A) = 0 (5.10)

Rewrite equation (5.10) as

An = α1 An−1 + α2 An−2 + ⋅ ⋅ ⋅ + αn−1 A + αn I, (5.11)

where αi = (−1)1−i ai . Thus, An can be expressed as a polynomial of degree (n − 1) in A.


Equation (5.11) gives

An+1 = α1 An + α2 An−1 + ⋅ ⋅ ⋅ + αn A
= α1 (α1 An−1 + ⋅ ⋅ ⋅ + αn I) + α2 An−1 + ⋅ ⋅ ⋅ + αn A
= β1 An−1 + β2 An−2 + ⋅ ⋅ ⋅ + βn I (5.12)

where β1 = α12 +α2 , . . . , βn = α1 αn are some constants. It follows from equation (5.12) that
An+1 can be expressed as a polynomial of degree (n−1) in A. Continuing this procedure,
we see that if

Q(λ) = ∑ ci λi (5.13)
i=0
5.2 Functions of matrices | 125

is any function that has a Maclaurin’s series expansion, then Q(A) can be expressed
as a polynomial of degree (n − 1) in A.
Now, suppose that A is invertible, i. e., A−1 exists. Then, multiplying both sides of
(5.11) by A−1 , we get

An−1 = α1 An−2 + α2 An−3 + ⋅ ⋅ ⋅ + αn−1 I + αn A−1

Assuming that αn ≠ 0, (this is the case if A is invertible), we can rewrite the above
equation as

A−1 = γ1 An−1 + γ2 An−2 + ⋅ ⋅ ⋅ + γn−1 A + γn I (5.14)

Thus, A−1 can be expressed as a polynomial of degree (n − 1) in A. Continuing this


procedure, we see that A−k (k ≥ 1) can be expressed as a polynomial of degree (n − 1)
in A.
i
󳨐⇒ If Q(λ) = ∑∞ i=−∞ ci λ , then Q(A) can be expressed as a polynomial of degree
(n − 1) in A.
This is a very profound result as it implies that any function of A can be expressed
as a polynomial of degree (n − 1) in A. It is used in the following section for defining
and computing functions of a square matrix.

5.2 Functions of matrices


Suppose that A is a square matrix with eigenvalues {λ1 , λ2 , . . . , λn }. The set of eigen-
values of A is called the spectrum of A. Let f (λ) be any function for which f (λj ), j =
1, 2, . . . , n is defined. Then we say that the function f is defined on the spectrum of A.
It follows from Cayley–Hamilton theorem that f (A) is at most a polynomial of degree
(n − 1) in A, i. e.,

f (A) = c1 An−1 + c2 An−2 + ⋅ ⋅ ⋅ + cn−1 A + cn I (5.15)

for some constants {c1 , c2 , . . . , cn }. This result can be used to extend the definition of
familiar scalar functions to functions of matrices as well as to compute them. For ex-
ample, we define the exponential matrix function by the series


Ai A2 A3
eA = ∑ =I+A+ + + ⋅⋅⋅ (5.16)
i=0
i! 2! 3!

or the trigonometric function by


(−1)i A2i+1 A3 A5
sin A = ∑ =A− + + ⋅⋅⋅ (5.17)
i=0
(2i + 1)! 3! 5!
126 | 5 Solution of linear equations containing a square matrix

or the Bessel function of order zero by

∞ (−1)i ( A2 )2i (A/2)2 (A/2)4


J0 (A) = ∑ =I− + − ⋅⋅⋅ (5.18)
i=0
(i!)2 (1!)2 (2!)2

It follows from Cayley–Hamilton theorem that the above matrix functions can be ex-
pressed as polynomials of degree (n − 1) in A. These function may also be computed if
we can evaluate the coefficients in equation (5.15).

Procedure for the calculation of f (A)


Let

B = f (A) = c1 An−1 + c2 An−2 + ⋅ ⋅ ⋅ + cn−1 A + cn I (5.19)

We can calculate B if we can determine the n-unknown constants in equation (5.19).


We now develop a procedure for calculating these constants. The following lemmas
are useful in this procedure.

Lemma 5.1. Let λj be an eigenvalue of A with eigenvector xj . Then λjk is an eigenvalue of


Ak with the same eigenvector. Here, k is any positive or negative integer (including zero).

Proof. For k = 0, A0 xj = Ixj = xj = λj0 xj .


For k > 1, we have

A2 xj = A(Axj ) = A(λj xj ) = λj (Axj ) = λj2 xj


A3 xj = A(A2 xj ) = λj3 xj

and in general,

Ak xj = λjk xj for any k ≥ 0.

Now, consider the equation

Axj = λj xj

Assuming that A is invertible, multiply both sides by A−1 .


󳨐⇒

A−1 Axj = λj A−1 xj


xj = λj A−1 xj
(λj )−1 xj = A−1 xj

Similarly,
5.2 Functions of matrices | 127

(λj )−k xj = A−k xj , k≥1

Thus, Ak xj = λjk xj for all integer values of k.

Lemma 5.2. If f (x) is any function of the form,


f (x) = ∑ ck x k
k=−∞

then

f (A)xj = f (λj )xj

This lemma follows directly from the previous one.

Now, let

B = f (A) 󳨐⇒ Bxj = f (A)xj = f (λj )xj (5.20)

But C-H theorem gives equation (5.19). From this equation, we get

f (A)xj = (c1 An−1 + c2 An−2 + ⋅ ⋅ ⋅ + cn I)xj


= (c1 λjn−1 + c2 λjn−2 + ⋅ ⋅ ⋅ + cn )xj (5.21)

Comparing (5.20) and (5.21) gives

f (λj )xj = (c1 λjn−1 + c2 λjn−2 + ⋅ ⋅ ⋅ + cn )xj

This can be true only if

{f (λj ) − [c1 λjn−1 + c2 λjn−2 + ⋅ ⋅ ⋅ + cn ]}xj = 0

Since xj ≠ 0 󳨐⇒

f (λj ) = (c1 λjn−1 + c2 λjn−2 + ⋅ ⋅ ⋅ + cn ); j = 1, . . . , n (5.22)

Thus, the n constants (c1 , . . . , cn ) can be determined by solving the linear equations
defined by equation (5.22). If an eigenvalue λi is repeated r times (r ≥ 1), then we
consider the relation

f (λ) = (c1 λn−1 + c2 λn−2 + ⋅ ⋅ ⋅ + cn ) (5.23)

and differentiate it k times and set λ = λi for k = 0, 1, . . . , r − 1. This gives r linear


equations for the constants (c1 , . . . , cn ).
128 | 5 Solution of linear equations containing a square matrix

Example 5.2. Develop a formula for f (A) when A is a 2 × 2 matrix with distinct eigen-
values.
From the Cayley–Hamilton theorem, we can write

f (A) = c1 A + c2 I,

where c1 and c2 are determined by the equations

f (λ1 ) = c1 λ1 + c2
f (λ2 ) = c1 λ2 + c2

Solving these two linear equations gives

f (λ1 ) − f (λ2 ) λ2 f (λ1 ) − λ1 f (λ2 )


c1 = , c2 =
(λ1 − λ2 ) (λ2 − λ1 )

Thus, for any 2 × 2 matrix with distinct eigenvalues we obtain the formula

f (λ1 ) − f (λ2 ) λ f (λ ) − λ1 f (λ2 )


f (A) = A+ 2 1 I (5.24)
(λ1 − λ2 ) (λ2 − λ1 )

Example 5.3. Develop a formula for f (A) when A is a 2 × 2 matrix with repeated eigen-
values.
Now, the constants c1 and c2 are given by

f (λ1 ) = c1 λ1 + c2
f ′ (λ1 ) = c1 󳨐⇒ c2 = f (λ1 ) − λ1 f ′ (λ1 )

Therefore, for a 2 × 2 matrix with repeated eigenvalues we have the formula

f (A) = [f ′ (λ1 )]A + [f (λ1 ) − λ1 f ′ (λ1 )]I

Example 5.4. Develop a formula for f (At) when A is a 2 × 2 matrix with distinct eigen-
values.
First, we note that if λ1 and λ2 are eigenvalues of A, then λ1 t and λ2 t are eigenvalues
of At. This follows from the fact that

|At − λtI| = 󵄨󵄨󵄨t(A − λI)󵄨󵄨󵄨 = t n 󵄨󵄨󵄨(A − λI)󵄨󵄨󵄨 = 0


󵄨 󵄨 󵄨 󵄨

where the last equality assumes that λi is an eigenvalue of A, and hence satisfies the
characteristic equation. Now replace λ1 by λ1 t and λ2 by λ2 t and A by At in the formula
of Example (5.2). This gives

f (λ1 t) − f (λ2 t) λ f (λ t) − λ1 f (λ2 t)


f (At) = A+ 2 1 I
(λ1 − λ2 ) (λ2 − λ1 )
5.3 Formal solutions of linear differential equations containing a square matrix | 129

For λ1 = λ2 , this formula may be simplified to

f (At) = f ′ (λt)At + [f (λt) − λtf ′ (λt)]I

As a specific example, when λ1 ≠ λ2 , we have

e λ1 t − e λ2 t λ eλ2 t − λ2 eλ1 t
eAt = A+ 1 I
(λ1 − λ2 ) (λ1 − λ2 )
1 λ2 1 λ1
= e λ1 t [ A− I] + eλ2 t [− A+ I]
(λ1 − λ2 ) (λ1 − λ2 ) (λ1 − λ2 ) (λ1 − λ2 )

while for λ1 = λ2 = λ, we get

eAt = eλt At + [eλt − λteλt ]I


= eλt [At + I − λtI]
= eλt I + teλt [A − λI].

5.3 Formal solutions of linear differential equations containing a


square matrix
It follows from Cayley–Hamilton theorem that f (A) commutes with A, i. e.,

Af (A) = f (A)A (5.25)

This property may be used to write the solution of many vector differential equations
containing a square matrix A with constant coefficients in terms of functions of A using
the solution for the scalar case. This is illustrated in this section with several examples.

Example 5.5. Consider the initial value problem

du
= Au, u = u0 @ t = 0 (5.26)
dt
We claim that

u = eAt u0 (5.27)

is a solution to the above initial value problem. To prove this, first we verify that (5.27)
satisfies the initial condition:

u(t = 0) = eA.0 u0 = e0 u0 = Iu0 = u0

To show that (5.27) satisfies the differential equation, we differentiate it w. r. t. time to


obtain
130 | 5 Solution of linear equations containing a square matrix

du
= eAt Au0 = AeAt u0 (since A commutes with eAt )
dt
= Au

Thus, the expression given by equation (5.27) is the solution to the initial value prob-
lem defined by (5.26).

Example 5.6. We consider the second-order vector initial value problem

d2 u
= −Au, (5.28)
dt 2
du
u(0) = u0 , (0) = v0 (5.29)
dt
Using the Cayley–Hamilton theorem, the general solution of equation (5.28) may be
written as

u = [cos √At]c1 + [sin √At]c2 (5.30)

The constant vectors c1 and c2 may be determined from the initial conditions:

u(0) = u0 󳨐⇒ c1 = u0
du
= −√A sin(√At)c1 + √A cos(√At)c2
dt
du
(0) = v0 󳨐⇒ √Ac2 = v0 󳨐⇒ c2 = (√A)−1 v0
dt
Thus,

u = [cos √At]u0 + [sin √At](√A)−1 v0

is a formal solution. Note that cos √At and [sin √At](√A)−1 contain only integral pow-
ers of A, and hence are polynomials of degree (n − 1) in A.

Example 5.7. We consider the vector two-point boundary value problem

d2 c
= Φ2 c, 0 < ξ < 1
dξ 2
c = c0 @ ξ = 0
dc
= 0@ξ = 1

representing diffusion and reaction in a flat plate geometry (see Section 4.5.6). The
solution may be written as

󳨐⇒ c = [cosh Φξ ]α1 + [cosh Φ(1 − ξ )]α2


dc
= [Φ sinh Φξ ]α1 + [sinh Φ(1 − ξ )](−Φ)α2

5.4 Sylvester’s theorem | 131

dc
= 0 @ ξ = 1 󳨐⇒ α1 = 0

c = c0 @ ξ = 0 󳨐⇒ c0 = [cosh Φ]α2
󳨐⇒ c(ξ ) = [cosh Φ(1 − ξ )](cosh Φ)−1 c0

is the solution. The quantity of practical interest is the average (or observed) reaction
rate vector in the pore defined by
1

robs = ∫ K.c(ξ ) dξ
0

= KΦ−1 tanh(Φ)c0
= K∗ c0

where Φ2 = L2 D−1 K and K∗ is the diffusion disguised rate constant matrix given by

K∗ = KΦ−1 tanh(Φ) = KH (5.31)

[Remark: H = Φ−1 tanh(Φ) is the so-called effectiveness factor matrix]. Writing D =


dm M (M = matrix of relative species diffusivities), K = kA (A = matrix of relative rate
constants), K∗ = kA∗ (A∗ = matrix of diffusion-disguised relative rate constants) and
2
ϕ2 = Ld k , Ψ2 = M−1 A, we have
m

A∗ = A(ϕΨ)−1 tanh(ϕΨ) (5.32)

The two limiting cases of equation (5.32) can be seen more easily now. For the case of
negligible pore diffusional limitations (ϕ → 0), we have A∗ = A while for the case of
strong pore diffusional limitations (ϕ → ∞), we have A∗ = ϕ1 √AM.

5.4 Sylvester’s theorem


Sylvester’s theorem and its generalization, known as the spectral theorem are impor-
tant in understanding and computing the solutions of linear equations in which the
square matrix A appears. Consider a scalar polynomial of degree (n − 1) in Lagrangian
form

Qn−1 (x) = c1 (x − x2 )(x − x3 ) . . . (x − xn ) + c2 (x − x1 )(x − x3 ) . . . (x − xn ) + ⋅ ⋅ ⋅


+ cn (x − x1 )(x − x2 ) . . . (x − xn−1 ) (5.33)

If we require that the polynomial has to pass through the point (xj , Qn−1 (xj )), we get

Qn−1 (xj )
cj = (5.34)
∏ni=1,j (xj − xi )
132 | 5 Solution of linear equations containing a square matrix

Here, the notation ∏ni=1,j means that the product excludes the term corresponding to
i = j. Thus,

n ∏ni=1,j (x − xi )
Qn−1 (x) = ∑ Qn−1 (xj ) (5.35)
j=1 ∏ni=1,j (xj − xi )

We now convert equation (5.35) into a matrix identity by the following assumptions:
(i) Assume that the square matrix A has n distinct eigenvalues {λ1 , λ2 , . . . , λn } and re-
place xj by λj and (ii) Replace x by A on both sides of equation (5.35). This gives

n ∏ni=1,j (A − λi I)
Qn−1 (A) = ∑ Qn−1 (λj ) (5.36)
j=1 ∏ni=1,j (λj − λi )

Note that the order of the products on the RHS. of this equation is immaterial since
(A−λi I) and (A−λj I) commute. As it stands, equation (5.36) is valid for any polynomial
of degree (n−1) in A. However, we note that any arbitrary polynomial in A and A−1 may
be expressed as a polynomial of degree (n − 1) in A (Cayley–Hamilton theorem). Now,
if f (λ) is any function of the form

f (λ) = ∑ ck λk (5.37)
k=−∞

then f (λj ) can be expressed as a polynomial of degree (n − 1) in λj using the relation


Pn (λj ) = 0. Thus, we can replace Qn−1 in equation (5.36) by any arbitrary function of
the form given by equation (5.37). Thus, we have

n ∏ni=1,j (A − λi I)
f (A) = ∑ f (λj ) (5.38)
j=1 ∏ni=1,j (λj − λi )

This is Sylvester’s formula. We can simplify it further by using the identity (see proof
below).
n
∏ (A − λi I) = (−1)n−1 adj(A − λj I). (5.39)
i=1,j

We note that
n n
∏ (λj − λi ) = (−1)n−1 ∏ (λi − λj ) (5.40)
i=1,j i=1,j

Substituting equations (5.39) and (5.40) into (5.38) gives


5.4 Sylvester’s theorem | 133

n (−1)n−1 adj(A − λj I)
f (A) = ∑ f (λj )
j=1 (−1)n−1 ∏ni=1,j (λi − λj )
n adj(A − λj I)
= ∑ f (λj ) (5.41)
j=1 ∏ni=1,j (λi − λj )

Defining

adj(A − λj I)
Ej = (5.42)
∏ni=1,j (λi − λj )

We get the final form of Sylvester’s formula


n
f (A) = ∑ f (λj )Ej (5.43)
j=1

It can be shown that the matrices Ej (j = 1, 2, 3, . . . , n) have rank one and satisfy the
relations
n
Ei Ej = 0, i ≠ j and ∑ Ei = I (5.44)
i=1

The matrix Ei is called a projection. We shall prove the above relations as well as estab-
lish other properties of projections in the next section when we deal with the spectral
theorem.

Proof of the identity given by equation (5.39).


n
∏ (A − λi I) = (−1)n−1 adj(A − λj I)
i=1,j

Write the characteristic polynomial as Pn (λ) = (−1)n [λn + α1 λn−1 + ⋅ ⋅ ⋅ + αn−1 λ + αn ]

󳨐⇒ Pn (y) − Pn (x) = (−1)n [yn − x n + α1 (yn−1 − x n−1 ) + ⋅ ⋅ ⋅ + αn−1 (y − x)]


= (−1)n (y − x)[(yn−1 + yn−2 x + ⋅ ⋅ ⋅ + yx n−2 + x n−1 )
+ α1 (yn−2 + yn−3 x + ⋅ ⋅ ⋅ + yx n−3 + x n−2 ) + ⋅ ⋅ ⋅ + αn−1 ]
≡ (−1)n (y − x)Φ(y, x) (5.45)

where Φ(y, x) = Φ(x, y) is of degree (n − 1). Convert this scalar polynomial identity to
a matrix identity by letting x = Iλ, y = A 󳨐⇒

Pn (A) − Pn (Iλ) = 0 − Pn (Iλ) = −IPn (λ)


= (−1)n (A − λI)Φ(A, Iλ) (5.46)

But
134 | 5 Solution of linear equations containing a square matrix

(A − λI) adj(A − λI) = Pn (λ)I (5.47)

Comparing (5.46) and (5.47) 󳨐⇒

(−1)n−1 (A − λI)Φ(A, λI) = (A − λI) adj(A − λI)

Since this is an identity for all values of λ, it follows that

adj(A − λI) = (−1)n−1 Φ(A, λI) (5.48)

Now, let x = λj and y = λ in (5.45)󳨐⇒

Pn (λ) − Pn (λj ) = Pn (λ) = (−1)n (λ − λj )Φ(λ, λj )


= (−1)n (λ − λ1 )(λ − λ2 ) . . . (λ − λn )
n
󳨐⇒ Φ(λ, λj ) = ∏ (λ − λi ) (5.49)
i=1,j

Convert this to a matrix identity by letting λ = A, λi = Iλ:


n
󳨐⇒ Φ(A, λj I) = ∏ (A − λi I) (5.50)
i=1,j

From (5.48) and (5.50), it follows that


n
adj(A − λI) = (−1)n−1 ∏ (A − λi I)
i=1,j

Example 5.8 (Illustration of Sylvester’s formula). Consider the 2 × 2 matrix

1 1
A=( )
4 1

whose eigenvalues are

λ1 = −1, λ2 = 3.

The eigenvectors and normalized eigenrows are given by

1 1
x1 = ( ); x2 = ( )
−2 2
1
y∗1 = ( 2
− 41 ) ; y∗2 = ( 1
2
1
4
)
2 1 2 −1
(A − λ1 I) = ( ) ⇒ adj(A − λ1 I) = ( )
4 2 −4 2
−2 1 −2 −1
(A − λ2 I) = ( ) 󳨐⇒ adj(A − λ2 I) = ( )
4 2 −4 −2
5.5 Spectral theorem | 135

1 −1
adj(A − λ1 I)
E1 = =( 2 4
) = x1 y∗1
(λ2 − λ1 ) −1 1
2
1 1
adj(A − λ2 I)
E2 = =( 2 4
) = x2 y∗2
(λ1 − λ2 ) 1 1
2
1
1
2
−1
4
1
2
−1
4 4
+ 41 −1
8
− 81 1
2
−1
4
E21 = ( )( )=( )=( ) = E1
−1 1
2
−1 2
1
− 21 − 21 4
1
+ 41 −1 1
2
E22 = E2
E1 + E2 = I
A = λ1 E1 + λ2 E2
f (A) = f (λ1 )E1 + f (λ2 )E2

For example,
1 −1 1 1
2 4 2 4
A100 = (−1)100 ( 1
) + (3)100 ( 1
)
−1 2
1 2

5.5 Spectral theorem


Before we state the spectral theorem, we review the concept of a projection.

Definition. A square matrix P is called a projection if P2 = P.


Two projections P1 and P2 are called orthogonal if P1 P2 = P2 P1 = 0.

Let A be a square matrix with simple eigenvalue λj . Let xj and y∗j be the corre-
sponding eigenvector and eigenrow. Consider the dyadic product of xj and y∗j defined
by

xj y∗j
Ej = (5.51)
y∗j xj

[Remark: Note that y∗j xj = ⟨xj , yj ⟩ is a scalar and is a normalization constant while
xj y∗j is an n × n matrix of rank one.] We now show that Ej is a projection using the fact
that matrix (or vector) multiplication is associative:

xj y∗j xj y∗j
E2j =
y∗j xj y∗j xj
xj (y∗j xj )y∗j
=
(y∗j xj )(y∗j xj )
xj y∗j
= = Ej (5.52)
y∗j xj
136 | 5 Solution of linear equations containing a square matrix

Let u be any vector in ℝn /ℂn and consider the expansion


n
u = ∑ αj xj (5.53)
j=1

where {x1 , x2 , . . . , xn } are the eigenvectors of A. From the biorthogonal expansion, we


have
y∗j u
αj = (5.54)
y∗j xj

Now,
n
xi y∗i
E i u = ∑ αj ( )x
j=1
y∗i xi j

= αi xi (5.55)

It follows from equations (5.54) and (5.55) that Ei u is the component of u in the space
spanned by the eigenvector xi , i. e., Ei is the projection operator onto the eigenspace
spanned by xi . If i ≠ j, we have

xi y∗i xj yj

Ei Ej = =0
y∗i xi y∗j xj

Similarly,

Ej Ei = 0.

We can now state the spectral theorem.

Theorem (spectral). Let A be a square matrix with distinct eigenvalues λ1 , λ2 , . . . , λn .


Then there exist projections Ej such that:
1. ∑ni=1 Ei = I (resolution of the identity)
2. (a) E2i = Ei
(b) Ei Ej = Ej Ei = 0 (i ≠ j)
3. A = ∑ni=1 λi Ei (spectral resolution of A)
4. f (A) = ∑ni=1 f (λi )Ei , where f is any function defined on the spectrum of A.

Proof. Let x1 , x2 , x3 , . . . , xn be the eigenvectors and y∗1 , y∗2 , . . . , y∗n be the eigenrows of A.
Defining

xi y∗i
Ei =
y∗i xi

We have already proved the relations given by (2-a) and (2-b).


5.5 Spectral theorem | 137

Proof of (3). Consider the matrix

T = (x1 x2 . . . xn )

whose columns are the eigenvectors of A. Then we have

AT = A(x1 x2 . . . xn )
= (Ax1 Ax2 . . . Axn )
= (λ1 x1 λ2 x2 . . . λn xn )
λ1 0 0 . . 0
0 λ2 0 . . 0
= (x1 x2 . . . xn ) ( 0 0 λ3 . . 0 )
. . . . . .
0 0 0 . . λn
= TΛ, (5.56)

where Λ is a diagonal matrix of eigenvalues (called the spectral matrix). Since T is


invertible, equation (5.56) may be written as

A = TΛT−1 (5.57)

We now show that the rows of the matrix T−1 are the normalized eigenrows of A. Let

y∗1 /(y∗1 x1 )
[ y∗ /(y∗ x ) ]
[ 2 2 2 ]
[ ]
S=[ . ]
[ ]
[ . ]
[ yn /(yn xn ) ]
∗ ∗

Then

y∗1 /(y∗1 x1 )
y∗2 /(y∗2 x2 )
ST = ( . ) (x1 x2 . . . xn )
.
y∗n /(y∗n xn )
1 0 . . 0
0 1 . . 0
=( )=I
. . . . .
0 0 . . 1

Since the inverse of a matrix is unique, we have


138 | 5 Solution of linear equations containing a square matrix

S = T−1 (5.58)

Thus,

y∗1 /(y∗1 x1 )
y∗2 /(y∗2 x2 )
A = (λ1 x1 λ2 x2 . . . λn xn ) ( . )
.
y∗n /(y∗n xn )
n
xi y∗i
= ∑ λi
i=1
(y∗i xi )
n
= ∑ λi Ei
i=1

Alternate proof of (3). Define

A1 = A − λ1 E1 (5.59)
A1 x1 = (A − λ1 E1 )x1
=0
A1 xj = λj xj − λ1 E1 xj
= λ j xj ; j = 2, . . . , n

Thus, the matrix A1 has eigenvalues 0, λ2 , λ3 , . . . , λn with eigenvectors x1 , x2 , . . . , xn .


Now, let

A2 = A − λ1 E1 − λ2 E2 (5.60)

A2 x1 = 0, A 2 x2 = 0
A2 xj = λj xj ; j = 3, . . . , n

The matrix A2 has eigenvalues 0, 0, λ3 , . . . , λn and eigenvectors x1 , x2 , . . . , xn . Similarly,


we see that
n
An = A − ∑ λi Ei (5.61)
i=1

has eigenvalues 0, 0, . . . , 0 and eigenvectors x1 , x2 , . . . , xn , i. e.,

An xj = 0; j = 1, 2, 3, . . . , n (5.62)

We now show that equation (5.62) implies that An = 0 (n × n zero matrix), and hence
5.5 Spectral theorem | 139

n
A = ∑ λi Ei
i=1

Now, (5.62) gives

α1T
[ T ]
[ α2 ]
[ ]
[ . ] xj = 0
[ ]
[ ]
[ . ]
T
[ αn ]

where αkT = k-th row of An . Since the eigenvectors are linearly independent, the only
solution to the system of equations

αkT xj = 0, k = 1, 2, 3, . . . , n

is the trivial one, i. e.,

αk = 0 for each k = 1, 2, 3, . . . , n

󳨐⇒

An = 0

Proof of (1). We have shown


n
u = ∑ Ei u
i=1

for any vector u. To show that (1) implies


n
∑ Ei = I,
i=1

we let
n
S = ∑ Ei
i=1

and rewrite (1) as

(S − I)u = 0 (5.63)

Equation (5.63) implies that each row of the matrix S − I is orthogonal to n linearly
independent vectors (u1 , u2 , . . . , un ).
140 | 5 Solution of linear equations containing a square matrix

󳨐⇒ Each row of (S − I) is a zero row.


∴ The result.

Proof of (4). To prove this, we use (2)

n
A = ∑ λj Ej
j=1
n n
󳨐⇒ A2 = (∑ λj Ej )(∑ λi Ei )
j=1 i=1
n
= ∑ λj2 Ej
j=1

Similarly,

n
Ak = ∑ λjk Ej , for k = 0, 1, 2, . . .
j=1

Thus, if f (λ) is any function that may be expressed in the form


f (λ) = ∑ ci λi ,
i=0

we have
n
f (A) = ∑ f (λj )Ej
j=1

If A is invertible, it follows from Cayley–Hamilton theorem that A−1 is a polynomial of


degree (n − 1) in A. Thus, for any arbitrary function f (λ), the function f (A) is at most a
polynomial of degree (n − 1) in A, i. e.,

f (A) = γ0 I + γ1 A + γ2 A2 + ⋅ ⋅ ⋅ + γn−1 An−1


n n n
= γ0 (∑ Ej ) + γ1 (∑ λj Ej ) + ⋅ ⋅ ⋅ + γn−1 (∑ λjn−1 Ej )
j=1 j=1 j=1
n
= (∑ f (λj )Ej )
j=1

Other and more general forms of the spectral theorem may be found in the books on
linear algebra (Halmos [20]; Naylor and Sell [24]; Lipschutz and Lipson [22]).
5.5 Spectral theorem | 141

Example 5.9 (Spectral decomposition). Consider the 2 × 2 matrix

−3 2
A=( )
4 −5

whose eigenvalues are λ1 = −1, λ2 = −7. We have eigenvectors and normalized eigen-
rows:

1 1
x1 = ( ); x2 = ( )
1 −2
2 1 1
y∗1 = ( 3 3
); y∗2 = ( 3
− 31 )
2 1
x1 y∗1 3 3
E1 = =( )
y∗1 x1 2 1
3 3
1
x y∗ 3
− 31
E2 = ∗2 2 = ( )
y2 x2 − 32 2
3
2 1 2 1 2 1
3 3 3 3 3 3
E21 = ( 2 1
)( 2 1
)=( 2 1
) = E1
3 3 3 3 3 3
1
3
− 31 1
3
− 31 1
3
− 31
E22 =( )( )=( ) = E2
− 32 2
3
− 32 2
3
− 32 2
3
2 1 1
3 3 3
− 31 0 0
E1 E2 = ( )( )=( )
2 1
− 32 2 0 0
3 3 3
2 1 1
3 3 3
− 31 1 0
E1 + E2 = ( )+( )=( )
2 1
− 32 2 0 1
3 3 3
2 1 1
3 3 3
− 31
λ1 E1 + λ2 E2 = −1 ( 2 1
) − 7( )
3 3
− 32 2
3
− 32 − 7
3
− 31 + 7
3
=( )
− 32 + 14
3
− 31 − 14
3
−3 2
=( )=A
4 −5
f (A) = f (λ1 )E1 + f (λ2 )E2 .

To show that the same expression is obtained when we use the Cayley–Hamilton the-
orem, we note that

f (A) = α0 I + α1 A,

where α0 and α1 are give by the two linear equations


142 | 5 Solution of linear equations containing a square matrix

f (λ1 ) = α0 − α1 = f (−1)
f (λ2 ) = α0 − 7α1 = f (−7).

f (−1)−f (−7) 7f (−1)−f (−7)


Solving, we get α1 = 6
; α0 = 6
. Thus, we have

7f (−1) − f (−7) 1 0 f (−1) − f (−7) −3 2


f (A) = ( )+ ( )
6 0 1 6 4 −5
7f (−1)−f (−7) 3f (−7)−3f (−1) 2f (−1)−2f (−7)
6
0 6 6
=( 7f (−1)−f (−7)
)+( 4f (−1)−4f (−7) 5f (−7)−5f (−1)
)
0 6 6 6
4f (−1)+2f (−7) 2f (−1)−2f (−7)
6 6
=( 4f (−1)−4f (−7) 2f (−1)+4f (−7)
)
6 6
2 1 1 −1
= f (−1) ( 32 31 ) + f (−7) ( 3 3
)
3 3
− 32 2
3
= f (λ1 )E1 + f (λ2 )E2 .

5.6 Projections operators and vector projections


We have already seen that the projection operators Ej appear in the calculation of func-
tion of matrices and spectral theorem. These also have geometrical interpretation that
is useful in applications. We illustrate the geometrical features briefly in ℝ2 .

5.6.1 Standard basis and projection in ℝ2

Consider the two elementary orthonormal basis vectors e1 = ( 01 ) and e2 = ( 01 ) as


shown in Figure 5.1, and projection matrices

1 0
E1 = e1 eT1 = ( )
0 0

and

0 0
E2 = e2 eT2 = ( ).
0 1

z
If z = ( z21 ) be any vector, then we can write

z = z1 e1 + z2 e2
= E1 z + E2 z,
5.6 Projections operators and vector projections | 143

Figure 5.1: Projection in standard basis in ℝ2 .

where

1 0 z z
E1 z = ( ) ( 1 ) = ( 1 ) = z1 e1
0 0 z2 0

is the projection of z on basis vector e1 and

0 0 z 0
E2 z = ( )( 1 ) = ( ) = z2 e2
0 1 z2 z2

is the projection of z on basis vector e2 (see Figure 5.1).


It can easily be verified that the matrices E1 and E2 are orthogonal projection ma-
trices, i. e.,

E2j = Ej , j = 1, 2
E1 E2 = E2 E1 = 0,

and satisfy the resolution of identity, i. e.,

2
1 0
∑ Ej = I = ( ).
j=1
0 1

5.6.2 Nonorthogonal projections

Consider a two-dimensional vector z = ( −1 2 ) split it into two nonorthogonal indepen-

dent vectors x1 = ( 11 ) and x2 = ( −2 1 ) as shown in Figure 5.2, i. e., z = x + x . Note


1 2
that x2 x1 = −1 ≠ 0 (i. e., not orthogonal). These two vectors also form the basis for ℝ2
T

with yT1 = 31 ( 21 ) and yT2 = 31 ( −1


1 ) forming normalized biorthonormal basis. Note that
144 | 5 Solution of linear equations containing a square matrix

Figure 5.2: Nonorthogonal projections in two-dimensional space.

yT2 x1 = yT1 x2 = 0 and yT1 x1 = yT2 x2 = 1. Here, the nonorthogonal projection matrices
are given by

1 2 1
E1 = x1 yT1 = ( )
3 2 1
1 1 −1
E2 = x2 yT2 = ( )
3 −2 2

It is easily verified that

1 2 1 2 1
E1 z = ( )( ) = ( ) = x1 ,
3 2 1 −1 1

i. e., projection of z onto x1 and

1 1 −1 2 1
E2 z = ( )( )=( ) = x2 ,
3 −2 2 −1 −2

i. e., projection of z onto x2 (see Figure 5.2).


Here, the projection matrices E1 and E2 satisfy:

2/3 1/3 2/3 1/3 2/3 1/3


E21 = ( )( )=( ) = E1
2/3 1/3 2/3 1/3 2/3 1/3
1/3 −1/3 1/3 −1/3 1/3 −1/3
E22 = ( )( )=( ) = E2
−2/3 2/3 −2/3 2/3 −2/3 2/3
2/3 1/3 1/3 −1/3 1 0
E1 + E2 = ( )+( )=( )=I
2/3 1/3 −2/3 2/3 0 1
5.6 Projections operators and vector projections | 145

2/3 1/3 1/3 −1/3 0 0


E1 E2 = ( )( )=( )=0
2/3 1/3 −2/3 2/3 0 0
1/3 −1/3 2/3 1/3 0 0
E2 E1 = ( )( )=( )=0
−2/3 2/3 2/3 1/3 0 0

Hence, for the case of real eigenvalues, the eigenvectors and normalized left eigenvec-
tors define the projections onto the one-dimensional eigenspaces.

5.6.3 Geometric interpretation with real and negative eigenvalues

Consider the initial value problem:

du
= Au, t > 0; u = u0 @ t = 0
dt

with A = ( −3 2
4 −5 ). Eigenvalues of A are λ1 = −1 and λ2 = −7 with corresponding eigen-
vectors

1 1
x1 = ( ) and x2 = ( ),
1 −2

respectively and normalized eigenrows as

yT1 = ( 2/3 1/3 ) and yT2 = ( 1/3 − 1/3 ) ,

respectively. As shown in the previous section,

2/3 1/3
E1 = x1 yT1 = ( )
2/3 1/3

and

1/3 −1/3
E2 = x2 yT2 = ( )
−2/3 2/3

are the projection matrices. Thus,

2/3 1/3 1/3 −1/3


λ1 E1 + λ2 E2 = (−1) ( ) + (−7) ( )
2/3 1/3 −2/3 2/3
−3 2
=( ) = A.
4 −5

The solution of above differential equation can be expressed as


146 | 5 Solution of linear equations containing a square matrix

u(t) = eAt u0 = α1 x1 eλ1 t + α2 x2 eλ2 t

where

u0 = α1 x1 + α2 x2 󳨐⇒ α1 = yT1 u0 and α2 = yT2 u0 .

u
Assuming u0 = ( u20
10
),

2u10 + u20
α1 = yT1 u0 =
3

and
u10 − u20
α2 = yT2 u0 = .
3

Thus, the solution can be expressed as

2u10 + u20 u − u20


u(t) = ( )x1 e−t + ( 10 )x2 e−7t .
3 3

Physical interpretation of eigenvectors


Assuming α2 = 0, i. e.,

1
u10 = u20 = β or u0 = β ( ) = βx1
1

then

u(t) = βx1 e−t .

In other words, when initial condition is in the eigenstate x1 , it always remains in the
same state (see Figure 5.3). Similarly, when initial state is in x2 , i. e.,

u20
α1 = 0 or u10 = − = γ,
2

then

1
u0 = γ ( ) = γx2
−2

and

u(t) = γx2 e−7t ,

i. e., the solution vector always remains in eigenstate x2 .


5.6 Projections operators and vector projections | 147

Figure 5.3: Eigenstates of the initial value problems.

Note that if u0 = ( 0β ), i. e., on u1 -axis, the solution can be expressed as

2 1
u(t) = βx e−t + βx2 e−7t .
3 1 3

Since, e−7t 󳨀→ 0 much faster than e−t (i. e., the component in eigenstate x2 vanishes
faster), the solution for t ≫ 1 simplifies to

2
u(t) = βx e−t , t ≫ 1.
3 1
In other words, the solution approaches the steady state along the x1 direction. This
is also true for any initial condition except when the initial condition is along x2 (see
Figures 5.3 and 5.4).
The trajectory of solution in (u1 , u2 ) plane can be determined for any general initial
condition corresponding to the point (u10 , u20 ) in phase plane. Note that solution in
two dimensional phase plane can be represented as

2u10 + u20 u − u20


u(t) = ( )x1 e−t + ( 10 )x2 e−7t
3 3
or

2u10 + u20 −t u − u20 −7t


u1 (t) = ( )e + ( 10 )e
3 3
2u + u20 −t u − u20 −7t
u2 (t) = ( 10 )e − 2( 10 )e
3 3

parametrically in t, which can be eliminated by solving for e−t and e−7t that leads to
the equation for the trajectory in the form of

g(u1 , u2 , u10 , u20 ) = 0


148 | 5 Solution of linear equations containing a square matrix

Figure 5.4: Trajectory of solution with stable node in the phase-plane for various initial points. The
component of solution along x2 direction (with large eigenvalue) vanishes first, and solution ap-
proaches steady state in x1 direction (smaller eigenvalue).

in the phase-plane. Such trajectories for various initial points are demonstrated in
Figure 5.4. The trivial steady-state here is stable because of negative real eigenvalues
(and is referred to as node in the dynamical systems literature).

5.6.4 Geometrical interpretation with complex eigenvalues with negative real part

Consider a matrix A = ( −1 1
−1 −1 ), whose eigenvalues are λ1 = −1 + i and λ2 = −1 − i with
corresponding eigenvectors

1 1
x1 = ( ) and x2 = ( ),
i −i

and normalized eigenrows

y∗1 = (1/2, −i/2) and y∗2 = (1/2, i/2),

respectively. Thus, the solution of initial value problem:

du
= Au, t>0 and u = u0 = (u10 , u20 )T @ t = 0
dt

can be expressed as

u = c1 x1 eλ1 t + c2 x2 eλ2 t
= y∗1 u0 eλ1 t x1 + y∗2 u0 eλ2 t x2
5.6 Projections operators and vector projections | 149

u10 − iu20 1 u + iu20 1


=( ) ( ) e(−1+i)t + ( 10 )( ) e(−1−i)t
2 i 2 −i
u10 cos t + u20 sin t
= e−t ( )
u20 cos t − u10 sin t
cos t sin t u
= e−t ( ) ( 10 )
− sin t cos t u20

The solution trajectory for this example is shown in Figure 5.5 for various initial con-
ditions. As can be seen from this figure, the solution is oscillatory in time as expected
because of complex eigenvalues but has stable focus since real part of the complex
eigenvalue is negative.

Figure 5.5: Solution trajectory with stable focus in phase-plane with complex eigenvalues leading to
oscillating response.

5.6.5 Geometrical interpretation with one zero eigenvalue

Consider a matrix A = ( −1 1
1 −1 ), whose eigenvalues are λ1 = 0 and λ2 = −2 with corre-
sponding eigenvectors

1 1
x1 = ( ) and x2 = ( ),
1 −1

and normalized eigenrows yT1 = (1/2, 1/2) and yT2 = (1/2, −1/2), respectively. The initial
value problem:

du
= Au, t>0 and u = u0 = (u10 , u20 )T @ t = 0
dt
150 | 5 Solution of linear equations containing a square matrix

represents the mass exchange between two identical tanks in absence of convection
and reaction. The solution for this example can be expressed as

u = c1 x1 eλ1 t + c2 x2 eλ2 t
= y∗1 u0 eλ1 t x1 + y∗2 u0 eλ2 t x2
u10 + u20 1 u − u20 1
=( ) ( ) + ( 10 )( ) e−2t
2 1 2 −1

Note that at steady-state (t 󳨀→ ∞), the solution reduces to

u10 + u20 1 u10 + u20


u(t 󳨀→ ∞) = us = ( ) ( ) = αx1 , α=
2 1 2

u +u
The term yT1 u = u1 +u
2
2
= 10 2 20 is constant and denotes the mass conservation of
species. The solution trajectory for this example is shown in Figure 5.6 for various
initial conditions. The bullet points in this figure represents the equilibrium state (x1 )
corresponding to zero eigenvalue. Thus, for any initial condition, the component of
the solution along x2 direction decreases with time (due to negative eigenvalue λ2 )
and approaches the equilibrium composition at steady-state, which is practically for
all t ≫ 2.

Figure 5.6: Solution trajectory in phase-plane for various initial conditions: the bullet points repre-
senting the equilibrium state (corresponding to zero eigenvalue).

5.6.6 Physical and geometrical interpretation of transient behavior of interacting


tank systems for various initial conditions

Consider the flow system as shown in Figure 5.7 containing three interacting tanks
with volumes V1 = 21 V2 = V3 = VR , and a fixed exchange rates qe . Assuming no net
5.6 Projections operators and vector projections | 151

1
Figure 5.7: Flow system containing three interacting tanks with volumes V1 = V
2 2
= V3 = VR , with a
fixed mass-exchange rates in absence of net convection and reaction.

convection and reaction, and tanks being well mixed, the model equations describing
the species balances in these tanks can be expressed as

dc1
VR = −2qe c1 + qe c2 + qe c3
dt ′
dc
2VR ′1 = qe c1 − 2qe c2 + qe c3
dt
dc1
VR ′ = qe c1 + qe c2 − 2qe c3 ,
dt

with initial conditions

c1 = c10 , c2 = c20 and c3 = c30 @ t ′ = 0.

Nondimensionalizing real time t ′ as

qe ′
t= t
VR

the governing model can be expressed as follows:

dc
= Ac, t>0 and c = c0 = (c10 , c20 , c30 )T @ t = 0
dt
−2 1 1
A = ( 1/2 −1 1/2 ) .
1 1 −2

Note that the matrix A is not symmetric, but has zero row sums, implying the existence
of zero eigenvalue, which also denotes the mass conservation of species.
152 | 5 Solution of linear equations containing a square matrix

Eigensystem
The eigenvalues of matrix A are λ1 = 0, λ2 = −2 and λ3 = −3 corresponding to the
eigenvectors

1 1 1
x1 = ( 1 ) , x2 = ( −1 ) and x3 = ( 0 )
1 1 −1

and normalized eigenrows

1 1 1
yT1 = (1, 2, 1), yT2 = (1, −2, 1) and yT3 = (1, 0, −1).
4 4 2

Interpretation of eigenvectors
1
The eigenvector x1 = ( 1 ) corresponding to the eigenvalue λ1 = 0 indicates the steady
1
state or equilibrium composition cs :

1
dc
@ t 󳨀→ ∞, = 0 󳨐⇒ Acs = 0 󳨐⇒ cs = α1 x1 = α1 ( 1 ) .
dt
1

In order to obtain α1 , initial condition can be utilized. Since yT1 is the eigenrow corre-
sponding to zero eigenvalue λ1 , we have

dc
yT1 = yT1 Ac = 0T c = 0 󳨐⇒ yT1 c = constant
dt
󳨐⇒ yT1 cs = yT1 c0
c + 2c20 + c30
󳨐⇒ α1 = yT1 c0 = 10
4

Thus, the steady-state concentration in each tank is given by

c10 + 2c20 + c30


c1s = c2s = c3s = α1 =
4
󳨐⇒

VR c1s + 2VR c2s + VR c3s = 4VR α1 = VR c10 + 2VR c20 + VR c30 ,

which represents the conservation of mass.


1
The other eigenvectors are transient modes, e. g., x2 = ( −1 ) is the slow transient
1
1
(asymmetric) mode with eigenvalue λ2 = −2 and x3 = ( 0 ) is the fast transient (skew-
−1
symmetric) mode with eigenvalue λ3 = −3.
5.6 Projections operators and vector projections | 153

Interpretation of eigenrows
To see the meaning of eigenrows, we express the solution as

3
yTi c0 3 3
c(t) = ∑ xi eλi t = ∑(yTi c0 )xi eλi t = ∑ αi xi eλi t
i=1 yTi xi i=1 i=1

= α1 x1 + α2 x2 e −2t
+ α3 x3 e −3t
, αi = yTi c0

We note that

yTi c = α1 = yTi c0 .

Thus, the first eigenrow corresponds to the overall mass-conservation and determines
1
the equilibrium/steady-state composition as shown earlier, i. e., cs = α1 ( 1 ), where
1

c10 + 2c20 + c30


α1 = yT1 c0 = .
4

We also note that α2 = yT2 c0 . Thus, if α2 = 0, then

c(t) = α1 x1 + α3 x3 e−3t ,

which implies that yT2 c = 0. In other words, if the initial concentration vector c0 is
such that α2 = 0, i. e.,

c0 = α1 x1 + α3 x3 ,

then concentration c(t) at all times remains a linear combination of x1 and x3 with
c(t = 0) = c0 and c(t 󳨀→ ∞) = α1 x1 = cs . It can be seen by combining the condition of
α2 = 0 with mass balance constraint as follows:

α2 = yT2 c0 = 0 󳨐⇒ c10 − 2c20 + c30 = 0


c + 2c20 + c30
α1 = yT1 c0 󳨐⇒ 10 = α1 .
4

The above two equations lead to

c20 = α1 and c10 + c30 = 2α1

Thus, the initial condition can be expressed in the form of

α1 + α3 1 1
0
c =( α1 ) = α1 ( 1 ) + α3 ( 0 )
α1 − α3 1 −1
= α1 x1 + α3 x3 = cs + α3 x3
154 | 5 Solution of linear equations containing a square matrix

With such initial condition, the solution is given by

c(t) = α1 x1 + α3 x3 e−3t = cs + α3 x3 e−3t ,

which approaches to steady state along a straight line on triangular diagram as shown
0.5
in Figure 5.8 (see the line DEF). For example, for special initial condition c0 = ( 0.25 )
0
corresponding to the point D, we have

1 1
α1 = yT1 c0 = , α2 = yT2 c0 = 0 and α3 = yT3 c0 = ,
4 4

and the solution is given by

1 + e−3t
1 1 1
c(t) = x1 + x3 e−3t = ( 1 ).
4 4 4
1−e −3t

In this case, the trajectory goes from point D (@ t = 0) to point E (@ t 󳨀→ ∞) as shown


in Figure 5.8.

Figure 5.8: Solution trajectory in triangular phase-plane: E is the equilibrium state, AEB is the slow
transient state (x2 ) and DEF is the fast transient state (x3 ).

Similarly, α3 = 0 along with the constraint y1 c0 = α1 corresponds to all initial compo-


sition of the form

α1 + α2 1 1
0
c = ( α1 − α2 ) = α1 ( 1 ) + α2 ( −1 )
α1 + α2 1 1
= α1 x1 + α2 x2 = cs + α2 x2 ,

which leads to the transient solution as


5.6 Projections operators and vector projections | 155

c(t) = α1 x1 + α2 x2 e−2t = cs + α2 x2 e−2t .

Thus, the solution remains a linear combination of x1 and x2 with

c(t = 0) = c0 and c(t 󳨀→ ∞) = α1 x1 = cs ,

and approaches to steady state along the straight line BEA as shown in triangular
0
phase-trajectory in Figure 5.8. For example, for the special initial condition c0 = ( 0.5 )
0
corresponding to the point B, we have

1 −1
α1 = yT1 c0 = , α2 = yT2 c0 = and α3 = yT3 c0 = 0.
4 4

Thus in this case, the solution simplifies to

1 − e−2t
1 1 1
c(t) = x1 − x2 e = ( 1 + e−2t ) ,
−2t
4 4 4
1 − e−2t

and the trajectory goes from point B (@ t = 0) to point E (@ t 󳨀→ ∞) as shown in


Figure 5.8. Similarly, the trajectories can be determined for other initial conditions as
shown in Figure 5.8.

Problems
1. Given the matrix

0.15 −0.01
A=( )
−0.25 0.15

evaluate the following:


(i) eA , (ii) cos A, (iii) sinh A, (iv) ln A, (v) ∑∞ j
j=0 A , (vi) J0 (A)
2. (a) Show that the solution of the inhomogeneous system

du
= Au + f(t), u(t = 0) = u0
dt

is given by

t
At
u = e u0 + ∫ eA(t−τ) f(τ)dτ
0

(b) Determine a similar formula for the solution of the inhomogeneous system

d2 u du
= −Au + f(t), u(t = 0) = u0 , (t = 0) = v0
dt 2 dt
156 | 5 Solution of linear equations containing a square matrix

3. Given the matrix

3 1 1
A=( 2 4 2 )
1 1 3

(a)
Compute the eigenvalues and eigenvectors.
(b)
What matrix in a similarity transform will reduce A to diagonal form.
(c)
Determine the spectral decomposition of A.
(d)
If f (λ) is any function defined on the spectrum of A, develop a formula for
determining f (A).
4. We had three representations for the solution of

du
= Au, u(t = 0) = u0
dt

(a) n y∗j u0
u=∑ exp(λj t)xj
j=1
y∗j xj

(b)
u = eAt u0

(c) n
u = ∑ exp(λj t)Ej u0
j=1

Show that these representations are the same.


5. Consider a cascade of N ideal stirred tank reactors (CSTRs) in which the following
first-order reactions occur:

k2 k4
󳨀󳨀 B →
A→
← 󳨀󳨀 C

k1 k3

where

k k k
k1 = k, k2 = , k3 = , k4 = .
2 2 4

(a) Formulate the species balances for steady-state operation and show that the
concentration vector in the stream leaving tank N is given by


−N
cN = [I + A] c0 ,
N

where c0 is the feed concentration vector, τ is the space time and A is the
matrix of relative rate constants defined by
5.6 Projections operators and vector projections | 157

1 − 21 0
A = ( −1 1 − 41 )
0 − 21 1
4

(b) If the reactor is an ideal plug flow reactor, show that the exit concentration
vector is given by

cN = exp{−kτA}c0 .

(c) Compute the exit concentration vector for cases (a) and (b) above for the fol-
lowing parameter values:

1
N = 10, kτ = 2, c0 = ( 0.2 )
0

6. Consider the problem of diffusion and reaction in an isothermal catalyst particle.


When several first-order reactions occur, the concentration vector in a slab geom-
etry satisfies the equations

d2 c
D = Kc, 0 < x < L
dx2
dc
D = kc (c − c0 ) @ x = 0
dx
dc
= 0@x = L
dx

where D is a positive definite matrix of diffusivities, K is a nonnegative definite


matrix of rate constants, c0 is the vector of ambient concentrations and kc is a
diagonal matrix of mass transfer coefficients.
(a) Cast the equations in dimensionless form and obtain a formal solution.
(b) If the quantity of interest is the observed (or diffusion/mass transfer dis-
guised) rate vector defined by

L
1
robs = ∫ Kc(x) dx,
L
0

show that

robs = K∗ c0 ,

where the diffusion/mass transfer disguised rate constant matrix K∗ is given


by
158 | 5 Solution of linear equations containing a square matrix

D
K∗ = Φ tanh Φ(Bi + Φ tanh Φ)−1 Bi,
L2

where Φ2 = D−1 KL2 ; Bi = D−1 kc L


(c) If we write

K∗ = KH

where H is called the effectiveness matrix, show that when D is diagonal and
K is nonsingular, H is given by H = Φ−1 tanh Φ(I + Bi−1 Φ tanh Φ)−1
k1 k2
(d) Use the result in (b) to calculate H for the reaction networks (i) A 󳨀→ B 󳨀→ C,
k1 k2 k3
(ii) A 󳨀→ B 󳨀→ C, A 󳨀→ D when the diffusivities and mass-transfer coef-
ficients of all the species are identical. Give a physical interpretation of the
diagonal elements of H.
(e) Consider again the special case in which

Lkc
D = dI; Bi = Bim I; Bim =
d

and we write

kL2
K = kA; K∗ = kA∗ ; Φ2 = ϕ2 A; ϕ2 =
d

where A (A∗ ) is the relative rate constant (diffusion disguised relative rate con-
stant) matrix. Show that the result in (b) may be expressed in dimensionless
form as

√A tanh ϕ√A ϕ√A tanh ϕ√A


−1
A∗ = [I + ]
ϕ Bim

Discuss the two limiting cases of negligible internal resistance (ϕ → 0) and


negligible external resistance (Bim → ∞). Calculate the diffusion-disguised
relative rate constant matrix when ϕ = 10 and Bim = ∞ for the following
reaction network:

k2 k4
󳨀󳨀 B →
A→
← 󳨀󳨀 C

k1 k3

where

k k k
k1 = k, k2 = , k3 = , k4 = .
2 2 4

Draw a schematic diagram of the diffusion-disguised reaction network and


compare it with the above true reaction network.
5.6 Projections operators and vector projections | 159

7. The concentration vector in a tubular reactor with axial dispersion satisfies the
equations

1 d2 c dc
− − Dac = 0 0 < ξ < 1
Pe dξ 2 dξ
1 dc
= c − c0 @ ξ = 0
Pe dξ
dc
= 0@ξ = 1

where c0 is the feed concentration vector, Pe is the Peclet number and Da = Kτ is


the Damköhler matrix, K is the rate constant matrix and τ is the space time.
(a) Outline a procedure for solving the above equations using eigenvector expan-
sions. Obtain the formal solution.
(b) Show that the exit concentration vector may be written as

c(ξ = 1) = f (Pe, Da)c0

where the function f is defined by

Pe Pe Pe
−1
f (Pe, Da) = 4 exp( )Y[(I + Y)2 exp( Y) − (I − Y)2 exp(− Y)]
2 2 2

and
1/2
4Da
Y = (I + )
Pe

(c) Use the result in (b) to calculate c(ξ = 1) for the reaction network in problem
#5 and numerical values:

1
Pe = 2.0; kτ = 2; c0 = ( 0.2 ) ;
0
1 − 21 0
Da = kτA; A = ( −1 1 − 41 ) .
0 − 21 1
4
6 Generalized eigenvectors and canonical forms
When an n × n matrix A that is not symmetric has repeated eigenvalues and fewer than
n eigenvectors, it is not possible to find a matrix P that reduces A to a diagonal form.
However, Jordon’s theorem states that any n × n matrix can be reduced to a canoni-
cal form called the Jordan canonical form. This chapter gives a brief introduction to
generalized eigenvectors and theory of Jordan forms.

6.1 Repeated eigenvalues and generalized eigenvectors


Consider the eigenvalue problem

Ax = λx (6.1)

where A is a matrix of constants and the solution x is the eigenvector that depends on
eigenvalue λ. Rewriting equation (6.1) as

Ax(λ) = λx(λ) (6.2)

and differentiating both sides of equation (6.2) w. r. t. λ, leads to

Ax′ (λ) = λx′ (λ) + x(λ) (6.3)

Note that if λ = λ1 is a repeated root of Pn (λ) = 0 and the rank of (A − λI) is n − 1,


then there is only one eigenvector that satisfies the equation (6.2) corresponding to
repeated root λ1 . Thus, in this case, we can define

x(λ) = x1 = regular eigenvector corresponding to λ1 ; (A − λ1 I)x1 = 0


dx(λ)
x′ (λ) = = x2 = generalized eigenvector corresponding to λ1

where

Ax2 = λ1 x2 + x1 (6.4)

or

(A − λ1 I)x2 = x1 or (A − λ1 I)2 x2 = 0. (6.5)

Now, we have two eigenvectors: one regular (x1 ) and one generalized eigenvector (x2 )
of rank 2. This procedure can be generalized if the eigenvalues λ = λ1 has multiplicity
greater than 2 (say r ≥ 2) and the rank of (A − λ1 I) is smaller than (n − r).

https://doi.org/10.1515/9783110739701-006
6.1 Repeated eigenvalues and generalized eigenvectors | 161

Differentiating equation (6.3) again w. r. t. λ leads to the relation

Ax′′ (λ) = λx′′ (λ) + 2x′ (λ). (6.6)

Defining

1 ′′
x3 = x (λ) (6.7)
2!

Equation (6.6) at λ = λ1 can be written as

(A − λ1 I)x3 = x2 or (A − λ1 I)3 x3 = 0, (6.8)

and defines a generalized eigenvector of rank 3. Similarly, differentiating equation


(6.6) again w. r. t. λ and defining

1 ′′′
x4 = x (λ) (6.9)
3!

gives

(A − λ1 I)x4 = x3 or (A − λ1 I)4 x4 = 0, (6.10)

and so forth.

du
6.1.1 Linearly independent solutions of dt
= Au with repeated eigenvalues

Let λ be an eigenvalue of A with eigenvector x. Then we have seen that u(t) = xeλt is a
solution of du
dt
= Au.

Lemma. Suppose that λ1 is a repeated eigenvalue of A (with multiplicity r = 2) with


eigenvector x1 and generalized eigenvector (GEV) x2 such that

(A − λ1 I)x1 = 0
(A − λ1 I)x2 = x1

then

u1 (t) = x1 eλ1 t
u2 (t) = x2 eλ1 t + x1 teλ1 t

du
are the two linearly independent solutions of the vector equation dt
= Au.
162 | 6 Generalized eigenvectors and canonical forms

Proof. It is easily verified that du


dt
1
= Au1 and du
dt
2
= Au2 .
To show that u1 and u2 are linearly independent, let

c1 u1 + c2 u2 = 0
λ1 t λ1 t
󳨐⇒ c1 x1 e + c2 (x2 e + x1 teλ1 t ) = 0

Evaluating at t = 0,

󳨐⇒ c1 x1 + c2 x2 = 0

󳨐⇒ c1 = 0 and c2 = 0 since x1 and x2 are linearly independent.


d
[Remark: If we write u1 (t) = u(λ, t) = x(λ)eλt , then u2 (t) = dλ
u(λ, t) when λ is a
repeated eigenvalue].

Theorem. If A is a symmetric matrix with real elements, then it cannot have any gener-
alized eigenvectors of rank 2 or higher.

Proof. If AT = A 󳨐⇒ eigenvalues are real.


Suppose that x is a GEV of A of rank 2, i. e.,

(A − λ1 I)2 x = 0 (6.11)
(A − λ1 I)x ≠ 0 (6.12)

Multiplying equation (6.11) on the left by xT (or take dot product with x) 󳨐⇒

xT (A − λ1 I)2 x = 0
󳨐⇒ xT (A − λ1 I)T (A − λ1 I)x = 0

Since AT = A and λ is real,

󳨐⇒ yT y = 0, where y = (A − λ1 I)x
󳨐⇒ y = 0 󳨐⇒ (A − λ1 I)x = 0,

which is a contradiction (see equation (6.12)). Thus, all eigenvectors of A are of


rank 1.

6.1.2 Examples of repeated EVs and GEVs

Consider the matrix A = ( −2 1


−1 −4 ). Its eigenvalues can be obtained by solving

|A − λI| = 0

󳨐⇒
6.1 Repeated eigenvalues and generalized eigenvectors | 163

λ2 + 6λ + 9 = 0 󳨐⇒ λ1,2 = −3, −3.

Note that

1 1
(A − λ1 I) = ( ) 󳨐⇒ rank(A − λ1 I) = 1,
−1 −1

i. e., only one eigenvector and one generalized eigenvector exist. In addition,

0 0
(A − λ1 I)2 = ( ).
0 0

Thus,

a
(A − λ1 I)2 x2 = 0 󳨐⇒ x2 = ( ).
b

Since x2 ≠ 0, both a and b cannot be simultaneously 0. Thus,

x1 = regular eigenvector
1 1 a
= (A − λ1 I)x2 = ( )( )
−1 −1 b
1
= (a + b) ( ), (a + b) ≠ 0
−1

1 ).
If we take a = 1 and b = 0, then the GEV x2 = ( 01 ) and regular eigenvector x1 = ( −1
Using these two eigenvectors (one GEV and one regular eigenvector), the modal matrix
can be constructed as

1 1 0 −1
T = (x2 , x1 ) = ( ) 󳨐⇒ T−1 = ( )
−1 0 1 1

󳨐⇒

0 −1 −2 1 1 1
T−1 AT = ( )( )( )
1 1 −1 −4 −1 0
1 4 1 1
=( )( )
−3 −3 −1 0
−3 1
=( ) = J = Jordan form of A
0 −3

du
Also, note that the solution of dt
= Au, u = u0 @ t = 0 is given by u = eAt u0 . But

eAt = TeJt T−1 ,


164 | 6 Generalized eigenvectors and canonical forms

1 Jt −3t 1 t
where J = ( −3
0 −3 ) is the Jordan form of A. Since e = e ( 0 1 ),

eAt = TeJt T−1 ,


1 1 1 t 0 −1
=( ) e−3t ( )( )
−1 0 0 1 1 1
1 t+1 0 −1
= e−3t ( )( )
−1 −t 1 1
t+1 t
= e−3t ( )
−t −t + 1

󳨐⇒

t+1 t u
u = eAt u0 = e−3t ( ) ( 10 )
−t −t + 1 u20

6.2 Jordon canonical forms

Definition. An upper (lower) Jordan block of order m is an m×m matrix with the eigen-
values along the diagonal and unity in the upper (lower) diagonal.
Examples of upper and lower Jordon blocks of order two and three are given be-
low:

λ 1
J(λ) = ( ) (upper Jordon block of order 2)
0 λ
λ 1 0
J(λ) = ( 0 λ 1 ) (upper Jordon block of order 3)
0 0 λ
λ 0
J(λ) = ( ) (lower Jordon block of order 2)
1 λ
λ 0 0
J(λ) = ( 1 λ 0 ) (lower Jordon block of order 3)
0 1 λ

The theory for upper and lower Jordon blocks is identical, the only difference being the
arrangement of the generalized eigenvectors. Here, we present the theory for upper
Jordon blocks.

Jordon’s theorem. Given a square matrix A, there exists a similarity transformation,


i. e., a matrix T such that
6.2 Jordon canonical forms | 165

J(λ1 ) 0 . 0
0 J(λ2 ) . 0
T−1 AT = B, B=( )
0 0 . 0
0 0 . J(λr )

and the number of Jordon blocks is equal to the number of linearly independent eigen-
vectors of A and there may be more than one block with the same eigenvalue along the
diagonal.

The proof of this theorem may be found in standard matrix algebra books (Bron-
son and Costa [9]; Gantmacher [18]).

Example 6.1. Consider a 5 × 5 matrix A with the following characteristic polynomial:

P5 (λ) = (λ1 − λ)3 (λ2 − λ)2 .

Then the following possible Jordon forms may exist depending on the number of
eigenvectors:
1. There are 5 eigenvectors (3 corresponding to λ1 and 2 corresponding to λ2 ). In this
case, A can be diagonalized and

λ1 0 0 0 0
0 λ1 0 0 0
T−1 AT = ( 0 0 λ1 0 0 )
0 0 0 λ2 0
0 0 0 0 λ2

2. There are 4 eigenvectors, 2 corresponding to λ1 , 2 corresponding to λ2 . In this case,


Jordon’s theorem implies that there exists a T, such that

λ1 1 0 0 0
0 λ1 0 0 0
B = T−1 AT = ( 0 0 λ1 0 0 ) (4 Jordon blocks)
0 0 0 λ2 0
0 0 0 0 λ2

3. There are 4 eigenvectors, 3 corresponding to λ1 , 1 corresponding to λ2 :

λ1 0 0 0 0
0 λ1 0 0 0
B = T−1 AT = ( 0 0 λ1 0 0 ) (4 Jordon blocks)
0 0 0 λ2 1
0 0 0 0 λ2
166 | 6 Generalized eigenvectors and canonical forms

4. There are 3 eigenvectors, 2 corresponding to λ1 , 1 corresponding to λ2 :

λ1 1 0 0 0
0 λ1 0 0 0
B = T−1 AT = ( 0 0 λ1 0 0 ) (3 Jordon blocks)
0 0 0 λ2 1
0 0 0 0 λ2

5. There are 3 eigenvectors, 1 corresponding to λ1 , 2 corresponding to λ2 :

λ1 1 0 0 0
0 λ1 1 0 0
B = T−1 AT = ( 0 0 λ1 0 0 ) (3 Jordon blocks)
0 0 0 λ2 0
0 0 0 0 λ2

6. There are 2 eigenvectors, each corresponding to λ1 and λ2 . In this case, the canon-
ical form of A consists of two Jordon blocks and is of the form,

λ1 1 0 0 0
0 λ1 1 0 0
B = T−1 AT = ( 0 0 λ1 0 0 ) (2 Jordon blocks)
0 0 0 λ2 1
0 0 0 0 λ2

Note that Jordon’s theorem tells us that A can be reduced to a canonical form B
but does not tell us how to find the matrix T in the similarity transformation. We
now focus on this aspect.

6.3 Multiple eigenvalues and generalized eigenvectors


In the general case, let the characteristic polynomial of A be of the form

Pn (λ) = (λ1 − λ)m1 (λ2 − λ)m2 . . . (λr − λ)mr (6.13)

where, λ1 , λ2 , . . . , λr are distinct eigenvalues and

m1 + m2 + m3 + ⋅ ⋅ ⋅ + mr = n, (6.14)

mi is called the algebraic multiplicity of the eigenvalue λi . Let the number of linearly
independent eigenvectors corresponding to λi be Mi . We note that if rank(A − λi I) =
n − Mi , then there are Mi independent solutions of the homogenous equations (A −
λi I)x = 0.
6.3 Multiple eigenvalues and generalized eigenvectors | 167

The positive integer Mi is called the geometric multiplicity of the eigenvalue λi . In


general, Mi ≤ mi . If Mi = mi for all i = 1, 2, . . . , r, then A is diagonalizable, i. e., B is a
diagonal matrix as in case (i) of Example (6.1). If Mi < mi for some i, then A can only
be reduced to a Jordon canonical form B.
To see how to find T that reduces A to a Jordon form, consider first the special case
in which A is n × n and has a single eigenvalue λ1 of multiplicity n. Let x1 , x2 , . . . , xn be
the columns of T. Then, by Jordon’s theorem,

T−1 AT = Jn (λ1 ) 󳨐⇒ AT = TJn (λ1 ) (6.15)

󳨐⇒

λ1 1 0 . . 0
0 λ1 1 . . 0
A(x1 x2... xn ) = (x1 x2... xn ) ( 0 0 λ1 . . 0 )
. . . . . 1
0 0 0 . . λ1
󳨐⇒ Ax1 = λ1 x1 or (A − λ1 I)x1 = 0 (6.16)
Ax2 = x1 + λ1 x2 or (A − λ1 I)x2 = x1 (6.17)
Ax3 = x2 + λ1 x3 or (A − λ1 I)x3 = x2 (6.18)
.
.
Axn = xn−1 + λ1 xn or (A − λ1 I)xn = xn−1 (6.19)

These equations define the columns of T. The vector x1 is the regular eigenvector. We
call the vectors x2 , x3 , . . . , xn the generalized eigenvectors. To see the properties of these
vectors, premultiplying equation (6.17) by (A − λ1 I), we get

(A − λ1 I)2 x2 = 0

Similarly, it can be seen that

(A − λ1 I)3 x3 = 0
...
(6.20)
...
(A − λ1 I)n xn = 0

These are the equations for determining the generalized eigenvectors. The vector x2
is called a generalized eigenvector of rank 2, x3 is a generalized eigenvector of rank 3,
etc. Note that once xn is determined, all the others can be determined by simply using
the above equations in reverse order, i. e.,
168 | 6 Generalized eigenvectors and canonical forms

xn−1 = (A − λ1 I)xn
xn−2 = (A − λ1 I)xn−1
.
.
x1 = (A − λ1 I)x2

Definition. A vector xm is called a generalized eigenvector (GEV) of rank m for A cor-


responding to eigenvalue λ1 if

(A − λ1 I)m xm = 0

and

(A − λ1 I)m−1 xm ≠ 0

Chains
A chain generated by a GEV xm of rank m associated with eigenvalue λ1 is a set of
vectors {xm , xm−1 , . . . , x1 } defined recursively as

xj = (A − λ1 I)xj+1 , j = m − 1, m − 2, . . . , 1

The number of vectors in the set is called the length of the chain.
The procedure for finding T may be summarized as follows: Let Pn (λ) given by
equation (6.13) be the characteristic polynomial. Determine the chain of GEVs corre-
sponding to each eigenvalue λj . Arrange these as columns of T, i. e.,

T = (x1,1 x1,2 . . . x1,m1 x2,1 . . . x2,m2 . . . xr,1 . . . xr,mr )

We now focus on the determination of each chain of the GEV.

Procedure for determining the chain of the GEV corresponding to λ1 (multiplicity m1 )


1. Determine first, rank (A − λ1 I) = r1 . If, r1 = n − m1 , then we have m1 eigenvectors
of rank 1 and we are finished. If, r1 > n − m1 , then determine the smallest integer
p1 for which rank(A − λ1 I)p1 = n − m1 .
2. For each integer k between 1 and p1 , inclusive, compute the eigenvalue rank num-
ber Nk as

Nk = rank(A − λ1 I)k−1 − rank(A − λ1 I)k

Each Nk is the number of generalized eigenvectors of rank k that will appear in T.


6.3 Multiple eigenvalues and generalized eigenvectors | 169

3. Determine a GEV of rank p1 by solving the equations

(A − λ1 I)p1 xp1 = 0, (A − λ1 I)p1 −1 xp1 ≠ 0

and construct the chain generated by this vector. Each of these vectors is part of
the canonical basis.
4. Reduce each positive Nk by one. If all Nk are zero, we are finished. If not, find the
highest value of k for which Nk is not zero and determine a GEV of that rank, which
is linearly independent of all previously determined GEV associated with λ1 . De-
termine the chain generated by this vector and include this in the canonical ba-
sis.
5. Repeat step 4 until all GEVs are found.

Example 6.2.

3 2 0 1
0 3 0 0
A=( )
0 0 3 −1
0 0 0 3

Here, the eigenvalue λ = 3 has multiplicity m1 = 4. We have

0 2 0 1
0 0 0 0
(A − 3I) = ( ); rank(A − 3I) = 2
0 0 0 −1
0 0 0 0

Thus, there are two eigenvectors, and hence two generalized eigenvectors. This can
also be confirmed from the following calculation:

0 0 0 0
0 0 0 0
(A − 3I)2 = ( ); rank[(A − 3I)2 ] = 0 ⇒ p1 = 2
0 0 0 0
0 0 0 0

N2 = rank(A − λ1 I) − rank(A − λ1 I)2 = 2 − 0 = 2 and N1 = rank(A − λ1 I)0 − rank(A − λ1 I) =


4 − 2 = 2. Hence, there are two generalized eigenvectors of rank 2 and two of rank 1.
Now, to determine the generalized eigenvectors of rank 2, we solve

(A − λ1 I)2 x2 = 0 ⇒ any x2 satisfies the equations.


2x22 + x24
0
(A − λ1 I)x2 = ( ) ≠ 0 ⇒ 2x22 + x24 ≠ 0 or x24 ≠ 0
−x24
0
170 | 6 Generalized eigenvectors and canonical forms

Hence, we take the two linearly independent generalized eigenvectors of rank 2 as

0 0
0 1
x2 = ( ), y2 = ( )
0 0
1 0

and obtain the eigenvectors of the corresponding chains as

1
0
x1 = ( )
−1
0

and

2
0
y1 = ( )
0
0

Hence,

2 0 1 0
0 1 0 0
T=( )
0 0 −1 0
0 0 0 1
3 1 0 0
0 3 0 0
T−1 AT = ( )
0 0 3 1
0 0 0 3

Example 6.3.

4 2 1 0 0 0
0 4 −1 0 0 0
( 0 0 4 0 0 0 )
) ⇒ P6 (λ) = (4 − λ)5 (7 − λ)
A=(
( 0 0 0 4 2 0 )
0 0 0 0 4 0
( 0 0 0 0 0 7 )

Thus, λ1 = 4, m1 = 5, n − m1 = 1; λ2 = 7, m2 = 1. The eigenvector corresponding to λ2 is


given by
6.3 Multiple eigenvalues and generalized eigenvectors | 171

−3 2 1 0 0 0 x1 0
0 −3 −1 0 0 0 x2 0
(
( 0 0 −3 0 0 0 )(
)( x3 ) (
)=( 0 )
)
( 0 0 0 −3 2 0 )( x4 ) ( 0 )
0 0 0 0 −3 0 x5 0
( 0 0 0 0 0 0 )( x6 ) ( 0 )
0
0
( 0 )
⇒ z1 = ( ) is an eigenvector corresponding to λ2 .
( 0 )
0
( 1 )

To determine the chain of GEV corresponding to λ1 , we note that

0 2 1 0 0 0
0 0 −1 0 0 0
( 0 0 0 0 0 0 )
(A − 4I) = ( ) ⇒ rank(A − 4I) = 4
( 0 0 0 0 2 0 )
0 0 0 0 0 0
( 0 0 0 0 0 3 )
0 0 −2 0 0 1
0 0 0 0 0 −1
( 0 0 0 0 0 0 )
(A − 4I)2 = ( ) ⇒ rank(A − 4I)2 = 2
( 0 0 0 0 0 0 )
0 0 0 0 0 0
( 0 0 0 0 0 9 )
0 0 0 0 0 −2
0 0 0 0 0 0
( 0 0 0 0 0 0 )
(A − 4I)3 = ( ) ⇒ rank(A − 4I)3 = 1
( 0 0 0 0 0 0 )
0 0 0 0 0 0
( 0 0 0 0 0 27 )

Thus, p1 = 3 and N3 = rank(A − 4I)2 − rank(A − 4I)3 = 2 − 1 = 1,

N2 = rank(A − 4I) − rank(A − 4I)2 = 4 − 2 = 2


N1 = rank(A − 4I)0 − rank(A − 4I) = 6 − 4 = 2

The chain corresponding to λ1 will consist of one GEV of rank 3, two GEV of rank 2 and
two GEVs of rank 1. We first determine the GEV of rank 3 by solving
172 | 6 Generalized eigenvectors and canonical forms

(A − 4I)3 x3 = 0; (A − 4I)2 x3 ≠ 0

If xT3 = (x1 x2 x3 x4 x5 x6 ), then the above equations give x6 = 0 while x1 , . . . , x5


are arbitrary. However, for x3 to be GEV of rank 3, we should have (A − 4I)2 x3 ≠ 0 ⇒
x3 ≠ 0. Hence, we take

0
0
( 1 )
x3 = ( )
( 0 )
0
( 0 )

The remaining vectors of the chain generated by x3 are

−1
−1
( 0 )
x2 = (A − 4I)x3 = ( )
( 0 )
0
( 0 )
−2
0
( 0 )
x1 = (A − 4I)x2 = ( )
( 0 )
0
( 0 )

We now reduce each Nk by one, obtaining N3 = 0, N2 = 1, N1 = 1. This implies that we


still need to find one GEV of rank 2 and one GEV of rank 1. If we denote the GEV of
rank 2 by yT2 = (y1 y2 y3 y4 y5 y6 ), it is given by solving

(A − 4I)2 y2 = 0; (A − 4I)y2 ≠ 0

⇒ y23 = y26 = 0 and y21 , y22 , y24 , y25 are arbitrary but must be chosen such that (A −
4I)y2 ≠ 0 and y2 is independent of x3 , x2 and x1 determined above:

y21
y22
( 0 )
(A − 4I) ( ) ≠ 0 ⇒ y22 ≠ 0 or y25 ≠ 0
( y24 )
y25
( 0 )
6.4 Determination of f (A) when A has multiple eigenvalues | 173

Hence, we take

0
0
( 0 )
y2 = ( ),
( 0 )
1
( 0 )

which is independent of x3, x2 and x1 . The remaining vector of this chain is given by

0
0
( 0 )
y1 = (A − 4I)y2 = ( )
( 2 )
0
( 0 )

Now we have found all the GEV of A. If we take

T = (x1 x2 x3 y1 y2 z1 ),

then

4 1 0 0 0 0
0 4 1 0 0 0
( 0 0 4 0 0 0 )
T−1 AT = ( )
( 0 0 0 4 1 0 )
0 0 0 0 4 0
( 0 0 0 0 0 7 )

6.4 Determination of f (A) when A has multiple eigenvalues


If A is a real symmetric matrix, then it can be diagonalized and the procedure for find-
ing f (A) is same as before, i. e.,

f (A) = Tf (Λ)T−1

where Λ is the diagonal matrix containing the eigenvalues. Now, let T be such that

J1 (λ) . 0
T−1 AT = B = ( . . . ) = Jordon canonical form of A.
0 0 Jr (λ)
174 | 6 Generalized eigenvectors and canonical forms

Then A = TBT−1 and

f (J1 ) . 0
f (A) = Tf (Λ)T−1 = T ( . . . ) T−1 .
0 0 f (Jr )

Thus, if we can evaluate f (J) where J is a Jordon block, then we can evaluate f (A).

Lemma. Let Jm (λ) be a Jordon block of order m, i. e.,

λ 1 0 . . 0
0 λ 1 . . 0
( 0 0 λ . . 0 )
Jm (λ) = ( ),
( . . . . 1 . )
. . . . . .
( 0 0 0 . . λ )
f ′ (λ) f ′′ (λ) f [m−1] (λ)
f (λ) 1! 2!
. . . (m−1)!
f ′ (λ) f [m−2] (λ)
( 0 f (λ) 1!
. . . (m−2)! )
f (Jm (λ)) = ( )
. . . . . . .
. . .. . . . .
( 0 0 0 . . . f (λ) )

To illustrate the use of this lemma, we compute exp(Jm (λ)t). Let,

λt t 0 0 . 0
0 λt t 0 . 0
( 0 0 λt t . 0 )
U = Jm (λ).t = (
(
)
)
. . . . . .
. . . . . .
( 0 0 0 0 . λt )

Note that U is not a Jordon block because it does not have ones on the super diagonal.
It is easily verified that U may be written as

U = VJm (λt)V−1

where

t m−1 0 0 . . 0
0 t m−2 0 . . 0
( . . . . . . )
V=(
(
)
)
. . . . . .
. . . . . .
( 0 0 0 . . 1 )
6.5 Application of Jordon canonical form to differential equations | 175

λt 1 0 . . 0
0 λt 1 . . 0
( 0 0 λt 1 . 0 )
Jm (λt) = ( )
. . . . . .
. . . . λt 1
( 0 0 0 . . λt )
t 1−m 0 0 . . 0
0 t 2−m 0 . . 0
( . . . . . . )
V−1 = ( )
. . . . . .
. . . . . .
( 0 0 0 . . 1 )

Hence,

exp(U) = exp(Jm (λ)t) = V exp(Jm (λt))V−1 .

Thus,

t t2 t m−1
1 1! 2!
. . (m−1)!
t t m−2
0 1 1!
. . (m−2)!
( )
( t m−3 )
exp(Jm (λ)t) = exp(λt) ( 0 0 1 . . (m−3)! )
( )
. . . . . .
. . . . . .
( 0 0 0 . . 1 )

6.5 Application of Jordon canonical form to differential equations


Consider the solution of the first-order equations

dx
= Ax, x(t = 0) = x0 (6.21)
dt

Substituting x = Tz, (6.21) transforms into

dz
= T−1 ATz, z(t = 0) = T−1 x0 ≡ z0 (6.22)
dt

If T−1 AT is a diagonal matrix, then (6.22) is a completely uncoupled system of n equa-


tions, i. e.,

dzi
= λi zi ⇒ zi = zi0 exp(λi t)
dt

Since x = Tz, each component of the solution to (6.21) is of the form,


176 | 6 Generalized eigenvectors and canonical forms

xi = c1i exp(λi t) + ⋅ ⋅ ⋅ + cni exp(λn t).

If T−1 AT consists of Jordon blocks, then the equations (6.22) are uncoupled only par-
tially (or equivalently in blocks). In this case, each component of the solution to (6.21)
consists of terms like exp(λi t) as well as terms of the form t exp(λi t), t 2 exp(λi t), . . ., etc.
To illustrate this, we consider a simple case in which the Jordon canonical form is

4 1 0 0 0 0
0 4 1 0 0 0
( 0 0 4 0 0 0 )
T−1 AT = ( )=B
( 0 0 0 4 1 0 )
0 0 0 0 4 0
( 0 0 0 0 0 7 )

Then the equations for zi are

dz1
= 4z1 + z2
dt
dz2
= 4z2 + z3
dt
dz3
= 4z3
dt
dz4
= 4z4 + z5
dt
dz5
= 4z5
dt
dz6
= 7z6
dt
or,

dz
= Bz; z = z0 @ t = 0 (6.23)
dt

The solution of (6.23) is given by

z = exp(Bt)z0 ,

where
t 4t t 2 4t
e4t 1!
e 2!
e 0 0 0
4t t 4t
0 e 1!
e 0 0 0
( 4t )
(
exp(Bt) = ( 0 0 e 0 0 0 )
t 4t
).
( 0 0 0 e4t 1!
e 0 )
0 0 0 0 e4t 0
( 0 0 0 0 0 e7t )
6.5 Application of Jordon canonical form to differential equations | 177

We close this topic by stating a general theorem on the solution of the linear initial
value problem,

du
= Au, u(@ t = 0) = u0 (6.24)
dt

Theorem. Let λ1 , λ2 , . . . , λr be the distinct eigenvalues of A with multiplicities m1 , m2 , . . . ,


mr and

r
∑ mi = n (6.25)
i=1

Let Wi (A) = generalized eigenspace of A corresponding to eigenvalue λi (This is the


vector space spanned by the eigenvectors corresponding to λi ) and

ℝn = W1 (A) ⊕ W2 (A) ⊕ ⋅ ⋅ ⋅ ⊕ Wr (A) (6.26)

Then the solution of the initial value problem, defined by equation (6.24) is given by

r mj −1
tk
u(t) = ∑ [ ∑ (A − λj I)k exp(λj t)]u0,j
j=0 k=0
k!

where
r
u = ∑ u0,j and u0,j ∈ Wj (A).
i=1

Problems
1. Given the matrix A = ( −2 1
−1 4 )
(a) Determine the eigenvalues, generalized eigenvectors and generalized adjoint
eigenvectors.
(b) What is the Jordan canonical form of A?
(c) Determine the matrix exp[At].
2. Given the matrix

8 −2 −2 0
0 6 2 −4
A=( )
−2 0 8 −2
2 −4 0 6

(a) Determine the eigenvalues, generalized eigenvectors and generalized adjoint


eigenvectors.
(b) What is the Jordan canonical form of A?
178 | 6 Generalized eigenvectors and canonical forms

(c) Given the set of differential equations

du
= Au, u(t = 0) = u0 ,
dt

with A as defined as above, determine the solution.


3. Consider the case of n-consecutive first-order reactions occurring in a tubular plug
flow reactor and the special case in which the rate constants for all the reactions
are equal.
(a) Formulate the balance equations and put them in matrix/vector form.
(b) If the feed to the reactor contains only the first species, determine the exit
concentration of any intermediate species.
4. Determine the eigenvalues, left and right eigenvectors and generalized eigenvec-
tors (if any) of the matrix

−7 −1 −3 1
−1 7 1 −3
A=( ).
−3 1 7 −1
1 −3 −1 7
7 Quadratic forms, positive definite matrices and
other applications
Quadratic forms appear in many applications such as the determination of maxima or
minima of functions of several variables, optimization theory, solution of linear and
nonlinear equations by iterative methods, tensor analysis and coordinate transforma-
tions, stability and control theory, definitions of metric or inner products, classifica-
tion of partial differential equations, etc. The first four sections of this chapter give a
brief introduction to this topic, and the rest of the chapter deals with various other
applications.

7.1 Quadratic forms


To illustrate one example where quadratic forms appear, we consider a scalar function
of n-variables f (x1 , x2 , . . . , xn ) and expand it in a Taylor (Maclaurin’s) series around the
origin x = 0 and keep only the constant, linear and quadratic terms. This gives

f (x) = c + bT x + xT Ax, (7.1)

where

c = f (x = 0)
𝜕f
bi = (x = 0)
𝜕xi
1 𝜕2 f
{aij } = {aji } = (x = 0).
2! 𝜕xi 𝜕xj

The real symmetric matrix 2A (of second partial derivatives of f (x1 , x2 , . . . , xn )) is also
called the Hessian matrix. We can remove the constant and linear terms in equation
(7.1) by a translation of the origin. Defining

y=x−α (7.2)

we get

f (x) = c + bT (y + α) + (y + α)T A(y + α)


= (c + bT α + αT Aα) + (2αT A + bT )y + yT Ay.

Suppose that we can choose α such that

1
Aα = − b (7.3)
2
https://doi.org/10.1515/9783110739701-007
180 | 7 Quadratic forms, positive definite matrices and other applications

Then the linear terms vanish. Defining Q(x) = f (x) − (c + bT α + αT Aα), the quadratic
form simplifies to

Q(y) = yT Ay (7.4)

Note that equation (7.3) can be solved for any b only if A is not singular (or if the vector
−b/2 is in the column space of A). For the case of n = 2 with

a b
A=( ), (7.5)
b c

the linear terms can be removed only if b2 − ac ≠ 0, i. e., the quadratic form is not of
parabolic type.
The quadratic form given by equation (7.4) can be put in canonical form by not-
ing that for a real symmetric matrix A, there exists an orthogonal matrix U such that
UT AU = Λ (diagonal). Making the coordinate transformation (which is a rotation)

y = Uz (7.6)

Equation (7.4) reduces to its canonical form

Q = (Uz)T AUz
= zT UT AUz
= zT Λz
= λ1 z12 + λ2 z22 + ⋅ ⋅ ⋅ + λn zn2 .

Example 7.1. Examine the nature of the curve defined by 5x12 − 8x1 x2 + 5x22 = 10.
The quadratic form Q = 5x12 − 8x1 x2 + 5x22 may be written as

5 −4 x
Q = ( x1 x2 ) ( ) ( 1 ) = 10
−4 5 x2

5 −4 ) are λ = 9, λ = 1, x = ( 1/√2
The eigenvalues and eigenvectors of A = ( −4 5 1 2 1 );
−1/√2
x2 = ( 1/√2 ). Thus, making the substitution (rotation) x = Uz, where

1/ 2

1 1
√2 √2
U=( ) (7.7)
− √12 1
√2

the quadratic form takes the canonical form

9z12 + z22 = 10
7.1 Quadratic forms | 181

or equivalently,

z12 z2
+ 2 =1
(10/9) 10

Now, since
1
z1 − √12 x1 (x − x2 )/√2
z=( ) = UT x = ( √2
1 1 )( )=( 1 ),
z2 √2 √2
x2 (x1 + x2 )/√2

in the original x-coordinates, the quadratic form may be written as

2 2
x1 − x2 x + x2
9( ) +( 1 ) = 10.
√2 √2

This represents an ellipse with semiaxis lengths of √10/9 and √10, respectively. Fig-
ure 7.1 shows the two coordinate systems as well as the curve (ellipse) represented by
the quadratic form. In this case, equation (7.7) represents a 7π/4 rotation of the axes
in the positive (or counterclockwise) direction (or π/4 in the clockwise direction).

Figure 7.1: Ellipse represented by 5x12 − 8x1 x2 + 5x22 = 10 with standard and canonical coordinate
systems.

Example 7.2. The quadratic form defined by 5x12 + 8x1 x2 + 5x22 = 10 is also an ellipse
identical to the one in the above example but the major (longer) axis makes an angle
3π/4 with the positive x1 -axis (see Figure 7.2).

Example 7.3. The quadratic form defined by 3x12 − 8x1 x2 − 3x22 = 10 is a hyperbola, as
seen from the following analysis. The eigenvalues and eigenvectors of A = ( −43 −4 ) are
−3
1/ 5 2/ 5
λ1 = −5, λ2 = 5, x1 = ( 2/ ); x2 = ( −1/ ). Making the substitution (rotation) x = Uz,
√ √
√5 √5
where
182 | 7 Quadratic forms, positive definite matrices and other applications

Figure 7.2: Ellipse represented by 5x12 + 8x1 x2 + 5x22 = 10 with standard and canonical coordinate
systems.

1 2
√5 √5
U=( 2
) (7.8)
√5
− √15

reduces it to the canonical form

−5z12 + 5z22 = 10,

or equivalently,

z22 z12
− =1
2 2

Now, since

1 2
z1 √5 √5 x1 (x + 2x2 )/√5
z=( ) = UT x = ( )( )=( 1 ),
z2 2
− √15 x2 (2x1 − x2 )/√5
√5

in the original x-coordinates, the quadratic form may be written as

2 2
2x1 − x2 x + 2x2
( ) −( 1 ) = 2.
√5 √5

Figure 7.3 shows the curve represented by the quadratic form along with the axes and
asymptotes.
Another application of quadratic forms is in obtaining the nature of the extrema
of multivariable functions (i. e., maxima, minima, saddle points, etc.), which we will
discuss in Section 7.4.
7.2 Positive definite matrices | 183

Figure 7.3: Hyperbola represented by 3x12 − 8x1 x2 − 3x22 = 10 with standard and canonical coordinate
systems.

7.2 Positive definite matrices


Consider the quadratic form

Q(x) = xT Ax (7.9)

where is A a real symmetric matrix.

Definition. Q(x) is called positive definite if it takes only positive values for any choice
of x ≠ 0 and is zero only for x = 0, i. e., xT Ax > 0 for all x ≠ 0. Similarly, Q(x) is called
negative definite if xT Ax < 0 for all x ≠ 0.
For example, Q(x) = 5x12 − 8x1 x2 + 5x22 = 9( x1√−x2 2 )2 + ( x1√+x2 2 )2 is positive definite.
The following tests may be used to check for the positive definiteness of a quadratic
form or symmetric (or Hermitian) matrix:
(i) A is positive definite if and only if it can be reduced to an upper triangular form
using only elementary row (or column) operations of type 3 and the diagonal ele-
ments of the resulting matrix (the pivots) are all positive.
(ii) A is positive definite if and only if all its principal minors are positive. A principal
minor of A is the determinant of any submatrix obtained from A by deleting its
last k rows and k columns (k = 0, 1, . . . , n − 1).
(iii) A is positive definite if and only if all its eigenvalues are positive.
184 | 7 Quadratic forms, positive definite matrices and other applications

The proof of the first two statements may be found in the book by Bronson [8]. State-
ment (iii) is proved in the next section.

Example 7.4. Test the matrix

6 2 −2
A=( 2 6 −2 )
−2 −2 10

for positive definiteness using criteria (i) and (ii) above.


(i) In step 1, we perform the row operations R2 → R2 − 31 R1 and R3 → R3 + 31 R1 , which
gives

6 2 −2
16
A1 = ( 0 3
− 43 ) .
−0 − 43 28
3

In step 2, we perform the row operation R3 → R3 + 41 R2 󳨐⇒

6 2 −2
16
A2 = ( 0 3
− 43 ) .
0 0 9

Since all the diagonal elements (pivots) of A2 are positive, A is positive definite.
(ii) The principal minors of A are

d1 = det[6] = 6,
6 2
d2 = det ( ) = 34,
2 6

and

6 2 −2
d3 = det ( 2 6 −2 ) = 288.
−2 −2 10

Since all three principal minors are positive, the matrix is positive definite.

7.3 Rayleigh quotient


Suppose that A is a real symmetric (or a complex Hermitian) matrix and the eigenval-
ues of A (which are real) are arranged such that
7.4 Maxima/minima for a function of several variables | 185

λ1 ≤ λ2 ≤ ⋅ ⋅ ⋅ ≤ λn .

We define the Rayleigh quotient (which defines a mapping from ℝn /ℂn to the field of
real numbers) as

⟨Ax, x⟩
R(x) = .
⟨x, x⟩

Then the Rayleigh principle may be stated as

λ1 ≤ R(x) ≤ λn

i. e., the Rayleigh quotient attains its minimum value (equal to the smallest eigen-
value of A) when x is the eigenvector corresponding to λ1 . Similarly, R(x) attains its
maximum value when x is the eigenvector corresponding to λn . This may be shown as
follows:
Suppose that U is the orthogonal (unitary) matrix that diagonalizes A. Then

U∗ AU = Λ = diag .(λ1 , . . . , λn )

We assume that the columns of U have been ordered so that λ1 ≤ λ2 ≤ ⋅ ⋅ ⋅ ≤ λn . Setting


x = Uy in the Rayleigh quotient, we get

⟨AUy, Uy⟩ ⟨U∗ AUy, y⟩ ⟨Λy, y⟩ λ1 |y1 |2 + λ2 |y2 |2 + ⋅ ⋅ ⋅ + λn |yn |2


R(x) = = = =
⟨Uy, Uy⟩ ⟨y, y⟩ ⟨y, y⟩ |y1 |2 + |y2 |2 + ⋅ ⋅ ⋅ + |yn |2

The upper and lower bounds follow from this expression and the assumption that
λ1 ≤ λ2 ≤ ⋅ ⋅ ⋅ ≤ λn .

7.4 Maxima/minima for a function of several variables


Recall from calculus that for f (x) ∈ ℝ, a necessary condition for a point x0 to be an
extremum is

df 󵄨󵄨󵄨󵄨
󵄨 =0 (7.10)
dx 󵄨󵄨󵄨x0

d2 f
A sufficient condition is |
dx 2 x0
≠ 0. Further, the extremum is a local minimum if

d2 f 󵄨󵄨󵄨󵄨
󵄨 > 0, (7.11)
dx2 󵄨󵄨󵄨x0

while it is a local maximum if


186 | 7 Quadratic forms, positive definite matrices and other applications

d2 f 󵄨󵄨󵄨󵄨
󵄨 < 0. (7.12)
dx2 󵄨󵄨󵄨x0
2
df
The case of dx 2 |x0 = 0 corresponds to neither maximum nor minimum, but a saddle

point (see Figure 7.4).

Figure 7.4: Schematic of extremum points: minimum, maximum and saddle points.

Now consider a scalar function of n variables f (x1 , x2 , . . . , xn ). Expanding this function


in a Taylor series around a point x0 gives
T
f (x) = f (x0 ) + ∇f (x0 ) (x − x0 )
1 T
+ (x − x0 ) A(x − x0 ) + higher order terms (7.13)
2!

where
𝜕f
𝜕x1
𝜕f
∇f (x0 ) = ( = gradient vector of f (x) at x = x0 (7.14)
𝜕x2 )
..
.
𝜕f
( 𝜕xn )x=x0

and

𝜕2 f 󵄨󵄨󵄨󵄨
A = {aij }n×n = { 󵄨 } = Hessian matrix
𝜕xi 𝜕xj 󵄨󵄨󵄨x=x0 n×n
= Symmetric matrix of second partial derivatives of f (x) evaluated at x = x0 (7.15)

From equations (7.13)–(7.15), it can be seen that a necessary condition for f (x) to have
an extremum at x = x0 is
7.4 Maxima/minima for a function of several variables | 187

∇f (x0 ) = 0, (7.16)

while a sufficient condition for x = x0 to be a minimum is that


T
(x − x0 ) A(x − x0 ) > 0 (7.17)

and for a maximum


T
(x − x0 ) A(x − x0 ) < 0 (7.18)

for all x in the neighborhood of x0 . Otherwise, x = x0 is neither a minimum nor a


maximum.
For simplicity, we can assume x0 = 0. If this is not the case, we can shift the origin.
Thus, the sufficient condition reduces to examining the quadratic form:

Q(x) = xT Ax

where AT = A is a real symmetric matrix. If Q(x) > 0 for all x near the origin, then f (x)
has a local minimum, while if Q(x) < 0 for all x near the origin (0), then f (x) has a
local maximum. Otherwise, the extremum is a saddle point.
Let λ1 , λ2 , . . . , λn be the eigenvalues (real) and T be the orthogonal matrix that di-
agonalizes A. Let

x = Tz 󳨐⇒ z = T−1 x = TT x (7.19)

󳨐⇒

Q = zT TT ATz = zT Λz
n
= ∑ λi zi2 (7.20)
i=1

Thus, if all λi > 0 (i = 1, 2, . . . , n), Q only takes positive values, and hence the extremum
is a minimum. If λi < 0 for all i, Q is negative and the extremum is a maximum. If λi = 0
for some i or if eigenvalues of A are both positive and negative, the extremum is neither
a maximum nor a minimum.
It may be shown that a symmetric matrix A has positive eigenvalues (or is positive
definite) if all the principal minors have positive determinants. A principle minor is
obtained by striking out the last k rows and k columns (k = 1, 2, . . . , n − 1) of A. For
example, for 2 × 2 case,

a11 a12
A=( ), a12 = a21
a21 a22

is positive definite if
188 | 7 Quadratic forms, positive definite matrices and other applications

a11 > 0
a11 a22 − a12 a21 = a11 a22 − a212 > 0.

Similarly, for the 3 × 3 case,

a11 a12 a13 a12 = a21


A = ( a21 a22 a23 ) , a13 = a31
a31 a32 a33 a23 = a32

is positive definite if

a11 > 0,
󵄨󵄨 󵄨
󵄨󵄨 a11 a12 󵄨󵄨󵄨
󵄨󵄨 󵄨 > 0,
󵄨󵄨 a21 a22 󵄨󵄨󵄨
󵄨󵄨 a a12 a13 󵄨󵄨󵄨󵄨
󵄨󵄨 11
󵄨󵄨 󵄨
󵄨󵄨 a21 a22 a23 󵄨󵄨󵄨 > 0.
󵄨󵄨 󵄨
󵄨󵄨 a31 a32 a33 󵄨󵄨󵄨

Remark. A is negative definite if (−A) is positive definite. Then, for the 2 × 2 case, A is
negative definite if

a11 < 0,
󵄨󵄨 󵄨
󵄨󵄨 a11 a12 󵄨󵄨󵄨
󵄨󵄨 󵄨 > 0.
󵄨󵄨 a21 a22 󵄨󵄨󵄨

Example 7.5. State the conditions for a function of two variables f (x1 , x2 ) to have a
local maximum.
For f (x1 , x2 ) to have a local maximum at (x10 , x20 ), a necessary condition is that the
first partial derivatives of f (x1 , x2 ) w. r. t. x1 and x2 vanish, i. e.,

𝜕f 𝜕f
(x , x ) = 0; (x , x ) = 0
𝜕x1 10 20 𝜕x2 10 20

The local extremum is a maximum if the quadratic form

1 𝜕2 f 𝜕2 f
Q= [ 2 (x10 , x20 )(x1 − x10 )2 + 2 (x , x )(x − x10 )(x2 − x20 )
2! 𝜕x1 𝜕x1 𝜕x2 10 20 1
𝜕2 f
+ 2 (x10 , x20 )(x2 − x20 )2 ]
𝜕x2

is negative definite. This is the case if the following two conditions are satisfied:

𝜕2 f
(x10 , x20 ) < 0
𝜕x12
2
𝜕2 f 𝜕2 f 𝜕2 f
(x 10 , x 20 ) (x 10 , x 20 ) − ( (x , x )) > 0
𝜕x12 𝜕x22 𝜕x1 𝜕x2 10 20
7.4 Maxima/minima for a function of several variables | 189

Example 7.6. Consider the function f (x, y) = 2x 2 − 2xy + 5y2 − 18y + 23, and examine it
for extremum values.
The extremum points can be obtained by setting ∇f = 0 󳨐⇒

𝜕f
= 4x − 2y = 0
𝜕x
𝜕f
= −2x + 10y − 18 = 0,
𝜕y

which after solving leads to (x0 , y0 ) = (1, 2) as a possible extremum point. We evaluate
the Hessian matrix (matrix of second derivatives),

𝜕2 f 𝜕2 f
𝜕x 2 4 −2
A=(
𝜕x𝜕y
) =( ).
𝜕2 f 𝜕2 f −2 10
𝜕y𝜕x 𝜕y2 (x0 ,y0 )=(1,2)

The eigenvalues of A are given by |A − λI| = λ2 − 14λ + 36 = 0 󳨐⇒ λ1 = 7 − √13 > 0 and


λ2 = 7 + √13 > 0. Since both eigenvalues are positive, the extremum is a minimum.
This can also be seen from the principal minors of A :

d1 = |4| = 4 > 0,
󵄨󵄨 󵄨
󵄨 4 −2 󵄨󵄨󵄨
d2 = 󵄨󵄨󵄨󵄨 󵄨󵄨 = 36 > 0.
󵄨󵄨 −2 10 󵄨󵄨󵄨

We also note that fmin = f (1, 2) = 5.

Example 7.7. Consider the function g(x, y) = 2x 2 − 8xy − y2 + 18y − 9. The extremum
points can be obtained by setting ∇g = 0 󳨐⇒

𝜕g
= 4x − 8y = 0
𝜕x
𝜕g
= −8x − 2y + 18 = 0.
𝜕y

Solving these linear equations gives a possible extremum point (x0 , y0 ) = (2, 1). We
examine the Hessian matrix

𝜕2 g 𝜕2 g
𝜕x 2 4 −8
A=(
𝜕x𝜕y
) =( ).
𝜕2 g 𝜕2 g −8 −2
𝜕y𝜕x 𝜕y2 (x0 ,y0 )=(2,1)

The eigenvalues of A are λ1 = 1 + √73 > 0 and λ2 = 1 − √73 < 0. Thus, the point x0 = 2,
y0 = 1 is neither a maximum nor a minimum. It is a saddle point.
190 | 7 Quadratic forms, positive definite matrices and other applications

7.5 Linear difference equations


In many applications, such as discrete dynamical systems, stage operations, Markov
processes, etc., the governing equations may be expressed in vector-matrix form as

uk+1 = Auk , k ≥ 0; u0 (i. e., initial state) given (7.21)

Here, uk is the system state at time k, which is assumed to be discrete, i. e., taking
values k = 0, 1, . . . ∞ while A is the connectivity, coupling or transition probability
matrix.

Scalar to vector representation


Scalar difference equations of any order can easily be represented in vector-matrix
form given in equation (7.21). For example, let us consider a second-order difference
equation:

un+1 = aun + bun−1 , n = 1, 2, . . . (7.22)

where u0 and u1 are given. Defining two-dimensional vectors

un−1 un
xn = ( ) and xn+1 = ( ), (7.23)
un un+1

the above second-order difference equation can be expressed as

0 1
xn+1 = ( ) xn = Axn with x1 given. (7.24)
b a

Similarly, a third-order difference equation:

un+1 = aun + bun−1 + cun−2 , n = 2, 3, . . . (7.25)

with given u0 , u1 and u2 , can be converted to vector-matrix form:

un−1 0 1 0
xn+1 = Axn where xn+1 = ( un ) and A = ( 0 0 1 ) (7.26)
un+1 c b a

u0
with given x1 = ( u1 ). Thus, higher-order scalar difference equations can be converted
u2
to vector equations.
7.5 Linear difference equations | 191

Formal solution of difference equation vector-matrix form


It follows from equation (7.21) that

u1 = Au0
u2 = Au1 = A2 u0
..
.
uk = Ak u0 , k = 0, 1, 2, . . . (7.27)

Thus, the computation of uk requires the evaluation of Ak , which is given by the spec-
tral theorem as follows:
n
Ak = ∑ λjk Ej (7.28)
j=1

where Ej are projection matrices given in terms of eigenvectors and eigenrows as

xj y∗j
Ej = = xj y∗j (if y∗j are normalized). (7.29)
y∗j xj

Thus, the solution (equation (7.27)) becomes

n
uk = ∑ λjk Ej u0 , k = 0, 1, 2, . . . (7.30)
j=1

We consider some special cases of this solution based on the magnitude of the eigen-
values.
Case 1: |λj | < 1 for all j = 1, 2, . . . , n
In this case, limk→∞ |λj |k → 0, and thus, uk → 0 for k → ∞ and the system
approaches the trivial state u = 0 for k → ∞.
Case 2: λ1 = 1, |λj | < 1 for all j = 2, 3, . . . , n
In this case, limk→∞ uk → E1 u0 = (y∗1 u0 )x1 , and the system approaches a scalar
multiple of the state given by the eigenvector x1 corresponding to λ1 = 1 (eigenvalue
of unity).
Case 3: |λj | > 1 for some j
In this case, limk→∞ uk → ∞ and the solution is not bounded.
Case 4: A pair of complex eigenvalues of unit modulus and all other eigenvalues
inside the unit circle, i. e., λ1,2 = α ± iβ with |λ1 | = |λ2 | = α2 + β2 = 1 while |λj | < 1 for
j = 3, 4, . . . , n. The solution is again bounded and lies on an invariant circle.
It should be pointed out that all these special cases occur in the solution of non-
linear algebraic equations by local linearization, e. g., Newton–Raphson or other iter-
ative methods.
192 | 7 Quadratic forms, positive definite matrices and other applications

Example 7.8 (Two-stage Markov process). Consider the case in which the state vector
u is of the form

ak
uk = ( )
bk

where ak is the fraction of the population in state ak and bk is the fraction of the pop-
ulation in state B at time k. For example, ak can be the fraction of students in a class
having a mobile phone of type-I while bk as the fraction having phone of type-II. Let

A = {pij } = P

where P being the transition/switching probability matrix, i. e., pij is the probability
of switching from state j to i then
n
∑ pij = 1 for j = 1, 2, . . . , n
i=1

2 1
or the columns of P sum to unity for a Markov matrix. For example, for n = 2, P = ( 31 2
1 )
3 2
2
is a Markov matrix where p11 = is the probability of a student in state A staying in
3
state A, p21 = 31 is the probability of student switching from state A to state B, p12 = 21
is the probability of switching from state B to state A, and p22 = 21 is the probability of
students staying in state B (here p11 + p21 = 1, p12 + p22 = 1).
We note that if P is a Markov matrix then λ1 = 1 is an eigenvalue of P with left
eigenvector yT1 = (1 1 . . . 1). This follows from the fact that ∑ni=1 pij = 1 for each j.
Thus, for a Markov process with initial state u0 , we have
n
uk = ∑ λjk Ej u0
j=1

lim uk = E1 u0 if |λj | < 1 for j = 2, 3, . . . , n


k→∞
= (yT1 u0 )x1
= x1 if y∗1 u0 = 1, i. e., sum of all probabilities is unity

In the example above, the eigenvalues are λ1 = 1 and λ2 = 61 corresponding to the


eigenvectors: x1 = ( 3/5 1 T T 2 −3
2/5 ) and x2 = ( −1 ) and eigenrows: y1 = (1 1) and y2 = ( 5 5 ).
Thus, assuming the components of u0 sum to unit (i. e., yT1 u0 = 1)

uk = (yT1 u0 )x1 + (λ2 )k (yT2 u0 )x2


3/5
⇒ u∞ = lim uk = (yT1 u0 )x1 = x1 = ( ).
k→∞ 2/5
7.5 Linear difference equations | 193

The rate of convergence to the steady state or equilibrium solution depends on λ2 .


In this example, since λ2 = 61 , which is much smaller than unity, the rate of conver-
gence is fast as shown by iterations below:
2 1
Taking u0 = ( 0.5 )u0 = ( 0.583 0.597
0.5 ) ⇒ u1 = ( 1 0.417 ). Similarly, u2 = Pu1 = ( 0.403 ), u3 =
3 2
1
3 2
0.5996 ), and so on. The convergence of Markov process is demonstrated in Figure 7.5
( 0.4004
2 1
with P = ( 31 2
1 ) and u0 = ( 0.05
0.95 ), along with eigenvectors and eigenrows of P. It can be
3 2
seen that the steady-state solution u∞ = x1 is achieved in few (less than 4) iterations
to very good accuracy.

2 1
Figure 7.5: Convergence in Markov chains with P = ( 3
1
2
1 ) and u0 = ( 0.05
0.95 ).
3 2

Remark 7.1. For the case of n = 2 or 2 × 2 Markov matrix, the limk→∞ uk can also be
β α
found by solving a simple algebraic equation. Let u0 = ( 1−β ) and u∞ = ( 1−α ). Now
since

2 1
α 3 2 α
u∞ = Pu∞ ⇒( )=( )( )
1−α 1 1 1−α
3 2
2 1 3
⇒α= + (1 − α) ⇒ α =
3 2 5
3
5
⇒ u∞ = ( 2
),
5

which is independent of β, i. e., the initial point (provided the sum of the two compo-
nents of initial condition is unity).
194 | 7 Quadratic forms, positive definite matrices and other applications

Remark 7.2. For n > 2, u∞ can be found by solving the equation

u∞ = Pu∞

along with the constraints that the sum of all u∞ components is unity. Note that
this is proportional to the eigenvector x1 corresponding to the eigenvalue λ1 where
the proportionality constant can be obtained with the constraints stated earlier, i. e.,
yT1 u∞ = 1.

Example 7.9 (Fibonacci equation). Consider the Fibonacci equation

xn+1 = xn + xn−1 , n = 1, 2, . . .
x0 = x1 = 1

which is a second-order linear difference equation. One way to solve this equation is
by assuming the solution of the form

xn = r n ,

which leads to the characteristic equation:

1 ± √5
r 2 = r + 1 ⇒ r1,2 = (r1 = 1.618, r2 = −0.618).
2
Thus, the general solution can be expressed as

xn = c1 r1n + c2 r2n

where c1 and c2 can be solved from initial points by setting for n = 0 and 1, respectively,
i. e.,

n = 0 ⇒ c1 + c2 = 1 and n = 1 ⇒ c1 r1 + c2 r2 = 1
√5 ± 1
⇒ c1,2 =
2√5

Thus, the general solution of Fibonacci equation is given by

1 (1 + √5)n+1 − (1 − √5)n+1
xn = ( ).
√5 2n+1

Vector-matrix form: The Fibonacci equation can be represented in vector-matrix form


by assuming
7.5 Linear difference equations | 195

xn−1
xn = ( )
xn
xn 0 1 x
⇒ xn+1 = ( )=( ) ( n−1 )
xn+1 1 1 xn
1
⇒ xn+1 = Axn , x1 = ( ).
1

Thus,

x2 = Ax1 , x3 = A 2 x2 , ..., xn = An−1 x1 ,

where A = ( 01 11 ), which is a symmetric matrix with real eigenvalues given by

λ2 − λ − 1 = 0,

1±√5
⇒ λ1,2 = 2
, and corresponding to eigenvectors:

1
x1,2 = ( ),
λ1,2

and normalized eigenrows:

1
yT1,2 = ( 1 λ1,2 ) .
2
√1 + λ1,2

Thus, the projection matrices can be obtained by

1 1 λ1,2
E1,2 = x1,2 yT1,2 = 2
( 2 )
1 + λ1,2 λ1,2 λ1,2

An−1 = λ1n−1 E1 + λ2n−1 E2


λ1n−1 1 λ1 λ2n−1 1 λ2
= ( )+ ( )
1 + λ12 λ1 λ12 1 + λ22 λ2 λ22

xn−1 1
( ) = xn = An−1 x1 = An−1 ( )
xn 1
(1+λ1 )λ1n−1 (1+λ2 )λ2n−1
1+λ12
+ 1+λ22
=( )
(1+λ1 )λ1n (1+λ2 )λ2n
1+λ12
+ 1+λ22
196 | 7 Quadratic forms, positive definite matrices and other applications


(1 + λ1 ) n (1 + λ2 ) n
xn = λ + λ
1 + λ12 1 1 + λ22 2

1±√5 2
But λ1,2 = 2
and λ1,2 = λ1,2 + 1 ⇒

2 n+2
(1 + λ1,2 ) n λ1,2 n λ1,2
2
λ1,2 = 2
λ1,2 = 2
1 + λ1,2 1 + λ1,2 1 + λ1,2
( 1±2 5 )n+2 1 (1 ± √5)n+2

= =
1+ 3±√5 2n+1 5 ± √5
2

1 (1 + √5)n+2 1 (1 − √5)n+2
xn = + n+1
2n+1 5 + √5 2 5 − √5
1 (1 + √5)n+1 − (1 − √5)n+1
= ( ),
√5 2n+1

which is the same result obtained earlier. Since |λ2 | = 0.618 < 1 and λ2n → 0 for n → ∞,
xn may be approximated for large n as

1 + λ1 n
xn ≈ λ = 0.7236(1.618)n .
1 + λ12 1

For n = 5, the approximation gives x5 = 8.02 while the exact value is 8.

7.6 Generalized inverse and least square solutions


Consider the system of linear equations:

Au = b (7.31)

where A is a real m × n matrix, u is n × 1 vector and b is m × 1 vector. When m (number


of equations) > n (number of unknowns), the augmented matrix [A b] may have rank
> n, in which case, the system is inconsistent and has no solution. Similarly, when
m < n, even if rank(A) = rank[A b] = m, there are fewer equations than unknowns and
there is no unique solution. In many practical cases, a solution that satisfies equation
(7.31) in the best possible way, often called the “least square solution,” can be obtained
by minimizing the scalar function:

f (u) = (Au − b)T (Au − b) (7.32)


T T T T T
= uA Au − b Au − u A b + b b.
7.6 Generalized inverse and least square solutions | 197

If we denote the j-th component of (Au − b) as ej , which represents the error in j-th
equation, then
m
f (u) = eT e = ∑ ej2 = sum of squares of residuals (7.33)
j=1

To minimize f , we set

𝜕f
= 0, k = 1, 2, . . . , n
𝜕uk
⇒ ∇f = gradient of f = 0 (7.34)

the gradient of f can be obtained from equation (7.32) as

∇f = 2AT Au − 2AT b = 0
⇒ AT Au = AT b
⇒ u = (AT A) AT b. (7.35)
−1

In arriving at equation (7.35), it is assumed that AT A is invertible.

Definition. A† = (AT A)−1 AT is called the generalized inverse (or Moore–Penrose) in-
verse of A.

Properties of A†
(i) If m = n and A is an invertible square matrix, then A† = A−1 .
(ii) A† A = In = Identity matrix.
(iii) AA† A = A.

The equations

AT Au = AT b (7.36)

are often referred to as the “least squares equations” and can be solved for u if AT A is
invertible. If we let

B = AT A, (7.37)

then

BT = A T A = B (7.38)

⇒ B is a symmetric matrix, and hence the eigenvalues of B are real and nonnegative.
The positive square roots of the eigenvalues of B are called “singular values” of A.
198 | 7 Quadratic forms, positive definite matrices and other applications

It may be shown that

D 0
A = U( ) VT (7.39)
0 0

where U and V are orthogonal matrices and D is a diagonal matrix having all the pos-
itive singular values of A as its diagonal elements. Equation (7.39) is referred to as
“singular-value decomposition” of A.

Example 7.10. Determine the least square solution to the equations

x1 + 3x2 = 5
x1 − x2 = 1
x1 + x2 = 0.

Here, we have

1 3 5
A=( 1 −1 ) ; b = ( 1 ),
1 1 0

󳨐⇒

3 3 6
AT A = ( ), AT b = ( ).
3 11 14

Solving the normal equations (7.36)–(7.38) gives

x̂1 = 1 = x̂2 .

Note that the least square solution is equivalent to fitting the line y = α + βx through
the data points (3, 5), (−1, 1) and (1, 0). The least squares solution α = 1 and β = 1 gives
the line closest to the data points as shown in Figure 7.6.

Problems
1. Given the quadratic form F = 6x12 + 5x22 + 7x32 − 4x1 x2 + 4x1 x3 :
(a) Reduce it to its canonical form.
(b) If F = 1, what are the lengths of the semiaxes?
(c) What are the directions of the principal axes with respect to the original axes?
2. Find the values of the parameter λ for which the following quadratic forms are
positive definite:
(a) 2x12 + x22 + 3x32 + 2λx1 x2 + 2x1 x3
(b) x12 + 4x22 + x32 + 2λx1 x2 + 10x1 x3 + 6x2 x3
7.6 Generalized inverse and least square solutions | 199

Figure 7.6: Geometric demonstration of least square solution.

3. If A is an n × n real symmetric positive definite matrix and x is an n-vector, show


that

π n/2
∞ ∞ ∞ ∞

∫ ∫ ∫ . . . ∫ exp(−xt Ax) dx = .
(det A)1/2
−∞ −∞ −∞ −∞

4. Examine the following functions for relative extremum values:


(a) F = x 2 y + 2x 2 − 2xy + 3y2 − 4x + 7y
(b) F = x 3 + 4x 2 + 3y2 + 5x − 6y
(c) F = x12 + x22 subject to the constraint 5x12 + 8x1 x2 + 5x22 = 10. Give a physical
interpretation of the Lagrange multipliers.
5. (Quadratic forms): Consider the following linear second-order partial differential
equation (PDE) satisfied by a scalar function u(x, y, z):

𝜕2 u 𝜕2 u 𝜕2 u 𝜕2 u 𝜕2 u
a + b + c + 2f + 2g
𝜕x 2 𝜕y2 𝜕z 2 𝜕x𝜕y 𝜕y𝜕z
𝜕2 u 𝜕u 𝜕u 𝜕u
+ 2h +l +m +n + qu = 0
𝜕z𝜕x 𝜕x 𝜕y 𝜕z

where a, b, c, f , g, h, l, m, n and q are constants. The mathematical properties of


the solutions of the PDE are largely determined by the algebraic properties of the
quadratic form:

Q = ax2 + by2 + cz 2 + 2fxy + 2gyz + 2hzx


a f h x
=( x y z )( f b g ) ( y ) = xT Ax
h g c z
200 | 7 Quadratic forms, positive definite matrices and other applications

or equivalently, the eigenvalues of the symmetric matrix A. If A has a zero eigen-


value, then the quadratic form (and the PDE) is called parabolic, if A has all three
strictly positive or negative eigenvalues, then the quadratic form is termed elliptic
while if A has three nonzero eigenvalues with sign of one eigenvalue distinct from
the other two, then the quadratic form and the PDE is called hyperbolic.
Based on this definition, classify the following PDEs: (i) uxx +6uxy +uyy +2uyz +uzz =
0, (ii) uxx + uyy + uzz = 0, (iii) uz − uxx − uyy = 0, (iv) uzz − uxx − uyy = 0 [Note: Here,
the subscripts stand for differentiation with respect to the subscripted variable].
6. (a) Consider the case of a linear PDE in two variables

𝜕2 u 𝜕2 u 𝜕2 u 𝜕u 𝜕u
a 2
+ 2b +c 2 +g +h + du = 0
𝜕x 𝜕x𝜕y 𝜕y 𝜕x 𝜕y

state explicitly (in terms of a, b and c) the conditions under which the PDE is
parabolic, elliptic or hyperbolic.
(b) The dispersion of a tracer in a capillary is described by the equation

𝜕C 𝜕C 𝜕2 C 1 𝜕2 C
+ +P − =0
𝜕t 𝜕z 𝜕t𝜕z Pe 𝜕z 2

where P and Pe are constants known as the transverse and axial Peclet numbers,
respectively. Determine whether this equation is parabolic, elliptic or hyperbolic.
7. (Scalar differential equation with constant coefficients)
(a) Consider the scalar second-order equation

d2 u du
a +b + cu = f (t), t > 0
dt 2 dt
u(0) = α, α ≠ 0
u′ (0) = β,

write the above equations in vector form:

du
= Au + b(t), t > 0; and u(t = 0) = u0
dt
u
where u = ( u21 ), and identify the coefficient matrix A and vector b.
(b) Show that the nth order scalar initial value problem can also be expressed in
the same vector form. Identify the coefficient matrix A and vector b.
8. (Solution of inhomogeneous vector IVP)
Consider a general IVP in vector form

du
= Au + b(t), t > 0; and u(t = 0) = u0
dt

where A is a general matrix.


7.6 Generalized inverse and least square solutions | 201

(a) Obtain a formal solution of above equation using the integrating factor e−At
(b) Show that when b(t) = b0 is a constant vector for t > 0 and A is invertible, the
solution simplifies to

u = eAt u0 + (eAt − I)A−1 b0 .

9. (Real and complex canonical forms)


Consider the initial value problem:

du
= Au, t > 0; and u = u0 @t = 0
dt

where A is a 3 × 3 matrix with one real eigenvalue λ1 and a pair of complex eigen-
values λ2 = α + ιβ, λ3 = α − ιβ.
(a) Show that the eigenvector x1 has real elements, while x2 and x3 may be ex-
pressed as complex conjugates:

x2 = x2R + ιx2I and x3 = x2R − ιx2I

where x2R and x2I are real vectors obtained by real and imaginary part of x2 .
(b) If we define

T = ( x1 x2R x2I )

verify that
i.

λ1 0 0
T AT = Λ = ( 0
−1 ̂ α β )
0 −β α

or

A = TΛT
̂ −1 ,

ii.
f (A) = Tf (Λ)T
̂ −1 ,

for any analytical function f .


iii. Use the results of (b − ii) to show that

e λ1 t 0 0
At Λt αt αt
e = Te T = T( 0 e cos(βt) e sin(βt) ) T .
̂ −1 −1

0 −eαt sin(βt) eαt cos(βt)

[Remark: Here, Λ
̂ is referred to as the real canonical form of A].
202 | 7 Quadratic forms, positive definite matrices and other applications

10. Find a least squares solution of Ax = b, when

−1 2 4
A=( 2 −3 ) , b = ( 1 ).
−1 3 2

11. Consider a set of data points [(x1 , y1 ), (x2 , y2 ), . . . , (xN , yN )] and you want to fit a lin-
ear model y = α0 + α1 x.
(a) Show that this is equivalent to solving the system of equations Aα = y, where

1 x1 y1
. . α .
A=( ), α = ( 0 ), y=( ).
. . α1 .
1 xN yN

(b) Show that the least squares solution is given by solving the normal equations:

(AT A)α = AT y.

12. (Application of matrix methods to linear difference equations): Linear difference


equations of the following type appear in many applications:
(a) un+1 = aun + bun−1 ; n ≥ 1; u0 and u1 given
(b) un+2 = aun+1 + bun + cun−1 ; n ≥ 1; u0 , u1 and u2 given
Formulate the above equations in vector-matrix form xk+1 = Axk and identify
the matrix that appears.
(c) Solve the Fibonacci equation un+1 = un +un−1 ; n ≥ 1; u0 = u1 = 1 and determine
a formula for un when n is large.
13. [Discrete dynamical systems and Markov matrices]: The manufacturer of a prod-
uct (P) currently controls 20 % of the market in a particular country, while the
rival brand (Q) controls the rest of the market. Data from previous year show that
75 % of P’s customers remained loyal, while 25 % switched to the rival brand. In
addition, 50 % of the competition’s customers did not switch to P during the year
while the other 50 % did. (a) Representing the market share of each brand as a
vector (whose components are fractions), formulate the problem of determining
the market share next year, given the market share this year, as a matrix-vector
problem and (b) Use the spectral theorem (or any other method) to determine the
market share of the two products in the long run.
14. Let F(x, y, z) be a single valued and twice differential function of the variables x, y
and z. State the conditions (in terms of partial derivatives) under which a point
(x0 , y0 , z0 ) can be a local maximum or minimum.
15. (a) The same similarity transformation diagonalizes both matrices A and B. Show
that A and B must commute.
(b) Two Hermitian matrices A and B have the same eigenvalues. Show that A and
B are related by a unitary similarity transformation.
|
Part II: Abstract vector space concepts
Introduction
As stated in the introduction to the book, most of the theories of linear differential
equations and other linear operators are generalizations to an infinite number of di-
mensions of the properties of matrices and vectors. For example, we have seen that
the solution of the set of coupled linear first-order differential equations

du
= Au, (1)
dt

with the initial condition

u(t = 0) = u0 , (2)

is given by

n y∗j u0
u=∑ x j e λj t , (3)
j=1
y∗j xj

where A is a constant coefficient n × n matrix, λj , xj and y∗j are the eigenvalues, eigen-
vectors and eigenrows of A, respectively. If A is a real symmetric (self-adjoint) matrix,
we have shown that yj = xj and the eigenvectors may be normalized to form an or-
thonormal basis. In this case, the solution given by Eq. (3) simplifies to
n
u = ∑⟨u0 , xj ⟩xj eλj t , (4)
j=1

where ⟨u0 , xj ⟩ = xTj u0 is the scalar(dot) product. Now, consider the linear partial dif-
ferential equation

𝜕u 𝜕2 u
= 2; 0 < ξ < 1, t > 0 (5)
𝜕t 𝜕ξ

with boundary conditions

u(0, t) = u(1, t) = 0 (6)

and initial condition

u(ξ , 0) = u0 (ξ ). (7)

The solution of Eqs. (5)–(7) may be written as



u(ξ , t) = ∑ ⟨u0 (ξ ), xj (ξ )⟩xj (ξ )eλj t (8)
j=1

https://doi.org/10.1515/9783110739701-008
206 | Abstract vector space concepts

where
1
0
⟨u (ξ ), xj (ξ )⟩ = ∫ u0 (ξ )xj (ξ )dξ , xj (ξ ) = √2 sin jπξ . (9)
0

Here, λj = −j2 π 2 (j = 1, 2, . . .) are the eigenvalues and xj (ξ ) = √2 sin jπξ are the normal-
ized eigenfunctions of the linear differential operator

𝜕2 v
Lv = ; v(0) = v(1) = 0. (10)
𝜕ξ 2

The striking similarity between the two solutions is due to the fact that they are both
linear equations and contain linear operators A and L which are symmetric or self-
adjoint. Thus, it is useful to study the general properties of such linear operators.
In what follows, we consider some abstract vector space concepts that are useful
in solving linear equations. The advantage of the abstract formalism is the unified
treatment of various cases that arise in applications.
8 Vector space over a field
8.1 Definition of a field
A field F is a collection of objects called scalars (or numbers) such that two binary op-
erations called addition and multiplication are defined with the following properties:
If α ∈ F, β ∈ F, then
(i) α + β ∈ F (addition)
(ii) α.β ∈ F (multiplication)

Further, the following axiomatic laws of addition and multiplication must hold:
1. α + β = β + α and α.β = β.α (Commutativity)
2. (α + β) + γ = α + (β + γ) and (α.β).γ = α.(β.γ) (Associativity)
3. α.(β + γ) = α.β + α.γ (Distributivity)
4. There are two distinct elements, denoted 0 and 1, respectively, with the properties

α + 0 = α, α.1 = α ∀α ∈ F

The element 0 is called the identity element for addition while 1 is called the iden-
tity element for multiplication.
5. For every α ∈ F, ∃ an element x in F ∋ α + x = 0.
x is called the additive inverse of α and is denoted by (−α).
6. For every α ∈ F(α ≠ 0), ∃ an element x in F ∋

α.x = 1.

x is called the multiplicative inverse of α and is denoted as α−1 .

The operations of subtraction and division are merely extensions of addition and mul-
tiplication, respectively. For example, b + (−a) is denoted as b − a and b.a−1 is denoted
as b/a(a ≠ 0).

Examples.
1. The set of rational numbers ( pq , q ≠ 0), (p and q integers) forms a field.
2. The set of real numbers (positive, negative and zero) forms a field (denoted by ℝ).
3. The set of complex numbers forms a field (denoted by ℂ).
4. The set of integers does not form a field since no multiplicative inverse exists.
5. The set of quaternions forms a field. Quaternions are hypercomplex numbers of
the form a + ib + jc + kd, where a, b, c and d are real and ij = k, jk = i, ki = j, i2 = j2 =
k 2 = −1.

In almost all of our applications, the field is either ℝ or ℂ.

https://doi.org/10.1515/9783110739701-009
208 | 8 Vector space over a field

8.2 Definition of an abstract vector or linear space:


An abstract vector or linear space consists of the following:
1. a field F;
2. a set V of objects called vectors;
3. a rule (or operation) called vector addition, which associates with each pair of
vectors u, v in V a vector u + v ∈ V, (u + v is called the sum of u and v) in such a
way that
(a) u + v = v + u (commutative)
(b) (u + v) + w = u + (v + w), u, v, w ∈ V (associative)
(c) there is a unique vector 0 in V (called zero vector) such that

u + 0 = u ∀u ∈ V

(d) For each u in V, there is a unique vector −u in V such that u + (−u) = 0.


4. A rule (or operation) called scalar multiplication, which associates with each
scalar α ∈ F and u ∈ V, a vector αu, called the product of α and u, in such a way
that
(a) 1.u = u ∀u in V;
(b) 0.u = 0.
(1 is multiplicative identity in F and 0 is additive identity in F while 0 is addi-
tive identity in V.)
(c) (αβ)u = α(βu);
(d) α(u + v) = αu + αv
(e) (α + β)u = αu + βu

Note that, as the definition states, a vector space is a composite object consisting of a
field, a set of vectors and two operations with the above special properties.

Examples of vector spaces


1. The space of n-tuples
Let F be ℝ or ℂ and u = (α1 α2 . . . αn ), v = (β1 β2 . . . βn ). Define vector addition
by

u + v = (α1 + β1 , . . . , αn + βn )

and scalar multiplication (γ ∈ F) by

γu = (γα1 , γα2 , . . . , γαn )

Then it can be shown that V has all the above properties. Hence, V is a vector
space and is denoted by ℝn /ℂn .
8.2 Definition of an abstract vector or linear space: | 209

2. The space of all m × n matrices over the field F


Let V = {set of all m × n matrices over F}. For A ∈ V, B ∈ V, define

(A + B)ij = Aij + Bij (vector addition)

γ∈F

(γA)ij = γAij (scalar multiplication)

This finite-dimensional vector space is denoted by ℝm×n or ℂm×n .


3. Let F = ℝ and V = {continuous functions f (x) defined in the interval [a, b]} =
C[a, b]. Define

(f + g)(x) = f (x) + g(x) ∈ V


(γf )(x) = γf (x) ∈ V

This is a vector space (of infinite dimension).


4. Let F be a field (ℝ or ℂ) and V the set of all polynomials p(x) of the form

p(x) = α0 + α1 x + α2 x 2 + ⋅ ⋅ ⋅ + αN x N , αj ∈ F

Then V is a vector space if we define

(p1 + p2 )(x) = p1 (x) + p2 (x)


(γp1 )(x) = γp1 (x), γ∈F

5. The set of all solutions to the system Ax = 0, where A is an m × n matrix and x is


n × 1 vector is a vector space.
6. The set of all solutions to

Ly = 0
n n−1
d y d y
where L is a linear n-th order differential operator a0 (x) dx n + a1 (x)
dx n−1
+ ⋅⋅⋅ +
an (x)y = 0 is a vector space.

8.2.1 Subspaces

Definition. Let V be a vector space over the field F. A subspace of V is a subset W of V,


which is itself a vector space over F with the operations of vector addition and scalar
multiplication.

Examples. Let V be a vector space over a field F. Then:


210 | 8 Vector space over a field

1. The subspace containing the single element or the zero vector {0} is a subspace.
This is called the zero subspace of V
2. Let V be the set of n-tuples defined over F and

x = (α1 α2 . . . αn ) ∈ V.

Then the set of n-tuples with α1 = 0 is a vector space, which is a subspace of V.


3. Let V be the set of n×n matrices over F and W be the set of symmetric n×n matrices
over F. Then W ⊂ V
4. Let V be the set of all Euclidean vectors with n elements, i. e., V = ℝn and W be
the set of all solutions to

Ax = 0,

where A is m × n and x is n × 1. Then W is a subspace of V.


5. V = set of all continuous functions defined in (a, b), W = space of all polynomials
functions. Then W ⊂V.

Definition. Let V be a vector space over F and S be a set of vectors in V. Then the
subspace spanned by S is defined to be the intersection W of all subspaces of V, which
contain S.

8.2.2 Bases and dimension

Let V be a vector space over F. A subset S = {x1 , x2 , . . . , xn } of V is said to be linearly


dependent if there exists a set of scalars αi (i = 1, 2, . . . , n) ∈ F, not all zero, such that

α1 x1 + α2 x2 + ⋅ ⋅ ⋅ + αn xn = 0.

If no such set of scalars exists, i. e., if αi = 0 for all i, then the set S is said to be linearly
independent.

Definition. Let V be a vector space over F. A basis for V is a linearly independent set
of vectors in V, which spans V.

Definition. The dimension of a vector space is the largest number of linearly indepen-
dent vectors in that space. If there is no largest number, then we say that the vector
space is of infinite-dimensional.

Example. The vector space of continuous functions over the unit interval C[0, 1] is of
infinite dimension.
8.2 Definition of an abstract vector or linear space: | 211

8.2.3 Coordinates

Let V be a finite-dimensional vector space over the field F. Let (x1 , x2 , . . . , xn ) be a basis
for V. Let z be any vector in V. Then we have

z = α1 x1 + α2 x2 + ⋅ ⋅ ⋅ + αn xn (8.1)

This representation is unique. We call (α1 , α2 , . . . , αn ) the coordinates of z w. r. t. the


basis (x1 , x2 , . . . , xn ). Now, let (y1 , y2 , . . . , yn ) be another basis for V and

z = β1 y1 + β2 y2 + ⋅ ⋅ ⋅ + βn yn (8.2)

(β1 , β2 , . . . , βn ) are coordinates of z w. r. t. the y-basis. To find the relationship between


the coordinates, we expand the y-basis in terms of x-basis

p11
[ p ]
[ 21 ]
y1 = p11 x1 + p21 x2 + ⋅ ⋅ ⋅ + pn1 xn = [ x1 x2 ⋅⋅⋅ xn ] [
[ ..
]
]
[ . ]
[ pn1 ]
yn = p1n x1 + p2n x2 + ⋅ ⋅ ⋅ + pnn xn
z = β1 (p11 x1 + p21 x2 + ⋅ ⋅ ⋅ + pn1 xn ) + β2 (p12 x1 + p22 x2 + ⋅ ⋅ ⋅ + pn2 xn1 )
+ ⋅ ⋅ ⋅ + βn (p1n x1 + p2n x2 + ⋅ ⋅ ⋅ + pnn xn )
= x1 (p11 β1 + p12 β2 + ⋅ ⋅ ⋅ + p1n βn ) + x2 (p21 β1 + p22 β2 + ⋅ ⋅ ⋅ + p2n βn )
+ ⋅ ⋅ ⋅ + xn (pn1 β1 + pn2 β2 + ⋅ ⋅ ⋅ + pnn βn ) (8.3)

Comparing (8.1) and (8.3), we get

α1 = p11 β1 + p12 β2 + ⋅ ⋅ ⋅ + p1n βn


.
.
αn = pn1 β1 + pn2 β2 + ⋅ ⋅ ⋅ + pnn βn

α = Pβ ⇒ β = P−1 α (∵ P is nonsingular)

The matrix P is called the change of basis matrix [from x to y basis].

Theorem. Let V be a n-dimensional vector space over the field F, let B and B′ be two
ordered bases of V. Then there is a unique, necessarily invertible, n × n matrix P with
entries in F such that
212 | 8 Vector space over a field

1.

[z]B = P[z]B′

where zB = coordinate vector w. r. t. basis B and zB′ = coordinate vector w. r. t. ba-


sis B′
2. [z]B′ = P−1 [z]B for every vector z in V.

Example. Consider ℝ2 with standard basis e1 = ( 01 ) and e2 = ( 01 ). A second basis is


1 ) in the standard basis
y1 = ( 21 ) and y2 = ( 52 ). The coordinates of the vector z = ( −1
7 5 −2 ).
and second y basis, β = ( −3 ) are related by the matrices P = ( 21 52 ) and P−1 = ( −2 1

An important point to note is that once a set of basis vectors is selected, coordi-
nates can be defined and all algebraic operations in the abstract vector space (of finite
dimension) can be reduced to operations on matrices and n-tuples.

Problems
1. Consider the vector space of polynomials of degree ≤ 3. Are the following four a
linearly independent set?

1 − 2t + t 2 − 3t 3 , −2 + t − 4t 2 + 5t 3 , −1 − 4t − 5t 2 + t 3 , 3 + t − 4t 2 + 3t 3

2. Consider the vector space of 3 × 3 symmetric matrices. What is its dimension? Find
a basis.
3. Consider the vector space of complex numbers over the real field. What is its di-
mension? Consider the vector space of complex numbers over the complex num-
ber field. What is its dimension?
4. Given the vector space ℝ4 , suppose xT = (a, b, c, d) where a, b, c, d ∈ ℝ. Consider
the subset with a + c = 0, b = 3d. Is this subset a vector space?
5. Consider a linear homogeneous n-th order differential equation with constant co-
efficients. Consider the space of polynomials of degree ≤ N where N is arbitrary
but fixed. By operating in the coordinate space show that no finite polynomial can
ever be a solution.
6. Let V be the vector space generated by the polynomials

p1 = x 3 − 2x 2 + 4x + 1
p2 = x 3 + 6x − 5
p3 = 2x 3 − 3x 2 + 9x − 1
p4 = 2x 3 − 5x 2 + 7x + 5

Find a basis and dimension of V.


8.2 Definition of an abstract vector or linear space: | 213

7. Let y1 , y2 , . . . yn be independent vectors in a vector space V. Let w ∈ V be given by

w = α1 y1 + α2 y2 + ⋅ ⋅ ⋅ + αn yn

where αi are constants (scalars in the field). Show that the representation of w is
unique.
8. Let U be the vector space of all 2 × 3 matrices over field ℝ.
(a) Determine a basis for U.
(b) Determine if the following matrices in U are dependent or independent:

1 2 −3
A=( )
4 0 1
1 3 −4
B=( )
6 5 4
3 8 −11
C=( ).
16 10 9
9 Linear transformations
9.1 Definition of a linear transformation
Recall from calculus the definition of a function. A function consists of the following:
1. a set X called the domain of the function
2. a set Y called the codomain of the function
3. a rule (or correspondence) f , which associates with each element x of X a single
element f (x) of Y.

We write f : X → Y to indicate the function. Figure 9.1 shows the key features of a
function schematically.

Figure 9.1: Schematic diagram illustrating the domain and codomain of a function or a linear trans-
formation.

The function f : X → Y is said to be one-to-one or injective if different elements of X


have distinct images, i. e.,

x ≠ x ′ ⇒ f (x) ≠ f (x ′ ) or equivalently f (x) = f (x ′ ) ⇒ x = x ′

f : X → Y is onto or surjective if every y ∈ Y is the image of some x ∈ X. The range of


f or Image of f = {f (x)/x ∈ X} 󳨐⇒ Range f ⊂ Y. The range of f consists of all elements
y of Y such that y = f (x) for some x ∈ X. If the range of f is all of Y then we say f is a
function from X onto Y, or simply f is onto. If each element of X is assigned to a distinct
element of Y, then we say f is one-to-one. If f is one-to-one and onto, i. e., the inverse
function f −1 : Y → X exists, then we say f is a bijection.
Definition (Linear transformation). Let V and W be vector spaces over the field F. A lin-
ear transformation from V into W is a function T from V into W such that

T(αu + v) = αT(u) + T(v) = αTu + Tv

for all u and v in V and α in F.

https://doi.org/10.1515/9783110739701-010
9.1 Definition of a linear transformation | 215

Examples.
1. Let V be any vector space. Then the identity transformation I defined by

Iu = u,

is a linear transformation from V into V.


2. Let V be the vector space of n × 1 (column) matrices defined over F and let A be a
fixed m × n matrix over F. Then the function T on V defined by

T(u) = Au

is a linear transformation from V into the space W of m × 1 matrices over F, i. e.,


T : V → W or ℝn → ℝm is a linear transformation.
3. Let F be a field and V be the space of polynomial functions, f , given by

f (x) = c0 + c1 x + ⋅ ⋅ ⋅ + ck xk

Let (Df )(x) = c1 + 2c2 x + ⋅ ⋅ ⋅ + kck x k−1 . Then D is a linear transformation from V into
V—the differentiation transformation.
4. Let V be the vector space of polynomials in t defined over the field ℝ and T be
integration operation from 0 to 1, i. e.,

T:V→ℝ

For any f ∈ V, define

Tf = ∫ f (t) dt
0

This is a linear transformation.


5. Let V be the space of all m × n matrices defined over a field F, let P be a fixed m × m
matrix over F and let Q be a fixed n × n matrix over F. Define T : V → V by

T(A) = PAQ.

Then T is a linear transformation from V into V.

Lemma 9.1. If T is a linear transformation, then

T(0v ) = 0w

Proof. Since T(u + v) = T(u) + T(v), let u = 0v and v = 0v


216 | 9 Linear transformations

T(0v + 0v ) = T(0v ) + T(0v )


⇒ T(0v ) = T(0v ) + T(0v )
⇒ T(0v ) = 0w

Lemma 9.2. If T : V → W is a linear transformation, then

T(α1 u1 + α2 u2 + ⋅ ⋅ ⋅ + αn un ) = α1 T(u1 ) + α2 T(u2 ) + ⋅ ⋅ ⋅ + αn T(un )

Definition. If V is a vector space over the field F, a linear operator on V is a linear


transformation from V into V.

Examples.
1. Let V = ℝ2 and define T(x1 , x2 ) = (x2 , x1 ), then T is a linear operator on V
2. Let V = space of 2 × 2 matrices defined on ℝ. For A ∈ V, define T(A) = AB − BA
where B is a fixed 2 × 2 matrix, e. g.,

1 2
B=( )
3 4

Then T is a linear operator.

9.2 Matrix representation of a linear transformation


Let V and W be vector spaces over F. If V and W are finite-dimensional, then any linear
transformation from V into W has a matrix representation. Let e1 , e2 , . . . , en be a basis
for V and f1 , f2 , . . . , fm be a set of basis vectors for W. Consider

Te1 = a11 f1 + a21 f2 + ⋅ ⋅ ⋅ + am1 fm


Te2 = a12 f1 + a22 f2 + ⋅ ⋅ ⋅ + am2 fm
.
.
Ten = a1n f1 + a2n f2 + ⋅ ⋅ ⋅ + amn fm

aij ∈ F. These equations contain the essential information about T, i. e., they tell us
how T transforms each of the basis vectors (e1 , e2 , . . . , en ). Now, let u ∈ V be an arbi-
trary vector. Then

u = α1 e1 + ⋅ ⋅ ⋅ + αn en

where αi ∈ F. We call the n × 1 column vector


9.2 Matrix representation of a linear transformation | 217

α1
.
α=( )
.
αn

as the coordinate vector of u w. r. t. the e-basis. Now, let

Tu = w ∈ W

w = β 1 f1 + ⋅ ⋅ ⋅ + β m fm
β1
.
β=( ), βj ∈ F
.
βm

is the coordinate vector of the image of u w. r. t. the basis (f1 , f2 , . . . , fm ). Consider

Tu = α1 Te1 + ⋅ ⋅ ⋅ + αn Ten
= α1 [a11 f1 + a21 f2 + ⋅ ⋅ ⋅ + am1 fm ] + ⋅ ⋅ ⋅
+ αn [a1n f1 + a2n f2 + ⋅ ⋅ ⋅ + amn fm ]
= f1 [a11 α1 + a12 α2 + ⋅ ⋅ ⋅ + a1n αn ] + ⋅ ⋅ ⋅
+ fm [am1 α1 + am2 α2 + ⋅ ⋅ ⋅ + amn αn ]

β1 = a11 α1 + ⋅ ⋅ ⋅ + a1n αn
.
.
βm = am1 α1 + ⋅ ⋅ ⋅ + amn αn

or β = Aα, where A = {aij } is called the matrix representation of the linear transforma-
tion T with respect to the bases e (in V) and f (in W). Thus, if u ∈ V, Tu ∈ W and once
we choose a basis for V and W then the linear transformation T : V → W (in abstract
spaces) is equivalent to the transformation

β1 a11 . . . a1n α1
[ . ] [ . . . . ][ . ]
n [ m
] [ ][ ]
T:ℝ →ℝ ⇒[ ]=[ ][ ]
[ . ] [ . . . . ][ . ]
[ βm ] [ am1 . . amn ] [ αn ]
218 | 9 Linear transformations

defined by T(u) = Aα = β, where α is the coordinate vector of u and β is the coordinate


vector of Tu.

Examples.
1. Let V = ℝ2 and consider the linear operator T(u1 , u2 ) = (u2 , u1 ). Choose e1 = (1, 0),
e2 = (0, 1) as a basis for V.

Te1 = (0, 1) = 0.(1, 0) + 1.(0, 1)


Te2 = (1, 0) = 1.(1, 0) + 0.(0, 1)

0 1
A=( )
1 0

2. V = ℝ2 , T(u1 , u2 ) = (u1 − u2 , 2u1 + 4u2 ).


In this case, it is easily seen that

1 −1
A=( )
2 4

w. r. t. standard basis. Another basis

1 1
e1 = ( ), e2 = ( )
−1 −2

gives

2 0
A=( ).
0 3

3. Let V = space of all polynomials of degree ≤ 3. Consider the linear operator T :


d
V → V, which is represented by the differentiation operator, dt . Take (1, t, t 2 , t 3 )
as a basis for V. Now,

d
(1) = 0 = 0.1 + 0.t + 0.t 2 + 0.t 3
dt
d
(t) = 1 = 1.1 + 0.t + 0.t 2 + 0.t 3
dt
d 2
(t ) = 2t = 0.1 + 2.t + 0.t 2 + 0.t 3
dt
d 3
(t ) = 3t 2 = 0.1 + 0.t + 3.t 2 + 0.t 3
dt

9.2 Matrix representation of a linear transformation | 219

0 1 0 0
0 0 2 0
A=( ).
0 0 0 3
0 0 0 0

Let

f = α0 + α1 t + α2 t 2 + α3 t 3

Then

0 1 0 0 α0
df 0 0 2 0 α1
=( )( )
dt 0 0 0 3 α2
0 0 0 0 α3
α1
2α2
=( ) = α1 .1 + 2α2 .t + 3α3 .t 2 + 0.t 3
3α3
0

4. V = as in example 3 above, W = ℝ, T = Integration from 0 to 1;

1 1 1
t 2 󵄨󵄨󵄨󵄨 1
∫ 1. dt = 1, ∫ t. dt = 󵄨 =
2 󵄨󵄨󵄨0 2
0 0
1 1
1 1
∫ t 2 . dt = , ∫ t 3 . dt =
3 4
0 0

Thus, T : V → ℝ has matrix representation

1 1 1
A=[ 1 2 3 4
]

If

f = α0 + α1 t + α1 t 2 + α1 t 3 ∈ V

then

α0
1 1 1
[ α ] α α α3
] = α0 + 1 + 2 +
[ 1 ]
Tf = [ 1 ][
2 3 4 [ α2 ] 2 3 4
[ α3 ]
220 | 9 Linear transformations

5. Let V be the vector space of all 2 × 2 real matrices and T be a linear operator on V
defined by

0 1 0 1
T(A) = A ( )−( )A
1 0 1 0

Determine the matrix representation of T.


We choose as basis for V = {e1 = ( 01 00 ), e2 = ( 00 01 ), e3 = ( 01 00 ), e4 = ( 00 01 )}

1 0 0 1 0 1 1 0
Te1 = ( )( )−( )( )
0 0 1 0 1 0 0 0
0 1 0 0
=( )−( )
0 0 1 0
0 1
=( ) = 0.e1 + 1.e2 − 1.e3 + 0.e4
−1 0
0 1 0 1 0 1 0 1
Te2 = ( )( )−( )( )
0 0 1 0 1 0 0 0
1 0 0 0
=( )−( )
0 0 0 1
1 0
=( ) = 1.e1 + 0.e2 + 0.e3 − 1.e4
0 −1
0 0 0 1 0 1 0 0
Te3 = ( )( )−( )( )
1 0 1 0 1 0 1 0
0 0 1 0
=( )−( )
0 1 0 0
−1 0
=( ) = −1.e1 + 0.e2 + 0.e3 + 1.e4
0 1
0 0 0 1 0 1 0 0
Te4 = ( )( )−( )( )
0 1 1 0 1 0 0 1
0 0 0 1
=( )−( )
1 0 0 0
0 −1
=( ) = 0.e1 − 1.e2 + 1.e3 + 0.e4
1 0
0 1 −1 0
1 0 0 −1
[T]{ei } =( )
−1 0 0 1
0 −1 1 0
9.2 Matrix representation of a linear transformation | 221

As shown in the next section, it may be noted that

range of T: {e2 − e3 , e1 − e4 }, ker T = {e2 + e3 , e1 + e4 }

9.2.1 Change of basis

Theorem. Let V be a finite-dimensional vector space over the field F, and let B1 =
(e1 , e2 , . . . , en ), B2 = (u1 , u2 , . . . , un ) be two sets of ordered bases for V. Suppose that T is
a linear operator on V. If P is the n×n transition matrix, which expresses the coordinates
of each vector in V relative to B1 in terms of its coordinates relative to B2 , then

A2 = P−1 A1 P

where Ai is the matrix representation of T w. r. t. the ordered bases Bi (i = 1, 2).

Sketch of the proof. A1 is the matrix representation of T w. r. t. B1 . Express the new ba-
sis in terms of the old one

u1 = p11 e1 + p21 e2 + p31 e3 + ⋅ ⋅ ⋅ + pn1 en


.
.
un = p1n e1 + p2n e2 + p3n e3 + ⋅ ⋅ ⋅ + pnn en

If x ∈ V, then

x = α1 e1 + ⋅ ⋅ ⋅ + αn en
= α1′ u1 + ⋅ ⋅ ⋅ + αn′ un

α = Pα′
[Tx] = A1 α = y = β1 e1 + ⋅ ⋅ ⋅ + βn en
⇒ β = A1 α

Now,

y = β1′ u1 + ⋅ ⋅ ⋅ + βn′ un ⇒ β = Pβ′

Pβ′ = A1 Pα′ ⇒ β′ = P−1 A1 Pα′ ⇒ β′ = A2 α′

∴ A2 = P−1 A1 P (A2 and A1 are similar matrices.)


222 | 9 Linear transformations

9.2.2 Kernel and range of a linear transformation

Definition. Let V and W be vector spaces over the field F and T : V → W be a linear
transformation. Then,

range of T = Image of T = {w ∈ W : w = T(u) for some u ∈ V}


kernel of T = null space of T = {u ∈ V : Tu = 0w }

Theorem. Let T : V → W be a linear mapping, then


1. range T is a subspace of W
2. the kernel of T is a subspace of V

Proof.
1. Let ℝT denote the range of T. Let w1 ∈ ℝT and w2 ∈ ℝT . Then ∃u1 , u2 ∈ V such
that

Tu1 = w1 , Tu2 = w2

Now consider

T(αu1 + u2 ) = αT(u1 ) + T(u2 ) (since T is linear)


= αTu1 + Tu2
= αw1 + w2

Thus, (1) ⇒ αw1 + w2 ∈ ℝT


∴ ℝT is a subspace of W
2. Let NT denote the null space of T. If u1 , u2 ∈ NT and α ∈ F

T(αu1 + u2 ) = αTu1 + Tu2 = 0 ⇒ αu1 + u2 ∈ NT

∴ NT is a subspace.

Definition. Let V and W be finite-dimensional vector spaces defined over a field F.


Then

Rank of T = dim(range of T)
Nullity of T = dim(kernel of T) = dim(null space of T)

Theorem. Let T : V → W be a linear transformation from V into W. Suppose that V is


finite-dimensional. Then

Rank(T) + Nullity(T) = dim(V)


9.2 Matrix representation of a linear transformation | 223

Proof. Let u1 , . . . , uk be a basis for NT (null space of T). Then there are vectors
(uk+1 , . . . , un ) in V such that (u1 , . . . , un ) is a basis for V. We shall prove that {Tuk+1 , . . . ,
Tun } is a basis for the range of T. The vectors {Tu1 , . . . , Tun } certainly span the range
of T and since Tuj = 0, j = 1, . . . , k, we see that {Tuk+1 , . . . , Tun } span the range of T.
To see that these vectors are linearly independent, suppose we have scalars αi such
that
n n
∑ αi (Tui ) = 0 ⇒ T( ∑ αi ui ) = 0
i=k+1 i=k+1

or the vector y = ∑ni=k+1 αi ui is in the null space of T. Since {u1 , u2 , . . . , uk } is a basis


for NT , we can express

k
y = ∑ βi ui
i=1

Thus,

k n
∑ βi ui − ∑ αi ui = 0
i=1 i=k+1

Since, the vectors (u1 , . . . , un ) are linearly independent ⇒

αk+1 = ⋅ ⋅ ⋅ αn = 0, β1 = ⋅ ⋅ ⋅ = βk = 0

Thus, if k is the nullity of T, the fact that {Tuk+1 , . . . , Tun } form a basis for range of T
implies that rank of

T = n − k = dim(Range of T)

∴ The result.

9.2.3 Relation to linear equations

Ax = b, x ∈ ℝn , b ∈ ℝm

The matrix A may be viewed as the linear transformation A : ℝn → ℝm and the solu-
tion to Ax = b is the preimage of b ∈ℝm . Furthermore, the solution of the associated
homogeneous equation Ax = 0 may be viewed as the kernel of the linear mapping
A :ℝn → ℝm . Figure 9.2 illustrates the kernel and range of a linear transformation
schematically.
224 | 9 Linear transformations

Figure 9.2: Schematic diagram illustrating the kernel and range of a linear transformation. The dot
represents the zero vector.

From the above theorem, we have

dim(ker A) = dim(ℝn ) − dim(Image of A)


= n − rank(A)

Thus, we have the following theorem.

Theorem. The dimension of the solution space of the linear equations

Ax = 0

is n − r, where r is the rank of A and n is the number of unknowns.

9.2.4 Isomorphism

If V and W are vector spaces over the field F, any one-to-one linear transformation
T of V onto W (i. e., any bijective linear transformation of V into W) is called an iso-
morphism of V into W. If there exists an isomorphism of V into W, we say that V is
isomorphic to W.

Theorem. Every n-dimensional space over F is isomorphic to the space Fn

Proof. Let V be the n-dimensional space over F and let B = {e1 , e2 , . . . , en } be a basis
for V. Define a function T : V → Fn as follows:
If x ∈ V, let Tx be the n-tuple of coordinates (α1 , . . . , αn ) of x relative to the basis B.
Now it is easily verified that T is linear, one-to-one and maps V onto Fn .
∴ The result.
For many purposes one often regards isomorphic vector spaces as being “the
same,” although the vectors and operations in the spaces may be quite different.
9.2 Matrix representation of a linear transformation | 225

9.2.5 Inverse of a linear transformation

Definition. Let V and W be vector spaces over a field F and T : V → W be a linear


transformation. T is said to be singular if ∃ a nonzero vector x ∈ V ∋ Tx = 0w . If the
kernel of T consists of the zero vector in V, then we say that T is nonsingular.

Theorem. T is nonsingular 󳨐⇒ T is one-to-one and onto.

Proof. Note that T is nonsingular implies that T is one-to-one and onto. To show this,
let

Tx = Ty

󳨐⇒

T(x − y) = 0w (since T is linear)

󳨐⇒

x − y = 0v (since T is nonsingular)

󳨐⇒ x = y ∴ Tx = Ty 󳨐⇒ x = y ∴ T is one-to-one.
To show that T is onto, it is sufficient if we show that if {e1 , . . . , en } is a basis for V
then {Te1 , . . . , Ten } is a basis for W. We claim that Te1 , . . . , Ten are an independent set
of vectors; for suppose

α1 Te1 + ⋅ ⋅ ⋅ + αn Ten = 0w

󳨐⇒

T(α1 e1 + ⋅ ⋅ ⋅ + αn en ) = 0w since T is linear

α1 e1 + ⋅ ⋅ ⋅ + αn en = 0v since T is nonsingular

⇒ α1 = ⋅ ⋅ ⋅ = αn = 0 since e1 , . . . , en are independent. Thus, {Te1 , . . . , Ten } are indepen-


dent, which implies that T is onto.

Theorem. T : V → W is an isomorphism if and only if it is non-singular.

Definition. Let V be a vector space defined over field F and T : V → W is a linear


transformation. We say that T is invertible if ∃ a transformation T−1 such that

TT−1 = Iw = identity on W
T−1 T = Iv = identity on V
226 | 9 Linear transformations

Theorem. Let V and W be finite-dimensional vector spaces over the field F such that dim
V = dim W. If T is a linear transformation from V into W, the following are equivalent:
1. T is invertible
2. T is nonsingular
3. The range of T is W
4. If {e1 , . . . , en } is a basis for V, then {Te1 , . . . , Ten } is a basis for W.

Theorem. If T is nonsingular, then the inverse function T−1 is a linear transformation


from W onto V.
Proof. T is nonsingular ⇒ T is one-to-one and onto. There is a uniquely determined
inverse function T−1 from W to V such that T−1 T is identity on V and TT−1 is identity
on W. To prove that T−1 is linear, let y1 , y2 ∈ W, c ∈ F. Then ∃ unique xi ∈ V such that

Tx1 = y1 or x1 = T−1 y1
Tx2 = y2 or x2 = T−1 y2

Now,

T(cx1 + x2 ) = cTx1 + Tx2


= cy1 + y2

Thus,

T−1 (cy1 + y2 ) = cx1 + x2


= cT−1 y1 + T−1 y2 ,

which implies that T−1 is linear.

Application
Consider the solution of Ax = b where A is n × n, x, b ∈ℝn . The above theorems can be
used to prove the following theorem.
Theorem (Linear algebraic equations).
1. If the homogeneous system Ax = 0 (A is n × n and x is n × 1) has only the trivial
solution, then the inhomogeneous system has a unique solution for any b ∈ℝn .
2. If Ax = 0 has nonzero solution, then ∃b ∈ℝn for which Ax = b has no solution.
Furthermore, if a solution exists for some b, it is not unique.

Example 9.1. Consider the linear equations

u1 − 2u2 = b1
2u1 − 4u2 = b2

b
or Au = b with A = ( 21 −4
−2 ) and b = ( 1 ).
b 2
9.2 Matrix representation of a linear transformation | 227

In this case, the homogeneous equations Au = 0 have the solution

2
uh = cx1 ; x1 = ( ),
1

where c is any arbitrary constant. The inhomogeneous equations are inconsistent if


b2 ≠ 2b1 . When b2 = 2b1 = −6, the general solution of Au = b may be expressed as

1
u = c 1 x1 + x2 ; x2 = ( ),
2

where c1 is an arbitrary constant. Figure 9.3 shows the solution space of the homoge-
neous and (consistent) inhomogeneous system.

Figure 9.3: The eigenvectors of A and the solution spaces of the homogeneous equation Au = 0 and
inhomogeneous equation Au = b (when they are consistent).

Remark. The vectors x1 and x2 are the eigenvectors of the matrix A, with x1 corre-
sponding to the zero eigenvalue. The solution space of Au = b is a translation of that
of Au = 0 by x2 . Such a space is called an affine space.

Further discussion and proofs of theorems on linear transformations may be


found in the books by Halmos [20], Naylor and Sell [24] and Lipschutz and Lipson
[22].

Problems
1. Determine which of the transformations are linear
(a) T : ℝ2 → ℝ2 defined by T(x, y) = (x + y, x)
(b) T : ℝ2 → ℝ3 defined by T(x, y) = (x + 1, 2y, x + y)
228 | 9 Linear transformations

(c) T : C[0, 1] → C[0, 1] defined by

g(t) = ∫ K(t, s)f (s) ds,


0

where g(t) and f (t) are elements in C[0, 1] and K(t, s) is continuous in [0, 1] ×
[0, 1].
2. Let T be a linear operator on ℝ3 defined by

T(x1 , x2 , x3 ) = (3x1 + x2 + x3 , 2x1 + 4x2 + 2x3 , x1 + x2 + 3x3 )

(a) What is the matrix of T in the standard basis for ℝ3 ?


(b) What is the matrix of T in the ordered basis u1 = (1, −1, 0), u2 = (1, 0, −1) and
u3 = (1, 2, 1)?
(c) Show that T is invertible and give a rule for T−1 like the one which defines T.
(d) Determine the spectral resolution of T and a formula for computing f (T)
where f is any function defined on the spectrum of T.
3. Let T : ℝ4 → ℝ3 be the linear transformation defined by

T(x, y, z, w) = (x − y + z + w, x + 2z − w, x + y + 3z − 3w).

Find a basis and the dimension of (a) the range of T, (b) the kernel of T.
4. Consider the linear operator T on ℝ3 defined by

V(x, y, z) = (2x, 4x − y, 2x + 3y − z).

(a) Show that T is invertible


(b) Determine formulas for T−1 , T2 and T−2 .
5. Let E be a linear operator on V for which E2 = E (such an operator is called a
projection). Let U be the image of E and W be the kernel. Show that
(a) if u ∈ U, then E(u) = u
(b) if E ≠ I, then E is singular
(c) V = U ⊕ W.
10 Normed and inner product vector spaces
The problem so far is that the abstract vector space has only algebraic structure. In
order for these concepts to be useful, we must add some geometrical structure by in-
troducing the concepts of length of a vector, distance and angle between vectors and
orthogonality. This is done by introducing the concepts of norm and scalar or dot or
inner product.

10.1 Definition of normed linear spaces


A norm on a vector space V is a rule which for every x ∈ V, specifies a real number ‖x‖,
called the norm or length of x such that:
1. ‖x‖ ≥ 0, ‖x‖ = 0 iff x = 0 (positivity)
2. ‖αx‖ = |α|‖x‖, α ∈ ℝ/ℂ (homogeneity)
3. ‖x + y‖ ≤ ‖x‖ + ‖y‖ (triangle inequality)

In a normed linear space, in addition to length, we can also measure distance between
vectors by defining a distance function

d(x, y) = ‖x − y‖.

Examples.
1. Let V = ℝn and for x ∈ V, define
1/p
n
p
‖x‖p = (∑ |xi | ) , p≥1
i=1

It can be shown that this definition satisfies all three rules of a norm:
(a) For p = 1, ‖x‖1 = ∑ni=1 |xi |.
(b) For p = 2, ‖x‖2 = (∑ni=1 |xi |2 )1/2 , which is the standard Euclidean norm.
(c) For p = ∞, ‖x‖∞ = max1≤i≤n |xi |, which is also referred to as the supremum
norm.
2. Let V = C[a, b], the space of continuous functions defined on [a, b]. For f (x) ∈ V,
define
b

󵄩󵄩f (x)󵄩󵄩󵄩1 = ∫󵄨󵄨󵄨f (x)󵄨󵄨󵄨 dx


󵄩󵄩 󵄩 󵄨 󵄨
a
b 1/2
󵄨2
󵄩󵄩f (x)󵄩󵄩󵄩2 = (∫󵄨󵄨󵄨f (x)󵄨󵄨󵄨 dx)
󵄩󵄩 󵄩 󵄨
a

󵄩󵄩f (x)󵄩󵄩󵄩∞ = sup 󵄨󵄨󵄨f (x)󵄨󵄨󵄨


󵄩󵄩 󵄩 󵄨 󵄨
a≤x≤b

https://doi.org/10.1515/9783110739701-011
230 | 10 Normed and inner product vector spaces

Again, it may be shown that these three definitions satisfy the three rules of a
norm. [Note that the vector space C[a, b] is infinite-dimensional.]
If f (x), g(x) ∈ V, the distance function corresponding to the above norms are given
by

d1 (f , g) = ∫󵄨󵄨󵄨f (x) − g(x)󵄨󵄨󵄨 dx


󵄨 󵄨
a
b 1/2
󵄨2
d2 (f , g) = (∫󵄨󵄨󵄨f (x) − g(x)󵄨󵄨󵄨 dx)
󵄨
a

d∞ (f , g) = sup 󵄨󵄨󵄨f (x) − g(x)󵄨󵄨󵄨,


󵄨 󵄨
a≤x≤b

which assert the following observations:


(a) If d∞ (f , g) = 0 󳨐⇒ f and g are equal at all x ∈ [a, b], i. e., pointwise or uniform
convergence.
(b) If d2 (f , g) = 0 󳨐⇒ f and g may not be equal at all x ∈ [a, b] but convergence
may be attained in the mean square norm.

Limits of sequences and convergence in vector spaces


If we consider the sequence,

n
1 1 1 1 1
Sn = 1 + + + + ⋅⋅⋅ + =∑ ;
1! 2! 3! n! k=1 k!

then Sn is a rational number for any finite value of n. However,

lim S =e (an irrational number)


n󳨀→∞ n

Similarly, the functions

fn (x) = exp(−nx), 0≤x≤1


gn (x) = tanh(nx), −1 ≤ x ≤ 1

are continuous, i. e., fn ∈ C[0, 1] and gn ∈ C[−1, 1] for any finite n, but for n 󳨀→ ∞,

1, x=0
f∞ (x) = { ∉ C[0, 1]
0, x ≠ 0

{ −1, x<0
{
g∞ (x) = { 0, x = 0 ∉ C[−1, 1]
{
{ 1, x>0
10.2 Inner product vector spaces | 231

Thus, the limits of sequences of continuous functions may not be continuous and the
space C[a, b] may not be complete depending on how d(f , g) is defined. Similarly, in
infinite-dimensional vector spaces, the functions (or vectors) may have uncountable
number of discontinuities as illustrated by the example below.

Dirichlet function, Riemann and Lebesque integration


Consider the function defined on the closed unit interval [0, 1]

0, x is rational
fD (x) = {
1, x is irrational

1
The Riemann integral ∫0 fD (x) dx does not exist. However, since the set of rational
numbers have zero measure in [0, 1], the Lebesque integral exists. Further, there is
no difference between fD (x) and function ̂f (x) = 1∀x ∈ [0, 1] in the Lebesque theory of
integration. The Lebesque integral of fD (x) is

∫ fD (x) dx = 1 (Lebesque integral); ̂f (x) ∈ C[0, 1] and fD (x) ∈ ℒ2 [0, 1].


0

The Fourier coefficients of fD and ̂f are identical and d2 (̂f , fD ) = 0. We return to this
example in Chapter 21 when we discuss the theory of convergence in function spaces.

10.2 Inner product vector spaces


Definition. Let a V be a vector space defined over a field F(ℝ or ℂ). Suppose that to
each pair of vectors, u, v ∈ V, we assign a scalar ⟨u, v⟩ ∈ F. This function is called an
inner product if it satisfies the following axioms:
1. Linearity in the first component

⟨αu + βw, v⟩ = α⟨u, v⟩ + β⟨w, v⟩; for all u, v, w ∈ V and α, β ∈ F

2. Hermitian symmetry

⟨u, v⟩ = ⟨v, u⟩; for all u, v ∈ V,

the bar denoting complex conjugation.


3. Positive definiteness

⟨u, u⟩ ≥ 0 and ⟨u, u⟩ = 0 iff u = 0

An abstract vector space on which an inner product is defined is called an inner


product space.
232 | 10 Normed and inner product vector spaces

Remarks.
1. If F = ℝ, then the bar denoting complex conjugation is superflous.
2. Inner product is a generalization to an abstract vector space of the dot (scalar)
product in two and three dimensions.

Examples.
1. Let F = ℝ and V = ℝn . For u, v ∈ V, define

⟨u, v⟩ = u1 v1 + u2 v2 + ⋅ ⋅ ⋅ + un vn = u.v (standard dot product)

This is an inner product as it satisfies all the axioms. Note that

⟨u, u⟩ = u21 + u22 + ⋅ ⋅ ⋅ + u2n

The norm or length of the vector w. r. t. this inner product is

‖u‖ = +√⟨u, u⟩ = √u21 + u22 + ⋅ ⋅ ⋅ + u2n .

The distance between two points (vectors) is defined as d(u, v) = ‖u − v‖. These
are the standard dot product and distance function in the Euclidean space ℝn .
2. Let F = ℂ and V = ℂn . For u, v ∈ V define

⟨u, v⟩ = u1 v1 + u2 v2 + ⋅ ⋅ ⋅ + un vn

Then
(a)

⟨αu + βw, v⟩ = (αu1 + βw1 )v1 + ⋅ ⋅ ⋅ + (αun + βwn )vn


= α(u1 v1 + ⋅ ⋅ ⋅ + un vn ) + β(w1 v1 + ⋅ ⋅ ⋅ + wn vn )
= α⟨u, v⟩ + β⟨w, v⟩

(b)
⟨v, u⟩ = v1 u1 + ⋅ ⋅ ⋅ + vn un

⟨v, u⟩ = v1 u1 + ⋅ ⋅ ⋅ + vn un
= v1 u1 + ⋅ ⋅ ⋅ + vn un
= ⟨u, v⟩

(c)
⟨u, u⟩ = u1 u1 + ⋅ ⋅ ⋅ + un un
= |u1 |2 + ⋅ ⋅ ⋅ + |un |2 > 0.
10.2 Inner product vector spaces | 233

The space ℂn with this dot/inner product is referred to as “Hilbert space.”


3. Let F = ℝ and V = ℝn . Let G be any fixed n × n symmetric positive definite matrix.
Define

⟨u, v⟩ = vT Gu

Then
(a)

⟨αu + βw, v⟩ = vT G(αu + βw)


= αvT Gu + βvT Gw
= α⟨u, v⟩ + β⟨w, v⟩

(b) T
⟨u, v⟩ = vT Gu = (vT Gu) since it is a scalar
T T
=u G v
= uT Gv (since GT = G)
= ⟨v, u⟩

(c)
⟨u, u⟩ = uT Gu > 0

since G is positive definite. G is called the matrix of the inner product (or met-
ric of the inner product) space. For the standard (Euclidean) inner product in
Example 1,

1 0 . 0
G=( . . . . )=I
0 0 . 1

4. Let F = ℝ and V = space of all continuous real valued functions in the interval
a ≤ t ≤ b. For f (t), g(t) ∈ V, define

⟨f , g⟩ = ∫ f (t)g(t) dt
a

This satisfies the axioms of an inner product. Note that the space V in this example
is infinite-dimensional. This space is denoted by C[a, b]:

‖f ‖ = 0 ⇒ ∫ f (t)2 dt = 0
a
234 | 10 Normed and inner product vector spaces

If f (t) is continuous, the only way the integral can be zero is f (t) ≡ 0, a ≤ t ≤ b.
This inner product is very useful in applications. Very often, we are interested in
solving nonlinear equations of the form:

N(y) = 0

Since an exact solution is not possible, y(t) is approximated by f (t). The closeness
of this approximation to the exact solution can be found only if an inner product
is defined on the space.
There exist a variety of inner products on the space C[a, b]. For example,

⟨f , g⟩ = ∫ ρ(t)f (t)g(t) dt; ρ(t) > 0 in (a, b)


a

is also an inner product. One chooses an inner product that is convenient in a


given application.
5. V = space of continuous complex valued functions in [a, b]. For f (t), g(t) ∈ V,
define
b

⟨f , g⟩ = ∫ f (t)g(t) dt
a

It can be shown that this satisfies all the axioms of an inner product.

Definition. Let V be an inner product space. The length of a vector u ∈ V (also called
the norm of u denoted by ‖u‖) is defined by ‖u‖ = √⟨u, u⟩.

Theorem (Schwarz inequality). Let V be an inner product space and u, v ∈ V. Then

󵄨2
󵄨󵄨⟨u, v⟩󵄨󵄨󵄨 ≤ ⟨u, u⟩.⟨v, v⟩
󵄨󵄨

Proof. If v = 0, both sides of the inequality are zero and it is satisfied. Assume v ≠ 0.
Then, by property (3) of inner product,

⟨w, w⟩ ≥ 0, w∈V

Let w = u − αv ⇒

⟨u − αv, u − αv⟩ ≥ 0
⇒ ⟨u, u − αv⟩ − α⟨v, u − αv⟩ ≥ 0
⇒ ⟨u − αv, u⟩ − α⟨u − αv, v⟩ ≥ 0
⇒ ⟨u, u⟩ − α⟨v, u⟩ − α[⟨u, v⟩ − α⟨v, v⟩] ≥ 0
10.2 Inner product vector spaces | 235

⇒ ⟨u, u⟩ − α⟨u, v⟩ − α⟨v, u⟩ + αα⟨v, v⟩ ≥ 0


⇒ ⟨u, u⟩ + αα⟨v, v⟩ ≥ α⟨v, u⟩ + α⟨u, v⟩

Let α = ⟨u,v⟩
⟨v,v⟩
, 󳨐⇒

|⟨u, v⟩|2 |⟨u, v⟩|2


⟨u, u⟩ + ≥2
⟨v, v⟩ ⟨v, v⟩
|⟨u, v⟩|2
⟨u, u⟩ ≥
⟨v, v⟩

󵄨2
⟨u, u⟩.⟨v, v⟩ ≥ 󵄨󵄨󵄨⟨u, v⟩󵄨󵄨󵄨
󵄨

∴ The result.

Definition. The angle between two vectors u, v in an inner product vector space V is
defined by

|⟨u, v⟩|
cos θ =
‖u‖.‖v‖

Definition. Let V be an inner product space over a field F. Two vectors u, v ∈ V are
said to be orthogonal w. r. t. its inner product if ⟨u, v⟩ = 0.

A vector is said to be normalized if it has unit length, i. e., ‖u‖ = 1.

Remark. If V is an inner product space, we can define (1) distances between vectors
(2) lengths of vectors (3) angles between vectors, i. e., an inner product space has a
geometrical structure.

A vector space in which only distances are defined is called a metric space.
A vector space in which lengths are defined is called a normed linear space. The
schematic diagram of Figure 10.1 shows the relationship between these spaces.

Theorem. Let V be a finite dimensional inner product vector space and {u1 , u2 , . . . , un }
be a set of orthogonal vectors. Then, this set is linearly independent provided it does not
include the zero vector.

Proof. Let v ∈ V be any vector that is in the subspace spanned by {u1 , u2 , . . . , un }

n
v = ∑ αi u i
i=1


236 | 10 Normed and inner product vector spaces

Figure 10.1: Schematic diagrams illustrating the structure of vector spaces.

n
⟨v, uj ⟩ = ⟨∑ αi ui , uj ⟩ = αj ⟨uj , uj ⟩
i=1

If ⟨uj , uj ⟩ ≠ 0 ⇒

⟨v, uj ⟩
αj =
‖uj ‖2


n
⟨v, ui ⟩
v=∑ u
i=1
‖ui ‖2 i

Now suppose that v = 0 ⇒ αj = 0, j = 1, 2, . . . , n ⇒ the set {u1 , u2 , . . . , un } is linearly


independent.

Theorem. Every finite-dimensional inner product space has an orthonormal basis.

Proof. Let {u1 , u2 , . . . , un } be a basis for V. From this basis, we shall show how to obtain
an orthogonal basis {v1 , v2 , . . . , vn }. When each of the vectors in this orthogonal basis
is normalized to have unit length, then we obtain an orthonormal basis.
vi
ei = , i = 1, 2, . . . , n
‖vi ‖

⇒ {e1 , e2 , . . . , en } is an orthonormal basis

10.2.1 Gram–Schmidt orthogonalization procedure

Given a set of linearly independent vectors (u1 , u2 , . . . , un ), we can construct an or-


thogonal or an orthonormal basis from this set by using the following Gram–Schmidt
procedure:
10.2 Inner product vector spaces | 237

Let

v1 = u1
⟨u2 , v1 ⟩
v2 = u2 − v
‖v1 ‖2 1

Then

⟨u2 , v1 ⟩
⟨v2 , v1 ⟩ = ⟨u2 , v1 ⟩ − ⟨v1 , v1 ⟩ = 0
‖v1 ‖2

∴ v2 is orthogonal to v1 . Let

2
⟨u3 , vi ⟩
v3 = u3 − ∑ v
i=1
‖vi ‖2 i

Then

⟨v3 , v1 ⟩ = ⟨u3 , v1 ⟩ − ⟨u3 , v1 ⟩ = 0


⟨v3 , v2 ⟩ = ⟨u3 , v2 ⟩ − ⟨u3 , v2 ⟩ = 0

∴ v3 is orthogonal to both v1 and v2 . At the k-th step, let

k−1
⟨uk , vi ⟩
vk = uk − ∑ v
i=1
‖vi ‖2 i

Then vk is orthogonal to v1 , v2 , . . . , vk−1 . If vk = 0 ⇒

k−1
⟨uk , vi ⟩
uk = ∑ v
i=1
‖vi ‖2 i

⇒ uk is a linear combination of v1 , v2 , . . . , vk−1 or equivalently, uk is a linear combi-


nation of (u1 , u2 , . . . , uk−1 ). But this cannot be true since (u1 , u2 , . . . , un ) is linearly in-
dependent. Therefore, the above procedure cannot fail and we obtain an orthogonal
set.
If {e1 , e2 , . . . , en } is an orthonormal basis and x ∈ V is an arbitrary vector,

n
x = ∑ αi ei
i=1

n
⟨x, ej ⟩ = ⟨∑ αi ei , ej ⟩ = αj
i=1
238 | 10 Normed and inner product vector spaces


n
x = ∑⟨x, ei ⟩ei
i=1

Remark. The above procedure also indicates to us how to define the inner product
so that {u1 , u2 , . . . , un } is an orthonormal basis. Thus, we may formulate a theorem as
follows.

Theorem. Let V be a finite dimensional vector space over a field F. Let {u1 , u2 , . . . , un } be
a basis for V. Then, ∃ an inner product on V such that {u1 , u2 , . . . , un } is an orthonormal
basis.

Proof. Let x, y ∈ V be any two linearly independent vectors. Expand x and y in terms
of the basis {ui }
n n
x = ∑ αi ui , y = ∑ βi ui
i=1 i=1

Define
n
⟨x, y⟩ = ∑ αi βi
i=1

Now it is obvious that {u1 , u2 , . . . , un } is an orthonormal basis w. r. t. this inner product.


Suppose that {e1 , e2 , . . . , en } is also a basis for V. Let α′ be the coordinate vector of x
w. r. t. the e-basis. Then we have

α = Pα′ , β = Pβ′

where P = transition matrix. Now,

⟨x, y⟩ = αT β = α′T PT Pβ′

This is the inner product in the e-basis that makes {ui } an orthonormal set. Note that
PT P is a positive definite matrix.

10.3 Linear functionals and adjoints


Definition. Let V be a vector space defined over a field F. A linear functional on V is a
linear transformation from V into F (the field) f : V → F
10.3 Linear functionals and adjoints | 239

Examples.
1. Let V = C[a, b] = space of continuous real valued functions over the field ℝ. For
g(t) ∈ V, define

f (g(t)) = ∫ g(t) dt
a

This linear functional maps the space V into the real line.
2. Let F = ℝ and V = ℝn and let {e1 , e2 , . . . , en } be a basis for V. Let f : ℝn → ℝ be a lin-
ear functional on V and f (ej ) = aj . Then the matrix of f in the basis {e1 , e2 , . . . , en }
is a row vector [a1 , a2 , . . . , an ]. If x ∈ V is any vector and

n
x = ∑ βj e j
j=1

Then

n
f (x) = f (∑ βj ej )
j=1
n
= ∑ βj f (ej ) since f is linear
j=1
n n
= ∑ βj aj = ∑ βj aj = ⟨β, a⟩
j=1 j=1

This appears like the standard inner product of x with a fixed vector in V, i. e.,
f (x) = ⟨β, a⟩.

Any linear functional f on a finite-dimensional inner product space is “inner product


with a fixed vector in that space,” i. e., f has the form f (x) = ⟨x, y⟩ for some fixed y ∈ V.
This result may be used to prove the existence of the “adjoint” of a linear operator T
on V, this being a linear operator T∗ such that

⟨Tx, y⟩ = ⟨x, T∗ y⟩ for all x, y ∈ V

Through the use of an orthonormal basis, this adjoint operation on linear operators
is identified with the operation of forming the conjugate transpose of a matrix. These
ideas are illustrated below.

Theorem. Given a linear functional f on a finite dimensional inner product space V, ∃


in V a unique vector y ∋ f (x) = ⟨x, y⟩ for all x ∈ V.

Proof. Let {e1 , e2 , . . . , en } be an orthonormal basis for V.


240 | 10 Normed and inner product vector spaces

Let
n
y = ∑ f (ej )ej
j=1

We shall show that this y is the y of the theorem. Let f ̂ be the linear functional on V
defined by

f ̂(x) = ⟨x, y⟩ for all x ∈ V


n
= ⟨x, ∑ f (ej )ej ⟩
j=1


n
f ̂(ek ) = ⟨ek , ∑ f (ej )ej ⟩
j=1

n
= ⟨∑ f (ej )ej , ek ⟩
j=1

= f (ek )
= f (ek )

Thus,

f ̂(ek ) = f (ek ) for k = 1, 2, . . . , n

Since f ̂ and f agree on each basis vector, we have f = f ̂, and hence the theorem is
proved. Now suppose that there are two such vectors (say y and z). Then

f (x) = ⟨x, y⟩ = ⟨x, z⟩

⟨x, y⟩ − ⟨x, z⟩ = 0 for all x


⇒ ⟨y, x⟩ − ⟨z, x⟩ = 0
⇒ ⟨y − z, x⟩ = 0
⇒ ⟨x, y − z⟩ = 0

Take x = y − z ⇒

⟨y − z, y − z⟩ = 0 ⇒ y − z = 0 or y=z

Thus, such a vector is unique.


10.3 Linear functionals and adjoints | 241

Theorem. For any linear operator T on a finite-dimensional inner product space V, ∃ a


unique linear operator T∗ in V such that

⟨Tx, y⟩ = ⟨x, T∗ y⟩ for all x, y ∈ V

Proof.
1. T∗ exists: Let y ∈ V be a vector. We shall define T∗ y to prove its existence. Now,

f (x) = ⟨Tx, y⟩ ∈ F(field), x, y ∈ V

is a linear functional on V. From the previous theorem, ∃ a unique ŷ in V such


that f (x) = ⟨x, y⟩̂ for all x in V ⇒ ⟨Tx, y⟩ = ⟨x, y⟩.
̂ ŷ is uniquely determined by y
and we define T as the rule that associates ŷ for each y, i. e.,

ŷ = T ∗ y

[Note: T∗ is called the adjoint operator, so that ⟨Tx, y⟩ = ⟨x, T∗ y⟩ for x, y ∈ V].
2. T∗ is a linear operator: Consider

⟨x, T∗ (αz + βw)⟩ = ⟨Tx, αz + βw⟩ (from the above definition of T∗ )


= ⟨αz + βw, Tx⟩
= ⟨αz, Tx⟩ + ⟨βw, Tx⟩
= α⟨z, Tx⟩ + β⟨w, Tx⟩
= α⟨Tx, z⟩ + β⟨Tx, w⟩
= α⟨Tx, z⟩ + β⟨Tx, w⟩
= α⟨x, T∗ z⟩ + β⟨x, T∗ w⟩
= α⟨T∗ z, x⟩ + β⟨T∗ w, x⟩
= α⟨T∗ z, x⟩ + β⟨T∗ w, x⟩
= ⟨αT∗ z, x⟩ + ⟨βT∗ w, x⟩
= ⟨x, αT∗ z⟩ + ⟨x, βT∗ w⟩
= ⟨x, αT∗ z + βT∗ w⟩

∴ T∗ is a linear operator.

Theorem. Let V be a finite-dimensional inner product space with an orthonormal basis


{e1 , e2 , . . . , en }. Let T be a linear operator on V. Then
(a) the matrix of T with respect to the above basis is A = {αij } where

αij = ⟨Tej , ei ⟩

(b) the matrix of T∗ with respect to the same basis is A∗ (conjugate transpose of A).
242 | 10 Normed and inner product vector spaces

Proof. Let
n
Tej = ∑ αij ei (10.1)
i=1

If we expand Tej in terms of the basis vectors, we get the j-th column of A,

⟨Tej , ei ⟩ = αij

If we expand T∗ ej in terms of the basis, we get the j-th column of the matrix of T∗ ,

n
⟨T∗ ej , ei ⟩ = ⟨∑ βij ei , ei ⟩ = βij
i=1

βij = ⟨T∗ ej , ei ⟩ = ⟨ej , Tei ⟩ = ⟨Tei , ej ⟩ = αji

∴ The result.

Remark. If the basis of V is not orthonormal, the relationship between the matrix of
T and T∗ is more complicated than given in the theorem above.

Definition. T is called a self-adjoint operator if T∗ = T. If the field is ℝ then self-


adjointness means the matrix of T in an orthonormal basis is symmetric. If F = ℂ,
then self-adjointness means that the matrix of T is Hermitian, i. e. it is equal to its
conjugate transpose.

Remark. Since T∗ is defined by

⟨Tx, y⟩ = ⟨x, T∗ y⟩

if T = T∗ (self-adjoint) ⇒

⟨Tx, y⟩ = ⟨x, Ty⟩

Thus, self-adjointness or the symmetry property of a linear operator or its matrix rep-
resentation very much depends on the definition of inner product (or equivalently,
adjointness depends on the definition of inner product).

Definition. A linear operator T is called normal if it commutes with its adjoint, i. e.,

TT∗ = T∗ T.
10.3 Linear functionals and adjoints | 243

Normal operators are generalization of symmetric operators on real inner product


spaces to complex inner product spaces.

Characteristic values
Let T : V → V is a linear operator over a field F. A scalar λ ∈ F is called a characteristic
value or eigenvalue of T if Tx = λx, x ∈ V, λ ∈ F. This definition is crucial for in some
cases because of the limitation on the field there would be no characteristic values.
If we choose an orthonormal basis for V, the eigenvalues of T and T∗ (adjoint) as well
as the corresponding eigenvectors can be found from the matrix representations. Sup-
pose that {e1 , e2 , . . . , en } is an orthonormal basis for V, and A = [T]e = n×n matrix with
aij ∈ F. Then the eigenvalues of T are given by the algebraic equations

(A − λI)x = 0 (10.2)

and of T∗ by

(A∗ − ηI)y = 0 (10.3)

Since A∗ is the conjugate transpose of A, equations (10.2) and (10.3) ⇒

λ are roots of det(A − λI) = 0


η are roots of det(A − ηI)T = 0.

Now,

det(A − ηI)T = 0 ⇒ det(A − ηI)T = 0 ⇒ det(A − ηI) = 0

Thus, if λ is a characteristic value of T, η = λ is a characteristic value of T∗ .


Consider the eigenvalue problems:

Tx = λx or Ax = λx
T y = ηy or

A∗ y = ηy = λy.

The following theorem on the nature of the eigenvalues may be stated.

Theorem. Let T : V → V be a linear operator and λ be an eigenvalue of T. Then


1. (a) If T is self-adjoint, i. e., T∗ = T, then λ is real
(b) the eigenvectors corresponding to different eigenvalues are orthogonal.
2. If T∗ = T−1 (i. e. T is unitary), then |λ| = 1. [The eigenvalues are located on the unit
circle in the complex plane.]
3. If T∗ = −T, then λ is purely imaginary (T is skew adjoint).
4. If T = S∗ S with S nonsingular, then λ is real and positive.
244 | 10 Normed and inner product vector spaces

Proof.
1. (a) Let x ∈ V

λ⟨x, x⟩ = ⟨λx, x⟩
= ⟨Tx, x⟩
= ⟨x, T∗ x⟩
= ⟨x, Tx⟩
= ⟨x, λx⟩
= ⟨λx, x⟩
= λ⟨x, x⟩
= λ⟨x, x⟩

Since ⟨x, x⟩ ≠ 0 ⇒ λ = λ or λ is real.


(b) Let λi , λj be two distinct values and xi , xj be the corresponding eigenvectors:

Txi = λi xi

⟨Txi , xj ⟩ = λi ⟨xi , xj ⟩
⟨Txi , xj ⟩ = ⟨xi , T∗ xj ⟩
= ⟨xi , Txj ⟩
= ⟨xi , λj xj ⟩
= λj ⟨xi , xj ⟩ since λj is real

as λi ≠ λj ⇒ ⟨xi , xj ⟩ = 0
2.

λλ⟨x, x⟩ = ⟨λx, λx⟩


= ⟨Tx, Tx⟩
= ⟨x, T∗ Tx⟩
= ⟨x, x⟩

Since ⟨x, x⟩ ≠ 0 ⇒ λλ = 1 ⇒ |λ| = 1. Thus, eigenvalues of a unitary operator lie on


the unit circle in the complex plane.
3.

λ⟨x, x⟩ = ⟨λx, x⟩
= ⟨Tx, x⟩
10.3 Linear functionals and adjoints | 245

= ⟨x, T∗ x⟩
= ⟨x, −Tx⟩
= ⟨x, −λx⟩
= −λ⟨x, x⟩
= −λ⟨x, x⟩

⇒ λ = −λ ⇒ λ is purely imaginary. Thus, the eigenvalues of a skew-adjoint oper-


ator are on the imaginary axis.
4.

λ⟨x, x⟩ = ⟨λx, x⟩
= ⟨Tx, x⟩
= ⟨S∗ Sx, x⟩
= ⟨Sx, Sx⟩

But ⟨x, x⟩ and ⟨Sx, Sx⟩ are positive.


⇒ λ is real and positive.

Theorem. Let V be a finite dimensional inner product space defined over F and let T be
a self-adjoint linear operator. Then T has n eigenvectors.

Proof. It is sufficient if we prove the theorem for a Hermitian matrix A. We need only
to show that a Hermitian matrix does not have any generalized eigenvectors of rank 2.
This implies there are no generalized eigenvectors of rank > 2. If there was a GEV of
rank k(k ≥ 3), then it generates a chain of GEV of rank k, k − 1, . . . , 2, 1. Thus, it is suffi-
cient to prove the theorem for k = 2. Assume x is a GEV of rank 2 with eigenvalue λ, ⇒

(A − λI)2 x = 0, (A − λI)x ≠ 0

We already proved that λ is real ⇒ A − λI is also Hermitian,


0 = ⟨x, 0⟩
= ⟨x, (A − λI)2 x⟩
= ⟨(A − λI)2 x, x⟩
= ⟨(A − λI)x, (A − λI)∗ x⟩
= ⟨(A − λI)x, (A − λI)x⟩
= ⟨(A − λI)x, (A − λI)x⟩

⇒ (A − λI)x = 0 from property (iii) of inner product.


246 | 10 Normed and inner product vector spaces

∴ Contradiction ⇒ x cannot be a GEV of rank 2.

Corollary. If T is self-adjoint, then ∃ an orthonormal basis w. r. t., which the matrix of T


is diagonal, i. e., T can be diagonalized.

Proof. It follows from previous two theorems. Let λ1 , λ2 , . . . , λr be the distinct eigenval-
ues. If r = n, then there are n orthogonal eigenvectors. If r < n, there are repeated
eigenvalues. Suppose λi is an eigenvalue of multiplicity mi . We showed that there can-
not be any GEV of rank ≥ 2.
⇒ There are mi eigenvectors corresponding to λi . Apply the Gram–Schmidt procedure
to make these orthogonal. Then these are not only orthogonal to each other but also
to other eigenvectors.
∴ The result.
If x is an arbitrary vector, then

n
x = ∑ αi xi
i=1

where {x1 , x2 , . . . , xn } is an orthonormal set, and αi = ⟨x, xi ⟩. Define Ej = xj x∗j (Note:


E∗j = Ej or Ej is self-adjoint)

n
Ej x = ∑ αi xj x∗j xi = αj xj ⇒ Ei Ej x = 0
i=1
n
E2j x = ∑ αi xj x∗j xj x∗j xi = αj xj x∗j xj = αj xj = Ej x
i=1


n
x = ∑ Ei x ⇒ E1 + E2 + ⋅ ⋅ ⋅ + En = I (10.4)
i=1


n
Tx = ∑ αi Txi
i=1

= ∑ αi λ i xi
n
= ∑ λi Ei x
i=1

λ1 E1 + λ2 E2 + ⋅ ⋅ ⋅ + λn En = T (10.5)
10.3 Linear functionals and adjoints | 247

Group repeated eigenvalues. Then (10.4) and (10.5) become E1 + E2 + ⋅ ⋅ ⋅ + Er = I and


λ1 E1 + ⋅ ⋅ ⋅ + λr Er = T.

Theorem. Let V be a finite-dimensional vector space and {w1 , w2 , . . . , wn } be a basis


for V. Let Wi = space spanned by wi = {x/x = αwi , α ∈ F}. Then

V = W1 ⊕ W2 ⊕ ⋅ ⋅ ⋅ ⊕ Wn .

This is known as direct sum decomposition of V.

Invariant subspaces
Let V be a finite-dimensional vector space and T be a linear operator on V. Let W be
a subspace of V. W is called an invariant subspace (w. r. t. T) if T maps W into itself,
i. e., x ∈ W ⇒ Tx ∈ W. A schematic diagram of such invariant subspaces is shown in
Figure 10.2 for V = W1 ⊕ W2 .

Figure 10.2: Schematic of invariant subspaces.

Example. If T is any linear operator on V, then V is invariant under T, as is the zero


subspace. The range of T and the null space of T are also invariant under T. If we can
write

V = W1 ⊕ W2 ⊕ ⋅ ⋅ ⋅ ⊕ Wk ,

where each Wi (i = 1, 2, . . . , k) is invariant under T, then the decomposition is called


invariant direct-sum decomposition. In order to state the major result of finite-dimen-
sional linear algebra (i. e., the spectral theorem), we need to introduce three additional
concepts:
(a) direct sum decomposition
(b) invariant subspaces
(c) projections and orthogonal projections

Direct sum decomposition


Let V be a finite-dimensional vector space and W1 , W2 , . . . , Wk be subspaces of V. Let
x ∈ V be any vector. Suppose that we can expand x as
248 | 10 Normed and inner product vector spaces

k
x = ∑ αi w i where wi ∈ Wi ,
i=1

and {αi } are uniquely determined. Then we say that V is a direct sum of W1 , . . . , Wk and
write as

V = W1 ⊕ W2 ⊕ ⋅ ⋅ ⋅ ⊕ Wk

Theorem. dim V = dim W1 + dim W2 + ⋅ ⋅ ⋅ + dim Wk .

Example.
1. Let V = ℝ2 , W1 = space spanned by e1 = (1, 0), W2 = space spanned by e2 = (0, 1)
If x ∈ V, then

x = α1 e1 + α2 e2

For any given x, α1 and α2 are uniquely determined.


V = W1 ⊕ W2

2. Let V = ℝ3 , W1 = {(1, 0, 0), (0, 1, 0)}, W2 = {(0, 0, 1)}. Then we can write

V = W1 ⊕ W2

A schematic diagram of the spaces W1 and W2 is shown in Figure 10.3. Here, W1 is the
(x, y) plane of dimension 2 and W2 is the z-axis of dimension 1.

Figure 10.3: Schematic illustrating direct sum decomposition.

Projections
Let E : V → V be a linear operator on a finite-dimensional vector space V. Then E is
called a projection if E2 x = Ex for all x in V.
10.3 Linear functionals and adjoints | 249

Theorem. Let E be a projection. Let R = range of E = {y/y = Ex} and N = null space of
E = {x/Ex = 0}. Then V = R ⊕ N.
Proof. If x ∈ V, then

x = x + Ex − Ex = (x − Ex) + Ex

Ex = E(x − Ex) + E2 x ⇒ E(x − Ex) = 0

⇒ x − Ex ∈ N. Also Ex ∈ R. Thus, we can write

x = y + z, y ∈ R, z ∈ N

We will now show that R ∩ N = 0, i. e., R and N are disjoint. Suppose x ∈ R ∩ N,

x = Ey for some y ∈ V

Ex = E(Ey) = Ey = x

But x is also in N ⇒ Ex = 0
⇒x=0
∴ V = R ⊕ N.
Definition. E1 and E2 are called orthogonal projections on the vector space V if E1 E2 x =
E2 E1 x = 0.
In general, let {E1 , E2 , . . . , Ek } be orthogonal projections on a finite-dimensional
inner product space V. Let

Ej x = xj , j = 1, 2, . . . , k

Then

E2j x = xj , . . . , Enj x = xj , n = 3, 4, . . .

Theorem (Spectral). Let T be a normal (symmetric) operator on a complex (real) finite-


dimensional inner product space V. Then there exist orthogonal projections E1 , E2 , . . . , Er
on V and scalars λ1 , λ2 , . . . , λr such that:
(i) E1 + E2 + ⋅ ⋅ ⋅ + Er = I
(ii) λ1 E1 + λ2 E2 + ⋅ ⋅ ⋅ + λr Er = T
(iii) Ei Ej = 0, i ≠ j
(iv) Q(T) = ∑ri=1 Q(λi )Ei

where Q is any function defined on the spectrum of T.


250 | 10 Normed and inner product vector spaces

This form of the spectral theorem is a generalization of that stated in Part I. A proof
of this may be found in the book by Halmos [20].
Finally, the following diagonalization theorems can be stated.

Theorem 1. Let T be a self-adjoint operator on a real finite-dimensional inner product


space V. Then ∃ an orthonormal basis of V consisting of eigenvectors of T, i. e., T can be
represented by a diagonal matrix relative to the orthonormal basis.

Theorem 2. Let T be an orthogonal operator on a real finite-dimensional inner product


space V. Then there is an orthonormal basis w. r. t., which T has the following form:

1
[ ]
[ 1 ]
[ ]
[
[ 1 ]
]
[ . ]
[ ]
[ ]
[ . ]
[ ]
[
[ 1 ]
]
[ −1 ]
[ ]
[ ]
[ −1 ]
[ ]
[ . ]
[ ]
[ . ]
[ ]
[ ]
[ −1 ]
[ ]
[
[ cos θ1 − sin θ1 ]
]
sin θ1 cos θ1
[ ]
[ ]
[ ]
[ . ]
[ ]
[ . ]
[ ]
[ ]
[ cos θn − sin θn ]
[ sin θn cos θn ]

Theorem 3. Let T be a normal operator on a complex finite-dimensional inner product


space V. Then ∃ an orthonormal basis of V consisting of eigenvectors of T, i. e., T can be
represented by a diagonal matrix w. r. t. an orthonormal basis.

Theorem 4. Let T be an arbitrary operator on a complex finite-dimensional inner prod-


uct space V. Then T can be represented by a triangular matrix w. r. t. an orthonormal
basis of V.

The proof of these theorems may be found in the books by Halmos [20], Naylor
and Sell [24] and Lipschutz and Lipson [22].

Problems
1. Given the vector space of polynomials of degree at most N defined over the interval
(a, b). Equip V with inner product
10.3 Linear functionals and adjoints | 251

⟨f , g⟩ = ∫ ρ(t)f (t)g(t) dt
a

where ρ(t) > 0 for a < t < b. Indicate how one may determine an orthogonal basis
set by applying the Gram–Schmidt process to the basis {1, t, t 2 , . . . , t N }. Find the
first three members for the following cases:
(a) ρ(t) = 1, a = 0, b = 1 (Legendre polynomials on the unit interval)
(b) ρ(t) = 1, a = −1, b = 1 (classical Legendre polynomials)
(c) ρ(t) = exp(−t), a = 0, b = ∞ (Laguerre polynomials)
(d) ρ(t) = exp(−t 2 ), a = −∞, b = ∞ (Hermite polynomials)
1
(e) ρ(t) = [t(1 − t)]− 2 , a = 0, b = 1 (Chebyshev polynomials on the unit interval)
2. Consider the space ℂn of n-tuples of complex numbers. Let W be a nonsingular
n × n matrix. For u, v ∈ℂn , define

(u, v)W = v∗ W∗ Wu

where the superscript ∗ denotes complex conjugate transpose.


(a) Show that this satisfies the requirements of an inner product
(b) Prove the Cauchy–Schwarz inequality for this inner product
(c) Let T be a linear operator in ℂn and A and B be the matrices of T and its adjoint
T∗ with respect to some orthonormal basis. Show that W∗ WB = A∗ W∗ W
3. Let V be a finite-dimensional inner product vector space, and let E be an idem-
potent linear operator on V, i. e., E2 = E. Prove that E is self-adjoint if and only if
EE∗ = E∗ E.
4. Let V be a finite-dimensional complex inner product space and T be a linear op-
erator on V. Prove that T is self-adjoint if and only if ⟨Tu, u⟩ is real for every u
in V.
5. Let V be a finite-dimensional inner product space and T be a linear operator on V.
Show that the range of T∗ is the orthogonal complement of the null space of T.
6. Let V be a finite-dimensional inner product space and T be a self-adjoint operator
on V. Show that if the eigenvalues of T are arranged so that λ1 ≤ λ2 ≤ ⋅ ⋅ ⋅ ≤ λn , then

(Tu, u)
λ1 ≤ ≤ λn
(u, u)

Here, (u, v) is the inner product on V.


7. Let V be the vector space 2 × 2 matrices over ℂ and let

1 −1
M=( ).
−2 2

Define a linear operator T on V by

T(A) = MA − AM for A ∈ V.
252 | 10 Normed and inner product vector spaces

(a) Find a basis and the dimension of the kernel and image of T.
(b) Show that ⟨A, B⟩ = tr(B∗ A), where tr stands for the trace (sum of diagonal
elements) satisfies the requirements of an inner product.
(c) Find the adjoint operator.
(d) Determine the eigenvalues and eigenvectors of T.
8. In the numerical solution of transport and reaction problems in a tube in which
the flow is laminar, we need a set of polynomial trial functions (to approximate
the unknown solution) on the unit interval 0 < r < 1 such that each function
vanishes at r = 1 while its derivative vanishes at r = 0. The functions should also
be orthogonal w. r. t. the weight function

ρ(r) = 4r(1 − r 2 )

and normalized to have unit length (w. r. t. this weight function).


(a) Determine the first four of these functions and plot them.
(b) Determine the first four coefficients in the expansion of unity in terms of these
normalized eigenfunctions.
(c) Verify that the first two coefficients contain 97 % of the energy while the first
four contains 99.55 % (sum of squares).
9. Consider the space ℝ2 . Let Dp (p = 1, 2, ∞) be the set of all vectors (points in ℝ2 )
having unit length w. r. t. the p-norm. Make a plot of D1 , D2 and D∞ .
10. Suppose that u1 , u2 , . . . , ur form an orthogonal set of nonzero vectors in a vector
space V. Let w be any vector in V and define

w′ = w − (c1 u1 + c2 u2 + ⋅ ⋅ ⋅ + cr ur )

where
⟨w, uj ⟩
cj = ; j = 1, 2, . . . , r
⟨uj , uj ⟩

Show that w′ is orthogonal to w. [Remark: The coefficient cj is called Fourier co-


efficient and represents the component of w along uj .]
11 Applications of finite-dimensional linear algebra
This chapter is an introduction to some applications of abstract vector space concepts
to problems of interest in transport phenomena, separations and kinetics.

11.1 Weighted dot/inner product in ℝn


Let V be a finite-dimensional inner product space over a field F (Hilbert space) and
{e1 , e2 , . . . , en } be an orthonormal basis for V. Let T : V → V be a linear operator and
[T]e = A = matrix of T in the {ei } basis. When V = ℝn , the standard inner product is
defined by
n
⟨x, y⟩ = ∑ xi yi = yT x; x, y ∈ V (11.1)
i=1

Now suppose that T is not self-adjoint with respect to the standard inner product. Then
the question is if we can make T self-adjoint w. r. t. the inner product
n n
⟨x, y⟩ = ∑ ∑ gij xj yi = yT Gx (11.2)
i=1 j=1

where G = {gij } is a symmetric positive definite matrix. For T to be self-adjoint, we


should have

⟨Tx, y⟩ = ⟨x, Ty⟩

or equivalently,

⟨Ax, y⟩ = ⟨x, Ay⟩ ⇒ yT GAx = yT AT Gx (11.3)

Since this must be true for all x and y in V,


GA = AT G
= AT GT (since G = GT )

GA = (GA)T (11.4)

Thus, the matrix A is symmetric (or T is self-adjoint) with respect to the inner product
(11.2) if the matrix (GA) is symmetric with respect to the standard inner product (11.1).
Now consider the special case in which G is a diagonal matrix

https://doi.org/10.1515/9783110739701-012
254 | 11 Applications of finite-dimensional linear algebra

g1 0 . . 0
0 g2 . . 0
G=( . . ), gi > 0 i = 1, 2, . . . , n
. .
0 0 . . gn

Then

(GA) = {gi aij } ⇒ (GA) = (GA)T ⇒ aij gi = aji gj .

G is positive definite ⇒ gi > 0 for all i. Thus, if we can choose gi such that gi > 0 and

aij gi = aji gj

then the operator T : V → V (or equivalently the matrix A) is symmetric (self-adjoint)


w. r. t. the weighted inner product

n
⟨x, y⟩ = ∑ gi xi yi (11.5)
i=1

Note that we can do so only if aij and aji are of the same sign. The above generaliza-
tion of the inner product to make a nonsymmetric matrix into a symmetric matrix has
many applications in chemical engineering. We illustrate here the use of weighted dot
product with examples.

Example 11.1. Let

1 2
A=( )
3 2

We note that A is not symmetric w. r. t. the usual inner product. Define

g1 0
⟨x, y⟩ = yT Gx = yT ( ) x; g1 , g2 > 0
0 g2

g1 0 1 2 g 2g1
GA = ( )( )=( 1 )
0 g2 3 2 3g2 2g2

GA is symmetric if

2g1 = 3g2

2
Take g1 = 1 ⇒ g2 = 3
11.1 Weighted dot/inner product in ℝn | 255

2
⟨x, y⟩ = x1 y1 + x2 y2
3

with respect to this inner product, A is symmetric. Note that this definition satisfies all
the rules of inner product.
Eigenvalues and vectors of A:

(1 − λ)(2 − λ) − 6 = 0
⇒ λ2 − 3λ − 4 = 0 ⇒ (λ − 4)(λ + 1) = 0 ⇒ λ = 4, −1
2 2 x 1
λ1 = −1 ⇒ ( ) ( 11 ) = 0 ⇒ x1 = ( )
3 3 x12 −1
−3 2 x 2
λ2 = 4 ⇒ ( ) ( 1 ) = 0 ⇒ x2 = ( )
3 −2 x2 3

It can be seen that ⟨x1 , x2 ⟩ = 1.2 + 32 (−1)(3) = 0. Thus, x1 and x2 are orthogonal w. r. t.
the new inner product.
In fact, it may be shown that any real or complex n × n matrix A that has real
eigenvalues and a complete set of eigenvectors is self-adjoint (symmetric) with respect
to some inner product.
Suppose that A has real eigenvalues and a complete set of eigenvectors. Then, ∃ a
nonsingular matrix T such that

A = TΛT−1 (11.6)

where

λ1 0
.
Λ = spectral matrix = ( )
.
0 λn

(11.6) ⇒
T T
AT = (TΛT−1 ) = (T−1 ) ΛT TT (11.7)

⇒ If T diagonalizes A in a similarity transform, (T−1 )T or (TT )−1 diagonalizes AT in a


similarity transform. For A to be symmetric w. r. t the inner product,

⟨x, y⟩ = yT Gx (11.8)

we should have

GA = (GA)T = AT G (11.9)
256 | 11 Applications of finite-dimensional linear algebra

GAG−1 = AT (11.10)

(11.6) and (11.10) ⇒

GTΛT−1 G−1 = AT (11.11)

(11.11) ⇒

GTΛ(GT)−1 = AT

Comparing (11.7) and (11.11) ⇒


T T
(T−1 ) = GT ⇒ G = (T−1 ) T−1 = (TT ) T−1
−1

or

G = (TTT ) .
−1

Thus, if T is the modal matrix of A and we define

⟨x, y⟩ = yT (TTT ) x, (11.12)


−1

then A is symmetric w. r. t this inner product. Since each column of T is determined up


to a constant there are n arbitrary (+ve) constants in (11.12).
Alternatively, if W is a nonsingular real matrix, then WT W is a symmetric positive
definite matrix. We can define a weighted inner product

⟨x, y⟩W = ⟨Wx, Wy⟩ = yT WT Wx (11.13)

For the special case in which W is a diagonal matrix, equation (11.13) reduces to equa-
tion (11.5).

Example 11.2.

1 2
A=( )
3 2

c1 2c2 c1 −c1
T=( ), TT = ( )
−c1 3c2 2c2 3c2


11.1 Weighted dot/inner product in ℝn | 257

c1 2c2 c −c1 c2 + 4c22 −c12 + 6c22


TTT = ( )( 1 ) = ( 12 )
−c1 3c2 2c2 3c2 −c1 + 6c22 c12 + 9c22
1 c12 + 9c22 c12 − 6c22
(TTT )
−1
= ( )
25c12 c22 c12 − 6c22 c12 + 4c22

Take c1 = c2 ⇒

1 10c2 −5c2 2 1 − 21
G= ( )= 2( 1 )
25c4 −5c2 5c2 5c −2 1
2

2
Take c2 = 5

1 − 21
G=( )
− 21 1
2

Then A is self-adjoint w. r. t the inner product

T
1 − 21
⟨x, y⟩ = y ( )x
− 21 1
2
1 − 21 x1
= (y1 y2 ) ( )( )
− 21 1 x2
2
1 −y1 + y2 x
= (y1 − y2 )( 1 )
2 2 x2
1 1 1
= x1 y1 − x1 y2 − x2 y1 + x2 y2
2 2 2

Take

1
x=( )
0


1
⟨x, x⟩ = x12 − x1 x2 + x22 = 1
2
1
⟨x, y⟩ = 0 ⇒ y1 − y2 = 0 ⇒ y2 = 2y1
2

Take y2 = 2 ⇒ y1 = 1.
∴ e1 = (1, 0), e2 = (1, 2) is an orthonormal basis for ℝ2 . In this inner product space e1 , e2
are orthonormal, i. e., they have unit length and orthogonal to each other. Clearly, the
geometry of this space is quite different from what it is with standard inner product
(see Figure 11.1).
258 | 11 Applications of finite-dimensional linear algebra

Figure 11.1: Schematic diagram of orthonormal basis vectors w. r. t. weighted inner product defined
by equation (11.12).

11.2 Application of weighted inner product to interacting tank


systems
The general form of the model for interacting tank systems (or discretized transient
diffusion model) can be expressed as

du
C = Qu, u = u0 @ t = 0, (11.14)
dt

where Q is a symmetric exchange matrix, C is a diagonal capacitance matrix, with all


positive diagonal elements, i. e.,

α1 0 ⋅⋅⋅ 0
0 α2 ⋅⋅⋅ 0
C=( . .. ) , αi > 0. (11.15)
.. .
0 0 ⋅⋅⋅ αn

The model equation (11.14) can also be rewritten as

du
= C−1 Qu = Au, u = u0 @ t = 0. (11.16)
dt

Here, the matrix A = C−1 Q is not symmetric w. r. t. the usual inner product. However,
if we define the weighted inner product
n
⟨u, v⟩ = vT Cu = ∑ αi ui vi , for all real vectors u and v (11.17)
i=1

then A is symmetric. This can be verified as follows:


11.2 Application of weighted inner product to interacting tank systems | 259

⟨Au, v⟩ = vT CAu = vT CC−1 Qu


= vT Qu

and

⟨u, Av⟩ = (Av)T Cu = vT AT Cu


T
= vT (C−1 Q) Cu
T
= vT QT (C−1 ) Cu
= vT QC−1 Cu (∵ CT = C and QT = Q)
= vT Qu

󳨐⇒

⟨Au, v⟩ = ⟨u, Av⟩. (11.18)

Thus, the solution of equations (11.14) or (11.16) can be obtained by taking the inner
product (defined in equation (11.17)) with the eigenvectors xj of A:

d
⟨u, xj ⟩ = ⟨Au, xj ⟩ = ⟨u, Axj ⟩
dt
= λj ⟨u, xj ⟩ (∵ A is symmetric, λj are real)

󳨐⇒

⟨u, xj ⟩ = ⟨u0 , xj ⟩eλj t (11.19)

󳨐⇒
n ⟨u, xj ⟩ n ⟨u0 , xj ⟩
u(t) = ∑ xj = ∑ e λj t x j . (11.20)
i=1
⟨xj , xj ⟩ i=1
⟨xj , xj ⟩

Example 11.3 (Interacting two tank system). As an example, consider the two inter-
acting tank system shown in Figure 11.2.
The model describing this system is given by

dc1
VR1 = −qe c1 + qe c2
dt
dc
VR2 2 = qe c1 − qe c2
dt

with initial condition

c1 = c10 and c2 = c20 @ t = 0,


260 | 11 Applications of finite-dimensional linear algebra

Figure 11.2: Interacting two tank system.

where VR1 and VR2 are the volumes of the tanks and qe is the exchange flow rate. The
model can be written in matrix-vector form:

dc c10
= Ac, c(t = 0) = c0 = ( ), (11.21)
dt c20
−α α qe qe
A=( ); α= ; β= . (11.22)
β −β VR1 VR2

The matrix A is not symmetric w. r. t. the usual inner product unless α = β or VR1 = VR2 .

Standard solution
The eigenvalues of matrix A are

λ1 = 0 and λ2 = −(α + β) (11.23)

with corresponding eigenvectors

1 −α
x1 = ( ); x2 = ( ) (11.24)
1 β

and left eigenvectors

yT1 = ( β α ); yT2 = ( −1 1 ). (11.25)

Thus, the solution can be expressed using standard inner product as

n yTj c0
c(t) = ∑ e λj t x j (11.26)
j=1 yTj xj
βc10 + αc20 1 c − c10 −(α+β)t −α
= ( ) + 20 e ( ). (11.27)
(β + α) 1 (β + α) β
11.2 Application of weighted inner product to interacting tank systems | 261

Solution using capacitance weighted inner product


Suppose that we define

g1 0
⟨u, v⟩ = vT Gu, G=( ) with gi > 0 (11.28)
0 g2
= g1 u1 v1 + g2 u2 v2 . (11.29)

Then A is symmetric if

GA = (GA)T .

But

g1 0 −α α
GA = ( )( )
0 g2 β −β
−αg1 αg1
=( )
βg2 −βg2

󳨐⇒ GA is symmetric (in standard sense) if

αg1 = βg2 .

Thus, if we take

g1 = β and g2 = α (11.30)

then A is symmetric with weighted inner product:

β 0
G=( ) and ⟨u, v⟩ = vT Gu = βu1 v1 + αu2 v2 . (11.31)
0 α

Note that

1 −α
⟨x1 , x2 ⟩ = ⟨( ),( )⟩ = β.1.(−α) + α.1.β = 0.
1 β

Similarly,

⟨c0 , x1 ⟩ = βc10 + αc20


⟨x1 , x1 ⟩ = β + α
⟨c0 , x2 ⟩ = αβ(c20 − c10 )
⟨x2 , x2 ⟩ = αβ(α + β)
262 | 11 Applications of finite-dimensional linear algebra

Thus, the solution can be expressed as

2 ⟨c0 , xj ⟩
c(t) = ∑ e λj t x j
j=1
⟨xj , xj ⟩
⟨c0 , x1 ⟩ ⟨c , x ⟩
= x1 + 0 2 e−(α+β)t x2 .
⟨x1 , x1 ⟩ ⟨x2 , x2 ⟩
βc10 + αc20 1 c − c10 −(α+β)t −α
= ( ) + 20 e ( ) (11.32)
(β + α) 1 (β + α) β

Equation (11.32) is the same solution as in equation (11.27) but we do not need to com-
pute the eigenrows. Note that the vectors x1 = ( 11 ) and x2 = ( −α
β ) are orthogonal w. r. t.
the inner product defined in equation (11.31). The solution for α = 1 and β = 3 is shown
in Figure 11.3 for the initial condition of c0 = ( 01 ).

Figure 11.3: Solution diagram of interacting two tank system for α = 1, β = 3, c10 = 1 and c20 = 0.

It can be seen form the plots that at steady-state (or after long time, t → ∞), the so-
lution is in the space spanned by the eigenvector x1 corresponding to zero eigenvalue
(as expected).

11.3 Application of weighted inner product to monomolecular


kinetics
Monomolecular kinetics (Wie–Prater scheme)
Consider the reaction scheme shown in Figure 11.4 between three species A1 , A2 , A3 ,
where kij = rate constant for the formation of species Ai from Aj .
11.3 Application of weighted inner product to monomolecular kinetics | 263

Figure 11.4: Reaction scheme between the species A1 , A2 and A3 .

The conservation equations for a batch reactor are

d[A1 ]
dt
= −[k21 + k31 ][A1 ] + k12 [A2 ] + k13 [A3 ] }
}
}
} @ t = 0, [Ai ] = [Aio ]
d[A2 ]
= k21 [A1 ] − (k12 + k32 )[A2 ] + k23 [A3 ] }
dt }
} (initial concentration of Ai )
d[A3 ] }
dt
= k31 [A1 ] + k32 [A2 ] − (k13 + k23 )[A3 ] }

Let

[Ai ]
xi = 3
= mole fraction of Ai
∑i=1 [Ai ]

then we have

dx
= Kx
dt

where

x1 −(k21 + k31 ) k12 k13


[ ] [ ]
x = [ x2 ] , K=[ k21 −(k12 + k32 ) k23 ]
[ x3 ] [ k 31 k32 −(k13 + k23 ) ]

Generalizing this to n-species (A1 , A2 , . . . , An ), we have

dx
= Kx, x(t = 0) = x0 (11.33)
dt

where

k11 k12 k13 . . k1n


[ ]
[ k21 k22 k23 . . k2n ]
[ ]
[ . ]
K=[
[ .
]
]
[ ]
[ ]
[ . ]
[ kn1 kn2 kn3 . . knn ]

and
264 | 11 Applications of finite-dimensional linear algebra

n
kii = − ∑ kji .
j=1,i

Note that each column of K sums to zero. Thus, rank of K < n, and there is a nonzero
solution of Kx = 0. This is the equilibrium solution, denoted by x∗ . We assume that
rank K = n − 1 so that there is no other equilibrium point. Clearly, the matrix K is
nonsymmetric with respect to the usual inner product. A feature crucial to the follow-
ing analysis is the principle of microscopic reversibility or principle of detailed balanc-
ing, which states that at equilibrium the rate of every reaction and its reverse must be
equal, i. e.,

kij [A∗j ] = kji [A∗i ] (11.34)

where [A∗j ] is the equilibrium concentration of species Aj . Thus, we have (in terms of
mole fraction)

kij kji
kij xj∗ = kji xi∗ ⇒ = .
xi∗ xj∗

We define the weighted inner product by


n uj vj
⟨u, v⟩ = ∑ , (11.35)
j=1
xj∗

and show that the matrix K is symmetric w. r. t. this inner product. The i-th element of
the vector Ku is given by
n
(Ku)i = ∑ kij uj
j=1

Thus,

n n
vi
⟨Ku, v⟩ = ∑(∑ kij uj )
i=1 j=1
xi∗
n n kij uj vi
= ∑∑
i=1 j=1
xi∗
n n kij uj vi
= ∑∑
j=1 i=1
xi∗

but

kij kji
=
xi∗ xj∗
11.3 Application of weighted inner product to monomolecular kinetics | 265


n n kji uj vi
⟨Ku, v⟩ = ∑ ∑
j=1 i=1
xj∗

Since the indices i and j are dummy we change i → j and j → i without changing the
sum. Thus,

n n
ui
⟨Ku, v⟩ = ∑ ∑ kij vj
i=1 j=1
xi∗
n (∑nj=1 kij vj )ui
=∑
i=1
xi∗
= ⟨u, Kv⟩

Therefore, K is self-adjoint w. r. t this inner product, which implies that all its eigenval-
ues are real and it has a set of n-eigenvectors. Let z1 (= x∗ ), z2 , z3 , . . . , zn be the eigenvec-
tors and λ1 (= 0), λ2 , . . . , λn be the eigenvalues. Now, let us show that K is nonpositive,
i. e., all the nonzero eigenvalues are strictly negative. We consider the quadratic form

n n
1
⟨Ku, u⟩ = ∑( ui ) ∑ kij uj
i=1
xi∗ j=1
n n kij ui uj
= ∑∑
i=1 j=1
xi∗
n n kij ui uj kii u2i
= ∑( ∑ + )
i=1 j=1,i
xi∗ xi∗
n n kij ui uj n kji u2i
= ∑( ∑ −∑ )
i=1 j=1,i
xi∗ j=1,i
xi∗

kij ui uj kji u2i


= ∑ ∑( ) − ∑∑
i=j̸
xi∗ i=j̸
xi∗
2
kij kji kji
= [∑ ∑ √ ∗ ui uj − ∑ ∑( ∗ ui ) ]
∗√

i=j̸
xi xj i=j̸
xi

Interchanging indices gives

2
kji kij kij
⟨u, Ku⟩ = [∑ ∑ √ √ ∗ uj ui − ∑ ∑(√ ∗ uj ) ]
i=j̸
xj ∗ xi i=j̸
xj


266 | 11 Applications of finite-dimensional linear algebra

2
kij kji
2⟨Ku, u⟩ = − ∑ ∑(√ u −√ u) ≤0 (11.36)
i=j̸
xj∗ j xi∗ i

Thus, the nonzero eigenvalues are strictly negative.


The solution of
dx
= Kx, x(t = 0) = x0
dt
is
n
x = ∑⟨x0 , zj ⟩eλj t zj
j=1

where zj are orthonormal set of eigenvectors and the inner product is defined by equa-
tion (11.35). To obtain this form of the solution, take inner product with zj ,
󳨐⇒
d
⟨x, zj ⟩ = ⟨Kx, zj ⟩
dt
= ⟨x, Kzj ⟩
= ⟨x, λj zj ⟩
= λj ⟨x, zj ⟩

⟨x, zj ⟩ = ⟨x0 , zj ⟩eλj t


n
x = ∑⟨x0 , zj ⟩eλj t zj
j=1
n
= ⟨x0 , x∗ ⟩x∗ + ∑⟨x0 , zj ⟩eλj t zj . (11.37)
j=2

Using the fact that ⟨x, x∗ ⟩ = 1, equation (11.37) simplifies to


n
x = x∗ + ∑⟨x0 , zj ⟩eλj t zj . (11.38)
j=2

The first term is the equilibrium solution, while the nonzero eigenvalues (λj , j =
2, . . . , n) determine the time scales associated with the transient process.
The Wie–Prater scheme is an experimental method for determining the eigenval-
ues and eigenvectors. From these experimental values, we can determine the rate con-
stant matrix K.
11.3 Application of weighted inner product to monomolecular kinetics | 267

k1j
.
Let kj = ( ) = j-th column of K.
.
knj

Since kj ∈ ℝn , expand it in terms of the eigenvectors as

n
kj = ∑⟨kj , zr ⟩zr .
r=1

Now consider
n kij zir
⟨kj , zr ⟩ = ∑
i=1
xi∗
n kji
=∑ zir
i=1
xj∗

1 n
= ∑k z
xj∗ i=1 ji ir
n
Kzr = λr zr ⇒ ∑ kji zir = λr zjr
i=1


1
⟨kj , zr ⟩ = λz
xj∗ r jr


n λr zjr 1 n
kj = ∑ zr = ∑λ z z (since λ1 = 0)
r=1 xj
∗ xj∗ r=2 r jr r

1 n
kij = ∑λ z z (11.39)
xj∗ r=2 r jr ir

All quantities on RHS are experimentally determinable. Thus, we can determine all
the rate constants from the eigenvectors and eigenvalues. An example in which this
procedure was used by Wei and Prater [31] is the catalytic isomerization of butenes.
The reaction network along with the relative values of the rate constants is shown in
Figure 11.5.
268 | 11 Applications of finite-dimensional linear algebra

Figure 11.5: Reaction network for monomolecular kinetics for catalytic isomerization of butenes.

In this case, the eigenvalues and orthonormal set of eigenvectors are given by

0.1436
λ1 = 0, z1 = x = ( 0.3213 ) ,

0.5351
0.1903
λ2 = −9.2602, z2 = ( 0.3050 ) ,
0.4953
0.2946
λ3 = −19.418, z2 = ( −0.3536 ) .
0.0590

The experimentally observed reaction paths (which include straight line reaction
paths) are shown in Figure 11.6.

Figure 11.6: Experimentally observed reaction paths for catalytic isomerization of butenes, obtained
from Wei and Prater [31].
11.3 Application of weighted inner product to monomolecular kinetics | 269

Other applications of weighted inner product to stage operations, kinetics and weight-
ed least squares are given in the exercises. Additional applications may also be found
in the book by Ramkrishna and Amundson [25].

Problems
1. Consider the vector space of 2-tuples of real numbers over the real field, i. e., ℝ2 .
(a) Find an inner product on ℝ2 w. r. t., which the matrix A = ( −1 1
4 −4 ) is self-adjoint
(symmetric).
(b) Determine the eigenvalues and eigenvectors of A and verify that the eigenvec-
tors are orthogonal w. r. t. the inner product defined in (a).
(c) Determine the normalized eigenvectors and show a schematic plot of these
eigenvectors.
(d) Use the above results to obtain the solution of initial value problem

du 1
= Au; u(t = 0) = ( ).
dt 0

2. Given the matrix

−γ α 0
A=( β −γ α ),
0 β −γ

where α, β and γ are positive and α ≠ β,


(a) how would you define the inner product in ℝ3 so that A is symmetric (self-
adjoint)?
(b) What additional condition need to be satisfied by α, β and γ so that A is neg-
ative definite?
3. The models used to describe the transient behavior of plate, gas absorbers, extrac-
tion units and rectifying and stripping and sections of a distillation column are
essentially the same. As an illustration, consider a staged absorber containing N
stages in which a heavy (liquid) phase and a light (gas) phase pass countercurrent
to each other. The contacting may be assumed to be uniform so that equilibrium
is attained in each stage and the equilibrium relationship is linear (y = Kx).
(a) Show that the dynamic model of the system is of the form

dxj
= αxj−1 − (α + β)xj + βxj+1 j = 1, 2, 3, . . . , N
dt

where α = Lh , β = GK
h
, L is the liquid (heavy-phase) flow rate, h is the holdup (of
the heavy phase), G, is the gas (light-phase) flow rate and xj is the composition
of the transferable component in the liquid stream leaving stage j. State any
other assumptions involved.
270 | 11 Applications of finite-dimensional linear algebra

y
(b) Assuming that the compositions x0 (t) and xN+1 (t) = N+1
K
, of the entering
streams are known, show that the model may be written in the form

dx
= Ax + b(t)
dt

Identify the vectors x, b(t) and the matrix A.


(c) The matrix A is not self-adjoint with respect to the usual inner product in ℝN .
However, show that if we define inner product as

N i−1
β
⟨x, y⟩ = ∑( ) xi yi
i=1
α

then A is self-adjoint with respect to this inner product.


(d) Using the above inner product, show that

n i−1 2
β β
⟨Ax, x⟩ = − ∑ α( ) [xi − x ] <0
i=0
α α i+1

Thus, A is negative definite and all its eigenvalues are strictly negative.
(e) Compute the steady-state values for y1 and x3 when N = 3 (three stage process)
and other parameters are given as follows:

L = 5, G = 3, h = 1, K = 1, x0 = 0, y4 = 0.5

Compute and plot the transient response when there is a step change in y4
from 0.5 to 0.7.
4. Consider a well-stirred batch reactor in which the following consecutive reversible
first-order reactions occur:

A1 󴀗󴀰 A2 󴀗󴀰 ⋅ ⋅ ⋅ 󴀗󴀰 An

(a) Denote the concentration vector by c = (c1, c2 , . . . , cn ) and identify the matrix
K, which yields the batch reactor equation

dc
= Kc.
dt

(b) Find the inner product on ℝn with respect to which K becomes self-adjoint.
Is K negative or nonpositive? Obtain the solution to the differential equation
above subject to the initial condition c(0) = c0 .
(c) Solve the transient continuous stirred tank reactor equation
11.3 Application of weighted inner product to monomolecular kinetics | 271

dc 1
= (cf − c) + Kc, c(0) = c0
dt τ

where cf is the feed concentration vector and τ is the holding time.


5. Difference equations of the following type arise in the steady-state analysis of
many stage operations:

ck+1 = Ack + bk ; c0 = α,

where A is a constant n × n matrix and ck , bk and α are n × 1 vectors.


(a) Obtain a formal solution of the above set of difference equations.
(b) Show that the linear second-order difference equation of the form

ck+1 + ack + bck−1 = bk ; c0 = α1 , c1 = a2

can be written in the matrix form given in (a). Obtain an explicit solution to
this equation.
6. Consider an extraction process in which G kg/s of a light phase containing Y0 kg
solute A per kg of carrier is fed to a cascade containing N ideal stages and con-
tacted with L kg/s of solute-free heavy phase containing XN+1 kg solute A per kg of
carrier.
(a) Formulate the relevant equations.
(b) If the concentration of A in the exit light and heavy streams are YN and X1 ,
respectively, show that the concentration of A in the light phase leaving stage
i is given by

Y0 − KX1 KG
i
KXN+1 − KG Y
L N
Yi = ( ) +
1− LKG L 1− LKG

where K is the equilibrium constant defined by YN = KXN .


7. A cascade of N stirred tank reactors is arranged to operate isothermally in series.
Each reactor has a volume of V m3 and is well stirred so that the composition of
the reactor effluent is the same as the tank contents. If initially the tanks contain
pure solvent only, and at a time designated t0 , q m3 /h of a reactant A of concen-
tration C0 kg moles/m3 are fed to the first tank, estimate the time required for the
concentration of A leaving the N-th tank to be CN kg moles/m3 . The reaction can
be represented stoichiometrically by the equation

A 󴀗󴀰 B 󳨀→ C

and all the reactions are first order. There is no B or C in the feed but it may be
assumed that the feed contains a catalyst that initiates the reaction as soon as the
feed enters the first reactor.
272 | 11 Applications of finite-dimensional linear algebra

8. Consider the three interacting tanks arranged in series as shown in Figure 11.7
below. The volume of tanks VR1 , VR2 and VR3 may not be identical. Similarly, the
exchange flow rate q1 between the tanks 1 and 2 may not be same as q2 between
tanks 2 and 3.

Figure 11.7: Schematic diagram of interacting tanks.

(a) Considering the transient process, formulate the model for concentrations ci
in each tank in the form:
dc
VR = Qc
dt

and show that (i) VR is the diagonal positive definite matrix, and (ii) Q is a
symmetric matrix with zero row and column sum (i. e., has a zero eigenvalue).
(b) Define a matrix A = V−1
R Q, show that (i) A is not symmetric w. r. t. usual inner
product

3
⟨x, y⟩ = yT x = ∑ xi yi
i=1

(ii) A is symmetric w. r. t. weighted inner product

⟨x, y⟩ = yT VR x

9. (Weighted least squares)


Consider a set of data points:

[(x1 , y1 ), (x2 , y2 ), . . . , (xN , yN )]


11.3 Application of weighted inner product to monomolecular kinetics | 273

and suppose that you want to fit a linear model

y = α0 + α1 x.

Suppose that the data point j, is given a weight wj (> 0). Formulate the weighted
least squares problem and determine the normal equations to be solved for α0
and α1 .
|
Part III: Linear ordinary differential equations-initial
value problems, complex variables and
laplace transform
12 The linear initial value problem
In earlier chapters, we have discussed the solution of linear initial value problems in
which a square matrix of constant coefficients appeared. In this chapter, we consider
the case of more general form of the linear initial value problem and discuss the rele-
vant theory along with applications.

12.1 The vector initial value problem


Suppose that A(t) is an n×n matrix whose entries {aij (t)} are continuous functions of t.
Then the most general linear initial value problem is defined by the equations

du
= A(t)u + b(t), 0<t<a (12.1)
dt
u(@ t = 0) = u0 (12.2)

where

u1 (t)
u2 (t)
u=( . )
.
un (t)

is an n-tuple of real (or complex) valued functions ui ∈ C 1 [0, a], a is a positive constant
and u0 is a constant vector determining the initial state of the system. The forcing
vector b(t) is also an n-tuple of real (or complex) valued function of t. The following
fundamental existence and uniqueness theorem may be stated for the initial value
problem defined by equations (12.1) and (12.2).
Theorem. Consider the IVP defined by equations (12.1) and (12.2) and suppose that

aij (t) ∈ C[0, a], bj (t) ∈ C[0, a], i, j = 1, 2, . . . , n

and let u0 be any vector in ℝn and t be a point in [0, a]. Then there exists one and only one
solution u(t), 0 < t < a, satisfying equations (12.1) and (12.2). This solution is a continu-
ous function of the initial conditions u0 and if aij (t) depend continuously on a parameter,
so does the solution. [For proof of this theorem, see Coddington and Levinson [13].]
We now outline a method for obtaining the solution of the initial value problem
defined by equations (12.1) and (12.2). As discussed previously, since equation (12.1)
is linear, the principle of superposition may be used and the general solution may be
written as

u(t) = uh (t) + up (t), (12.3)

https://doi.org/10.1515/9783110739701-013
278 | 12 The linear initial value problem

where uh (t) is the solution to the homogeneous initial value problem

duh
= A(t)uh (12.4)
dt
uh (@ t = 0) = u0 (12.5)

and up (t) is a particular solution of the inhomogeneous system

dup
= A(t)up + b(t) (12.6)
dt
up (@ t = 0) = 0. (12.7)

We now consider the homogeneous IVP defined by equation (12.4). The following
properties may be easily established:
1. The set of all solutions to equation (12.4) form a vector space.
2. There are n linearly independent solutions and every solution of equation (12.4)
is expressible as a linear combination of these n solutions.
3. Let {α1 , α2 , . . . , αn } be any set of n linearly independent constant vectors in ℝn /ℂn
and uj (t) be a solution of (12.4) satisfying the initial condition uj (0) = αj . Then the
set {u1 (t), u2 (t), . . . , un (t)} is linearly independent and forms a basis for the solu-
tion space.

Definition. A set of n linearly independent solutions of equation (12.4) is called a fun-


damental set of solutions. The matrix U(t) = [u1 (t) u2 (t) . . . un (t)] whose columns form
a fundamental set is called a fundamental matrix.

If U(t) is a fundamental matrix for equation (12.4), then every solution is of the
form

uh (t) = U(t)c (12.8)

where c is some constant vector.


Lemma.
(a) The fundamental matrix satisfies the matrix differential equation:

dU
= A(t)U(t) (12.9)
dt

(b) t

det U(t) = det U(τ) exp{∫ tr A(s) ds} (12.10)


τ

and U(t) is nonsingular at every point in [0, a]. [Here, tr A stands for the trace of the
matrix A].
12.1 The vector initial value problem | 279

(c) If B is any constant nonsingular matrix, then V(t) = U(t)B is also a fundamental
matrix and every fundamental matrix may be written in this form.

Note that the specific solution of equations (12.4) and (12.5) is given by

uh (t) = U(t)U(0)−1 u0 . (12.11)

We now consider the inhomogeneous equation (12.6) with homogeneous initial


condition (equation (12.7)).
Theorem. If U(t) is a fundamental matrix of the homogeneous system, a particular so-
lution of equations (12.6) and (12.7) is given by

up (t) = U(t) ∫ U(s)−1 b(s) ds (12.12)


0

Proof. We use the variation of parameter technique. Writing

up (t) = U(t)h(t) (12.13)

u′p = U′ (t)h(t) + U(t)h′ (t)


= A(t)U(t)h(t) + U(t)h′ (t)
= A(t)U(t)h(t) + b(t) (from Eq. (12.6))

U(t)h′ (t) = b(t)


h′ (t) = U(t)−1 b(t)

Integrating and using the initial condition gives


t

h(t) = ∫ U(s)−1 b(s) ds


0

and up (t) is given by equation (12.12).


Thus, the general solution of equation (12.1) is given by

u(t) = U(t)c + U(t) ∫ U(s)−1 b(s) ds (12.14)


0

where c is a constant vector. The unique solution of equations (12.1) and (12.2) is given
by
280 | 12 The linear initial value problem

t
−1 0
u(t) = U(t)U(0) u + U(t) ∫ U(s)−1 b(s) ds. (12.15)
0

We will be using equation (12.15) in many applications.

12.2 The n-th order initial value problem


Consider the n-th order linear differential operator defined by

dn u dn−1 u du
Lu = p0 (t) n
+ p1 (t) n−1 + ⋅ ⋅ ⋅ + pn−1 (t) + pn (t)u, 0 < t < a, (12.16)
dt dt dt

where pi (t), i = 0, 1, . . . , n are real (or complex) valued functions of a real variable t.
We assume that L is a regular differential operator, i. e., po (t) ≠ 0 for 0 ≤ t ≤ a and
pi (t) ∈ C[0, a]. Suppose that u(t) ∈ C n [0, a], the class of n-times differentiable func-
tions defined on the interval [0, a]. Then the most general form of a linear n-th order
initial value problem is given by

Lu = f (t), 0<t<a (12.17)


u(0) = α0 , u (0) = α1 , . . . , u
′ [n−1]
(0) = αn−1 . (12.18)

Defining

u1 (t) = u(t)
du1 du
u2 (t) = =
dt dt
. (12.19)
.
dun−1 dn−1 u
un (t) = = n−1
dt dt

Equations (12.17) and (12.18) may be written in the vector form of equations (12.1)
and (12.2) with

u0 = α
u(t)
0 1 0 0 . . . 0
u′ (t)
0 0 1 0 . . . 0
( . )
u=(
(
),
) A(t) = ( 0 0 0 1 . . . 0 )
.
. . . . . . . .
p (t) pn−1 (t)
− pp1 (t)
. n

u[n−1] (t) ( − p0 (t) − p0 (t)


. . . . .
0 (t) )
( )
12.2 The n-th order initial value problem | 281

0
0
f (t) ( . ) = f (t) en
)
b(t) = ( (12.20)
p0 (t) ( . ) p (t)
0
0
( 1 )

Thus, the n-th order IVP is a special case of the vector initial value problem. In fact,
every linear IVP (e. g., coupled higher-order scalar equations) may be written in the
form given by equation (12.1).
The solutions of Lu = 0 are precisely those vectors (functions) whose image under
L is zero, i. e., the kernel of L. It is easily shown that ker L is of dimension n and a basis
for ker L consists of n-linearly independent solutions.

Definition. Any vector whose components form a basis for ker L is called a fundamen-
tal vector for Lu = 0. If ψT = [ψ1 (t) ψ2 (t) . . . ψn (t)] is a fundamental vector, the
general solution of Lu = 0 is of the form u = ψT c, where c is a constant vector.

The fundamental matrix of the companion vector equation is called the Wronskian
matrix of equation (12.17). The Wronskian matrix is denoted by

ψ1 (t) . . . . ψn (t)
ψ′1 (t) . . . . ψ′n (t)
( . . . . . . )
K(ψ(t)) = (
(
)
)
. . . . . .
. . . . . .
( ψ1 (t)
[n−1]
. . . . ψn (t)
[n−1]
)

Similarly, the Wronskian vector of a solution ψ(t) is defined by

ψ(t)
ψ′ (t)
( . )
k(ψ(t)) = (
(
).
)
.
.
( ψ [n−1]
(t) )

Theorem. Given a linear n-th order homogeneous equation,


n
dn−j u
Lu = ∑ pj (t) = 0, (12.21)
j=0 dt n−j

let
282 | 12 The linear initial value problem

ψ1 ψn
dψ1 dψn
( dt ) ( dt )
u1 = k(ψ1 (t)) = ( . ), ..., un = k(ψn (t)) = ( . )
. .
dn−1 ψ1 dn−1 ψn
( dt n−1 ) ( dt n−1 )

be a set of solutions to the associated vector equation, i. e., ψ1 , ψ2 , . . . , ψn are solutions


to equation (12.21). Then a necessary and sufficient condition that these solutions be
linearly independent is that the Wronskian (defined as the determinant of the Wronskian
matrix) be nonzero.

Proof. Suppose that ψ1 , ψ2 , . . . , ψn are linearly independent, i. e., the only solution of

c1 ψ1 (t) + c2 ψ2 (t) + ⋅ ⋅ ⋅ + cn ψn (t) = 0

is c1 = c2 = ⋅ ⋅ ⋅ = cn = 0. Differentiating on both sides i times (i = 1, 2, 3, . . . , n − 1), we


get

c1 ψ′1 (t) + c2 ψ′2 (t) + ⋅ ⋅ ⋅ + cn ψ′n (t) = 0


.
.
c1 ψ1[n−1] + c2 ψ2[n−1] + ⋅ ⋅ ⋅ + cn ψ[n−1]
n =0

Since the only solution to the above system of n equations is the trivial one, we have

󵄨󵄨 ψ (t) ψ2 (t) . . . ψn (t) 󵄨󵄨


󵄨󵄨 1 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨
󵄨󵄨 ψ′1 (t) ψ′2 (t) . . . ψn (t) 󵄨󵄨󵄨

󵄨󵄨 󵄨󵄨
󵄨󵄨 . 󵄨󵄨
W(t) = 󵄨󵄨󵄨 󵄨󵄨 ≠ 0
󵄨󵄨
󵄨󵄨 . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . 󵄨󵄨
󵄨󵄨 [n−1] 󵄨
󵄨󵄨 ψ ψ[n−1] ψn (t) 󵄨󵄨󵄨
[n−1]
󵄨 1 (t) 2 (t) . . .

Conversely, if W(t) = 0, then there is a nontrivial solution and ψ1 , ψ2 , . . . , ψn are lin-


early dependent.

Theorem. The Wronskian W(t) is either identically zero or is never zero, i. e., it never
changes sign.
12.2 The n-th order initial value problem | 283

Proof. By definition,

󵄨󵄨 ψ (t) ψ2 (t) . . . 󵄨󵄨
ψn (t)
󵄨󵄨 1 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨
󵄨󵄨 ψ′1 (t) ψ′2 (t) . . . ψn (t) 󵄨󵄨󵄨

󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
W(t) = 󵄨󵄨󵄨 ψ1 (t) ψ2 (t) ψn (t) 󵄨󵄨󵄨
′′ ′′ ′′
󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . 󵄨󵄨
󵄨󵄨 [n−1] 󵄨󵄨
󵄨󵄨 ψ1 (t) ψ[n−1]
2 (t) . . . ψn (t) 󵄨󵄨󵄨
[n−1]

Using the rule for differentiation of a determinant, we get

󵄨󵄨 ψ′ ψ′2 . . . ψ′n 󵄨󵄨
󵄨󵄨 1 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 ψ1 ′
ψ2′
. . . ψ′n 󵄨󵄨
󵄨󵄨 󵄨󵄨
dW(t) 󵄨󵄨󵄨󵄨 ψ′′ ψ2
′′
ψn
′′ 󵄨󵄨
󵄨󵄨
= 󵄨󵄨 1 󵄨󵄨
dt 󵄨󵄨
󵄨󵄨 . .
󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 . .. 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 [n−1] 󵄨󵄨
󵄨󵄨 ψ1 ψ[n−1]
2 . . . ψ[n−1]
n 󵄨󵄨
󵄨󵄨 ψ ψ2 . . . ψn 󵄨󵄨 󵄨󵄨 ψ ψ2 . . . ψn 󵄨󵄨󵄨󵄨
󵄨󵄨 1 󵄨󵄨 󵄨󵄨 1
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨
󵄨󵄨 ψ1 ′′
ψ2′′
. . . ψ′′n 󵄨󵄨 󵄨󵄨 ψ′1 ψ′2 . . . ψ′n 󵄨󵄨󵄨
󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨
󵄨
󵄨󵄨
+ 󵄨󵄨󵄨
󵄨 . . 󵄨󵄨 + ⋅ ⋅ ⋅ + 󵄨󵄨󵄨 . . 󵄨󵄨󵄨
󵄨󵄨
󵄨󵄨 . . 󵄨󵄨󵄨 󵄨󵄨󵄨 . . 󵄨󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 . . 󵄨󵄨 󵄨󵄨 . . 󵄨󵄨󵄨
󵄨󵄨 [n−1] 󵄨󵄨 󵄨󵄨 [n] 󵄨󵄨
󵄨󵄨 ψ ψ2[n−1] . . ψn
[n−1] 󵄨󵄨 󵄨󵄨 ψ ψ[n] ψ[n] 󵄨
󵄨 1 . 󵄨 󵄨 1 2 . . . n 󵄨󵄨
󵄨󵄨󵄨 ψ1 (t) ψ2 (t) . . . ψn (t) 󵄨󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 ψ′ (t) ψ′2 (t) . . . ψ′n (t) 󵄨󵄨
󵄨󵄨 1 󵄨󵄨
󵄨󵄨 ′′ 󵄨󵄨
󵄨 󵄨󵄨
= 0 + 0 + ⋅ ⋅ ⋅ + 0 + 󵄨󵄨󵄨 ψ1 (t) ψ2 (t) ψn (t)
′′ ′′
󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 [n−2] 󵄨󵄨
󵄨󵄨󵄨 ψ1 . ψ[n−2]
2 ψ[n−2]
n
󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨󵄨 ψ[n] (t) ψ[n] ψ[n] 󵄨󵄨
󵄨 1 2 (t) . . . n (t) 󵄨

Now,

Lψj = 0


p1 [n−1] p2 [n−2] p p
ψ[n]
j =− ψ − ψj − ⋅ ⋅ ⋅ − n−1 ψ′j − n ψj
p0 j p0 p0 p0

Substitute this in the above determinant and perform the following (n − 1) row opera-
tions:
284 | 12 The linear initial value problem

p
(i) Multiply row 1 by pn and add to the last row
0
(ii) Multiply row 2 by pn−1 /p0 and add to the last row
..
.

Multiply row (n − 1) by p2 /p0 and add to the last row



󵄨󵄨 󵄨󵄨
󵄨󵄨 ψ1 ψ2 . . . ψn 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨 ψ′1 ψ′2 . . . ψ′n 󵄨󵄨
󵄨󵄨
dW 󵄨󵄨 󵄨󵄨 = − p1 W(t)
󵄨 . . 󵄨󵄨
= 󵄨󵄨󵄨 󵄨󵄨
dt 󵄨󵄨 . 󵄨󵄨 p0
󵄨󵄨 󵄨󵄨󵄨
󵄨󵄨 . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 p1 [n−1]
󵄨󵄨 − p ψ1
󵄨 − pp1 ψ[n−1]
2 . . . − pp1 ψn[n−1] 󵄨󵄨
󵄨󵄨
0 0 0


t
p1 (s)
W(t) = W(t0 ) exp{− ∫ ds}
p0 (s)
t0

If W(t0 ) = 0 ⇒ W(t) ≡ 0 and if W(t0 ) ≠ 0, then W(t) never vanishes


∴ The result.

12.2.1 The n-th order inhomogeneous equation

Consider the inhomogeneous equation

Lu = f (t), (12.22)

or its companion matrix equation

du
= A(t)u + b(t). (12.23)
dt
Theorem. Let L be regular and f (t) ∈ C[t0 , b], t0 < b < ∞. Then the solution of the IVP

Lu = f (t)
k(u(t0 )) = α

is given by

t
T T f (s)
u(t) = [ψ(t)] c + [ψ(t)] ∫ K(ψ(s)) en ds (12.24)
−1
p0 (s)
t0
12.2 The n-th order initial value problem | 285

where

0
ψ1 (t)
0
ψ2 (t)
( . )
en = (
(
),
) ψ(t) = ( . )
.
.
0
ψn (t)
( 1 )

is a fundamental vector of the homogeneous system and K(ψ(t)) is the Wronskian matrix.
The constant vector c is determined from the algebraic equations

T
α0 = [ψ(t0 )] c
T
α1 = [ψ′ (t0 )] c
.
.
T
αn−1 = [ψ[n−1] (t0 )] c

or

α = K[ψ(t0 )]c or c = K[ψ(t0 )] α.


−1

Proof. The proof of equation (12.24) follows directly from that for the vector equation
du
dt
= A(t)u + b(t), whose solution is given by

u(t) = U(t)c + U(t) ∫ U(s)−1 b(s) ds.


t0

Now, let

0
0
b(s) = ( . ) = en f (s)
p0 (s)
.
f (s)
( p0 (s) )

and use the fact that the fundamental matrix for the scalar problem is the Wronskian
matrix, i. e.,

U(s) = K(ψ(s)),

and take only the first component of the solution u(t).


286 | 12 The linear initial value problem

A special case of equation (12.24) that is of interest in many applications is when


n = 2 for which it may be written as

t
ψ1 (t)ψ2 (s) − ψ1 (s)ψ2 (t) f (s)
u(t) = c1 ψ1 (t) + c2 ψ2 (t) + ∫ ds (12.25)
W(s) p0 (s)
t0

where W(s) = ψ1 (s)ψ′2 (s) − ψ′1 (s)ψ2 (s) is the Wronskian.

Example 12.1. Consider the second-order equation

u′′ + u = 2 sin t.

The two linearly independent solutions of the homogeneous equation are given by

ψ1 = sin t, ψ2 = cos t
sin t cos t
K(ψ) = ( ), W(t) = −1 ≠ 0
cos t − sin t

up (t) = particular solution


t

= ∫(sin t cos s − sin s cos t)2 sin s ds


0

= sin t − t cos t

Thus, the general solution is given by

u(t) = c1 sin t + c2 cos t + sin t − t cos t.

12.3 Linear IVPs with constant coefficients


In this section, we consider some special cases of the general IVP.
(a) The first special case is of the vector IVP where the matrix A(t) is a constant matrix
and the forcing vector b(t) is also a constant, i. e.,

du
= Au + b (12.26)
dt
u = u0 @ t = 0 (12.27)

In this case, U = eAt is a fundamental matrix. Thus, uh = eAt c and a particular


solution is given by
12.3 Linear IVPs with constant coefficients | 287

up = U(t) ∫ U(s)−1 b(s) ds


0
t

= eAt ∫ e−As b ds = eAt [−A−1 e−At b + A−1 b] (if A is invertible)


0

Simplifying, we get

up (t) = −A−1 b + A−1 eAt b (12.28)

The general solution in given by

u = eAt c + A−1 eAt b − A−1 b

Applying the initial condition,

u = u0 @ t = 0 ⇒ c = u0 + 0

u = eAt u0 + A−1 eAt b − A−1 b (12.29)

Another method: The solution to equation (12.26) may also be obtained by de-
termining the steady-state solution and subtracting it to obtain a homogeneous
equation.
Let z = u − us , us = −A−1 b ⇒

dz
= Az
dt
z = z0 = u0 − us @ t = 0
z = eAt z0 ⇒ u = −A−1 b + eAt (u0 + A−1 b)
u = eAt u0 + eAt A−1 b − A−1 b

Thus, for the case of constant coefficients and constant forcing vector, the solution
is given by

u(t) = eAt [u0 + A−1 b] − A−1 b. (12.30)

[Remark: Since A−1 and eAt commute, the solution given by equations (12.29) and
(12.30) are identical].
288 | 12 The linear initial value problem

(b) The second special case is that of scalar n-th order IVP with constant coefficients:

p0 u[n] + p1 u[n−1] + ⋅ ⋅ ⋅ + pn−1 u′ + pn u = f (t).


u(0) = α0 ,
u′ (0) = α1 ,
.
.
u [n−1]
(0) = αn−1 .

Let

u = eλt

⇒ The characteristic equation is given by

p0 λn + p1 λn−1 + ⋅ ⋅ ⋅ + pn−1 λ + pn = 0

Denote the roots of the characteristic equation by

λ1 , λ2 , . . . , λn ,

and for simplicity assume that they are all distinct. Then

ψj (t) = eλj t , j = 1, 2, 3, . . . , n

are linearly independent solutions. Therefore,

n
uh = ∑ cj eλj t
j=1
t
T f (s)
up (t) = ψ(t) ∫[K(ψ(s))] en ds
−1
p0 (s)
0

where

ψ(t)T = [ ψ1 (t) . ψn (t) ]


ψ1 ψ2 . . . ψn
[
[ ψ′1 ψ′2 . . . ψ′n ]
]
[ ]
ψ1 ψ2 ψn
′′ ′′ ′′
[ ]
K(ψ(t)) = [
[
]
]
[ . ]
[ ]
[ . ]
ψ
[ 1
[n−1]
ψ2[n−1] . . . ψn[n−1] ]
12.3 Linear IVPs with constant coefficients | 289

e λ1 t e λ2 t . . . e λn t
[ λ 1 e λ1 t λ2 eλ2 t λ n e λn t
[ ]
. . . ]
[ ]
[ . . . ]
=[
[
]
]
[ . ]
[ ]
[ . ]
n−1 λ1 t
[ 1 e
λ λ2n−1 eλ2 t . . . λnn−1 eλn t ]

Evaluation of K(ψ(t)) at t = 0 leads to the Van der Monde matrix, whose inverse
can be expressed analytically for the case of distinct eigenvalues.

Many examples illustrating the application of the above theory to first- and second-
order scalar as well as vector equations are given as problems below and in Chapter 14.

Problems
1. The vibration of the spring-mass system (with identical springs and masses) may
be described by the equations

m d2 u1
= −2u1 + u2
k dt 2
2
m d u2
= u1 − 2u2
k dt 2

where u1 and u2 are the displacements of the masses from their equilibrium po-
sitions. (a) Cast the above equations in dimensionless form, (b) Determine the
natural frequencies of vibration and the modes (eigenvectors) of vibration and (c)
What will be the form of the equations if damping is included? Write the equations
in vector/matrix form.
2. Show that the solution of the inhomogeneous system

du
= Au + f(t), u(t = 0) = u0
dt

is given by

u = eAt u0 + ∫ eA(t−t ) f(t ′ ) dt ′ .


3. Determine a formula similar to that given in problem (2) for the solution of the
inhomogeneous system

d2 u du
= −Au + f(t), u(t = 0) = u0 , (t = 0) = v0 .
dt 2 dt
290 | 12 The linear initial value problem

4. Consider the flow system shown in Figure 12.1. Assume that each tank is well
mixed and species A enters tank 1 at a concentration of cin (t) and leaves at c1 (t).
Assume further that VR1 = 1 m3 , VR2 = 32 m3 and q1 = q2 = 2 m3 / min. (a) Formulate
the differential equations describing the transient behavior of the system and put
them in vector/matrix form. (b) Determine the response of the system (i. e., how
the exit concentration c1 (t) varies with time) for a unit step input (cin (t) = 1 for
t > 0 and 0 for t < 0). Assume that no A is present initially in either tank.

Figure 12.1: Schematic diagram of interacting tanks.

5. Solve problem (4) by converting the model to a scalar initial value problem for
c2 (t).
6. Consider the linear system

du 0 1 1
=( )u + ( ).
dt −2 3 1

Determine (a) a fundamental matrix for the homogeneous system (b) a particular
solution to the inhomogeneous system (c) the general form of the solution (d) the
solution to the initial value problem u(0) = ( 21 ).
7. (a) Determine a fundamental matrix for the linear system

du 1 1
=( ) u.
dt 0 1

(b) Determine the vector differential equations for which the following are funda-
mental matrices

1 t t2
et tet
(i) U(t) = ( t ) (ii) U(t) = ( 0 1 t )
e (1 + t)et
0 0 1

(c) Verify that ψ1 (t) = t 3 , ψ2 (t) = t −2 are linearly independent solutions of


12.3 Linear IVPs with constant coefficients | 291

Lu = u′′ − 6t −2 u = 0.

Find a particular solution to the inhomogeneous equation Lu = t ln(t).


8. (a) Derive a formula for the solution of the initial value problem

Lu = f (t); k(u(0)) = b,

where L is an n-th order linear differential operator with constant coefficients, k is


the Wronskian vector and b is a constant vector.
(b) Use the above formula to solve the initial value problem for forced oscillation
of a second-order system

u′′ + cu′ + α2 u = a cos ωt


0
k(u(0)) = ( )
0

Here, c, α and a are positive constants. Plot the response for the overdamped
(c = 6, α2 = 5, ω = 1), critically damped (c = 2, α2 = 1, ω = 1) and underdamped
cases (c = α2 = 2, ω = 1). Also, plot the amplitude of the asymptotic response as a
function of the forcing frequency for a fixed α2 (say α2 = 4) and varying values of
c(≥ 0).
13 Linear systems with periodic coefficients
The theory of linear differential equations with periodic coefficients is encountered
in applications such as the analysis of transport phenomena with spatially periodic
transport properties, in the development of time averaged models of periodically
forced systems, in the determination of the stability of periodic solutions of nonlinear
systems, and so forth. In this chapter, we outline the theory briefly.

13.1 Scalar equation with a periodic coefficient


Consider the scalar initial value problem

du
= a(t)u, u = u0 @ t = 0 (13.1)
dt

with

a(t + T) = a(t).

Integrating equation (13.1), we get

ln u = ∫ a(s) ds + c
0

t = 0, u = u0 ⇒ c = ln u0

⇒ The solution is given by

t
u
ln( 0 ) = ∫ a(s) ds. (13.2)
u
0

To investigate the nature of this solution when a(s) is periodic, we consider the differ-
ent time intervals:
(i) 0 < t ≤ T

t
u(t)
= exp{∫ a(s) ds} (13.3)
u0
0

(ii) T < t < 2T

https://doi.org/10.1515/9783110739701-014
13.1 Scalar equation with a periodic coefficient | 293

T t
u(t)
= exp{∫ a(s) ds + ∫ a(s) ds}
u0
0 T
T t

= exp{∫ a(s) ds}. exp{∫ a(s) ds}


0 T

Let

T kT

m = exp{∫ a(s) ds} = exp{ ∫ a(s) ds}, k = 1, 2, . . . .


0 (k−1)T

Then we get

t
u(t)
= m exp{∫ a(s) ds}
u0
T

We now make the substitution ŝ = s − T in the integral inside the exponent to


obtain
t−T
u(t)
= m exp{ ∫ a(ŝ + T)ds}
̂
u0
0

but a(ŝ + T) = a(s)̂ ⇒

t−T
u(t)
= m exp{ ∫ a(s)d
̂ s};
̂ T < t < 2T (13.4)
u0
0

(iii) If 2T < t < 3T, following the above procedure, it is easily seen that

t−2T
u(t)
= m2 exp{ ∫ a(s) ds}; 2T < t < 3T, etc.
u0
0

Thus, we can compute u(t) for all t if we know u(t) for 0 < t < T.

Let
T T
1
m = eρT = exp{∫ a(s) ds}; ρ= ∫ a(s) ds
T
0 0

Claim: The solution to equation (13.1) may be expressed as


294 | 13 Linear systems with periodic coefficients

u(t) = eρt p(t)

where

p(t + T) = p(t)
t

= e−ρt exp{∫ a(s) ds}u0 . (13.5)


0
t+T

p(t + T) = e−ρ(t+T) exp{ ∫ a(s) ds}u0


0
T t+T

= e−ρt .e−ρT . exp{∫ a(s) ds} exp{ ∫ a(s) ds}u0


0 T
t

= e−ρt .1. exp{∫ a(s) ds}u0


0
= p(t)

Thus, when a(t) is periodic, the solution of equation (13.1) is of the form

u(t) = eρt p(t)

where p(t) is a periodic function. The solution is periodic iff ρ = 0 or m = 1



T

exp{∫ a(s) ds} = 1


0


T

∫ a(s) ds = 0.
0

Example 13.1. We consider the linear initial value problem

du
= (cos t)u; u = u0 .
dt
Here,

a(s) = cos s

T = 2π; ∫ cos s ds = − sin s|2π


0 = 0 󳨐⇒ m = 1 or ρ = 0.
0
13.2 Vector equation with periodic coefficient matrix | 295

p(t) = exp{∫ cos s ds}u0


0

= exp{− sin s|t0 }u0


= exp{− sin t}u0

Thus, the solution is given by

u(t) = p(t) = exp{− sin t}u0 .

A plot of this solution is shown in Figure 13.1 for u0 = 1.

Figure 13.1: Plot of the periodic solution of the scalar linear equation with ρ = 0.

13.2 Vector equation with periodic coefficient matrix


We now consider the vector equation

du
= A(t)u (13.6)
dt
A(t + T) = A(t) (13.7)

Here, A is a n × n matrix of complex valued continuous functions of real variable t.


T is the period of A(t). The following theorem may be stated for the system defined by
equations (13.6) and (13.7).

Theorem. If U(t) is a fundamental matrix of equations (13.6) and (13.7) so is V(t) where

V(t) = U(t + T)

Corresponding to every such U(t), ∃ a nonsingular matrix P(t), which is periodic with
period T and a constant n × n matrix C such that

U(t) = P(t)etC
296 | 13 Linear systems with periodic coefficients

Proof. Since

dU
= A(t)U ⇒ V′ (t) = U′ (t + T).I
dt
= A(t + T)U(t + T)
= A(t)V(t)
det V(t) = det{U(t + T)} ≠ 0

∴ V(t) is a fundamental matrix.


∴ ∃ a constant non singular matrix M so that

U(t + T) = U(t)M.

Since M is nonsingular, ∃ a constant matrix C such that

1
M = eTC (i. e., ln M = TC or C = ln M)
T

U(t + T) = U(t)eTC
= U(t)e−tC e(t+T)C
= P(t)e(t+T)C ,

where the periodic matrix P(t) is defined by

P(t) = U(t)e−tC
P(t + T) = U(t + T)e−(t+T)C
= P(t)e(t+T)C .e−(t+T)C
= P(t)

∴ The result.

The significance of the above theorem is that the determination of the fundamen-
tal matrix U(t) over a finite interval of length T (e. g., 0 ≤ t ≤ T) leads at once to the
determination of U(t) over (−∞, ∞). This follows from the periodicity property.

U(t + T) = P(t)e(t+T)C
= P(t)etC .eTC
= U(t)eTC = U(t)M


13.2 Vector equation with periodic coefficient matrix | 297

U(t + 2T) = U(t)e2TC


U(t + nT) = U(t)enTC , n is a positive or negative integer
n
= U(t)M

Remark. If we chose the initial condition

U(0) = U0

Then

U(T) = U0 eTC = U0 M


1
C= ln U−1
0 U(T)
T
1
= ln[U−1
0 U(T)]
T

For the special case,

U(0) = I


1
C= ln U(T)
T

Thus, C can be determined from U(T). Each column of U(T) may be determined by
integrating the IVP

du
= A(t)u; u(0) = ej .
dt

Definition. M is called the monodromy matrix. The eigenvalues of M are called the
characteristic multipliers of the periodic initial value problem. The matrix C is called
the Floquet matrix and the eigenvalues of C are called the characteristic exponents or
Floquet exponents.

Remark. The characteristic multipliers μ1 , μ2 , . . . , μn are uniquely defined. The char-


acteristic exponents ρ1 , ρ2 , . . . , ρn are given by

Tρj = ln μj

If we take the principal value of the logarithm, then ρj is also uniquely defined.

Lemma 13.1. M is not unique but any two are related by a similarity transform.
298 | 13 Linear systems with periodic coefficients

Proof. Let Ui (t) be a fundamental matrix (i = 1, 2)

U1 (t + T) = U1 (t)M1
U2 (t + T) = U2 (t)M2

But U2 (t) = U1 (t)D, D is nonsingular:

M2 = U−1
2 (t)U2 (t + T)
= D−1 U−1
1 (t)U1 (t + T)D
= D−1 M1 D

Therefore, the eigenvalues of M1 and M2 are the same.

In order to see the explicit form of the solution of the periodic system (equations
(13.6) and (13.7)), we consider the simple case in which matrix C is diagonalizable, i. e.,
∃ a nonsingular constant matrix S such that

C = SρS−1

where

ρ1 0
ρ=( . )
0 ρn

e ρ1 t 0
tC
e = S( . ) S−1
ρn t
0 e

e ρ1 t 0
U(t) = P(t)S ( . ) S−1
ρn t
0 e

From this, it is clear that U(t) consists of columns u1 (t), . . . , un (t), which are of the form

ui (t) = pi (t)eρi t

where pi (t + T) = pi (t) is a periodic n-vector.


⇒ General form of the solutions of equations (13.6) and (13.7) is

u = p1 (t)eρ1 t + ⋅ ⋅ ⋅ + pn (t)eρn t
13.2 Vector equation with periodic coefficient matrix | 299

Lemma 13.2. A complex number ρ is a characteristic exponent of equations (13.6) if and


only if there is a nontrivial solution of equations (13.6) of the form eρt p(t) where p(t +T) =
p(T).

Proof. This follows from above discussion. Note that from this lemma it follows that
u(t) is a periodic solution iff ∃ a characteristic exponent of C, which is zero, i. e., ρi = 0
for some i or equivalently μi = +1 for some i.

Lemma 13.3. If μ1 , μ2 , . . . , μn are the characteristic multipliers of the periodic system


(equations (13.6) and (13.7)), then

n T

∏ μi = exp{∫ trace A(s) ds}


i=1 0

Proof. For the general linear system, we have seen that

det U(t) = det U(τ) exp{∫ trace A(s) ds}


τ

Let τ = 0, t = T

det U(T) = det U(0) exp{∫ trace A(s) ds}


0

For the periodic system, we can take

U(0) = I
U(T) = M

det M = exp{∫ trace A(s) ds}


0

but
n
det M = μ1 .μ2 . . . μn = ∏ μi
i=1

∴ The result.
300 | 13 Linear systems with periodic coefficients

Thus, if (n − 1) of the characteristic multipliers are known we can determine the


remaining one using the above relation. For example, for the special case of a planar
system (n = 2), we have

μ1 μ2 = exp{∫[a11 (s) + a22 (s)] ds}


0

This result is useful in determining the stability of bifurcating periodic solutions of a


planar system. In this case, one Floquet multiplier is unity and the magnitude of the
second multiplier determines the stability of the bifurcating periodic solution.

Problems
1. Consider the linear system with periodic coefficients:

u′ = A(t)u; A(t) = A(t + T) (13.8)

(a) Show that U(t) is a fundamental matrix, it may be written as

U(t) = P(t)etC , P(t) = P(t + T)

where P(t) is a n × n periodic matrix and C is a constant matrix.


(b) Determine the asymptotic form of the solution to the initial value problem if
C has (i) simple eigenvalues in the left half of the complex plane, (ii) a zero
eigenvalue and all other in the left half-plane and (iii) a pair of purely imagi-
nary eigenvalues and all others in the left half-plane.
(c) If μ1 , μ2 , . . . , μn are the characteristic multipliers of the periodic system (i),
show that

n T

∏ μi = exp{∫ trace A(s) ds}


i=1 0

2. If μ1 , μ2 , . . . , μn are the characteristic multipliers of the periodic system (i),


(a) Discuss the nature of solution when μi = −1 for some i;
(b) Discuss the nature of the solution when a pair of μi are on the unit circle.
3. Consider the linear system

du1
= (− sin 2t)u1 + [(cos 2t) − 1]u2
dt
du2
= (1 + cos 2t)u1 + (sin 2t)u2
dt
13.2 Vector equation with periodic coefficient matrix | 301

(a) Verify that

et (cos t − sin t) e−t (cos t + sin t)


U(t) = ( )
et (cos t + sin t) e−t (− cos t + sin t)

is a fundamental matrix.
(b) Obtain the corresponding monodromy matrix and determine the characteris-
tic multipliers and characteristic exponents.
4. Consider the nonlinear system

du1
= u1 − u2 − u31 − u1 u22
dt
du2
= u1 + u2 − u21 u2 − u32 (13.9)
dt

(a) Verify that u01 = cos t, u02 = sin t is a (periodic) solution.


(b) Show that the linearization of equation (13.9) around (u01 , u02 ) gives the follow-
ing linear system with period coefficients:

dz1
= (−1 − cos 2t)z1 − [1 + sin 2t]z2
dt
dz2
= (1 − sin 2t)z1 + (cos 2t − 1)z2 (13.10)
dt

(c) Verify that

− sin t e−2t cos t


U(t) = ( )
cos t e−2t sin t

is fundamental matrix.
(d) Determine the monodromy matrix and the Floquent multipliers of the peri-
odic system defined by equation (13.10).
5. Consider an ideal mixing tank with constant fluid density and tank volume but
periodically varying flow rate. With appropriate notation,
(a) Show that the concentration of a solute satisfies the scalar equation,

dc 1 + ε sin(ωt)
= [cin (t) − c], t > 0; c = c0 @ t = 0
dt τ

(b) Obtain the solution of the above equation for c0 = 0, and cin = H(t) = Heavi-
side’s unit step function.
(c) Obtain the solution for cin (t) = 0 and c0 = 1, and show a plot of the solution
for ε = 0 and ε > 0.
14 Analytic solutions, adjoints and integrating
factors
We have already shown that the solution to linear scalar and vector differential equa-
tions with constant coefficients can be expressed analytically in terms of the eigenval-
ues. This chapter deals with other cases in which it is possible to express the solutions
in explicit form.

14.1 Analytic solutions


First consider the general nth order linear differential equation

dn u dn−1 u du
p0 (t) + p 1 (t) + ⋅ ⋅ ⋅ + pn−1 + pn (t)u = f (t)
dt n dt n−1 dt
or

Lu = f (t) (14.1)

Consider only the homogeneous equation

Lu = 0 (14.2)

Under certain special conditions, we can obtain the solution of equation (14.2) analyt-
ically. We discuss some of these special cases here.
(i) Scale Invariance in t

Suppose that equation (14.2) is invariant to the scaling t → at ⇒ Lu = 0. Then the


equation is called equidimensional or Euler’s equation. To illustrate, consider the case
of n = 2:

d2 u du
p0 (t) + p1 (t) + p2 (t)u = 0 (14.3)
dt 2 dt

and the special case of

p0 (t) = α0 t 2 , p1 (t) = α1 t, p2 (t) = α2

gives

α0 t 2 u′′ + α1 tu′ + α2 u = 0.

d d
Let t ′ = at ⇒ dt
= dt ′
.a

https://doi.org/10.1515/9783110739701-015
14.1 Analytic solutions | 303


2
t′ d2 u t′ du
α0 ( ) a2 ′2 + α1 ( ).a ′ + α2 u = 0
a dt a dt

d2 u du
α0 t ′2 + α1 t ′ ′ + α2 u = 0.
dt ′2 dt

Thus, the equation is invariant to the scaling t → at. Scale invariant equations can be
converted to constant coefficient equations by the transformation

t = ex or x = ln t (14.4)

d d 1
= .
dt dx t

d d
t =
dt dx
d2 d2 1 1 d 2 d
2
d2 d
= . − ⇒ t = −
dt 2 dx 2 t 2 t 2 dx dt 2 dx dx

Thus, using the transformation given by equation (14.4), equation (14.3) reduces to

d2 u du
α0 + (α1 − α0 ) + α2 u = 0
dx 2 dx

The solution of this equation is of the form

u(x) = c1 eλ1 x + c2 eλ2 x

where λ1 , λ2 are roots of α0 λ2 + (α1 − α0 )λ + α2 = 0,


u(t) = c1 t λ1 + c2 t λ2

If λ1 = λ2 ⇒ u(x) = c1 eλ1 x + c2 xeλ2 x ,


u(t) = c1 t λ1 + c2 (ln t)t λ1 .

The characteristic equation may also be obtained by the substitution


304 | 14 Analytic solutions, adjoints and integrating factors

u = tλ

du
u′ = = λt λ−1
dt
u′′ = λ(λ − 1)t λ−2

α0 λ(λ − 1) + α1 λ + α2 = 0
α0 λ2 + (α1 − α0 )λ + α2 = 0

[Remark: If we write u(λ) = t λ , then du



= (ln t)t λ . Hence, when λ is a double root, the
two linearly independent solutions are t λ and (ln t)t λ .]

Example 14.1. Consider the equation


u
u′′ + =0
4t 2

2
1 1 1
(λ − ) = 0 ⇒ λ = ,
2 2 2

u(t) = c1 √t + c2 √t ln t

Thus, for scale invariant equations in t, the linearly independent solutions are of the
form t λ or t λ ln t is a solution for some λ determined by the characteristic equation.
(ii) Scale invariant equations in u

An ODE (linear or nonlinear) is called scale invariant w. r. t. u if the transformation


u → au (a ≠ 0) leaves the equation invariant. All linear equations Lu = 0 are invariant
w. r. t. u. If an equation is scale invariant w. r. t. u, we can reduce its order by one. To
illustrate, consider the second-order equation

d2 u du
p0 (t) + p1 (t) + p2 (t)u = 0, (14.5)
dt 2 dt

which is scale invariant w. r. t. u. Define

u(t) = ew(t) (14.6)


14.1 Analytic solutions | 305

u′ = ew .w′ = uw′
2
u′′ = ew .(w′ ) + uw′′


2
p0 (t).u[(w′ ) + w′′ ] + p1 (t)uw′ + p2 (t)u = 0


2
p0 [(w′ ) + w′′ ] + p1 w′ + p2 = 0 (14.7)

Let w′ = y

dy
p0 (t) + p0 (t)y2 + p1 (t)y + p2 (t) = 0 (14.8)
dt

This is a first-order (but nonlinear) equation in y.


(iii) Scale invariant equations in t and u

An ODE (linear or nonlinear) is called scale invariant in t and u if ∃ a transformation


of the form t → at, u → am u(for some m and a ≠ 0) that leaves the equation invariant.
It can be shown that all scale invariant equations can be transformed to equidi-
mensional (Euler) equations in t.

Examples.
(1) (u′ )2 + uu′′ + t = 0 is scale invariant t → at and u → a3/2 u
(2) du
dt
= F( ut ) is scale invariant t → at and u → au

First-order equations
(a) The linear equation

du
+ p(t)u = q(t) (14.9)
dt

can be solved exactly in terms of quadratures

u exp{∫ p(t ′ ) dt ′ } = ∫ q(t) exp{∫ p(t ′ ) dt ′ } dt

(b) The Bernoulli’s equation

du
+ p(t)u = q(t)um (14.10)
dt

can be solved by the substitution


306 | 14 Analytic solutions, adjoints and integrating factors

1−m
y = [u(t)]

to obtain a linear equation in y.


(c) Riccati equation

du
= a(t)u2 + b(t)u + c(t) (14.11)
dt
a(t) = 0 ⇒ linear equation
c(t) = 0 ⇒ Bernoulli’s equation

When a ≠ 0, let

−w′ (t)
u=
a(t)w(t)

a′ (t)
w′′ (t) − [ + b(t)]w′ (t) + a(t)c(t)w = 0
a(t)

This is a linear second-order equation.


Another special feature of the Riccati equation is that it can be solved exactly if
we can find a (special) solution by inspection. Let

u = u1 (t) + v(t) (14.12)

where u1 is a special solution. Substitution of equation (14.12) in equation (14.11)


gives

v′ (t) = [b(t) + 2a(t)u1 ]v(t) + a(t)vm (14.13)

This is Bernoulli equation for v.


(d) Separable and exact equations

du
a(u, t) + b(u, t) = 0
dt
or

a(u, t) du + b(u, t) dt = 0 (14.14)

If a(u, t) = a1 (u)a2 (t) and b(u, t) = b1 (u)b2 (t), then equation (14.14) is separable
and can be solved by quadratures. If

𝜕a 𝜕b
=
𝜕t 𝜕u
then we have an exact equation that can be solved analytically.
14.2 Adjoints and integrating factors | 307

A listing of all ODEs (linear and nonlinear) that have analytic solutions can be found
in the book by E. Kamke [21].

14.2 Adjoints and integrating factors


The concept of adjoint plays an important role in the theory of differential equations.
We discuss it here in the context of initial value problems and revisit it again in Chap-
ter 18.

14.2.1 First-order equation

Consider the first-order homogeneous equation

du
Lu ≡ p0 (t) + p1 (t)u = 0 (14.15)
dt

Multiply both sides of equation (14.15) by v(t),

du
vLu ≡ v(t)p0 (t) + v(t)p1 (t)u = 0 (14.16)
dt

Now using the chain rule:

d du d
(vp0 u) = vp0 + u (vp0 )
dt dt dt
we can write

d d
vLu ≡ [p (t)u.v] − u (p0 (t)v) + v.p1 (t)u = 0
dt 0 dt
󳨐⇒ vLu ≡ (p0 uv)′ + u[−(p0 v)′ + p1 v] = 0 (14.17)

Suppose that v(t) satisfies the equation

− (p0 v)′ + p1 v = 0 (14.18)

Then the LHS or RHS of equation (14.17) is an exact derivative. The function v(t) is
called the integrating factor of equation (14.15). To find the integrating factor of equa-
tion (14.15), we end up with equation (14.18) and to find the integrating factor of equa-
tion (14.18) we end up with equation (14.15). Hence, Lagrange called equation (14.18),
the adjoint equation.
Thus, the adjoint operator of equation (14.15) is defined by
308 | 14 Analytic solutions, adjoints and integrating factors

L∗ v = −(p0 v)′ + p1 v
dv
= −p0 (t) + [p1 (t) − p′0 (t)]v (14.19)
dt

Also, we have

d
vLu − uL∗ v = [p (t)u.v] (14.20)
dt 0

Equation (14.20) is known as the Lagrange identity.


Now we suppose that v(t) satisfies the adjoint differential equation:

dv
L∗ v = −p0 (t) + [p1 (t) − p′0 (t)]v = 0. (14.21)
dt

Then equation (14.20) becomes

d
vLu = [v(t)p0 (t)u(t)]. (14.22)
dt

Thus, the solution of the adjoint equation gives an integrating factor to equation
(14.15). Now suppose that u(t) satisfies Lu = 0. Then equation (14.20) gives

d
uL∗ v = − [p (t)u.v] (14.23)
dt 0

Thus, u(t) is an integrating factor for the adjoint equation. It is also seen that (L∗ )∗ = L,
i. e., the adjoint of the adjoint equation is the original equation.
If Lu = 0 and L∗ v = 0, equation (14.20) gives

d
(vp0 u) = 0. (14.24)
dt
or,

v(t)p0 (t)u(t) = c0 (constant) (14.25)

Thus, if we have solution to Lu = 0, we can determine the solution to L∗ v = 0 and vice


versa. This idea extends to higher-order differential equations as shown below.

14.2.2 Second-order equation

Consider the second order differential operator

d2 u du
Lu = p0 (t) + p1 (t) + p2 (t)u.
dt 2 dt
14.2 Adjoints and integrating factors | 309

Multiplying by v and integrating by parts gives

vLu = vp0 u′′ + vp1 u′ + vp2 u


= (vp0 u′ ) − u′ (p0 v)′ + (vp1 u)′ − u(p1 v)′ + vp2 u

= (vp0 u′ ) − [u(p0 v)′ ] + u(p0 v)′′ + (vp1 u)′ − u(p1 v)′ + vp2 u
′ ′

= u[(p0 v)′′ − (p1 v)′ + vp2 ] + (vp0 u′ ) − [u(p0 v)′ ] + (vp1 u)′
′ ′

d
= uL∗ v + [vp0 u′ − u(p0 v)′ + vp1 u]
dt
where

L∗ v = (p0 v)′′ − (p1 v)′ + p2 v


d
vLu − uL∗ v = [π(u, v)]
dt
where the concomitant π(u, v) is defined by

π(u, v) = vp0 u′ − u(p0 v)′ + p1 uv


= p0 u′ v − p0 uv′ − p′0 uv + p1 u.v
= p0 u′ v − p0 uv′ + (p1 − p′0 )uv
p1 − p′0 p0 u
=[ v v′ ] [ ][ ′ ]
−p0 0 u
= kt (v).P.k(u)

where

u
k(u) = ( ) = Wronskian vector,
u′

and the 2 × 2 matrix P = [ p−p p0


is called the concomitant matrix.

1 −p0
0
]
0
The bilinear form π(u, v) is called the bilinear concomitant.
Suppose that v(t) satisfies L∗ v = 0 and u(t) satisfies Lu = 0, then

π(u, v) = constant

󳨐⇒

(p0 u′ − p′0 u + p1 u)v − p0 uv′ = constant


󳨐⇒ p0 (u′ v − uv′ ) + (p1 − p′0 )uv = c0 . (14.26)

Thus, the solution of Lu = 0 and L∗ v = 0 are related by equation (14.26).


310 | 14 Analytic solutions, adjoints and integrating factors

14.3 Relationship between solutions of Lu = 0 and L∗ v = 0


When v(t) satisfies the adjoint equation L∗ v = 0, we have

d
vLu = [π(u, v)] = 0
dt
󳨐⇒

π(u, v) = constant (14.27)

For the case of n = 2, equation (14.27) takes the form

p0 [u′ (t)v(t) − u(t)v′ (t)] + [p1 (t) − p′0 (t)]u(t)v(t) = c0 (14.28)

Thus, if ψ1 and ψ2 are two linearly independent solutions of L∗ v = 0, we have

(p1 − p′0 )ψ1 − p0 ψ′1 p 0 ψ1 u c


[ ] ( ′ ) = ( 01 ) (14.29)
−(p1 − p′0 )ψ2 − p0 ψ′2 p 0 ψ2 u c02

Solving equation (14.29) gives

ψ2 (t)c01 − ψ1 (t)c02
u(t) = (14.30)
p0 (t)W(t)

where
󵄨󵄨 󵄨󵄨
󵄨 ψ ψ2 󵄨󵄨
W(t) = 󵄨󵄨󵄨󵄨 1′ 󵄨󵄨 = ψ1 ψ′2 − ψ′1 ψ2 ≠ 0
󵄨󵄨
󵄨󵄨 ψ1 ψ′2 󵄨

Equation (14.30) gives the general solution to Lu = 0. In the general case, we have n
relations

π(u, ψi ) = c0i , i = 1, 2, . . . n

Since π is linear in u, u′ , u′′ , . . . , u[n−1] (t), we can eliminate (or solve) for these variables.
The same reasoning also applies to the vector equation.

14.4 Vector initial value problem


Consider the vector form of initial value problem:

du
= A(t)u (14.31)
dt

Define linear operator L as


14.4 Vector initial value problem | 311

du
Lu = − A(t)u (14.32)
dt
then
du
vT Lu = v(t)T − v(t)T A(t)u
dt
d dvT
= (vT u) − u − v(t)T A(t)u (14.33)
dt dt
is an exact derivative if

dvT
u + vT A(t)u = 0
dt
or

dvT
= −vT A(t)
dt
or
dv
= −A(t)T v (14.34)
dt

Thus, the adjoint operator L∗ can be defined as

dv
L∗ v = − − A(t)T v (14.35)
dt

which rewrites equation (14.33) as

d T T
vT Lu = (v u) + (L∗ v) u
dt
󳨐⇒
T d T
vT Lu − (L∗ v) u = (v u) (14.36)
dt

Integrating equation (14.36) from t = 0 to t = a gives

⟨Lu, v⟩ − ⟨u, L∗ v⟩ = vT (a)u(a) − vT (0)u(0) (14.37)

Thus, if u(0) = α, the Lagrange condition: ⟨Lu, v⟩ = ⟨u, L∗ v⟩ can be satisfied when we
have vT (a)u(a) = vT (0)α. If we choose v(a) = α, then αT u(a) = vT (0)α = αT v(0) 󳨐⇒
v(0) = u(a), i. e., the final condition of original IVP is the initial condition of the adjoint
problem.
Thus, if we integrate the IVP,

du
= A(t)u, 0<t≤a with u(t = 0) = α, (14.38)
dt
312 | 14 Analytic solutions, adjoints and integrating factors

to obtain u(a), it is equivalent to integrating the adjoint IVP,

dv
= −AT (t)v, 0≤t<a with v(t = a) = α (14.39)
dt

in backward direction to get v(0). Thus, forward integration of equation (14.38) and the
backward integration of equation (14.39) are coupled. Further, if the final condition of
original IVP is u(a) = β, the initial condition of the adjoint problem is v(0) = u(a) = β.
In other words, the adjoint IVP can be integrated in forward direction with the initial
condition as v(0) = β = u(a) to get v(a) = α = u(0). Thus, the adjoint problem may be
used to determine what initial condition on equation (14.31) may lead to a given final
state at t = a. This observation is useful in many applications.
Let u1 , u2 , . . . , un be n linearly independent solutions of

du
Lu ≡ − A(t)u = 0
dt

and v(t) is a solution of the adjoint equation

dv
L∗ v = + AT (t)v = 0,
dt

then we have

d T
(v u) = 0 󳨐⇒ vT u = constant
dt
󳨐⇒

vT u i = c i , i = 1, 2, . . . , n

󳨐⇒

vT U = c T (14.40)

where U(t) is a fundamental matrix. Equation (14.40) can also be rewritten as

UT v = c 󳨐⇒ v = [UT ] c (14.41)
−1

Thus, we can find solutions of the adjoint equation if we know the solution of Lu = 0.
It can also be shown that

VT U = B = a constant nonsingular matrix

We return to the concept of adjoint again when we deal with boundary value problems
in Chapter 18.
14.4 Vector initial value problem | 313

Problems
1. (Linear first-order equation): Find the general solution (or a particular solution if
the initial condition is given) of the following first-order differential equations:
dy dy dy
(a) dx + y cos x = 21 sin 2x, (b) (1 − x 2 ) dx + 2xy = √ x 2 ; y(0) = 0, (c) dx − y tan x =
1−x
dy
ex sec x and (d) (1 + x 3 ) dx + 2xy = 4x 2
2. (Bernoulli’s equation): Find the general solution of the following first-order dif-
y
dy dy dy
ferential equations: (a) x dx + y = y2 ln x, (b) dx + x1 = ex2 , (c) y12 dx + xy1 = 1 and
dy
(d) dx = x 2 y3 − xy
3. (Transient behavior of a mixing tank): A tank initially holds 80 gal of a brine solu-
tion containing 0.125 lb of salt per gallon. At t = 0, another brine solution contain-
ing 1 lb of salt per gallon is poured into the tank at a rate of 4 gal/min, while the
well-stirred mixture leaves the tank at a rate of 8 gal/min. (a) Formulate a model
for describing the transient behavior of the tank and (b) Find the amount of salt
in the tank when the tank contains exactly 40 gal of the solution.
4. (Compound interest modeling): A depositor currently has $6,000 and plans to in-
vest it in an account that accrues interest continuously. What is the required in-
terest rate if the depositor needs to have $10,000 in 4 years? Formulate the model,
solve it and use the solution to calculate the required rate of interest.
5. (Application of Newton’s second law for a free falling body): A body weighing m kg
is dropped from a height H with no initial velocity. As it falls, the body encoun-
ters a force due to air resistance, which may be assumed to be proportional to its
velocity. If the limiting/terminal velocity of this body is v0 m/s, determine (a) an
expression for the velocity of the body at any time t and (b) an expression for the
position of the body at any time t.
6. (Cooling of a pie): A hot pie that was cooked at 325 ∘ F is taken directly from an oven
and placed outdoors in the shade to cool on a day when the air temperature in the
shade is 85 ∘ F. After 5 minutes in the shade, the temperature of the pie had been
reduced to 250 ∘ F. Determine (a) the temperature of the pie after 20 minutes and
(b) the time at which the pie cools to a temperature of 90 ∘ F.
7. (Population growth): The population of a certain state is known to grow at a rate
proportional to the number of people presently living in the state. If after 10 years
the population has trebled and if after 20 years the population is 200,000, find the
number of people initially living in the state.
8. (First-order process model): The process model for a first-order system is given by

dy
τ + y = f (t); y(0) = 0
dt

Here, τ > 0 is the first-order time constant. (a) Determine and plot the response of
the system for a unit step input and a unit impulse input and (b) Determine and
plot the response of the system (amplitude and phase lag) when f (t) = A sin ωt.
314 | 14 Analytic solutions, adjoints and integrating factors

9. (Second-order irreversible reaction in a batch reactor): Consider the second-order


irreversible reaction 2A 󴀕󴀬 B + C occurring in a constant density batch reactor:
(a) Formulate the differential equation for the concentration of A and cast it in
dimensionless form and (b) Solve the equation in (a) for the initial condition cor-
responding to only A present at t = 0. Discuss the special cases of the solution
corresponding to the equilibrium constant being infinity and a value of four.
10. (Terminal velocity of a particle): Consider the motion of a small particle falling
through a fluid (such as a dust particle falling in air or very small particle of size
smaller than 20 µm falling in water). Assume that the initial velocity of the parti-
cle is zero: (a) With appropriate notation, show that Newton’s second law may be
written as

π 3 dv π 3 π
d ρ = dp ρs g − dp3 ρf g − 3πμdp v
6 p s dt 6 6

Explain the meaning of each term in the above equation and (b) Solve the above
model with the initial condition specified and show that the velocity of the particle
at any time may be expressed as

18μt
v(t) = v∞ [1 − exp(− )]
ρs dp2

where v∞ is the terminal velocity (for t → ∞). (c) Determine an expression for v∞ .
11. (Bernoulli’s equation): (a) Find the general solution of the following first-order dif-
ferential equations:
y
dy dy
(i) x dx + y = y2 ln x (ii) dx + x1 = ex2
Consider a population balance model in which the birth rate is proportional to the
square of the population size while the death rate varies linearly: (i) With appro-
priate assumptions, show that the evolution equation is of the form

dN
= bN 2 − aN
dt

where a and b are positive constants and N(t) is the population size at time t,
(ii) Solve the above equation and plot the solution for the two cases of N(t = 0) < ba
and N(t = 0) > ba .
12. (Consecutive first-order reactions in a batch reactor): Consider a well-stirred batch
reactor in which the consecutive reactions A → B → C occur. Assume that the
density of the reaction mixture (and the volume of the reactor) remains constant
and at time zero the reactor is charged with a solution containing only reactant
A at a concentration of C0 . Further assume that the rate of the first reaction is
given by r1 = −rA = k1 CA and that of the second reaction by r2 = rC = k2 CB :
(a) Formulate the differential equations (and the initial conditions) describing the
concentrations of all the species as a function of time, (b) Solve the equations
14.4 Vector initial value problem | 315

in (a) and determine the concentrations and (c) Determine the time at which the
concentration of species B is maximum.
13. The process model for a second order system is given by

d2 y dy
τ2 + 2γτ + y = f (t); y(0) = 0; y′ (0) = 0
dt 2 dt

Here, τ > 0 is the system time constant, while γ > 0 is called the damping con-
stant. (a) Determine and plot the response of the system for a unit step input and
a unit impulse input and (b) Determine and plot the response of the system (am-
plitude and phase lag) when f (t) = A sin ωt for three cases: γ = 0.1, 1, 2.
14. The transient response of a U-tube manometer to changes in pressure is described
by the initial value problem

d2 h 24μ dh 3g 3 Δp
+ + h= ; t > 0, h(0) = 0; h′ (0) = 0
dt 2 ρd2 dt 2L 4 ρL

where h is the deviation of the manometer liquid level from the equilibrium po-
sition, t is time, g is the gravitational acceleration, d is the tube diameter, ρ and
μ are the density and viscosity of the manometer fluid, L is the total length of the
fluid column and Δp is the change in pressure (a) Cast the above model in dimen-
sionless form (b) Determine the critical tube radius above which the response is
oscillatory (c) Determine and plot the transient response of the manometer for
a step change in the pressure when the tube diameter is twice the critical value
determined in (b).
15. The motion of a periodically driven pendulum (for small amplitudes), the clas-
sic mechanical spring-mass-dashpot oscillator and the RLC (resistor-inductor-
capacitor) electric circuit may be described by the second- order IVP:

d2 u du
+ 2γ + ω20 u = f (t); u(0) = 0; u′ (0) = 0
dt 2 dt

where γ is the damping constant and ω0 is the natural frequency of oscillation in


the absence of damping. (a) For each case, formulate the model, cast it into dimen-
sionless form and identify/relate the constants γ and ω0 to the physical quantities
that appear in the model (b) Determine and plot the solution when f (t) = a cos ωt
with special attention to cases in which ω is close to ω0 .
16. Find a general solution of the following ODEs: (i) y′′ +3y′ +2y = 0, (ii) y′′ +2y′ +y = 0
and (iii) y′′ + 4y = 0 (iv) y′′ + 2y′ + 5y = 0
17. Determine whether or not the following functions are linearly independent:
(i) e−x , e−2x , e−4x , (ii) x 2 , x 2 log x, (iii) log x, log(x 3 ), (iv) sin x, sin 2x and (v) e−x ,
cos x, 0
316 | 14 Analytic solutions, adjoints and integrating factors

18. Consider the second-order process model described by the following IVP:

d2 y dy
τ2 + 2γτ + y = f (t); y(0) = 0; y′ (0) = 0
dt 2 dt

Determine and plot the response of the system for a unit step input when τ = 1
and three values of the damping constant: γ = 0.1, 1, 2
19. The motion of a periodically driven pendulum (for small amplitudes) may be de-
scribed by the second-order IVP:

d2 u du
+ 2γ + ω20 u = f (t); u(0) = 0; u′ (0) = 0
dt 2 dt

where γ is the damping constant and ω0 is the natural frequency of oscillation in


the absence of damping. Consider the case of ω0 = 1 and γ = 0.1. Determine and
plot the amplitude of the solution when f (t) = cos ωt and ω is varied in the range
(0.1, 2).
20. Consider the U-tube manometer described by the following IVP:

d2 h 24μ dh 3g 3 Δp
+ + h= ; t > 0, h(0) = 0; h′ (0) = 0
dt 2 ρd2 dt 2L 4 ρL

If the viscous damping term can be neglected, determine the period of oscillation
(or the natural frequency) for the case where the total length of the liquid col-
umn is 2 meters (b) Examine the eigenvalues of the homogeneous equation with
the damping term and derive a formula for the critical diameter of the tube be-
low which the system does not oscillate. Calculate this value when the manome-
ter fluid is water at 20∘ C with density ≈ 1.0 g/cm3 and viscosity ≈ 1 centipoise
(= 0.01 g.cm−1 s−1 ).
21. Find a general solution of the inhomogeneous equation y′′ + 3y′ + 2y = f (x) for
the following cases: (i) f (x) = 1, (ii) f (x) = x 2 , (iii) f (x) = e−4x , (iv) f (x) = e−x and
(v) cos x
22. Find a general solution of the inhomogeneous equation y′′ + 2y′ + y = f (x) for the
following cases: (i) f (x) = 1, (ii) f (x) = x 2 , (iii) f (x) = e−4x , (iv) f (x) = e−x and
(v) sin x
23. Find a general solution of the inhomogeneous equation y′′ + 4y = f (x) for the
following cases: (i) f (x) = 1, (ii) f (x) = x 2 , (iii) f (x) = e−4x , (iv) f (x) = e−x and
(v) sin 2x
24. Find a general solution of the inhomogeneous equation y′′ + 2y′ + 5y = f (x) for the
following cases: (i) f (x) = 1, (ii) f (x) = x 2 , (iii) f (x) = e−x sin 2x, (iv) f (x) = e−x and
(v) sin 2x
25. Use the method of variation of parameter to obtain a general formula for the so-
lution of the equation in problem (21) and then solve the specific cases.
14.4 Vector initial value problem | 317

26. Find a general solution of the following ODEs: (i) yiv + 2y′′ + y = 0, (ii) yiv + 4y′′ = 0
and (iii) y′′′ + 4y′′ + 13y′ = 0
27. Determine the solution of the initial value problem:

d3 y dy
t3 +t − y = t2; y(1) = 1; y′ (1) = 3; y′′ (1) = 14
dt 3 dt

28. Let L be a linear differential operator defined by Lu = u′ − A(t)u where A is a n × n


matrix with components in C[0, a]. The equation Lu = 0 is equivalent to the linear
system

u′ = A(t)u (14.42)

(a) Show that the expression vT Lu is a exact derivative if v satisfies the adjoint
equation

v′ = −A(t)T v (14.43)

What is the adjoint operator?


(b) Show that
T
vT Lu − (L∗ v) u = (vT u)

This is known as the Lagrange’s identity and by integrating it we get the so-
called Green’s formula

⟨Lu, v⟩ − ⟨u, L∗ v⟩ = v(a)T u(a) − v(0)T u(0)

(c) If U is a fundamental matrix for equation (14.42) and V for equation (14.43)
show that UT V = C, where C is a nonsingular constant matrix.
(d) Discuss the relationship between scalar and vector adjoints for the case n = 2.
15 Introduction to the theory of functions of a
complex variable
The theory of functions of a complex variable is helpful in determining and under-
standing the solutions (and their properties) of linear equations. Specifically, it is use-
ful for (i) inverting the Laplace transformation, (ii) evaluation and inversion of Fourier
transforms, (iii) series solutions of ordinary differential equations, (iv) solution of lin-
ear partial differential equations (conformal mapping and solution of Laplace’s equa-
tion in the plane) and (v) evaluation of certain definite integrals. This chapter pro-
vides an introduction to the theory of functions of a complex variable with examples
selected from applications.

15.1 Complex valued functions


15.1.1 Algebraic operations with complex numbers

We have already seen that the set of all complex numbers forms a field, which we shall
denote by ℂ. The symbol z = x + iy, which can stand for any complex number in the
set ℂ is called a complex variable. We shall use the notation Re{z} = real part of z = x
and Im{z} = imaginary part of z = y. The complex conjugate of z is a complex number
x − iy and will be denoted by z or z ∗ .
If z1 = x1 + iy1 ∈ ℂ and z2 = x2 + iy2 ∈ ℂ, then the usual algebraic operations
(addition/subtraction and multiplication/division) are defined by

z1 ± z2 = (x1 ± x2 ) + i(y1 ± y2 )
z1 z2 = (x1 + iy1 )(x2 + iy2 )
= (x1 x2 − y1 y2 ) + i(x1 y2 + x2 y1 )
z1 x1 + iy1 (x1 + iy1 )(x2 − iy2 )
= =
z2 x2 + iy2 x22 + y22
x1 x2 + y1 y2 (x y − x1 y2 )
= + i 2 21 (for z2 ≠ 0)
x22 + y22 x2 + y22

The modulus or absolute value of z = x + iy is defined as |z| = √x 2 + y2 .

15.1.2 Polar form of complex numbers

Since a complex number z = x + iy can be identified with the ordered pair (x, y), it
can be represented as a point in the x − y plane, called the complex plane or Argand
diagram. This is shown in Figure 15.1.

https://doi.org/10.1515/9783110739701-016
15.1 Complex valued functions | 319

Figure 15.1: Representation of a complex number in the plane (Argand diagram).

Referring to this figure, we have

x = r cos θ; y = r sin θ; r = √x 2 + y 2 ,

where r is the modulus and the angle θ with the positive x-axis (in counterclockwise
direction) is called the argument. In polar form, the complex number is written as

z = x + iy = r(cos θ + i sin θ) = reiθ .

The last equality follows from the Euler’s formula

eiθ = cos θ + i sin θ. (15.1)

The polar form of the complex number is convenient for some operations such as mul-
tiplication or division. For example, if we denote zk = rk eiθk (k = 1, 2, . . . , n), then

z1 z2 = r1 eiθ1 r2 eiθ2 = r1 r2 ei(θ1 +θ2 ) .

Applying this to the n-th power of a complex number, we get De Moivre’s theorem

n
z n = (reiθ ) = r n (cos θ + i sin θ)n = r n (cos nθ + i sin nθ) (15.2)

or

cos nθ + i sin nθ = (cos θ + i sin θ)n . (15.3)

By expanding the RHS of equation (15.3) and equating the real and imaginary parts,
we can express cos nθ or sin nθ (for n = 1, 2, 3, . . .) in terms of cos θ and sin θ.
320 | 15 Introduction to the theory of functions of a complex variable

15.1.3 Roots of complex numbers

If n is a positive integer, we have from De Moivre’s theorem and using the relation
e2kπi = 1 for k = 0, 1, 2, . . ., we can write
1 1 1 1 1
z n = (reiθ ) n = r n (eiθ+2kπi ) n = r n ei
(θ+2kπ)
n , k = 0, 1, . . . n − 1 (15.4)
1
It follows from equation (15.4) that there are n distinct values of z n that are located on
1
a circle of radius r n and with argument differing by 2π n
. For example, for n = 4 and
r = 1, we get the fourth roots of unity: 1, −1, i and −i, located on a circle of unit radius
as shown in Figure 15.2.

Figure 15.2: Fourth roots of unity on the unit circle.

Similarly, the nth roots of unity (z = 1) can be expressed as 11/n = 1, ω, ω2 , . . . , ωn−1


πi
where ω = e2πi/n . For example, for n = 4, ω = e 2 = i.

15.1.4 Complex-valued functions

If to each value of the complex number z in a set, we assign another complex number
w such that w = f (z), where f is a function, then w is called a function of a complex
variable. The function can be single-valued or multivalued. In general, we write

w = f (z) = f (x + iy)
= u(x, y) + iv(x, y) (15.5)

where u(x, y) and v(x, y) are the real and imaginary part of f (z). Unless otherwise spec-
ified, we assume f (z) is single-valued.
15.2 Limits, continuity and differentiation | 321

Example 15.1. We consider some elementary functions and determine their real and
imaginary parts.
1. Some elementary functions of a complex variables:
(a) w = z 2 = (x + iy)2 = (x 2 − y2 ) + i(2xy) ≜ u(x, y) + iv(r, θ).
(b) w = ez = ex+iy = ex (cos y + i sin y) = ex cos y + iex sin y ≜ u(x, y) + iv(x, y). The
function ez is periodic with complex period 2πi, since ez = ez+2πi
(c) w = sin z = sin(x + iy) = sin x cos iy + cos x sin iy = sin x cosh y + i cos x sinh y ≜
u(x, y) + iv(x, y). The function sin z is periodic with real period 2π.
2. Some functions in polar coordinates:
(a) w = z 2 = (reiθ )2 = r 2 e2iθ = r 2 cos 2θ + ir 2 sin 2θ ≜ u(r, θ) + iv(r, θ).
(b) w = ln z. Writing z = reiθ = reiθ+2kπi , k = 0, ±1, ±2, . . . we get w = ln(reiθ+2kπi ) =
ln r+i(θ+2kπ) ≜ u(x, y)+iv(x, y), k = 0, ±1, ±2, . . . . The function ln z is infinitely
many valued. The principal or main branch is given by ln r + iθ and is denoted
by Ln z.

15.2 Limits, continuity and differentiation


15.2.1 Limits

Let f (z) be a function defined in some neighborhood of a point z0 , with the possible
exception of z0 itself. We say that the limit of f (z) as z approaches z0 is w0 and write

lim f (z) = w0 (15.6)


z→z0

if for any ϵ > 0, there exists a δ > 0 such that

󵄨󵄨f (z) − w0 󵄨󵄨󵄨 < ϵ (15.7)


󵄨󵄨 󵄨

whenever 0 < |z − z0 | < δ.

15.2.2 Continuity

Let f (z) be a function defined in a neighborhood of z0 . Then we say that f (z) is contin-
uous at z0 if (i) limz→z0 f (z) = w0 exists, (ii) f (z0 ) is defined and (iii) w0 = f (z0 ).
A function f (z) is said to be continuous in a region if it is continuous at all points
of the region.

15.2.3 Derivative

Let f (z) be a complex-valued function defined in a neighborhood of z0 . Then the


derivative of f (z) at z0 is defined by
322 | 15 Introduction to the theory of functions of a complex variable

f (z0 + Δz) − f (z0 )


f ′ (z0 ) = lim (15.8)
Δz→0 Δz

provided the limit exists and is independent of the manner in which Δz → 0.

Definition. A complex-valued function f (z) is said to be analytic (holomorphic or reg-


ular) at a point z0 if there exists a neighborhood |z − z0 | < δ at all points of which f ′ (z)
exists. The function f (z) is analytic in an open region ℛ if it is analytic at all points
of ℛ.

Remarks.
(i) analyticity is a property defined over open sets while differentiability could con-
ceivably hold at one point only. For example, for the function

f (z) = 󵄨󵄨󵄨z 2 󵄨󵄨󵄨,


󵄨 󵄨

the derivative exists at z0 = 0 but does not exist at any other point.
(ii) As in the case of function of a real variable, differentiability implies continuity
but the converse is not true. For example, the function f (z) = |z 2 | is continuous
everywhere but is differentiable only at z0 = 0. Similarly, the function f (z) = z is
continuous everywhere but is not differentiable at any point.

15.2.4 The Cauchy–Riemann equations

The property of analyticity of a function of complex variable z dictates a relationship


between the derivatives of its real and imaginary parts. If we write

f (z) = u(x, y) + iv(x, y) (15.9)

and if f (z) is differentiable at z0 , then f ′ (z0 ) is independent of how Δz approaches zero.


If Δz = Δx, then from the definition of the derivative, it follows that

𝜕u 𝜕v
f ′ (z0 ) = (x , y ) + i (x0 , y0 ) (15.10)
𝜕x 0 0 𝜕x

However, if Δz = iΔy, then we have

1 𝜕u 𝜕v
f ′ (z0 ) = (x0 , y0 ) + (x0 , y0 )
i 𝜕y 𝜕y
𝜕v 𝜕u
= (x , y ) − i (x0 , y0 ) (15.11)
𝜕y 0 0 𝜕y

Thus, if f ′ (z0 ) exists, a necessary condition from equation (15.10) and equation (15.11)
is
15.2 Limits, continuity and differentiation | 323

𝜕u 𝜕v 𝜕u 𝜕v
= , =− @ (x0 , y0 ) (15.12)
𝜕x 𝜕y 𝜕y 𝜕x

Equations (15.12) are referred to as the Cauchy–Riemann (C–R) equations. It may be


shown that a necessary and sufficient condition for f (z) to be analytic in a region ℛ
is that the C–R equations hold and the first partial derivatives of u(x, y) and v(x, y) are
continuous. In polar coordinates, the C–R equations become

𝜕u 1 𝜕v 𝜕v 1 𝜕u
= , =− (15.13)
𝜕r r 𝜕θ 𝜕r r 𝜕θ
Definition. A real-valued function ϕ(x, y) is said to be harmonic in a region ℛ if all its
second partial derivatives are continuous and

𝜕2 ϕ 𝜕2 ϕ
+ 2 =0 (Laplace’s eq.) (15.14)
𝜕x 2 𝜕y

at each point of ℛ.

Theorem (Analytic functions). If f (z) = u + iv is analytic in a region ℛ, then the function


u(x, y) and v(x, y) are harmonic in ℛ.

The proof of this theorem follows from the C–R equations. u and v are called the
conjugate harmonic functions.

15.2.5 Some elementary functions of a complex variable

Here, we give some examples of some elementary functions of a complex variable:


1. Polynomial functions
Let

w = Pn (z)
= a0 + a1 z + a2 z 2 + ⋅ ⋅ ⋅ + an z n ,

where n is a nonnegative integer and ai ∈ ℂ.


2. Algebraic functions
These functions are defined by solutions to equations of the form:

P0 (z)wn + P1 (z)wn−1 + ⋅ ⋅ ⋅ + Pn (z) = 0.

For example,
1
w1 (z) = z 2
1 1
w2 (z) = z 2 + z 3

are algebraic functions. [w1 (z) is two-valued function while w3 (z) is six-valued.]
324 | 15 Introduction to the theory of functions of a complex variable

3. Rational algebraic functions


These functions are of the form:
Pn (z)
w(z) = .
Pm (z)

For example, for n = m = 1, we obtain the bilinear transformation

az + b
w(z) = ; a, b, c, d ∈ ℂ.
cz + d

4. Exponential functions

w = ez = ex cos y + iex sin y.

This function is periodic with period 2πi, i. e.,

w(z) = w(z + 2πi)

5. Logarithmic functions

w = ln z = ln(reiθ ) = ln(reiθ+2kπi ), k = 0, 1, 2, . . .
= ln r + i(θ + 2kπ), k = 0, 1, 2, . . .

As stated earlier, this function (which is the inverse of the exponential function)
is infinite-valued. The primary branch corresponding to k = 0 is denoted by

Ln z = ln r + iθ.

More generally, the function f (z)g(z) is defined by

f (z)g(z) = eg(z) ln f (z)

and may be infinite-valued. For example,

z α = eα ln z

is multi-valued for α = n1 (n = 2, 3, . . .) and infinite-valued for general (and complex


values of) α.
6. Trigonometric functions

eiz − e−iz eiz + e−iz sin z


sin z = , cos z = , tan z =
2i 2 cos z

are trigonometric functions and periodic with period 2π. The inverse function
such as
15.2 Limits, continuity and differentiation | 325

1
sin−1 z = ln(iz + √1 − z 2 )
i

is infinite-valued.
7. Hyperbolic functions

ez − e−z ez + e−z sinh z


sinh z = , cosh z = , tanh z =
2 2 cosh z

are periodic with period 2πi. The inverse function such as

sinh−1 z = ln(z + √z 2 − 1)

is infinite-valued.

15.2.6 Zeros and singular points of complex-valued functions

Definition (Zeros). If f (z) is analytic at z = a and f (a) = f ′ (a) = f ′′ (a) = ⋅ ⋅ ⋅ =


f [n−1] (a) = 0 but f [n] (a) ≠ 0, then we say that f (z) has a zero at z = a of order n.
Definition (Singular point). The point z = z0 is a singular point of f (z) if it is not an-
alytic at z0 . It is called an isolated singular point if there is no other singular point in
the neighborhood, i. e., ∃δ > 0 ∋ there is no other singular point in 0 < |z − z0 | < δ.
Singular points may be classified as follows:

(a) Poles
If z0 is a singular point of f (z) such that

lim (z − z0 )n f (z) = A ≠ 0,
z→z0

then z0 is called a pole of order n. For n = 1, it is called a simple pole.

Example 15.2.
ez
(i) f1 (z) = z−1 has a simple pole at z = 1.
1
(ii) f2 (z) = sin z has simple poles at z = ±kπ, k = 0, 1, 2, . . .
(iii) f3 (z) = z13 has a pole of order 3 at z = 0.

(b) Branch point


If f (z) is a multivalued function centered at z = z0 or equivalently, f (z) takes multiple
values on the circle |z − z0 | = ε > 0, then z0 is a branch point.

Example 15.3.
(i) f1 (z) = √z + 1 has a branch point at z = −1.
(ii) f2 (z) = ln z has a branch point at z = 0.
(iii) f3 (z) = cos √z has no branch points and is analytic for all z.
326 | 15 Introduction to the theory of functions of a complex variable

(c) Removable singularity


z = z0 is called a removable singularity of f (z) if

lim f (z) exists.


z→z0

sin √z
For example, f (z) = √z
has a removable singularity at z = 0.

(d) Essential singularity


If z = z0 is a singularity of f (z) that is not a pole, branch point or removable singular-
1
ity, then z0 is an essential singularity. For example, the function e z has an essential
singularity at z = 0.

(e) Singularity at infinity


Let

1
g(z) = f ( ).
z

If z = 0 is a singularity of g(z), then z = ∞ is a singularity of f (z). For example, ez has


a singularity at z = ∞.

Definition. If f (z) is analytic for all z(with |z| < ∞), it is called an entire function.

For example, sin z, cos √z, ez and J0 (z) are entire functions.

15.3 Complex integration, Cauchy’s theorem and integral


formulas
Let f (z) be continuous at all points of a curve C (Figure 15.3), which is assumed to be
of finite length (also called rectifiable). Let
n n
Sn = ∑ f (ξk )(zk − zk−1 ) = ∑ f (ξk )Δzk ; Δzk = zk − zk−1 (15.15)
k=1 k=1

Define the complex line integral

∫ f (z) dz = ∫ f (z) dz
a C

= lim Sn (15.16)
n→∞
max Δzk →0

If this definite integral exists, f (z) is said to be integrable along curve C.


15.3 Complex integration, Cauchy’s theorem and integral formulas | 327

Figure 15.3: Schematic diagram illustrating integration in z-plane along a curve C.

From the above definition, it is seen that the complex line integral of f (z) = u(x, y) +
iv(x, y) can be expressed in terms of two real line integrals:

∫ f (z) dz = ∫(u + iv)(dx + i dy)


C C

= ∫ u(x, y) dx − v(x, y) dy + i ∫ v(x, y) dx + u(x, y) dy (15.17)


C C

Further, many properties of the real definite/line integrals also apply to complex inte-
grals.

15.3.1 Simply and multiply connected domains

A region ℛ is called simply connected if any simple curve, which lies in ℛ can be
shrunk to a point without leaving ℛ (equivalently, the region ℛ has no holes). In Fig-
ure 15.4, ℛ1 and ℛ2 are simply-connected while ℛ3 and ℛ4 are not. The regions ℛ3
and ℛ4 are multiply-connected (or they have one or more holes).

15.3.2 Contour integrals and traversal of a closed path

Let ℛ be a region in the complex plane and C be its boundary. The boundary is said to
be traversed in the positive direction if an observer moving on C has the region to the
left. We use the notation

∮ f (z) dz (15.18)
C

to denote the integration of f (z) along the closed curve C.


328 | 15 Introduction to the theory of functions of a complex variable

Figure 15.4: Schematic diagram illustrating simply and multiply connected domains.

Example 15.4. Evaluate ∮C z 3 dz where C is the circle |z| = 1.


Along C, we have z = eiθ , and dz = ieiθ dθ. Thus,

∮ z dz = ∫ ei3θ ieiθ dθ
3

C 0
2π 2π

= i ∫ ei4θ dθ = i ∫ (cos 4θ + i sin 4θ) dθ = 0


0 0

15.3.3 Cauchy’s theorem

Suppose that f (z) is analytic in a region ℛ and on its boundary C. Then

∮ f (z) dz = 0. (15.19)
C

This fundamental theorem may be shown to be valid for both simply and multiply
connected domains. It was proved by Cauchy with the further assumption of f ′ (z) to
be continuous. However, Goursat removed the restriction and for this reason, it is also
referred to as the Cauchy–Goursat theorem. Cauchy’s proof utilizes Green’s theorem in
the plane.

Green’s theorem. Let P(x, y) and Q(x, y) be continuous and have continuous partial
derivatives in a region ℛ and on its boundary C. Then
15.3 Complex integration, Cauchy’s theorem and integral formulas | 329

𝜕Q 𝜕P
∮ P dx + Q dy = ∬( − ) dx dy (15.20)
𝜕x 𝜕y
C ℛ

This theorem is valid for simply as well as multiply connected domains.

If f ′ (z) is continuous, the Cauchy–Riemann equations are valid and

∮ f (z) dz = ∫ u(x, y) dx − v(x, y) dy + i ∫ v(x, y) dx + u(x, y) dy


C C C
−𝜕v 𝜕u 𝜕u 𝜕v
= ∬( − ) dx dy + i ∬( − ) dx dy
𝜕x 𝜕y 𝜕x 𝜕y
R ℛ
=0 (15.21)

The consequences listed below follow from Cauchy’s theorem:


b
1. If f (z) is analytic in a simply connected region ℛ, the integral ∫a f (z) dz is inde-
pendent of the path in ℛ joining any two points a and b in ℛ.
z
2. For a and z in ℛ, F(z) = ∫a f (z) dz is analytic and F ′ (z) = f (z).
3. If C1 and C2 are any closed curves in ℛ,

∮ f (z) dz = ∮ f (z) dz,


C1 C2

where C1 and C2 are traversed in the positive sense (see Figure 15.5).

Figure 15.5: Schematic diagram illustrating positive traversal and Cauchy’s theorem.

1
Example 15.5. Evaluate ∮C (z−a) dz, where C is any simple closed curve and z = a is (i)
outside of C and (ii) inside C.
1
(i) If z = a is outside of C, by Cauchy’s theorem ∮C (z−a) dz = 0.
(ii) If z = a is inside C, let C2 be a circle of radius ϵ centered at z = a. Then
330 | 15 Introduction to the theory of functions of a complex variable

1 1
I1 = ∮ dz = ∮ dz
(z − a) (z − a)
C C2

But on C2 ,

z − a = ϵeiθ ; dz = iϵeiθ dθ

Thus,


1 iϵeiθ dθ
I1 = ∮ dz = ∫ = 2πi
(z − a) ϵeiθ
C2 0

1
Example 15.6. Evaluate In = ∮C (z−a) n dz, n = 2, 3, 4, . . . where C is any simple closed

curve.
Using the same procedure as above, it is easily seen that In = 0 for both cases, i. e.,
when z = a is inside or outside of C.

15.3.4 Cauchy’s integral formulas

Let f (z) be analytic inside and on a simple closed curve C and let a be any point in-
side C. Then

1 f (z)
f (a) = ∮ dz (15.22)
2πi (z − a)
C
n! f (z)
f [n] (a) = ∮ dz, n = 1, 2, 3, . . . (15.23)
2πi (z − a)n+1
C

These formulas follow from Cauchy’s theorem and are remarkable as they imply that
if f (z) is known on a simple closed curve C, then the values of the function and all its
derivatives can be found at all points inside C. An extended form of equation (15.22)
is also useful for developing an inversion formula for the Laplace transform. It is also
useful for solving Laplace’s equation in two spatial dimensions with Dirichlet bound-
ary conditions.
The following are some consequences of Cauchy’s integral formulas:
1. Every polynomial of degree n,

Pn (z) ≡ a0 + a1 z + ⋅ ⋅ ⋅ + an z n = 0

with n ≥ 1 and an ≠ 0, has exactly n roots, counting multiplicity [also known as


the fundamental theorem of algebra].
15.3 Complex integration, Cauchy’s theorem and integral formulas | 331

2. If f (z) is analytic inside and on a circle C with center at a and radius r, then f (a)
is the mean value of f (z) on C, i. e.,

1
f (a) = ∫ f (a + reiθ ) dθ

0

[This is also known as Gauss’ mean value theorem.]


3. If f (z) is analytic inside and on a simple closed curve C, except for a finite number
of poles inside C, then

1 f ′ (z)
∮ dz = N − P
2πi f (z)
C

where N and P are, respectively, the number of zeros and poles of f (z) inside C.
4. Suppose that f (z) and g(z) are analytic inside and on a simple closed curve C and
suppose that |g(z)| < |f (z)| on C, then f (z) + g(z) and f (z) have the same number
of zeros inside C. [This is also known as Rouche’s theorem.]

For a proof of these and many other related theorems, we refer to the book by Spiegel
[29].

Example 15.7 (Poisson’s integral formula for a circle). Let f (z) be analytic inside and
on a circle C defined by |z| = R and let z = reiθ be any point inside C. We have by
Cauchy’s integral formula

1 f (w)
f (z) = f (reiθ ) = ∮ dw. (15.24)
2πi (w − z)
C

The inverse of the point z with respect to the circle C lies outside of C and is given
2
by Rz . By Cauchy’s theorem,

1 f (w)
0= ∮ dw. (15.25)
2πi (w − R2 )
C z

Subtraction of equation (15.25) from equation (15.24) gives


2
1 (z − Rz )
f (z) = ∮ f (w) dw (15.26)
2πi (w − z)(w − R2
)
C z

Let z = reiθ and w = Reiϕ in equation (15.26) to obtain



iθ 1 (R2 − r 2 )f (Reiϕ )
f (re ) = ∫ 2 dϕ (15.27)
2π R − 2Rr cos(θ − ϕ) + r 2
0
332 | 15 Introduction to the theory of functions of a complex variable

Writing

f (reiθ ) = u(r, θ) + iv(r, θ)


and f (Reiϕ ) = u(R, ϕ) + iv(R, ϕ)

and separating the real and imaginary parts of equation (15.27) gives


1 (R2 − r 2 )u(R, ϕ)
u(r, θ) = ∫ 2 dϕ (15.28)
2π R − 2Rr cos(θ − ϕ) + r 2
0

and

1 (R2 − r 2 )v(R, ϕ)
v(r, θ) = ∫ 2 dϕ (15.29)
2π R − 2Rr cos(θ − ϕ) + r 2
0

We note that the Poisson’s integral formula given by equation (15.28) is the solu-
tion of the Laplace’s equation

1 𝜕 𝜕u 1 𝜕2 u
∇2 u = (r ) + 2 2 = 0
r 𝜕r 𝜕r r 𝜕θ

in a circle with boundary value u(R, ϕ) specified [Dirichlet problem]. A similar formula
may be obtained for the solution of ∇2 u = 0 in the upper half-plane (y > 0) with u(x, 0)
specified.

15.4 Infinite series: Taylor’s and Laurent’s series


Let u1 (z), u2 (z), . . . , un (z), . . . be a sequence of functions of z and single valued in some
region of the z-plane. We call U(z) the limit of the sequence {un (z)} as n → ∞ and
write

lim u (z) = U(z)


n→∞ n

if given ϵ > 0, we can find a number N (depending in general on both ϵ and z) such
that

󵄨󵄨un (z) − U(z)󵄨󵄨󵄨 < ϵ for all n > N


󵄨󵄨 󵄨

In such case, we say that the sequence converges to U(z). If a sequence {un (z)} con-
verges for all z in a region ℛ, we call ℛ the region of convergence of the sequence.
A sequence which is not convergent is called divergent.
15.4 Infinite series: Taylor’s and Laurent’s series | 333

2n−1
Example 15.8. Consider the sequence {un (z) = n
+ i n+2
n
} = {1 + 3i, 32 + 2i, 35 + i 35 , . . .}.
We claim that

lim u (z) =2+i


n→∞ n

To verify this, we note that

󵄨 󵄨󵄨 2n − 1 n+2
󵄨 󵄨󵄨
󵄨󵄨un (z) − 2 − i󵄨󵄨󵄨 = 󵄨󵄨󵄨 +i − 2 − i󵄨󵄨󵄨
󵄨󵄨 󵄨
󵄨󵄨 n n 󵄨󵄨
󵄨󵄨 1 2 󵄨󵄨
󵄨
= 󵄨󵄨󵄨− + i 󵄨󵄨󵄨
󵄨
󵄨󵄨 n n 󵄨󵄨
√5
=
n
√5
< ε ⇒ n > ε5 . Thus, if ε = 0.01, n > 223 = N and all terms after the 223rd are

n
inside a circle of radius 0.01 centered at 2 + i.

15.4.1 Taylor’s series

Taylor’s theorem. Let f (z) be analytic in a region ℛ and let z = a be any point in ℛ.
Then there exist precisely one power series with center at z = a representing f (z) and it
is given by

f (n) (a)
f (z) = ∑ (z − a)n
n=0
n!
f ′ (a) f ′′ (a)
= f (a) + (z − a) + (z − a)2 + ⋅ ⋅ ⋅
1! 2!

The above representation is referred to as Taylor’s series expansion of f (z) and is valid
in the largest open disk with center at z = a in ℛ, i. e., the radius of convergence of the
above Taylor’s series is equal to the distance of the point z = a to the nearest singularity
of f (z), or to the boundary as shown below in Figure 15.6 schematically.

Figure 15.6: Schematic diagram illustrating the region of convergence of Taylor series.
334 | 15 Introduction to the theory of functions of a complex variable

15.4.2 Practical methods of obtaining power series

Let f (z) be analytic in a region ℛ and let z = a be any point in ℛ. Then we have seen
that f (z) has a power series representation


f (n) (a)
f (z) = ∑ (z − a)n
n=0
n!

that converges uniformly in the disk |z − a| < b. The power series may be obtained
either by evaluating the derivatives of f (x) or by other methods.

Example 15.9.
(a)

1
f (z) =
1+z
= 1 − z + z 2 − z 3 + ⋅ ⋅ ⋅ + (−1)n z n + ⋅ ⋅ ⋅ (|z| < 1)

(b)
f (z) = ln(1 + z)
z2 z3 (−1)n−1 z n
=z− + + ⋅⋅⋅ + + ⋅⋅⋅ (|z| < 1)
2 3 n

(c)
f (z) = tan−1 z
z3 z5 (−1)n z 2n+1
=z− + + ⋅⋅⋅ + + ⋅⋅⋅ (|z| < 1)
3 5 2n + 1

In cases (b) and (c), the series represent the principal value of the function.

15.4.3 Laurent series

In many applications, it is necessary to expand a function around points where f (z) is


singular. In such cases, Taylor’s theorem cannot be applied and a new type of series,
called the Laurent series, is necessary.
1
Example 15.10. Consider the function f (z) = z 2 (1−z)
. Using the result,

1
= 1 + z + z 2 + z 3 + ⋅ ⋅ ⋅ ; |z| < 1,
1−z

we obtain

1 1
f (z) = + + 1 + z + z2 + z3 + ⋅ ⋅ ⋅ ; 0 < |z| ≤ γ < 1
z2 z
15.5 The residue theorem and integration by the method of residues | 335

which is valid in the punctured disk 0 < |z| ≤ γ (i. e., all points of the disk |z| ≤
γ excluding the center). This is the Laurent series of the function f (z). The first part
containing the reciprocal powers of z is called the principal part while the rest of the
series containing the constant and positive power of z is called the analytic part of f (z).

Theorem (Laurent). If f (z) is analytic and single-valued on two concentric circles 𝒞1 and
𝒞2 with center at z = a and in the annulus between them, then f (z) can be represented
by the (Laurent) series

f (z) = ∑ an (z − a)n
n=−∞

where
1 f (w)
an = ∮ dw
2πi (w − a)n+1
𝒞

and 𝒞 is a closed curve in the annulus that encircles 𝒞2 (Figure 15.7). The series converges
and represents f (z) in the open annulus obtained from the given annulus by continuously
increasing 𝒞1 and decreasing 𝒞2 until each of the two circles reaches a point where f (z)
is singular.

15.5 The residue theorem and integration by the method of


residues
If f (z) is analytic in a neighborhood of a point z = a, then by Cauchy’s integral theo-
rem, we can write

∫ f (z) dz = 0 (15.30)
𝒞

for any contour 𝒞 in that neighborhood. If, however, f (z) has a pole or an isolated
essential singularity at z = a, and z = a lies in the interior of 𝒞 , then the integral
(equation (15.30)) will, in general, be different from zero. In this case, we may represent
f (z) by a Laurent series
∞ ∞
bn
f (z) = ∑ an (z − a)n + ∑ n
, (15.31)
n=0 n=1 (z − a)

which converges in the annulus 0 < |z − a| < R. Here, R is the distance from z = a to
the nearest singularity of f (z) and

1 f (w)
an = ∮ dw (15.32)
2πi (w − a)n+1
𝒞2
336 | 15 Introduction to the theory of functions of a complex variable

1
bn = ∮(w − a)n−1 f (w) dw (15.33)
2πi
𝒞1

where 𝒞1 and 𝒞2 are the contours enclosing the point z = a as shown in Figure 15.7.

Figure 15.7: Schematic diagram illustrating Laurant’s theorem.

From equations (15.32)–(15.33), we get

1
b1 = ∮ f (w) dw
2πi
𝒞1

󳨐⇒ ∮ f (z) dz = 2πib1 (15.34)


𝒞1

The coefficient b1 in the expansion (equation (15.31)) is called the residue of f (z) at
z = a.
Formally, we may obtain equation (15.34) from equation (15.31) by integrating term
by term as follows:

∞ ∞
bn
∮ f (z) dz = ∮[ ∑ an (z − a)n + ∑ ] dz
n=0 n=1 (z − a)n
𝒞1 𝒞1
∞ ∞
dz
= ∑ an ∮(z − a)n dz + ∑ bn ∮
n=0 n=1 (z − a)n
𝒞1 𝒞1

dz
= 0 + ∑ bn ∮
n=1 (z − a)n
𝒞1
15.5 The residue theorem and integration by the method of residues | 337

But on contour 𝒞1 , we can write

z − a = εeiθ and dz = iεeiθ dθ, with |ε| < R

󳨐⇒
2π 2π
dz iεeiθ dθ
∮ = ∫ = iε1−n ∫ ei(1−n)θ dθ
(z − a)n εn einθ
𝒞1 0 0

0, n ≠ 1
={
2πi, n=1

Thus,

dz
∮ f (z) dz = ∑ bn ∮ = 2πib1
n=1 (z − a)n
𝒞1 𝒞1

= 2πi. Res f (z)|z=a (15.35)

Since b1 is the only term that contributes to the integral, it is called the residue.
Note that the Laurent expansion may be obtained by various methods without
using the formula (equations (15.32)–(15.33)). Hence, we may determine the residue
by one of these methods and then use the formula (equation (15.35)) for evaluating
the contour integral.

Example 15.11.
1. Evaluate

∮ ez dz, where 𝒞 : |z| = 1


𝒞

Since ez is analytic function inside 𝒞 , we have

∮ ez dz = 0
𝒞

2. Evaluate
1
∮ z 2 e z dz, where 𝒞 : |z| = 5
𝒞

1 1
Note that the point z = 0 is an essential singularity of e z . The expansion of e z can
be expressed as
1 1 1 1 1 1 1 1
ez = 1 + + + + ...
z 2! z 2 3! z 3 4! z 4
338 | 15 Introduction to the theory of functions of a complex variable

󳨐⇒
1 1 1 1 1 1
z2e z = z2 + z + + + ...
2! 3! z 4! z 2
󳨐⇒
1 1
b1 = Res(z 2 e z )󵄨󵄨󵄨z=0 =
󵄨
3!

Thus,
1 πi
∮ z 2 e z dz = 2πib1 =
3
𝒞

3. Evaluate

sin z
∮ dz, where 𝒞 : |z| = 1
z3
𝒞

sin z
The expansion of z3
can be expressed as

sin z 1 z3 z5
= (z − + − ⋅ ⋅ ⋅)
z3 z3 3! 5!
1 1 z2
= 2 − + − ⋅⋅⋅
z 3! 5!
sin z
Thus, z3
has a second-order pole at z = 0 with residue as

sin z 󵄨󵄨󵄨󵄨
b1 = Res( )󵄨 =0
z 3 󵄨󵄨󵄨z=0

Thus,

sin z
∮ dz = 2πib1 = 0.
z3
𝒞

15.5.1 Other methods for evaluating residues

(a) Simple pole


Suppose that f (z) has a simple pole at z = a. Then near z = a, f (z) has a representation

p(z)
f (z) =
q(z)

where p(z) and q(z) are analytic at z = a,


15.5 The residue theorem and integration by the method of residues | 339

(z − a) ′′ (z − a)2 ′′′
q(z) = (z − a)[q′ (a) + q (a) + q (a) + ⋅ ⋅ ⋅],
2! 3!

and

q′ (a) ≠ 0, p(a) ≠ 0,

i. e., q(z) has a simple zero at z = a. Thus, the residue of f (z) is given by

(z − a)p(z)
Res f (z)|z=a = lim
z→a q(z)
p(a)
= ′ (15.36)
q (a)

Example 15.12.
4−3z
(i) Consider f (z) = z(z−1)
, which has simple poles at z = 0 and z = 1. Thus,

4 − 3z
Res f (z)|z=0 = lim = −4
z→0 (z − 1)
4 − 3z
Res f (z)|z=1 = lim =1
z→0 z
sin z
(ii) Consider f (z) = tan z = cos z
, which has simple poles at

π
ak = (2k − 1) , k = 0, ±1, ±2, . . .
2

with residue given by

(z − ak ) sin z (z − ak )
Res f (z)|z=ak = lim = sin ak lim
z→ak cos z z→ak cos z
1 sin ak
= sin ak lim = =1
z→ak sin z sin ak

(b) Poles of higher order


If f (z) has a pole of order m(> 1) at z = a, then the Laurent series expansion is of the
form:

b1 b2 bm
f (z) = ∑ an (z − a)n + + + ⋅⋅⋅ +
n=0
z − a (z − a) 2 (z − a)m

󳨐⇒

(z − a)m f (z) = bm + bm−1 (z − a) + ⋅ ⋅ ⋅ + b1 (z − a)m−1 + ∑ an (z − a)n+m
n=0

󳨐⇒
340 | 15 Introduction to the theory of functions of a complex variable

dm−1
[(z − a)m f (z)]󵄨󵄨󵄨z=a = (m − 1)!b1
󵄨
dz m−1

Thus,

1 dm−1
Res f (z)|z=a = b1 = lim { m−1 [(z − a)m f (z)]} (15.37)
(m − 1)! z→a dz

ez
Example 15.13. Consider f (z) = (z−1)3
, which has pole of order 3 at z = 1. Thus,

1 d2 e
Res f (z)|z=1 = lim lim{ 2 [ez ]} =
z→1 2! z→1 dz 2

Alternatively, we can rewrite the function f (z) by transforming z = 1 + u, which


simplifies f (z) as

e1+u e u2 u3 u4
f = = 3 (1 + u + + + + ⋅ ⋅ ⋅)
u 3 u 2! 3! 4!
e e e 1 u
= 3 + 2 + + + + ⋅⋅⋅
u u 2!u 3! 4!
󳨐⇒
e e
b1 = Res f (z)|z=1 = Res f (u)|u=0 = =
2! 2

15.5.2 Residue theorem

Let f (z) be a single-valued function, which is analytic inside a simple closed path 𝒞
and on 𝒞 , except for finitely many singular points at α1 , α2 , . . . αm inside 𝒞 . Then

m
∮ f (z) dz = 2πi ∑ Res f (z)|z=αj
𝒞 j=1

Proof. Consider the schematic of a simply connected contour 𝒞 and positively oriented
circles 𝒞k (interior to 𝒞 and centered at z = αk ) as shown in Figure 15.8.
It follows from Cauchy’s theorem that

∮ f (z) dz = ∮ f (z) dz + ∮ f (z) dz + ⋅ ⋅ ⋅ + ∮ f (z) dz


𝒞 𝒞1 𝒞2 𝒞m

= 2πi Res f (z)|z=α1 + 2πi Res f (z)|z=α2 + ⋅ ⋅ ⋅ + 2πi Res f (z)|z=αm


m
= 2πi ∑ Res f (z)|z=αj (15.38)
j=1
15.5 The residue theorem and integration by the method of residues | 341

Figure 15.8: Simply connected contour 𝒞 containing positively oriented circles 𝒞k centered at z = αk .

This important theorem has various applications in connection with complex and real
integrals that appears in the inversion of Laplace and Fourier transforms. This will be
illustrated in later chapters.

Problems
1. (Complex numbers and functions):
(a) Evaluate the following:
(i) sin−1 2, (ii) cos(1 + i), (iii) ii , (iv) ln(−4) and (v) sinh−1 z
(b) Use the definition of elementary functions to prove the following:
sin 2x sinh 2y
(i) sin iz = i sinh z, (ii) cos iz = cosh z and (iii) tan z = cos 2x+cosh 2y
+i cos 2x+cosh 2y
2. (Real and complex roots of nonlinear equations): Determine all the roots (real and
complex) of the following equations:
(a) z sin z = γ cos z (γ = 2)
kp e−zD
(b) 1 + 1+τz
= 0 (kp = 1, τ = 1, D = 1)

Remark. Equation (a) appears in the solution of unsteady state heat/mass trans-
form problems while (b) appears in the stability analysis of a closed loop control
system with delay.

3. Determine which of the following functions are analytic:


(a) z 3 + z
(b) arg z
1
(c) 1−z
z
(d) z+2 .
1 1
4. Determine the six branches of the function f (z) = z 2 + z 3 in terms of polar coor-
dinates.
5. Show that the following functions are harmonic and find the analytic function
f (z) = u + iv:
(a) u = ln(x 2 + y2 ) and
(b) v = − sin x sinh y
(c) u = xex cos y − yex sin y
6. Verify Cauchy’s theorem for the function f (z) = 3z 2 + iz − 4 if C is the unit circle.
342 | 15 Introduction to the theory of functions of a complex variable

7. Show that

1 ezt
∮ 2 dz = sin t, t>0
2πi z + 1
C

where C is the circle |z| = 3.


8. Find the steady-state temperature T at each point inside the unit disk if the tem-
perature of the rim is at the levels shown in Figure 15.9. What is the temperature
at the center?

Figure 15.9: Schematic of a disc subjected to a temperature distribution at the rim.

9. (Singularity and convergence)


(a) Locate and name the singularities of the following functions:
2
(i) (z1+z 1
3 −1) sin( 1+z ), (ii) sech z, (iii) tan (z +2z +2) and (iv) exp(tan z1 ) (v) zsinh
−1 2 z√a

sinh √z
(b) Determine the region of convergence for:
e2πinz (z+i)n n! n
(i) ∑∞n=1 (n+1)3/2 , (ii) ∑n=1 (n+1)(n+2) and (iii) ∑n=1 (n+1)n+1 z
∞ ∞

(c) Expand each of the following in a Laurent series about z = 0, naming the type
of singularity in each case:
z2 4
(i) 1−cos z
, (ii) ez 3 , (iii) z 2 e−z , (iv) coshz z and (v) z sinh √z
−1

z
(d) State whether each of the following functions are entire, meromorphic or nei-
ther:
(i) z 2 e−z , (ii) cot 2z, (iii) 1−cos z
z
, (iv) z sin z1 and (v) sin√z√z (vi) z + z1
10. Show that

sin ax 1 a 1
∫ dx = coth −
e2πx − 1 4 2 2a
0
15.5 The residue theorem and integration by the method of residues | 343

aiz
e
Hint: Use e2πz −1
around a rectangle with vertices at 0, R, R + i, i and let R → ∞.
11. Nyquist stability criterion: Let f (z) = Pn (z) + Qm (z)e−zD , where D > 0; Pn (z) and
Qm (z) are relative prime polynomials, n > m and f (z) has no zeros on the imagi-
nary axis while Pn (z) has N zeros in the right half-plane. Prove that for the function
f (z) to have no zeros in the right half-plane, it is necessary and sufficient that the
point

Qm (z) −zD
w=− e
Pn (z)

wind around the point w = 1, N times in the positive direction while the point z
traverses the entire imaginary axis upwards.
(a) Apply the theorem to the following function to determine the domain of the
real numbers kp and τ for which all the zeros of the function lie in the left
half-plane:

f (z) = kp e−zD + τz + 1

(b) Extend the results in (a) to second- and higher-order systems.


12. Show that

x b−1

π
∫ = (0 < b < 1).
1 + x sin bπ
0
16 Series solutions and special functions
In this chapter, we illustrate ordinary, regular and irregular singular points of a first-
and second-order differential equations and method of obtaining series solutions. We
also introduce various special functions that arise in applications.

16.1 Series solution of a first-order ODE


To illustrate the types of solutions, we consider a linear first-order equation:

dw
+ p(z)w = 0 (16.1)
dz

and four special cases of the coefficient function p(z) as discussed below.

Case 1: p(z) = 1
The exact solution is w(z) = c exp(−z), where c is a constant.
z = 0 is an ordinary point and the solution is an entire function.

1
Case 2: p(z) = 2z
c
The exact solution is w(z) = √z .
z = 0 is a regular singular point and the solution has a branch point at z = 0.

Case 3: p(z) = 3z
The exact solution is w(z) = zc3 .
z = 0 is a regular singular point and the solution has a pole of order 3 at z = 0.

1
Case 4: p(z) = z2
The exact solution is w(z) = c exp( z1 ).
z = 0 is an irregular singular point and the solution has an essential singularity at
z = 0.
More generally, when p(z) = kz n−1 with k ≠ 0, the exact solution is given by

c exp(− nk z n ), n ≠ 0
w(z) = { (16.2)
cz −k , n=0

where c is a constant. Thus, when n < 0 or when n is not an integer, z = 0 is an


irregular singular point and the solution has an essential singularity at z = 0. When n
is a natural number, z = 0 is an ordinary point and the solution is an entire function.
But when n = 0, (i) for all noninteger k, z = 0 is a regular singular point and the

https://doi.org/10.1515/9783110739701-017
16.2 Ordinary and regular singular points | 345

solution has a branch point at z = 0; (ii) for any positive integer k, z = 0 is a regular
singular point and the solution has a pole of order k and (iii) for any negative integer
k, z = 0 is an ordinary point and the solution is an entire function.

16.2 Ordinary and regular singular points


Consider the homogeneous linear ODE of order n,

w[n] + pn−1 (z)w[n−1] + ⋅ ⋅ ⋅ + p1 (z)w′ (z) + p0 (z)w(z) = 0 (16.3)

where

di w
w[i] =
dz i

Definition. z0 (≠ ∞) is called an ordinary point of equation (16.3) if pi (z), i = 0, 1, . . . ,


n − 1 are analytic at z0 .

Examples. (a)
w′′ (z) + z 2 w′ (z) + (z − 3)w = 0 (16.4)

All points are ordinary points


(b)

1
w′′ (z) + w′ (z) + ez w = 0 (16.5)
(1 + z 2 )

All points except z = ±i are ordinary points.

Theorem. Suppose that z0 is an ordinary point of (16.3). Then (16.3) has n linearly in-
dependent solutions that are analytic at z0 . Each solution may be expanded in a Taylor
series

w(z) = ∑ an (z − z0 )n (16.6)
n=0

and the radius of convergence of this series is at least as large as the distance to the
nearest singularity of the coefficient functions pi (z).

Example 16.1. Consider the second-order equation

w′ (z)
w′′ (z) + + ez w = 0, (16.7)
1 + z2

and note that z0 = 0 is an ordinary point of equation (16.7) To determine solutions of


the form,
346 | 16 Series solutions and special functions


w(z) = ∑ an z n (16.8)
n=0

We substitute equation (16.8) into equation (16.7) to obtain

∞ ∞ ∞
zn ∞
(1 + z 2 ) ∑ n(n − 1)an z n−2 + ∑ nan z n−1 + (1 + z 2 )( ∑ )( ∑ an z n ) = 0
n=0 n=0 n=0
n! n=0

We solve for an by equating the coefficients of various powers of z to zero.


z 0 : 2a2 + a1 + a0 = 0
z 1 : 6a3 + 2a2 + a1 + a0 = 0
z 2 : 12a4 + 3a3 + 3a2 + a1 + 32 a0 = 0,

etc.

z2 z2 z4
w(z) = a0 (1 − + ⋅ ⋅ ⋅) + a1 (z − + + ⋅ ⋅ ⋅)
2 2 24
= a0 w0 (z) + a1 w1 (z)

Here, w0 (z) and w1 (z) are the two linearly independent solutions.

Definition (Regular singular point, r. s. p.). The point z0 is a regular singular point of
equation (16.3) if not all pi (z), i = 0, 1, 2, . . . , (n−1) are analytic at z0 but if (z −z0 )n p0 (z),
(z − z0 )n−1 p1 (z), . . . , (z − z0 )pn−1 (z), are analytic at z0.

Example 16.2. Consider the second-order equation

z 2 w′′ + zw′ − w = 0
w′ w
w′′ + − 2 =0
z z
1 1
p0 (z) = − , p1 (z) =
z2 z

z0 = 0 is a regular singular point.

Theorem. Suppose that z0 is regular singular point of (16.3). Then the n linearly inde-
pendent solutions of (16.3) have one of the following forms:

w1 (z) = (z − z0 )α1 f1 (z)


w2 (z) = (z − z0 )αi fi (z), αi = ̸ α1
w3 (z) = (z − z0 ) f1 (z) ln(z − z0 ) + (z − z0 )β g1 (z)
α1

k
i
w4 (z) = (z − z0 )γ ∑ [ln(z − z0 )] fi (z), for some k = 1, 2, . . . , n − 1
i=0
16.2 Ordinary and regular singular points | 347

where fi (z), gi (z) are analytic at z0 and have a Taylor series, which converges in a disk
extending to the nearest singularity of pi (z). αi , β and γ are called indicial exponents.
A solution of equation (16.3) is either analytic at z0 or if it is not analytic, the singularity
must be either a pole or an algebraic or logarithmic branch point. The indicial exponents
can be obtained by solving a polynomial of degree n whose coefficients depend on pi (z).

Frobenius method
To determine the solution(s) of the form w1 (z), we write

w1 (z) = (z − z0 )α1 ∑ ai (z − z0 )i
i=0

pk (z) = ∑ pki (z − z0 )i , k = 0, 1, . . . , n − 1
i=0

To find other forms of solutions, expand fi (z) and gi (z) in a Taylor series around z0 . We
now consider the special case n = 2 (second-order differential equation). Let α1 and α2
be the two indicial exponents. Then the different forms of the solutions are

w1 (z) = (z − z0 )α1 f1 (z) (16.9)


α2
w2 (z) = (z − z0 ) f2 (z) (16.10)

or

w2 (z) = (z − z0 )α2 g1 (z) + (z − z0 )α1 ln(z − z0 )g2 (z) (16.11)

The indicial equation is given by

α2 + [p10 − 1]α + p00 = 0

where p10 = limz→0 (z − z0 )p1 (z0 ), p00 = limz→0 (z − z0 )2 p0 (z0 )

Case 1: α1 ≠ α2 ; α1 − α2 ≠ integer
Two Frobenius solutions of the form given in equation (16.9) and equation (16.10)

Case 2: α1 − α2 = 0, 1, 2, . . .
Either two Frobenius solutions or one Frobenius solution and one with logarithm

Example 16.3.
zw′′ + w′ + zw = 0

(This equation is a special case of z 2 w′′ +zw′ +(z 2 −n2 )w = 0, which is Bessel’s equation
of order n)
348 | 16 Series solutions and special functions

w′
w′′ + +w =0
z

z = 0 is regular singular point.

1
p0 (z) = 1, p1 (z) =
z

Indicial equation

α2 + [p10 − 1]α + p00 = 0


1
p10 = lim z. =1
z→0 z

p00 = lim z 2 .1 = 0
z→0

α2 = 0

α = 0, 0

Substituting

w = ∑ an z n
n=0

gives

zw′′ + w′ + zw = ∑ {an n2 + an−2 }z n−1
n=0


an−2
an = −
n2
a2n+1 = 0
a0 a2 a4
a2 = − , a4 = − , a6 = − , etc.
22 42 62


(−1)k z 2k
w0 (z) = a0 ∑ 2k 2
= a0 J0 (z)
k=0 2 (k!)

where
16.2 Ordinary and regular singular points | 349

∞ (−1)k ( z2 )2k
J0 (z) = ∑ = Bessel function of order zero
k=0
(k!)2

I0 (z) = J0 (iz) = Bessel function of second kind of order zero (also known as modified
2 π
Bessel function). For |z| ≫ 1, J0 (z) = √ πz cos(z − 4
). A graph of J0 (z) is shown in
Figure 16.1.

Figure 16.1: Zeroth-order Bessel function of the first kind.

To determine the second solution, we write

w1 (z) = g1 (z) + (ln z)g2 (z),

where g1 and g2 are analytic functions:


g1 (z) = ∑ an z n
n=0

g2 (z) = ∑ bn z n
n=0
g2
w1′ = g1′ + (ln z)g2′ +
z
2g2′ g2
w1 = g1 + g2 ln z +
′′ ′′ ′′
− 2
z z

Substitute these in the differential equation and compare coefficients. After some al-
gebra, we get the second solution

w1 (z) = cY0 (z), c = constant


350 | 16 Series solutions and special functions

where

2 z
Y0 (z) = {ln( ) + γ}J0 (z)
π 2
2 z2 z4 1 z6 1 1
+ { 2 − 2 2 (1 + ) + 2 2 2 (1 + + ) − ⋅ ⋅ ⋅}
π 2 2 .4 2 2 .4 .6 2 3

and

1 1 1
γ = lim {(1 + + + ⋅ ⋅ ⋅ + ) − ln n} = 0.5772 (Euler’s constant)
n→∞ 2 3 n

2
For |z| ≫ 1, Y0 (z) ≈ √ πz sin(z − π4 ). A graph of Y0 (z) for real z is shown below (see
Figure 16.2).

Figure 16.2: Zeroth-order Bessel function of the second kind.

16.3 Series solutions of second-order ODEs


Consider a second-order homogeneous ODE

d2 u du
p0 (z) + p1 (z) + p2 (z)u = 0 (16.12)
dz 2 dz

Let

p1 (z) p2 (z)
p(z) = , q(z) =
p0 (z) p0 (z)

and write equation (16.12) as


16.3 Series solutions of second-order ODEs | 351

d2 u du
+ p(z) + q(z)u = 0 (16.13)
dz 2 dz

Let ψ1 (z) and ψ2 (z) be the two linearly independent solutions of equation (16.13), and

󵄨󵄨 󵄨󵄨
󵄨 ψ (z) ψ2 (z) 󵄨󵄨
W(z) = 󵄨󵄨󵄨󵄨 1′ 󵄨󵄨 = ψ1 ψ′2 − ψ′1 ψ2 = Wronskian
󵄨󵄨 ψ1 (z) ψ′2 (z) 󵄨󵄨
󵄨

Now,

󵄨 󵄨 󵄨 󵄨󵄨
dW 󵄨󵄨󵄨 ψ1 (z) ψ2 (z) 󵄨󵄨󵄨 󵄨󵄨󵄨 ψ1 ψ2 󵄨󵄨
= 󵄨󵄨󵄨 ′′ 󵄨󵄨 = 󵄨󵄨 󵄨󵄨
dz 󵄨󵄨 ψ1 (z) ψ2 (z) 󵄨󵄨󵄨 󵄨󵄨󵄨 −pψ′1 − qψ1
′′
−pψ′3 − qψ2 󵄨󵄨
󵄨
󵄨󵄨 󵄨 󵄨󵄨 󵄨󵄨
󵄨 ψ1 ψ2 󵄨󵄨󵄨 󵄨󵄨 ψ1 ψ2 󵄨󵄨
= 󵄨󵄨󵄨󵄨 󵄨
󵄨 = −p(z) 󵄨󵄨 ′
󵄨󵄨 ψ ψ′
󵄨󵄨
󵄨󵄨 −pψ′1 −pψ′3 󵄨󵄨󵄨 󵄨 1 3
󵄨󵄨
󵄨

󳨐⇒

dW
= −p(z)W(z)
dz

or
z

W(z) = W(z0 ) exp[− ∫ p(t ′ ) dt ′ ]


z0

󳨐⇒
z
ψ1 ψ′2 − ψ′1 ψ2 W(z) W(z0 )
= = exp[− ∫ p(t ′ ) dt ′ ]
ψ21 ψ21 ψ21
z0

󳨐⇒
z
d ψ2 W(z) W(z0 )
( )= = 2 exp[− ∫ p(t ′ ) dt ′ ]
dz ψ1 ψ21 ψ1 (z)
z0

󳨐⇒

z t′
1
ψ2 (z) = ψ1 (z) ∫ 2 ′ exp[− ∫ p(y) dy] dt ′ (16.14)
ψ1 (t )
z0 z0

up to a constant. Thus, if one solution of equation (16.13) is known, a second linearly


independent solution can be determined using equation (16.14). This result is often
used in defining special functions.
352 | 16 Series solutions and special functions

Example 16.4. Consider the second-order ODE

u
u′′ + =0
4z 2

1 1
Here, p(z) = 0 and q(z) = 4z 2
. It can be seen that ψ1 (z) = √z is a solution, as ψ′1 = 2√z
,
1 2 ′′
ψ′′
1 = − 4z√z 󳨐⇒ 4z ψ1 + ψ1 = 0. Thus, from equation (16.7), we get

z
1 ′
ψ2 = √z ∫ dt = √z(ln z − ln z0 ) = √z ln z + c1 ψ1
t′
z0

Thus, we can take ψ2 = √z ln z as the second linearly independent solution.

Example 16.5 (Legendre’s equation). Consider the second-order Legendre’s equation

(1 − z 2 )w′′ − 2zw′ + n(n + 1)w = 0.

2z
Here, p(z) = − 1−z 2 󳨐⇒

z
1
exp[− ∫ p(t) dt ′ ] =
1 − z2

For n = 0, w = P0 (z) = 1 is a solution (zeroth-order Legendre function of the first kind).


The second solution is given by

1 1 1+z
Q0 (z) = 1. ∫ dz = ln( )
(1 − z 2 ).12 2 1−z

which is zeroth-order Legendre function of the second kind.


For n = 1, the solution is same as in the above example where first-order Legendre
function of first kind is P1 (z) = ψ1 (z) = z and first-order Legendre function of second
kind is Q1 (z) = ψ2 (z) = z2 ln( 1+z
1−z
) − 1.
For n = 2, the solution is given by second-order Legendre function of first and
second kind as given by

1
P2 (z) = (3z 2 − 1)
2
3z 2 − 1 1+z 3z
Q2 (z) = ln( )−
4 1−z 2

Similarly, higher-order Legendre functions of both kinds can be obtained.


16.4 Special functions defined by second-order ODEs | 353

16.4 Special functions defined by second-order ODEs

In this section, we summarize some second-order equations and their solutions in


terms of special functions. These appear in many of our applications.

16.4.1 Airy equation

The Airy equation is given by

w′′ = zw

The two linearly independent solutions are expressed in powers of z (as z = 0 is an


ordinary point) and are denoted by Ai(z) and Bi(z). These are defined by

n+1
1 ∞ Γ( 3 ) 1/3 n 2(n + 1)π
Ai(z) = ∑ (3 z) sin[ ]
π32/3 n=0 n! 3
n+1
1 ∞ Γ( 3 ) 1/3 n 󵄨󵄨󵄨󵄨 2(n + 1)π 󵄨󵄨󵄨󵄨
Bi(z) = ∑ (3 z) 󵄨󵄨sin[ ]󵄨󵄨
π3 n=0 n!
1/6 󵄨󵄨 3 󵄨󵄨

where Γ is the Gamma function. For a plot of the Airy Ai(z) and Bairy Bi(z) functions,
see Figure 16.3. Note that Ai(0) = 2/3 1 2 = 0.3550 and Bi(0) = 1/6 1 2 = 0.6149.
3 Γ( 3 ) 3 Γ( 3 )

Figure 16.3: Airy functions Ai(z) and Bi(z).


354 | 16 Series solutions and special functions

16.4.2 Bessel equation

The Bessel equation is given by

z 2 w′′ + zw′ + (z 2 − ν2 )w = 0

where ν may not have to be an integer. Note that z = 0 is a regular singular point. The
two linearly independent solutions are denoted by Jν (z) and Yν (z) and referred to as
Bessel functions of first and second kind, respectively. These are defined by

2m+ν

(−1)m z
Jν (z) = ∑ ( ) , ∀ν (including nonintegers)
m=0
m!Γ(m + ν + 1) 2
Jν (z) cos(νπ) − J−ν (z)
Yν (z) = , ν is not an integer
sin(νπ)

For ν = 0, the Bessel equation simplifies to

zw′′ + w′ + zw = 0

with the two linearly independent solutions J0 (z) and Y0 (z).


For ν = n (integer), the Bessel function of second kind is given by

2k−n
2 z 1 n−1 (n − k − 1)! z
Yn (z) = [ln( ) + γ]Jn (z) − ∑ ( )
π 2 π k=0 k! 2
2k+n
1 n−1 [ϕ(k) + ϕ(n + k)] z
− ∑ (−1)κ ( )
π k=0 (n + k)!k! 2

where γ = Euler’s constant and

1 1 1
ϕ(p) = 1 + + + ⋅⋅⋅ + ; ϕ(0) = 0.
2 3 p

A plot of the Bessel functions is shown in Figure 16.4.

16.4.3 Modified Bessel equation

The modified Bessel equation is given by

z 2 w′′ + zw′ − (z 2 + ν2 )w = 0

where ν is a real constant. When ν is not an integer, the two linearly independent
solutions are denoted by Iν (z) and Kν (z), where
16.4 Special functions defined by second-order ODEs | 355

Figure 16.4: Bessel functions Jn (z) and Yn (z) for n = 0, 1, 2, 3, 4.

2m+ν

1 z
Iν (z) = i−ν Jν (iz) = ∑ ( ) ∀ν (including nonintegers)
m=0
m!Γ(m + ν + 1) 2
π I−ν (z) − Iν (z)
Kν (z) = , ν is not an integer
2 sin(νπ)

For ν = 0, the modified Bessel equation simplifies to

zw′′ + w′ − zw = 0

with the two linearly independent solutions I0 (z) and K0 (z), where

z2 z4 z2
I0 (z) = 1 + + + + ⋅⋅⋅
22 22 .42 22 .42 .62
z z2 z4 1 z6 1 1
K0 (z) = −[ln( ) + γ]I0 (z) + 2 + 2 2 (1 + ) + 2 2 2 (1 + + ) + ⋅ ⋅ ⋅
2 2 2 .4 2 2 .4 .6 2 3

For ν = ± 21 , I 1 (z) = √ πz
2 2
sinh z and I− 1 (z) = √ πz cosh z. A plot of modified Bessel
2 2
functions for real z is shown in Figure 16.5. Note that z = 0 is a regular singular point
of modified Bessel equation.
356 | 16 Series solutions and special functions

Figure 16.5: Modified Bessel function In (z) and Kn (z) for n = 0, 1, 2, 3, 4.

16.4.4 Spherical Bessel equation

The spherical Bessel equation is given by

z 2 w′′ + 2zw′ + [z 2 − n(n + 1)]w = 0; n = 0, 1, 2, . . .

For n = 0, the spherical Bessel equation simplifies to

zw′′ + w′ + zw = 0

with the two linearly independent solutions j0 (z) and y0 (z), where

sin z cos z
j0 (z) = ; and y0 (z) = −
z z

Note that the negative sign in y0 (z) is used as convention so that these functions are
similar to J0 (z) and Y0 (z).
For n > 0, the two linearly independent solutions of the spherical Bessel equation
are
16.4 Special functions defined by second-order ODEs | 357

π Jn+ 21 (z)
jn (z) = (−1)n+1 √
2 √z
Y
π n+ 2
1 (z)
π J−(n+ 21 ) (z)
yn (z) = √ = (−1)n+1 √
2 √z 2 √z

For example, for ν = 1, j1 (z) = sin


z2
z cos z
− z and y1 (z) = − cos
z2
z sin z
− z . A plot of the spherical
Bessel functions is shown in Figure 16.6. Note that z = 0 is a regular singular point of
the spherical Bessel equation.

Figure 16.6: Spherical Bessel function jn (z) and yn (z) for n = 0, 1, 2, 3, 4.

16.4.5 Legendre equation

The Legendre equation is given by

(1 − z 2 )w′′ − 2zw′ + n(n + 1)w = 0; n = 0, 1, 2, . . .

Note that z = ±1 is a regular singular point. The two linearly independent solutions
are denoted by Pn (z) and Qn (z) and are called Legendre functions of first kind and of
second kind, respectively, and are related as

(n + 1)Pn+1 (z) = (2n + 1)zPn (z) − nPn−1 (z)


(n + 1)Qn+1 (z) = (2n + 1)zQn (z) − nQn−1 (z), n≥1
358 | 16 Series solutions and special functions

where

1 1+z
P0 (z) = 1; Q0 (z) = ln( )
2 1−z
z 1+z
P1 (z) = z; Q1 (z) = ln( )−1
2 1−z

These functions are plotted in Figure 16.7.

Figure 16.7: Legendre polynomial Pn (z) and Legendre’s function of second kind Qn (z) for n =
0, 1, 2, 3, 4.

16.4.6 Associated Legendre equation

The Associated Legendre equation is given by

m2
(1 − z 2 )w′′ − 2zw′ + [n(n + 1) − ]w = 0,
1 − z2

where m and n are nonnegative integers. Note that z = ±1 is a regular singular point.
The two linearly independent solutions are denoted by Pnm (z) and Qm
n (z) and are called
16.4 Special functions defined by second-order ODEs | 359

the associated Legendre functions of first kind and of second kind, respectively. For
m = 0, Pn0 reduces to Legendre polynomials. Figure 16.8 shows the plot of Pn1 and Q1n
for n = 1, 2, 3, 4, 5.

Figure 16.8: Associated Legendre polynomial Pn1 (z) and Legendre’s function of second kind Q1n (z) for
n = 1, 2, 3, 4, 5.

16.4.7 Hermite’s equation

The Hermite’s equation is given by

w′′ − 2zw′ + 2nw = 0; n = 0, 1, 2, . . . ,

which has an irregular singular point at infinity. The bounded solutions are the Her-
mite polynomials Hn (z) that are given as

H0 (z) = 1; H1 (z) = 2z; H2 (z) = 4z 2 − 2


dn
Hn (z) = (−1)n exp(z 2 ) exp(−z 2 )
dz n
360 | 16 Series solutions and special functions

16.4.8 Laguerre’s equation

The Laguerre’s differential equation is given by

w′′ + (1 − z)w′ + nw = 0; n = 0, 1, 2, . . .

where the bounded solutions are the Laguerre polynomials Ln (z) that are given as

L0 (z) = 1; L1 (z) = 1 − z; L2 (z) = z 2 − 4z + 2


dn n
Ln (z) = exp(z) [z exp(−z)]
dz n

16.4.9 Chebyshev’s equation

The Chebyshev’s differential equation is given by

(1 − z 2 )w′′ − zw′ + n2 w = 0; n = 0, 1, 2, . . .

where the two linearly independent solutions are Tn (z) and √1 − z 2 Un with Tn (z) and
Un (z) being Chebyshev’s polynomials of degree n of first and second kind, respectively.
These polynomials are given by following recurring relations:

T0 (z) = 1; T1 (z) = z; Tn+1 (z) = 2zTn (z) − Tn−1 (z)


U0 (z) = 1; U1 (z) = 2z; Un+1 (z) = 2zUn (z) − Un−1 (z).

For additional discussion on series solutions and special functions, we refer to the
book by Bender and Orszag [7].

Problems
1. Determine the Taylor series expansion about the point z = 0 of the solution to the
following initial value problems:
(a) w′′ = (z − 1)w; w(0) = 1, w′ (0) = 0
(b) w′′ = z 3 w; w(0) = 1, w′ (0) = 0, w′′ (0) = 0
2. Determine the series expansions of all solutions of the following differential equa-
tions (about the point z = 0) and identify the functions that appear:
(a) zw′′ + w = 0
(b) w′′ − z 2 w = 0
3. Determine the linearly independent (series) solutions of the following differential
equations and identify the functions that appear:
(a) zw′′ + w′ − zw = 0
(b) w′′ + λz 2 w = 0 λ is a positive constant.
(c) (1 − z 2 )w′′ − 2zw′′ + n(n + 1)w = 0; n = 0, 1, 2, . . . .
17 Laplace transforms
The Laplace Transform is a special case of general linear integral transformation of a
function of variable t and parameter s. The general transformation with kernel K(t, s)
is of the form

T{f (t)} = F(s) = ∫ K(t, s)f (t) dt (17.1)


a

where F(s) is called the image or transform of f (t). Integral transforms of the above
form have been studied by Laplace (1749–1827) and Cauchy (1789–1857), and hence
the name. When a = 0, b → ∞ and K(t, s) = e−st , the general integral transform
becomes the Laplace transform.
The Laplace transform technique is useful for solving (i) linear differential equa-
tions (initial value problems and boundary value problems), (ii) linear difference
equations, (iii) linear integral equations, (iv) linear ordinary differential equations
with time delay, (v) linear integrodifferential equations and (vi) linear partial differ-
ential equations that arise in many applications. We review first the theory of Laplace
transform and then illustrate its usefulness with some chemical engineering applica-
tions. Further applications are given in the last section.

17.1 Definition of Laplace transform


Definition. Let f (t) be a real or complex valued function of a real variable t satisfying
the following conditions:
(i) f (t) ≡ 0, t < 0
(ii) f (t) has at most a finite number of discontinuities in 0 ≤ t ≤ λ < ∞ for any finite λ,
i. e., f (t) is sectionally continuous
(iii) f (t) has a bounded order of growth for t → ∞, i. e., ∃ positive constants M and γ
such that for all t > 0

󵄨󵄨 󵄨 γt
󵄨󵄨f (t)󵄨󵄨󵄨 ≤ Me (17.2)

(This last condition implies that the function f (t) does not grow faster than an
2
exponential function. Thus, for functions such as et , the Laplace transform does
not exist. There are also other classes of functions such as t1n ; n ≥ 1, which are
unbounded at t = 0 and for which the Laplace transform does not exist.)

The Laplace transform associates with f (t) a function F(s) of the complex variable
s = x + iy defined by the integral

https://doi.org/10.1515/9783110739701-018
362 | 17 Laplace transforms

F(s) = ∫ e−st f (t) dt = ℓ{f (t)} (17.3)


0

Theorem 17.1. If f (t) satisfies conditions (i)–(iii) above, the Laplace transform exists for
Re s > γ.

Proof. From the definition, we have

󵄨󵄨 ∞ 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨󵄨 −st 󵄨󵄨
󵄨󵄨F(s)󵄨󵄨 = 󵄨󵄨 ∫ e f (t) dt 󵄨󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨0 󵄨

≤ M ∫ 󵄨󵄨󵄨e−st 󵄨󵄨󵄨eγt dt
󵄨 󵄨
0

Now,

󵄨󵄨 −st 󵄨󵄨 󵄨󵄨 −(x+iy)t 󵄨󵄨
󵄨󵄨e 󵄨󵄨 = 󵄨󵄨e 󵄨󵄨 = e
−xt



󵄨󵄨 󵄨 M
󵄨󵄨F(s)󵄨󵄨󵄨 ≤ M ∫ e dt = x>γ
(γ−x)t
,
(x − γ)
0

Thus, if Re s = x > x0 > γ,

󵄨󵄨 󵄨 M
󵄨󵄨F(s)󵄨󵄨󵄨 ≤ (17.4)
(x0 − γ)

and the integral converges uniformly in the domain Re s > x0 > γ.

Theorem 17.2. The Laplace transform F(s) of f (t) is an analytic function of the complex
variable s in the domain Re s > γ

Proof. From the definition,

F(s) = ∫ e−(x+iy)t f (t) dt


0
∞ ∞

= ∫e −xt
(cos yt)f (t) dt − i ∫ e−xt (sin yt)f (t) dt
0 0

≜ u(x, y) + iv(x, y) (17.5)

Now,
17.2 Properties of Laplace transform | 363

Figure 17.1: Schematic diagram illustrating the region x > γ in which F (s) is analytic.


𝜕u
= − ∫ te−xt cos ytf (t) dt (17.6)
𝜕x
0

and

󵄨󵄨 𝜕u 󵄨󵄨 ∞ M
󵄨󵄨 󵄨󵄨 γt
󵄨󵄨 󵄨󵄨 ≤ ∫ e tMe dt = (17.7)
−xt
.
󵄨󵄨 𝜕x 󵄨󵄨 (x − γ)2
0

Thus, 𝜕u
𝜕x
exists and it is continuous in the domain x > γ. Also, from the definition,


𝜕v 𝜕u
= − ∫ te−xt cos ytf (t) dt = . (17.8)
𝜕y 𝜕x
0

Similarly, it can be shown that

𝜕u 𝜕v
=− .
𝜕y 𝜕x

Thus, the Cauchy–Riemann equations are satisfied and the first partial derivatives are
continuous ⇒ F(s) is analytic for Re s > γ. This theorem also implies that any singu-
larities of F(s) must lie to the left of the line Re s = γ. Figure 17.1 shows a schematic
diagram illustrating the region in Laplace domain where F(s) is analytic.

17.2 Properties of Laplace transform


The following properties of Laplace transform, which can be established from the def-
inition, are useful in the solution of linear equations:
364 | 17 Laplace transforms

1. Linearity
If ℓ{f1 (t)} = F1 (s) and ℓ{f2 (t)} = F2 (s), then

ℓ{c1 f1 (t) + c2 f2 (t)} = c1 F1 (s) + c2 F2 (s), (17.9)

where c1 and c2 are real or complex constants.


2. Shifting or Translation
(a) If ℓ{f (t)} = F(s) and a is any complex constant, then

ℓ{eat f (t)} = F(s − a) (17.10)

(b) If ℓ{f (t)} = F(s) and

f (t − a), t≥a
g(t) = {
0, 0 < t < a,

where a is a real constant (a > 0),

ℓ{g(t)} = e−as F(s) (17.11)

3. Scaling property
If ℓ{f (t)} = F(s)

1 s
ℓ{f (at)} = F( ), where a ≠ 0 is any complex number. (17.12)
a a

4. Transforms of derivatives
If ℓ{f (t)} = F(s) and f (t) has continuous derivatives,

ℓ{f ′ (t)} = sF(s) − f (0).

This formula can be established from the definition using integration by parts.
Repeated application of the above formula gives

ℓ{f ′′ (t)} = s2 F(s) − sf (0) − f ′ (0), (17.13)


3 2
ℓ{f (t)} = s F(s) − s f (0) − sf (0) − f (0),
′′′ ′ ′′
and so forth. (17.14)

If f (t) has discontinuity at t = a(a > 0),

ℓ{f ′ (t)} = sF(s) − f (0) − e−as [f (a+ ) − f (a− )] (17.15)

5. Transforms of integrals
If ℓ{f (t)} = F(s) = ∫0 e−st f (t) dt, then

17.2 Properties of Laplace transform | 365

t
F(s)
ℓ{∫ f (t ′ ) dt ′ } = , (17.16)
s
0

6. Differentiation and integration of transforms

dn F(s)
ℓ{(−1)n t n f (t)} = (17.17)
dsn

f (t)
ℓ{ } = ∫ F(s′ ) ds′ (17.18)
t
s

7. The transform of a convolution


The convolution of two functions f1 (t) and f2 (t) is defined by

ϕ(t) = f1 (t) ∗ f2 (t)


t t

= ∫ f1 (t )f2 (t − t ) dt = ∫ f1 (t − t ′ )f2 (t ′ ) dt ′
′ ′ ′
(17.19)
0 0

If ℓ{fi (t)} = Fi (s), i = 1, 2, it may be shown that

ℓ{ϕ(t)} = F1 (s)F2 (s) (17.20)

8. Periodic functions
If f (t + T) = f (t), T > 0

ℓ{f (t)} = ∫ e−st f (t) dt/[1 − e−sT ] (17.21)


0

9. Initial and final value theorems


If the indicated limits exist, it may be shown that

lim f (t) = lim sF(s) (Initial value theorem) (17.22)


t→0 s→∞
lim f (t) = lim sF(s) (Final value theorem) (17.23)
t→∞ s→0

10. Moment theorem:


Expanding the exponential in the definition,

F(s) = ∫ e−st f (t) dt


0

s2 t 2 s3 t 3

= ∫ [1 − st + − + ⋅ ⋅ ⋅]f (t) dt
2! 3!
0
366 | 17 Laplace transforms

and defining the jth moment of f (t) by


Mj = ∫ t j f (t) dt; j = 0, 1, 2, . . . , (17.24)


0

it follows that

dj F(s) 󵄨󵄨󵄨󵄨
Mj = (−1)j 󵄨 . (17.25)
dsj 󵄨󵄨󵄨s=0

Equation (17.25) shows that the jth moment of f (t) can be obtained from F(s) by
expanding it in power of s.

17.2.1 Examples of Laplace transform

Example 17.1. Consider the exponential function f (t) = eat , t > 0 (a is real or com-
plex). We have

1
ℓ{e } = ∫ eat e−st dt =
at
(17.26)
s−a
0

From equation (17.26), we can obtain the Laplace transform of many elementary func-
tions:

(set a = 0) (differentiation w. r. t. a)
1 1
ℓ{1} = ℓ{teat } =
s (s − a)2
1
ℓ{t} = 2 .
s
. .
. .
. .
n n! n!
ℓ{t } = n+1 ℓ{t n eat } =
s (s − a)n+1
(set a = α + iβ) (set a = iw)
1
⇒ ℓ{eiwt } =
s − iw
1
ℓ{eαt (cos βt + i sin βt)} ℓ{e−iwt } =
s + iw
.
1 s
= ⇒ ℓ{cos wt} = 2 .
s − α − iβ s + w2
s−α w
⇒ ℓ{eαt cos βt} = ℓ{sin wt} = 2
(s − α)2 + β2 s + w2
17.2 Properties of Laplace transform | 367

Figure 17.2: Schematic diagram of unit step input at t = 0.

Example 17.2 (Unit step function (Heaviside’s function)). The Heaviside’s function,
also known as the unit-step function is shown in Figure 17.2.

1, t>0
H(t) = U(t) = {
0, t<0
1
ℓ{U(t)} = (17.27)
s
Similarly, if the unit step is located at t = α (α > 0), then we denote the function by
U(t − α) and

1
ℓ{U(t − α)} = e−αs (17.28)
s
Example 17.3. Consider the function
1
, 0<t<ε
f (t) = { ε
0, t>ε

The Laplace transform is given by


ε
1 1 − e−εs
F(s) = ∫ e−st dt =
ε εs
0

Taking the limit ε → 0 (but keeping the area under the curve constant), we get the
so-called unit impulse function (also known as the Dirac delta function):

δ(t) = Dirac delta function (Unit impulse function)


= lim f (t, ε)
ε→0
ℓ{δ(t)} = 1 (17.29)

Figure 17.3 shows an approximation to unit impulse for small values of ε. Similarly, for
the unit impulse function at time t = t0 , denoted δ(t − t0 ), we have

ℓ{δ(t − t0 )} = e−st0 (17.30)


368 | 17 Laplace transforms

Figure 17.3: Approximation to unit impulse (Dirac delta) input at t = 0.

Remark. The Dirac delta function may be approached by using many “test functions.”
The above illustration used one-sided test function. A test function using the two-
sided or symmetric around the origin is the Gaussian function

1 x2
f (x, ε) = exp(− ).
2πε 2ε

In the limit of ε → 0, this function approaches δ(x).

Example 17.4. Laplace transform of Bessel functions



(−1)m t 2m
f (t) = J0 (t) = ∑ 2m 2
m=0 2 (m!)
(−1)m (2m)! 1

1
ℓ{J0 (t)} = ∑ 2m 2 2m+1
=
m=0 2 (m!) s 2
√s + 1
1
ℓ{J0 (at)} =
√s2 + a2
1
ℓ{I0 (at)} =
√s2 − a2

Example 17.5. Laplace Transform of Error function

t
2 2
erf t = ∫ e−u du = Error Function
√π
0
7
2 1 t 3/2 t 5/2 t2
erf √t = {t 2 − + − + ⋅ ⋅ ⋅}
√π 3(1!) 5(2!) 7(3!)
2 Γ(3/2) Γ(5/2) 1
ℓ{erf √t} = { − + ⋅ ⋅ ⋅} =
√π s3/2 3s5/2 s √s + 1

ℓ{t } = ∫ e−st t α dt; ⇐ st = u


α

e−u uα

Γ(α + 1)
=∫ du =
sα+1 sα+1
0
17.3 Inversion of Laplace transform | 369

where Γ is the Gamma (factorial) function defined by

Γ(α + 1) = ∫ uα e−u du = α!
0

Γ(α + 1) = αT(α)
∞ ∞
1 1
2 1 2
Γ( ) = ∫ u− 2 e−u du, u = v ⇒ Γ( ) = 2 ∫ e−v dv = √π.
2 2
0 0

Example 17.6 (Laplace transform directly from the D. E.). Consider the Bessel equa-
tion of order zero:

tJ0′′ + J0′ + tJ0 = 0,

with initial conditions:

J0 (0) = 1, J0′ (0) = 0.

Let

ℓ{J0 (t)} = F(s)

d 2 dF
− {s F(s) − s} + sF − 1 − =0
ds ds
dF sF c
⇒ =− ⇒F= ,
ds 1 + s2 √1 + s 2
and lim sF = c = 1
s→∞

1
ℓ{J0 (t)} = .
√1 + s2

17.3 Inversion of Laplace transform


We have shown that if f (t) is sectionally continuous and is of exponential order, i. e.,

󵄨󵄨 󵄨 γ t
󵄨󵄨f (t)󵄨󵄨󵄨 < Me 0 (17.31)

then the Laplace transform of f (t),


370 | 17 Laplace transforms

F(s) = ∫ e−st f (t) dt


0

is analytical in the region Re s > γ0 . Assume that F(s) is real for s real. Suppose that s
is any point on the real axis and γ > γ0 as shown in Figure 17.4.

Figure 17.4: Schematic diagram of the contours for Cauchy’s integral formula.

If C is the vertical and Γ is the curved contour shown, Cauchy’s integral formula gives

1 F(z) F(z)
F(s) = [∫ dz + ∫ dz]
2πi (z − s) (z − s)
C Γ
γ−iR sin θ1
1 F(z) 1 F(z)
= ∫ dz + ∫ dz (17.32)
2πi (z − s) 2πι (z − s)
γ+iR sin θ1 Γ

where Γ = part of semicircular arc with radius R as shown in Figure 17.4. Consider the
integral

1 F(z)
J= ∫ dz (17.33)
2πi (z − s)
Γ

on Γ, z − s = Reiθ ⇒ dz = i Reiθ dθ.



π−θ1
1 󵄨 󵄨
|J| ≤ ( ). ∫ 󵄨󵄨󵄨F(z)󵄨󵄨󵄨dθ. (17.34)

π+θ1
17.3 Inversion of Laplace transform | 371

If we assume that for sufficiently large R,

󵄨󵄨 󵄨 M
󵄨󵄨F(z)󵄨󵄨󵄨 ≤ k (17.35)
R

1 M Mθ1
|J| ≤ . 2θ = . (17.36)
2π Rk 1 πRk
Thus, limR→∞ |J| = 0 if k > 0.

γ+i∞
1 F(z)
F(s) = ∫ dz (17.37)
2πi (s − z)
γ−i∞

where s is real. The above formula is an extended form of Cauchy’s integral formula.

17.3.1 Bromwich’s complex inversion formula

Now, if ℓ{f (t)} = F(s) ⇒ ℓ−1 {F(s)} = f (t), where ℓ−1 is the inverse operator. Applying
ℓ−1 on both sides of equation (17.37) ⇒

γ+i∞
1 −1 F(z)
f (t) = ℓ ∫ dz (17.38)
2πi (s − z)
γ−i∞

ℓ−1 is w. r. t. s while integration is w. r. t. z. Thus, we can take ℓ−1 inside the integral

γ+i∞
1 1
f (t) = ∫ F(z)ℓ−1 ( ) dz (17.39)
2πi s−z
γ−i∞


γ+i∞
1
f (t) = ∫ ezt F(z) dz. (17.40)
2πi
γ−i∞

M
Theorem. Suppose that F(s) is analytic in some right half-plane Re s > γ, |F(s)| < sk
for
some k > 0 and F(s) is real for s real. Then the integral

γ+i∞
1
f (t) = ∫ est F(s) ds (17.41)
2πi
γ−i∞
372 | 17 Laplace transforms

is independent of γ and converges to a real valued function f (t) whose Laplace transform
is F(s).

A proof of this theorem may be found in the literature (Churchill [12]).


Laplace transform pair

∞ γ+i∞
1
F(s) = ∫ e −st
f (t) dt, f (t) = ∫ est F(s) ds (17.42)
2πi
0 γ−i∞

In Part V, we compare it with the Fourier transform pair:

∞ ∞
1
F(α) = ∫ e −iαt
f (t) dt, f (t) = ∫ eiαt F(α) dα. (17.43)

−∞ −∞

Computing the Bromwich’s integral

Case A: only singularities of F (s) for Re s < γ are poles


Choose R large enough so that the contour ABCEFA shown in Figure 17.5 encloses all
the poles F(s).

Figure 17.5: Schematic diagram illustrating the contour enclosing all the poles for Bromwich’s inte-
gral formula.

By the residue theorem,

1
∮ est F(s) ds = ∑ Residues of est F(s) at poles of F(s)
2πi
17.3 Inversion of Laplace transform | 373

γ+i∞
1 1
LHS = ∫ est F(s) ds + ∫ est F(s) ds
2πi 2πi
γ−i∞ Γ

If F(s) is such that on Γ, |F(s)| < RMk , k > 0, then it can be shown that for R → ∞, the
second integral goes to zero and

γ+iR
1
LHS = ∫ est F(s) ds
2πi
γ−iR

f (t) = ∑ Residues of est F(s) at poles of F(s)

Case B: F (s) has a branch point (say at s = 0)


As an example, we consider the function

e−a√s
F(s) = (a > 0),
s
which has a branch point at s = 0. The appropriate contour to consider is shown in
Figure 17.6.

Figure 17.6: Schematic diagram illustrating the contour enclosing the branch point at s = 0 for
Bromwich’s integral formula.

Thus,
γ+i∞
e−a√s 1 e−st−a√s
ℓ−1 { }= ∫ { } ds.
s 2πi s
γ−i∞
374 | 17 Laplace transforms

The function f (t) can be obtained by evaluating the integral along the contour shown
in Figure 17.6 and taking the limits ε → 0, R → ∞.

Case C: F (s) has poles as well as one or more branch points


(i) If poles are not on branch lines, modify the contour C as in case B
(ii) If poles are on branch lines, then modify the contour C to go around the poles that
are on branch line.

Case D: (numerical inversion)


It follows from equation (17.41) that the evaluation of f (t) for a fixed t requires the eval-
uation of a complex line integral along the line Re s = γ. This can be done numerically
in various ways such as using (i) Fourier transform (or fast Fourier transform) meth-
ods, (ii) Gaver functionals, (iii) Laguerre functions or (iv) deformation of the Bromwich
contour. We refer to the articles by Abate and Valko [1] and Defreitas and Kane [17] for
further details. Examples illustrating numerical inversion are given in Sections 17.5,
17.6 and Chapter 29.

17.4 Solution of linear differential equations by Laplace transform


In this section, we illustrate the solution of various linear differential equations (in
one independent variable) using the Laplace transform technique.

17.4.1 Initial value problems with constant coefficients

Example 17.7 (Homogeneous equation). Solve

d2 u du
a0 + a1 + a2 u = 0 (17.44)
dt 2 dt
u(0) = d0 (17.45)
u (0) = d1

(17.46)

where d0 and d1 are constants. Let


U(s) = ℓ{u(t)} = ∫ e−st u(t) dt (17.47)


0
du
ℓ{ } = sU(s) − u(0) (17.48)
dt
d2 u
ℓ{ } = s2 U(s) − su(0) − u′ (0) (17.49)
dt 2
17.4 Solution of linear differential equations by Laplace transform | 375

Taking the Laplace transform of equations (17.44)–(17.46) gives

a0 [s2 U(s) − sd0 − d1 ] + a1 [sU(s) − d0 ] + a2 U(s) = 0

a0 (sd0 + d1 ) + a1 d0
U(s) = (17.50)
a0 s2 + a1 s + a2

Now, consider the n-th order equation

dn u dn−1 u du
a0 + a 1 + ⋅ ⋅ ⋅ + an−1 + an u = 0 (17.51)
dt n dt n−1 dt

with initial conditions

u(0) = d0 , u′ (0) = d1 , . . . , u[n−1] (0) = dn−1 (17.52)

Then, taking LT of equation (17.51) gives

Qm (s)
U(s) = , m≤n−1 (17.53)
Pn (s)

where Pn and Qm are polynomials in s of degree n and m, respectively. To deter-


mine u(t), let s1 , s2 , . . . , sn denote the zeros of the polynomial Pn (s) and assume
(for simplicity) these are simple and Qm (si ) ≠ 0. Then

u(t) = ℓ−1 {U(s)}


n
Qm (s)
= ∑ Residue est @ s = si (17.54)
i=1
Pn (s)


n
Qm (si )
u(t) = ∑ esi t (17.55)
i=1
Pn′ (si )

Equation (17.55) is called the Heaviside’s expansion formula. An elementary derivation


of this equation is given below.

Example 17.8 (Inhomogeneous equation). Now consider the inhomogeneous equa-


tion:

dn u dn−1 u du
a0 n
+ a1 n−1 + ⋅ ⋅ ⋅ + an−1 + an u = g(t) (17.56)
dt dt dt

with initial conditions


376 | 17 Laplace transforms

u(0) = d0 , u′ (0) = d1 , ..., u[n−1] (0) = dn−1 (17.57)

Let ℓ{u(t)} = U(s) and ℓ{g(t) = g(s)}.


̂ Taking L.T of equations (17.56)–(17.57) gives

Pn (s)U(s) − Qm (s) = g(s)


̂

Qm (s) g(s)
̂
U(s) = + (17.58)
Pn (s) Pn (s)
Qm (s) g(s)
̂
u(t) = ℓ−1 { } + ℓ−1 { }
Pn (s) Pn (s)
= u1 (t) + u2 (t), (17.59)

where

u1 (t) = solution to the homogeneous equation (17.60)


u2 (t) = particular solution (17.61)

u1 (t) is obtained by setting g(t) = 0, while u2 (t) is obtained by setting di = 0 (i =


0, 1, 2, . . . , n − 1).

g(s)
̂
u2 (t) = ℓ−1 { } (17.62)
Pn (s)

Let
n
1 e si t
ℓ−1 { }=∑ ′ ≡ G(t) (17.63)
Pn (s) P (s )
i=1 n i

Using the convolution theorem, we get

u2 (t) = ∫ G(t ′ )g(t − t ′ ) dt ′ (17.64)


0
t

= ∫ G(t − t ′ )g(t ′ ) dt ′ (17.65)


0

The function G(t − t ′ ) is also called the Green’s function for the initial value problem.
We note that when g(t) = δ(t − t0 ), equation (17.65) reduces to u2 (t) = G(t − t0 ). Thus,
G(t − t0 ) is the response of the system with homogeneous (zero) initial condition for a
unit impulse at t = t0 .
17.4 Solution of linear differential equations by Laplace transform | 377

17.4.2 Elementary derivation of Heaviside’s formula

Consider the scalar nth order IVP

Lu = f (t), t>0 (17.66)


k(u(0)) = α (17.67)

where L has constant coefficients. Taking Laplace transform and solving for the trans-
form, we get

Qm (s) F(s)
U(s) = + (17.68)
Pn (s) Pn (s)

where Qm (s) is a polynomial of degree m(< n) depending only on the initial condi-
tions while Pn (s) is a polynomial depending on the coefficients of L. Equation (17.68)
is written as

U(s) = Uh (s) + Up (s) (17.69)

or

u(t) = uh (t) + up (t) (17.70)

Homogeneous part
We now derive the Heaviside’s formula for uh (t).
Let
Qm (s)
U(s) = (17.71)
Pn (s)

Since m ≤ (n − 1), we can express equation (17.71) in partial fractions

Qm (s) A1 A2 An
= + + ⋅⋅⋅ + (17.72)
Pn (s) (s − λ1 ) (s − λ2 ) (s − λn )

[Assuming λi , i = 1, 2, . . . , n is a simple root of Pn (s) = 0.] Multiply equation (17.72) by


(s − λi ) and let s → λi 󳨐⇒

(s − λi )
Ai = lim Qm (s)
s→λi Pn (s)
Q (λi )
= m (17.73)
Pn′ (λi )
378 | 17 Laplace transforms

Taking inverse Laplace transform of equation (17.72) gives


n
uh (t) = ∑ Aj eλj t
j=1
n Qm (λj )
= ∑ e λj t , (17.74)
j=1
Pn′ (λj )

which is the Heaviside’s formula.

Heaviside’s formula for repeated roots


Consider again

Qm (s)
U(s) = (17.75)
Pn (s)

Pole of order 2
Suppose that

Pn (s) = (s − a)2 Rn−2 (s), where Rn−2 (a) ≠ 0 (17.76)

Since s = a is a pole of order 2, we can write


n αj
α1 α2
U(s) = + + ∑ (17.77)
s − a (s − a)2
j=3
(s − sj )

󳨐⇒
n αj
(s − a)2 U(s) = α1 (s − a) + α2 + (s − a)2 ∑
j=3
(s − sj )

󳨐⇒

α2 = lim[(s − a)2 U(s)] (17.78)


s→a
d
α1 = lim [(s − a)2 U(s)] (17.79)
s→ads
Qm (sj )
αj = ′ , j≥3 (17.80)
Pn (sj )

and
n
u(t) = α1 eat + α2 teat + ∑ αj esj t (17.81)
j=3
17.4 Solution of linear differential equations by Laplace transform | 379

Pole of order 3
let s = a is a pole of order 3, we can write
n αj
α1 α2 α3
U(s) = + + +∑ (17.82)
s − a (s − a) 2 (s − a) 3
j=4
(s − sj )

where

α3 = lim[(s − a)3 U(s)] (17.83)


s→a
d
α2 = lim [(s − a)3 U(s)] (17.84)
ds
s→a

1 d2
α1 = lim 2 [(s − a)2 U(s)] (17.85)
2! s→a ds
Qm (sj )
αj = ′ , j≥4 (17.86)
Pn (sj )

and

t 2 at n
u(t) = α1 eat + α2 teat + α3 e + ∑ α j e sj t (17.87)
2! j=4

and so on for higher-order poles.


Similar results may also be derived using the residue theorem. For example, con-
sider the case with pole of order 2. In this case, we can write

Qm (s)
U(s) = , where Rn−2 (a) ≠ 0 (17.88)
(s − a)2 Rn−2 (s)

Thus, the residue theorem leads to


n (s − sj )Qm (s) st
d (s − a)2 Qm (s) st
u(t) = lim [ e ] + ∑ lim [ e ]
s→a ds Pn (s) s→s j Pn (s)
j=3

󳨐⇒
n Q (s )
d Qm (s) st m j sj t
u(t) = lim [ e ]+∑ ′ e
s→a ds R (s) P (s )
n−2 j=3 n j

d Qm (s) Q (a) at n Qm (sj ) sj t


= (lim [ ])eat + t m e +∑ ′ e
s→a ds R Rn−2 (a) P (s )
n−2 (s) j=3 n j
n
= α1 eat + α2 teat + ∑ αj esj t (17.89)
j=3
380 | 17 Laplace transforms

where

d Qm (s) d
α1 = (lim [ ]) = (lim [(s − a)2 U(s)]) (17.90)
s→a ds Rn−2 (s) s→a ds

Qm (a)
α2 = = lim[(s − a)2 U(s)] (17.91)
Rn−2 (a) s→a
Qm (sj )
αj = , j≥3 (17.92)
Pn′ (sj )

As can be expected, the results from residue theorem (equations (17.89)–(17.92)) are
identical to those obtained from Heaviside’s formula (equations (17.78)–(17.81)).

Inhomogeneous part
We now derive the Heaviside’s formula for up (t).
Let
F(s) F(s)
Up (s) = = (17.93)
Pn (s) Pn (s)

Writing

Pn (s) = α0 (s − λ1 )(s − λ2 ) . . . (s − λn ) (17.94)

we get
n
1 e λj t
ℓ−1 [ ]=∑ ′ ≡ G(t) (17.95)
Pn (s) P (λ )
j=1 n j

Thus, by convolution,

1
up (t) = ℓ−1 [ .F(s)]
Pn (s)
t

= ∫ G(t − t ′ )f (t ′ ) dt ′ (17.96)
0

= ∫ G(t ′ )f (t − t ′ ) dt ′ . (17.97)
0

Here, G(t) is the Green’s function of the IVP and is defined by equation (17.63).
17.4 Solution of linear differential equations by Laplace transform | 381

17.4.3 Two-point boundary value problems

Consider solving

d2 u du
+ a1 + a0 u = f (x), a < x < b (17.98)
dx 2 dx
u(a) = α1 , u(b) = α2 (17.99)

We may assume a = 0. (If a ≠ 0, we can define x ′ = x − a, so that the new domain is 0,


b − a). To solve equations (17.98)–(17.99), we replace it by the modified problem

u′′ + a1 u′ + a0 u = f (x), 0<x<b (17.100)


u(0) = α1 , u (0) = β1 ,

(17.101)

where the unknown constant β1 is to be determined.


Taking LT. of equations (17.100)–(17.101) gives

sα1 + β1 + a1 α1 f ̂(s)
U(s) = 2
+ 2 (17.102)
s + a1 s + a0 s + a1 s + a0

(s1 α1 + β1 + α1 a1 ) (s α + β1 + α1 a1 )
u(x) = es1 x + e s2 x 2 1
(s1 − s2 ) (s2 − s1 )
x
e s1 x − e s2 x
′ ′

+∫ f (x − x ′ ) dx′ (17.103)
(s1 − s2 )
0

Determine β1 so that u(b) = α2 . When α1 = α2 = 0, we get a linear equation for deter-


mining β1 as follows:

b
e s1 b − e s2 b
β1 ( ) + ∫ G(x, x ′ )f (x ′ ) dx ′ = 0.
s1 − s2
0

Also, in this case, the solution of the BVP with homogeneous BCs may be expressed
as
x
e s1 x − e s2 x
u(x) = β1 ( ) + ∫ G(x, x ′ )f (x′ ) dx ′ (17.104)
s1 − s2
0

where

es1 (x−x ) − es2 (x−x )


′ ′

G(x, x ′ ) = . (17.105)
(s1 − s2 )
382 | 17 Laplace transforms

Equations (17.104)–(17.105) can be simplified further to determine the Green’s function


for the BVP. This is discussed in more detail and generality in Part IV.

17.4.4 Linear ODEs with variable coefficients:

Example 17.9. Solve

d2 u du
t − − t.u = 0
dt 2 dt
u(0) = 0
U(s) = ℓ{u(t)}
ℓ{u (t)} = s2 U(s) − 0 − u′ (0),
′′
ℓ(u′ (t)) = sU(s)
d 2
ℓ{tu′′ (t)} = − {s U(s) − u′ (0)}
ds

d 2 dU
− {s U(s)} − sU(s) + =0
ds ds
dU
(s2 − 1) + 3sU = 0
ds

c
U(s) =
(s2 − 1)3/2
u(t) = c ℓ−1 {(s2 − 1) } = c t I1 (t),
−3/2

where I1 is the modified Bessel function of order one. Note that

1
−3/2
(s2 − 1) = s−3 (1 −
−3/2
)
s2
3 5
3 1 (− )(− 2 ) 1
= s−3 [1 + . 2 + 2 + ⋅ ⋅ ⋅]
2 s 2! s4
1 3 3.5 3.5.7
= 3 + 5 + + + ⋅⋅⋅
s 2.s 2
2!.2 s 7 3!23 s9
Taking the inverse transform term-by-term gives

u(t) = tI1 (t),

where

(t/2)2m+ν
Iν (t) = ∑
m=0
m!Γ(m + v + 1)
17.4 Solution of linear differential equations by Laplace transform | 383

Other Bessel equations


(i)

d2 u du
t + + tu = 0, u(0) = 1
dt 2 dt

c
U(s) = , c = u(0) = lim s.U(s) = 1
√1 + s 2 s→∞

u(t) = J0 (t)

(ii) d2 u du
t + − tu = 0
dt 2 dt
1
u(0) = 1 ⇒ U(s) = , u(t) = I0 (t)
√s2 − 1

17.4.5 Simultaneous ODEs with constant coefficients

Example 17.10. Consider the coupled initial value problem

ẍ − x + ẏ − y = 0 (17.106)
−2ẋ − 2x + ÿ − y = e −t
(17.107)
x(0) = 0, y(0) = 1 (17.108)
x(0)
̇ = −1, y(0)
̇ =1 (17.109)

Taking L.T. gives

s2 X(s) + 1 − X(s) + sY(s) − 1 − Y(s) = 0 (17.110)


1
−2[sX(s)] − 2X(s) + s2 Y(s) − s − 1 − Y(s) = (17.111)
s+1

(s2 + 2s + 2)
X(s) = − (17.112)
(s + 1)2 (s2 + 1)
s2 + 2s + 2
Y(s) = (17.113)
(s2 + 1)(s + 1)

Using Heaviside’s formula gives

1 1
X(t) = cos t − sin t − (1 + t)e−t (17.114)
2 2
384 | 17 Laplace transforms

1 3 1
Y(t) = cos t + sin t + e−t . (17.115)
2 4 2

Example 17.11 (Autonomous linear systems). Consider the vector initial value prob-
lem

du
= Au, u(@ t = 0) = u0 , (u is a n-vector, A is a n × n matrix) (17.116)
dt

ℓ{u} = û ⇒

̂ − u0 = Au(s)
su(s) ̂ = (sI − A)−1 u0
̂ ⇒ u(s) (17.117)

Complex inversion formula gives

γ+i∞
1
u(t) = ∫ est (sI − A)−1 u0 ds (17.118)
2πi
γ−i∞
adj(sI − A) adj(sI − A)
(sI − A)−1 = = (17.119)
det(sI − A) Pn (s)

Let s1 , s2 , . . . , sn be roots of Pn (s) = 0 ⇒

n
u(t) = ∑ Residue est (sI − A)−1 u0 |s=si
i=1
n
adj(si I − A) 0
= ∑ e si t u
i=1
Pn′ (si )
n
= ∑ esi t Ei u0 = eAt u0 . (17.120)
i=1

This example shows the connection between the spectral theorem and residue theo-
rem.

17.5 Solution of linear differential/partial differential equations


by Laplace transform

In this section, we illustrate the application of the Laplace transform method to solve
linear differential equations in two independent variables.
17.5 Solution of linear differential/partial differential equations by Laplace transform | 385

17.5.1 Heat transfer in a finite slab

Consider the heat/diffusion equation

𝜕θ 𝜕2 θ
= 2; 0 < ξ < 1, t > 0 (17.121)
𝜕t 𝜕ξ
θ(ξ , 0) = 0 IC (17.122)
θ(0, t) = 0; θ(1, t) = 1 BCs (17.123)

Let θ(ξ
̄ , s) = ℓ{θ(ξ , t)} ⇒

d2 θ̄
− sθ̄ = 0
dξ 2
θ(0,
̄ s) = 0 BC1

̄ s) = 1 BC2
θ(1,
s

θ̄ = c1 cosh √sξ + c2 sinh √sξ

1
BC1⇒ c1 = 0, BC2 ⇒ c2 = s. sinh √s

sinh √sξ
θ̄ = (17.124)
s sinh √s
γ+i∞
1
θ(ξ , t) = ∫ est .θ(ξ
̄ , s) ds (17.125)
2πi
γ−i∞

s = 0 is a simple pole (not a branch point) and simple poles at sinh √s = 0 ⇒ √s =


±nπi, sn = −n2 π 2 , n = 1, 2, 3, . . . . Now,

sinh √sξ 󵄨󵄨󵄨󵄨


Residue est 󵄨 =ξ
s sinh √s 󵄨󵄨󵄨s=0
sinh √sξ 󵄨󵄨󵄨󵄨 2 2 2
Residue est 󵄨󵄨 = (−1)n e−n π t sin(nπξ ).
s sinh √s 󵄨s=sn
󵄨 nπ

Thus,

2 ∞ (−1)n sin(nπξ ) −n2 π 2 t


θ(ξ , t) = ξ + ∑ e (17.126)
π n=1 n

Note that for t → ∞, θ(ξ , t) → ξ , i. e. the steady-state profile is linear. The results are
shown below in Figure 17.7 using 1000 terms in the summation in Mathematica® .
386 | 17 Laplace transforms

Figure 17.7: Temperature profile at various times in a slab from one-dimensional heat conduction
model given in equations (17.121)–(17.123).

17.5.2 TAP reactor model

A temporal analysis of products (TAP) reactor is a packed-bed of catalyst particles op-


erated under vacuum conditions. In a typical experiment, a pulse of molecules is in-
jected at the inlet to the TAP reactor and the product composition at the exit is mea-
sured as a function of time. The response curves are analyzed to determine the kinetic
constants and species diffusivities (in the Knudsen regime). A schematic of the TAP
reactor is shown in Figure 17.8.

Figure 17.8: Temporal profile of flux for TAP reactor (exact solution in blue and short time solution in
red curves).

The mathematical model of a TAP (temporal analysis of products) reactor in dimen-


sionless form is given by

𝜕c 𝜕2 c
= ; 0 < z < 1, t>0 (17.127)
𝜕t 𝜕z 2
17.5 Solution of linear differential/partial differential equations by Laplace transform | 387

with BC:

𝜕c
− (0, t) = δ(t); c(1, t) = 0 (17.128)
𝜕z

and IC:

c(z, 0) = 0 (17.129)

The quantity of interest is the exit flux J defined by

𝜕c
J(t) = − (1, t) (17.130)
𝜕z

The Laplace transform method can be used to solve for dimensionless concentration
c and flux J.
Let

ĉ(z, s) = ℓ[c(z, t)] = LT of c(z, t) (17.131)

Taking LT of governing equation and boundary conditions (equations (17.127)–(17.129))


gives

d2 ĉ dĉ
sĉ = ; − (z = 0, s) = 1; ĉ(z = 0, s) = 0 (17.132)
dz 2 dz
󳨐⇒

sinh[√s(1 − z)]
ĉ(z, s) = (17.133)
√s cosh[√s]

󳨐⇒ LT of exit flux is

̂J(s) = − dĉ (z = 1, s) = 1
(17.134)
dz cosh[√s]

Temporal moments of flux J(t)


From the definition of LT of flux,

̂J(s) = ∫ e−st J(t) dt
0

s2 t 2

= ∫ (1 − st + . . .)J(t) dt
2!
0
s2
= M0 − sM1 + M − ⋅⋅⋅ (17.135)
2! 2
388 | 17 Laplace transforms

where M0 , M1 and M2 are the zeroth, first and second moments of J(t). Thus,

M0 = ̂J|s=0 = 1 (17.136)
d̂J 󵄨󵄨󵄨 sinh √s 󵄨󵄨
󵄨󵄨 1
M1 = − 󵄨󵄨󵄨 = 󵄨󵄨 = (17.137)
dz 󵄨s=0 2√s(cosh √s) 󵄨s=0 2
󵄨 2 󵄨
d2̂J 󵄨󵄨󵄨 5
M2 = − 2 󵄨󵄨󵄨 = (17.138)
dz 󵄨s=0 12
󵄨

Thus, the dimensionless variance (second central moment) of the response is given by

M2 2
σ2 = −1= . (17.139)
2
M1 3

The moments can also be found by using the expansion

1 1 s s2 5
= =1− + + ⋅⋅⋅
cosh √s 1 + s
+ s2
+ ⋅⋅⋅ + sn
+ ⋅⋅⋅ 2 2! 12
2! 4! (2n)!

Laplace inverse of flux and solution in time domain


Equation (17.134) suggests that ̂J has poles at

cosh √s = 0 󳨐⇒ e2√s + 1 = 0

or

π2
sk = −(2k − 1)2 ; k = 1, 2, 3, . . . (17.140)
4

All these poles are simple zeros. Thus, using Heaviside’s formula or residue theorem,
we get

2√sk
J(t) = ∑ esk t
k=1
sinh √sk

π2
= ∑ (−1)k−1 (2k − 1)π exp[−(2k − 1)2 t] (17.141)
k=1
4
π2 9π 2 25π 2
= π[e− 4 t − 3e− 4
t
+ 5e− 4
t
− ⋅ ⋅ ⋅] (17.142)

We note that for large s (or small t),

̂J(s) = 2e−√s , large s


1 − 4t1
󳨐⇒ J(t) = e , small t (17.143)
√πt 3
17.5 Solution of linear differential/partial differential equations by Laplace transform | 389

The general solution from equation (17.141) with 100 terms in the summation and the
short time solution from equation (17.143) are shown in Figure 17.8 in blue and red
curves, respectively.
It can be seen from this figure that the flux attains a maximum when

dJ 󵄨󵄨󵄨󵄨 1
󵄨 = 0 󳨐⇒ t ∗ = and Jmax = 1.85 (17.144)
dt 󵄨󵄨󵄨t=t ∗ 6

We can also show from initial and final value theorem that

lim J(t) = lim ŝJ(s) = 0 (17.145)


t→0 s→∞

lim J(t) = lim ŝJ(s) = 0. (17.146)


t→∞ s→0

17.5.3 Dispersion of tracers in unidirectional flow

Tracer tests are used to determine the flow maldistributions, leaks and performance
of several types of process equipment such as reactors, separation/adsorption and
distillation columns. In a typical tracer test, a pulse of tracer is injected at the inlet
to the device and the exit concentration of the tracer is recorded. The response of the
equipment for a unit impulse input is known as the “residence time distribution” or
RTD curve and is denoted by E(t). The exit response to the unit step input is referred
to as the “breakthrough curve” or cumulative RTD curve, and is denoted by F(t).
Figure 17.9 shows some schematic of RTD curves corresponding to unit-step and
pulse inputs to a tubular reactor.
E(t) = RTD curve = response to a unit impulse function and F(t) = response to a
unit step function

t
dF
E(t) = or F(t) = ∫ E(t ′ ) dt ′ .
dt
0

Axial dispersion model


The axial dispersion model describes the dispersion of a tracer in a tube. If the tube
is empty (i. e., no packing), the dispersion is due to the combined effect of velocity
profile and molecular diffusion. If the tube is packed, the dispersion may be due to
the combined effect of molecular diffusion, interstitial mixing and intraparticle diffu-
sion. The simplest model that describes the tracer dispersion in both cases is the axial
dispersion model. The model for a tube of length L is defined by

𝜕c 𝜕c 𝜕2 c
+ ⟨u⟩ = D , 0<x<L (17.147)
𝜕t ′ 𝜕x 𝜕x 2
390 | 17 Laplace transforms

Figure 17.9: Schematic of RTD curves corresponding to unit-pulse and unit-step input in a tubular
reactor.

with BCs:

⟨u⟩C0 (t ′ ) = ⟨u⟩c − D 𝜕x
𝜕c
@x = 0
} Danckwert’s BCs, (17.148)
𝜕c
𝜕x
=0 @;x = L

and IC.:

c(x, 0) = f (x) (17.149)

To cast the model in dimensionless form, define

t ′ ⟨u⟩ x ⟨u⟩L Lt
t= ; z= ; Pe = ; g(z) = f (Lz); c0 (t) = C0 ( ) (17.150)
L L D ⟨u⟩

Then the differential equation and the boundary and initial conditions may be written
as
17.5 Solution of linear differential/partial differential equations by Laplace transform | 391

𝜕c 𝜕c 1 𝜕2 c
+ = , 0 < z < 1, t>0
𝜕t 𝜕z Pe 𝜕z 2
1 𝜕c
BCs: c0 (t) = c − @z = 0
Pe 𝜕z (17.151)
𝜕c
= 0@z = 1
𝜕z
IC: c(z, 0) = g(z)

The solution of this more general model will be considered in part V using the Fourier
transform method. Here, we consider the special case in which

g(z) = 0 and c0 (t) = δ(t) (17.152)

Let

c(z,
̂ s) = ℓ{c(z, t)} (17.153)

1 d2 ĉ dĉ
− − sĉ = 0, 0 < z < 1, t>0
Pe dz 2 dz
1 dĉ (17.154)
BCs: 1 = ĉ − @z = 0
Pe dz
dĉ
= 0@z = 1
dz

ĉ = α1 eλ1 z + α2 eλ2 z (17.155)

where

λ2
−λ−s=0
Pe
4s (17.156)
1 ± √1 + Pe
λ1,2 = 2
Pe

BCs ⇒
1
1 = α1 + α2 − [α λ + α2 λ2 ]
Pe 1 1
0 = α1 λ1 eλ1 + α2 λ2 eλ2

solving ⇒

Pe λ2 eλ2 Pe λ1 eλ1
α1 = , α2 = − (17.157)
λ22 eλ2 − λ12 eλ1 λ22 eλ2 − λ12 eλ1
392 | 17 Laplace transforms


Pe
c(z,
̂ s) = [λ2 eλ2 +λ1 z − λ1 eλ1 +λ2 z ] (17.158)
(λ22 eλ2 − λ12 eλ1 )

where

Pe 4s
λ1 = [1 + √1 + ] (17.159)
2 Pe
Pe 4s
λ2 = [1 − √1 + ] (17.160)
2 Pe

Since

E(t) = c(1, t) (17.161)

E(s)
̂ = c(1,
̂ s) = Laplace transform of the RTD curve (17.162)


Pe
̂ = Pe e [λ2 − λ1 ]
E(s) (17.163)
λ22 eλ2 − λ12 eλ1

Write

Pe 4s
λ1 = (1 + q); q = √1 + (17.164)
2 Pe
Pe
λ2 = (1 − q) ⇒ λ2 − λ1 = − Pe q (17.165)
2

Pe Pe
4qe 2 4qe 2
E(s)
̂ = = (17.166)
(1 + q)2 e
Pe q
2 − (1 − q)2 e−
Pe q
2 H(q)

where
Pe q Pe q
H(q) = (1 + q)2 e 2 − (1 − q)2 e− 2 (17.167)

and H(−q) = −H(q). Thus, H(q) is an odd function and may be written as

H(q) = qh(q2 ) (17.168)

− Pe
⇒ E(s)
̂ contains only even power of q, and hence q = 0, or equivalently, s =
4
is not
a branch point. The poles of E(s)
̂ are given by
17.5 Solution of linear differential/partial differential equations by Laplace transform | 393

Pe

̂ = 4e for q → 0)
2
H(q) = 0 (q = 0 is not a branch point since E(s) 2
h(q )

(1 − q)2
= ePe q
(1 + q)2
1 + q2 − 2q
= ePe q
1 + q2 + 2q

2q 1 − ePe q Pe q
= = tanh(− ) (17.169)
1+q 2 1 + ePe q 2

Let

q = iQ (17.170)

2Q Pe Q
+ tan( )=0 (17.171)
1 − Q2 2

⇒ The poles are determined by the equation

Λ Pe
tan Λ = (17.172)
Pe2
Λ2 − 4

󳨐⇒

Pe Q Pe Pe 4s
Λ= = iq = i √1 + (17.173)
2 2 2 Pe
Pe2 4s Pe2
Λ2 = − (1 + )=− − sPe (17.174)
4 Pe 4

Denote the roots of the characteristic equation (17.172) as Λ1 , Λ2 . . . . Each of these


roots corresponds to a pole of E(s),
̂ and are located on the negative real axis:

1 Pe2
s = sj = − (Λ2j + ), j = 1, 2 . . . (Λi ≠ 0) (17.175)
Pe 4

These roots are shown in Figure 17.10 for Pe = 5.


A detailed discussion of the nature of the roots of equations (17.172) or (17.173)–
(17.174) is given in Chapter 23.
394 | 17 Laplace transforms

Figure 17.10: Schematic diagram showing the location of the roots of characteristic equation in
Laplace domain.

RTD curve
The RTD curve can be obtained from the residue theorem as
γ+i∞
1
E(t) = ∫ est E(s)
̂ ds = ∑ Residue est E(s)|
̂ s=s (17.176)
2πi j
j

γ−i∞

Let

Residue est E(s)|


̂ s=s = Rj = esj t lim (s − sj )E(s).
̂ (17.177)
j s→sj

Evaluating the limit and simplification gives

R j = e sj t A j (17.178)

Thus, we obtain the solution as an infinite sum



E(t) = ∑ esj t Aj (17.179)
j=1

where

1 Pe2
sj = − (Λ2j + ) (17.180)
Pe 4

and
4sj Pe
2 Pe(1 + Pe
)e 2
Aj = (17.181)
H ′ (qj )


17.5 Solution of linear differential/partial differential equations by Laplace transform | 395

Pe
∞ (−1)j+1 8Λ2j
E(t) = e 2 ∑ e sj t (17.182)
j=1 4Λ2j + 4 Pe + Pe2

A plot of the RTD curve is shown in Figure 17.11 for three different values of the Peclet
number.

Figure 17.11: RTD curves for Pe = 0.5, 2.0 and 5.0.

Similarly, the results from numerical inversion is shown in Figure 17.12 for Pe = 5.0,
along with the solution from the Residue theorem (equation (17.182)) with 10 terms in
the summation.

Figure 17.12: RTD curves for Pe = 5.0 from residue theorem (solid line) and numerical inversion of
Laplace transform (marker points).

Alternate method of analysis of the RTD curve


Moments
Recall the property (equations (17.24)–(17.25)) of Laplace transform (discussed in ear-
lier section), the nth moment of a function f (t) can be obtained from its Laplace trans-
form F(s) = ℓ[f (t)] as follows:
396 | 17 Laplace transforms

dn F(s) 󵄨󵄨󵄨󵄨
Mn = (−1)n 󵄨 (17.183)
dsn 󵄨󵄨󵄨s=0

Central moments

2
m2 = σ = ∫ (t − t)2 f (t) dt = M2 − M12 (17.184)
0

m3 = ∫ (t − t)3 f (t) dt = M3 + 2M13 − 3M1 M2 (17.185)


0

Solution of axial dispersion model


Consider the solution of axial dispersion model as given in equation (17.166):

Pe
4qe 2
E(s)
̂ =
Pe q Pe q
(17.186)
(1 + q)2 e 2 − (1 − q)2 e− 2

where q is given from equation (17.164) as

4s
q = √1 +
Pe
󳨐⇒

4s 2s 2s2
q = √1 + =1+ − + ⋅⋅⋅
Pe Pe Pe2

and

Pe Pe s2
exp( q) = e 2 es− Pe +⋅⋅⋅
2

Thus,

2s 2s2 Pe
4(1 + Pe
− Pe2
+ ⋅ ⋅ ⋅)e 2
E(s)
̂ =
Pe s 2 s 2
8s 4s2 4s2 − Pe
(4 + Pe
− Pe2
)e 2 es− Pe +⋅⋅⋅ + Pe2
e 2 e−s+ Pe +⋅⋅⋅
1 1 1 e− Pe 2
=1−s+( + − + )s + O(s3 ) (17.187)
2 Pe Pe2 Pe 2

But

E(s)
̂ = ℓ[E(t)] = ∫ e−st E(t) dt
0
17.5 Solution of linear differential/partial differential equations by Laplace transform | 397

󳨐⇒ from equation (17.183)

n ̂
̂ = ∑ d E(s) 󵄨󵄨󵄨 sn = ∑ (−1)n Mn sn
󵄨󵄨
E(s) n
n=0
ds 󵄨󵄨󵄨s=0 n=0
M2 2
= 1 − M1 s + s ... (17.188)
2

Comparing (17.187) and (17.188), we get

M0 (zeroth moment) = 1 (17.189)


M1 (first moment) = 1 (17.190)
− Pe
2 2 2e
M2 (second moment) = 1 + − + (17.191)
Pe Pe2 Pe2

󳨐⇒ from equation (17.184)

σ 2 = M2 − M12
2 2
= − (1 − e− Pe ) (17.192)
Pe Pe2

2
Note that for Pe ≫ 1, σ 2 = Pe . The higher-order moments can be obtained similarly by
Taylor series expansion (using Mathematica® ) as

6 6 18e− Pe 24 24e− Pe
M3 = 1 + − 2 + − + (17.193)
Pe Pe Pe2 Pe3 Pe3
− Pe − Pe
12 48 108e 360e 336e− Pe 312e− Pe 24e−2 Pe
M4 = 1 + − 2 + + − + + . (17.194)
Pe Pe Pe 2
Pe 3
Pe4 Pe4 Pe4

The third central moment is given by

m3 = M3 + 2M13 − 3M1 M2
12 12e− Pe 24 24e− Pe
= + − + (17.195)
Pe2 Pe2 Pe3 Pe3

For Pe ≫ 1,

12
m3 ≈ (17.196)
Pe2

which means that the RTD curve is positively skewed (i. e., longer tail on right side).
This can be observed in the plots shown in Figures 17.11 and 17.12.
398 | 17 Laplace transforms

Figure 17.13: Schematic diagram of a packed-bed.

17.5.4 Unsteady-state operation of a packed-bed

A number of chemical engineering processes utilize the unsteady-state operation of a


packed-bed. These include a heat-generator, an adsorption column, an ion-exchange
column and a chromatographic column. Typically, the equipment consists of a column
filled with a loosely packed solid as shown schematically in Figure 17.13. A fluid is
passed through the column, and the fluid and solid exchange heat or mass or both.
Since the solid remains fixed in place, the temperature and/or composition of the solid
changes with time, and the operation of the column is an unsteady-state process. We
consider here the heat transfer problem. A closely related problem of chromatography
is discussed in Section 17.6.

Formulation of the heat transfer problem


Consider a packed bed in which heat is transformed from the fluid phase to the solid
phase. Let

ρs , Cps = solid density, heat capacity

ρf , Cpf = fluid density, heat capacity

ε = fractional volume of the bed available for flow(void fraction)


17.5 Solution of linear differential/partial differential equations by Laplace transform | 399

h = overall heat transfer coefficient between solid and fluid

av = specific surface area (area of transfer per unit volume of bed)

superficial velocity
u0 = interstitial fluid velocity (= )
ε
Ts = solid temperature

Tf = fluid temperature

Tr = reference temperature

Ac = area of cross-section of the bed

L = Length of bed

x = distance from the bed entrance

Energy balance for the fluid phase

Ac εu0 ρf Cpf (Tf − Tr )|x − Ac εu0 ρf Cpf (Tf − Tr )|x+Δx


(inflow) (outflow)
𝜕
− Ac Δxav h(Tf − Ts ) = [A Δxερf Cpf (Tf − Tr )] (17.197)
(transfer to solid phase)
𝜕t c
(accumulation)

Assuming constant (and average) physical properties, and taking the limit △x → 0,
we get

𝜕Tf 𝜕Tf
ερf Cpf = −hav (Tf − Ts ) − εu0 ρf Cpf (17.198)
𝜕t 𝜕x

Energy balance for the solid phase

𝜕
hav Ac Δx(Tf − Ts ) = [A Δx(1 − ε)ρs Cps (TS − Tr )]
𝜕t c

𝜕Ts
(1 − ε)ρs Cps = hav (Tf − Ts ) (17.199)
𝜕t

Assume that at time t = 0, the fluid and solid in the bed are at T = T0 and for
t > 0 the fluid enters the bed at a temperature T = Tin , i. e., the initial and boundary
conditions can be expressed as

Ts (x, 0) = T0 ; Tf (x, 0) = T0 ; Tf (0, t) = Tin (t) (17.200)


400 | 17 Laplace transforms

Defining dimensionless variables

x u0 Tf − T0 Ts − T0
z= ; τ=t ; θf = ; θs = , (17.201)
L L T0 T0

Equations (17.198) and (17.199) may be written as

𝜕θs hav L
= (θ − θs ) (17.202)
𝜕τ u0 (1 − ε)ρs Cps f
𝜕θf 𝜕θf hav L
=− − (θ − θs ) (17.203)
𝜕τ 𝜕z u0 ερf Cpf f

with initial and boundary conditions (equation (17.200)) simplifying to

θs (z, 0) = θf (z, 0) = 0 (17.204)


T (t) − T0
θf (0, τ) = θin (τ) = in . (17.205)
T0

The dimensionless equations (17.202) and (17.203) contain two dimensionless groups,
namely

u0 ερf Cpf t
ph = = h = local (or transverse) heat Peclet number; (17.206)
hav L tc
(1 − ε)ρs Cps
αh = = heat capacitance ratio of solid to fluid in the bed. (17.207)
ερf Cpf

ha
Here, tc (= uLε ) is the convection time while th (= ρ Cv ) is the heat exchange time
0 s ps
between solid and fluid. The model in dimensionless form can be expressed as

𝜕θs
= (θf − θs )
αh p h (17.208)
𝜕τ
𝜕θf 𝜕θf
ph ( + ) = −(θf − θs ) (17.209)
𝜕τ 𝜕z

with ICs

θs (z, 0) = θf (z, 0) = 0 (17.210)

and BC

θf (0, τ) = θin (τ). (17.211)

This model ignores heat conduction in the solid and fluid phases. This is the simplest
nontrivial model for unsteady-state heat transfer in a packed-bed. As shown, later, the
same model appears in many other packed-bed operations such as chromatography
17.5 Solution of linear differential/partial differential equations by Laplace transform | 401

and mass transfer operations. We consider here only the special case of a unit-step
input, i. e., θin (τ) = H(τ) = Heaviside’s function.
Let

Θs (z, s) = ℓ{θs (z, τ)} = ∫ e−st θs (z, t) dt and Θf (z, s) = ℓ{θf (z, τ)}
0

Taking Laplace transformations on both sides, equations (17.208)–(17.211) ⇒

αh ph sΘs = (Θf − Θs ) (17.212)


dΘf
ph (sΘf + ) = −(Θf − Θs ) (17.213)
dz
1
Θf (0, s) = (17.214)
s
Equation (17.212) ⇒

1
Θs = Θ (17.215)
1 + αh ph s f

substituting equation (17.215) in equation (17.213), we get

dΘf αh
+ Θf [1 + ]s = 0 (17.216)
dz 1 + αh ph s

1 αh
Θf = exp[−s(1 + )z]
s 1 + αh ph s
1 αh s
= exp[−sz] exp[− z]
s 1 + αh ph s
1 z 1
= exp[−sz] exp[− (1 − )]
s ph 1 + αh ph s
󳨐⇒

z exp[−sz] 1 z
Θf = exp[− ] exp[ ] (17.217)
ph s (s + 1
) αh p2h
αh ph

Equation (17.215) ⇒

1 z exp[−sz] 1 1 z
Θs = exp[− ] ( exp[ ]) (17.218)
αh ph ph s s+ 1
(s + 1
) αh p2h
αh ph αh ph

Equations (17.217) and (17.218) are the Laplace transformation of the fluid and solid
temperatures. By inverting them, we get the temperature in time domain.
Using the formulas,
402 | 17 Laplace transforms

1
ℓ−1 { } = 1,
s
󳨐⇒

e−sz 1, τ>z
ℓ−1 { } = H(τ − z) = { (17.219)
s 0, τ < z,

and

ℓ{f (t)} = F(s), then ℓ{e−at f (t)} = F(s + a) (Shift theorem), (17.220)

we have

1 λ 1 λ
ℓ−1 [ exp( )] = e−βτ ℓ−1 [ e s ] (17.221)
(s + β) s+β s

Since

1 λ 1 ∞ λn ∞
λn
exp( ) = ∑ = ∑
s s s n=0 n!sn n=0 n!sn+1

and

1 τn
ℓ−1 { } =
sn+1 n!

1 λ ∞
(λτ)n ∞
(2√λτ)2n
ℓ−1 { exp( )} = ∑ = ∑ = I0 (2√λτ) (17.222)
s s n=0 (n!) 2
n=0 22n (n!)2

where I0 is the modified Bessel function of order zero, which is defined by

ξ
∞ ( 2 )2k ∞
ξ 2k
I0 (ξ ) = ∑ = ∑ .
n=0 (k!)2 2k
k=0 2 (k!)
2

1 z
Thus, by replacing β = αh ph
and λ = αh p2h
, equations (17.221) and (17.222) 󳨐⇒

1 1 z
ℓ−1 [ 1
exp( 1
)]
s+ αh ph
s+ αh ph
αh p2h
−τ 1 z 1
= exp( )ℓ−1 [ exp( )]
αh p h s αh p2h s
−τ zτ
= exp( )I (2√ ) (17.223)
αh p h 0 αh p2h
17.5 Solution of linear differential/partial differential equations by Laplace transform | 403

Thus, using the convolution theorem (17.19)–(17.20) and equations (17.219) and (17.223),
we can express inverse Laplace transform of equation (17.218) as

θs (z, τ) = ℓ−1 [Θs (z, s)]


τ
1 z −t zt
= exp[− ] ∫ H(τ − t − z) exp( )I (2√ ) dt (17.224)
αh p h ph αh p h 0 αh p2h
0

Since

1, 0<t <τ−z
H(τ − t − z) = {
0, t >τ−z

Equation (17.224) simplifies to

τ−z
{ 1 exp[− pz ] ∫0 exp( α−tp )I0 (2√ α ztp2 ) dt, τ>z
θs (z, τ) = { αh ph h h h h h (17.225)
{ 0, τ<z

Now we can obtain θf (z, τ) by taking inverse Laplace transform of equation (17.217).
Alternatively, we can use equations (17.208) and (17.225) to obtain θf (z, τ), which is
given as follows:

𝜕θs
θf = θs + αh ph
𝜕τ
1 τ−z
{
{
{ α p
exp[− pz ] ∫0 exp( α−tp )I0 (2√ α ztp2 ) dt
{ h h h h h h h
{
{
={ (17.226)
{ + exp[− pz ] exp( αz−τ )I0 (2√ z(τ−z) ), τ > z
h ph αh p2h
{
{ h
{
{
{ 0, τ < z

From equation (17.225), we obtain

θs (0, τ) = 1 − e−βτ

Figure 17.14 below shows the breakthrough curve, fluid temperature at the exit, i. e.,
θf (1, τ), when αh = 10 and ph = 0.01, 0.05 and 0.1. The 3D and density plots of solid
phase temperatures (θs ) are also shown in this figure (on right) with αh = 10 and ph =
0.1.
1
Note that for small values of ph , the temperature front moves with a speed of ( 1+α )
h
or the breakthrough time is τ ≈ (1 + αh ), and the spread (dispersion) is symmetric. In
the limit of ph → 0, the breakthrough curve is a step function with a jump at τ =
1 + αh .
404 | 17 Laplace transforms

Figure 17.14: Breakthrough curves θf (1, τ) for αh = 10 and different Peclet numbers ph between 0.01
to 0.1 (left); and 3D and density plots of dimensionless temperatures of solid θs (z, τ) for αh = 10 and
ph = 0.1 (right).

17.6 Control system with delayed feedback


In this section, we illustrate the solution of linear initial value problems with time
delay using the Laplace transform technique.

17.6.1 PI control with delayed feedback

Consider a single input-single output (SISO) first-order system with PI control. The
dynamics of such control system can be described by the linear equation:

dx
τ + x = u + f (t) (17.227)
dt
t

u = −kP [x(t − τD ) + kI ∫ x(t ′ ) dt ′ ] (17.228)


0
x(t) = 0, −τD < t ≤ 0 (17.229)

where x and u are the state and the control variables, τ is the process time constant,
τD is the delay time, kP and kI are proportional and integral gains, and f (t) is the dis-
turbance function. Let X(s) and F(s) be the Laplace transform of x(t) and f (t), i. e.,
17.6 Control system with delayed feedback | 405

ℒ[x(t)] = X(s); ℒ[f (t)] = F(s) (17.230)

󳨐⇒
t
X(s)
ℒ[∫ x(t ) dt ] = (17.231)
′ ′
;
s
0
dx
ℒ[ ] = sX(s) − x(0) = sX(s); (17.232)
dt

and

ℒ[x(t − τD )] = ∫ exp(−st )x(t − τD ) dt


′ ′ ′

0

= ∫ exp(−s(t1 + τD ))x(t1 ) dt1 (taking t1 = t ′ − τD )


−τD
0 ∞

= exp(−sτD )[ ∫ exp(−s(t1 ))x(t1 ) dt1 + ∫ exp(−st1 )x(t1 ) dt1 ]


−τD 0

= exp(−sτD )[0 + X(s)]

󳨐⇒

ℒ[x(t − τD )] = exp(−sτD )X(s) (17.233)

Thus, the Laplace transform of the control variable u may be obtained as

kI
ℒ[u] = U(s) = −kP [exp(−sτD ) + ]X(s) (17.234)
s

Thus, taking the Laplace transform of equations (17.227)–(17.229) on both sides, we get

(sτ + 1)X(s) = U(s) + ℒ[f (t)]


kI
= −kP [exp(−sτD ) + ]X(s) + F(s)
s
󳨐⇒
F(s)
X(s) = kP kI
(17.235)
(sτ + 1) + kP exp(−sτD ) + s

For a unit step disturbance, i. e.,

1, t≥0 1
f (t) = H(t) = { or F(s) = ,
0, t < 0, s
406 | 17 Laplace transforms

the response to the unit step disturbance can be expressed in Laplace domain as

1
X(s) = . (17.236)
τs2 + [1 + kP exp(−sτD )]s + kP kI

We consider the inversion of equation (17.236) for various cases.

Case 1: PI control with no delay


Consider the PI control system with no delay (i. e., τD = 0 and kI ≠ 0). In this case, the
response to a unit-step disturbance can be simplified in Laplace domain from equation
(17.236) as

1
X(s) = (17.237)
τs2 + (1 + kP )s + kP kI

which can be further simplified as

1
X(s) = (17.238)
τ(s − s1 )(s − s2 )

where s1 and s2 are the roots of the denominator function in equation (17.237), which
can be expressed as

−(1 + kP ) ± √(1 + kP )2 − 4τkP kI


s1,2 = (17.239)

The response function X(s) can be simplified further from equation (17.238) as

1 1 1
X(s) = [ − ] (17.240)
τ(s1 − s2 ) s − s1 s − s2

󳨐⇒

1 exp(s1 t) − exp(s2 t)
x(t) = [ ] (17.241)
τ (s1 − s2 )

Note from equation (17.239) since kP > 0 and τ > 0 for a physical system and PI
control, real part of s1 and s2 are always negative, i. e.,

Re(s1 ) < 0 and Re(s2 ) < 0

therefore, x(t) goes to zero for t → ∞, i. e.,

lim x(t) = 0.
t→∞
17.6 Control system with delayed feedback | 407

In other words, such control system leads to no offset. In addition, when s1 and s2 are
s1
ln(
real (i. e., (1 + kP )2 ≥ 4τkP kI ), then x(t) goes through a maximum at t = s −ss2 . But when
)
1 2
s1 and s2 are complex, x(t) goes to zero in an oscillatory manner. For example, when
s1 and s2 are complex, i. e., s1,2 = −a ± ib, the response to unit step can be given from
equation (17.241) as

exp(−at) exp(ibt) − exp(−ibt)


x(t) = [ ]
bτ 2i
exp(−at)
= sin(bt) (17.242)

Further, when kI = 0 (i. e., proportional control only), the roots s1 = 0 and s2 = −(1 +
kP ), and in this case the response to a unit disturbance can be expressed as

1 1 − exp[−t(1 + kP )]
x(t) = [ ]. (17.243)
τ 1 + kP

Similarly, when (1 + kP )2 = 4τkP kI , then such system has repeated roots, i. e., s1 = s2 =
−(1+kP )

. In this case, the response function simplifies from equation (17.241) to

1 exp(s1 t) − exp(s2 t)
x(t) = lim [ ]
s1 󳨀→s2 τ (s1 − s2 )
t
= exp(s2 t)
τ
t −(1 + kP )t
= exp[ ] (17.244)
τ 2τ

Figure 17.15 shows the response of PI control systems for few cases discussed be-
low:
1. τ = 1, kI = 0, kP = 1. In this case, the roots are given from equation (17.238) as

s1,2 = 0, −2

󳨐⇒ from equation (17.243) as

1 − exp(−2t)
x(t) = [ ]
2

2. τ = 1, kI = 1, kP = 1. In this case, the roots are given from equation (17.239) as

s1,2 = −1, −1

󳨐⇒ from equation (17.241) as

x(t) = t exp(−t)
408 | 17 Laplace transforms

Figure 17.15: Response to a unit step disturbance for a PI control system.

3. τ = 1, kI = 1, kP = 10. In this case, the roots are given from equation (17.239) as

−11 ± √121 − 40 −11 ± 9


s1,2 = = = −1, −10
2 2

󳨐⇒ from equation (17.241) as

exp(−t) − exp(−10t)
x(t) =
9

4. τ = 1, kI = 10, kP = 1. In this case, the roots are given from equation (17.239) as

−2 ± √4 − 40 −2 ± 6i
s1,2 = =
2 2
− 1 + 3i, −1 − 3i

󳨐⇒ from equation (17.242) as

exp(−t)
x(t) = sin(3t)
3

Case 2: proportional control with delay


Consider the proportional control system with delay (i. e., kI = 0 and τD ≠ 0). In this
case, the response to a unit-step disturbance can be simplified in Laplace domain from
equation (17.236) as

1
X(s) = ; (17.245)
s[τs + 1 + kP exp(−sτD )]

Let the function in denominator be considered as sG(s). Then we have

1
X(s) = (17.246)
sG(s)
17.6 Control system with delayed feedback | 409

G(s) = τs + 1 + kP exp(−sτD ) (17.247)


G (s) = τ − kP τD exp(−sτD )

(17.248)
G (s) =
′′
kP τD2 exp(−sτD ) >0 (17.249)

Thus, it can be seen that G′′ (s) is always positive, and hence cannot have the roots re-
peated more than twice. The roots that are repeated twice, can be obtained by solving
G(s) = G′ (s) = 0, simultaneously, which leads to

τ
= kP exp(−sj τD ) = −(1 + τsj ) (17.250)
τD

This results into

−(1 + α) τ 1
sj = ; α = D; = α exp(1 + α) and G′′ (sj ) = τD τ (17.251)
τD τ kP

Thus, if kp , τ and τD satisfy the constraint given in equation (17.251), then sj = − 1+α
τD
is
a repeated root of G(s) or a second-order pole of X(s), otherwise sj given by

G(sj ) = τsj + 1 + kP exp(−sj τD ) = 0

is a simple pole. Also note that since α exp(1 + α) is an increasing function of α, when
relation in equation (17.251) given by k1 = α exp(1 + α) leads to unique solution in α
P
for a given kp . Thus, for a given set of kp , τ and τD , that satisfy the relation given in
equation (17.251), only one repeated root exits.
Note that s = 0 is a simple pole of X(s) and the residue at s0 = 0 is given by

1 1
Res exp(st)X(s) = lim = (17.252)
s=si s→0 G(s) 1 + kP

If s∗ is the twice repeated root and sj are the other nonzero simple roots of G(s), the
response to unit disturbance can be obtained from residue theorem as

x(t) = ∑ Res[exp(st)X(s)]
s

= Res[exp(st)X(s)] + ∑ Res[exp(st)X(s)] + Res∗ [exp(st)X(s)]
s=0 s=si s=s
i=1
1 exp(si t) 1

exp(s t) 2 ∗
= +∑ +
1 + kP i=1 si G (si )
′ s∗ G′′ (s∗ )

󳨐⇒

1 ∞
exp(si t) 2 exp(s∗ t)
x(t) = +∑ + (17.253)
1 + kP i=1 si [τ − kP τD exp(−si τD )] s ∗ τD τ
410 | 17 Laplace transforms

Stability analysis
Consider again case 2 (i. e., proportional control with delay). The control system is sta-
ble when all the roots of denominator in X(s) lie in the left half-plane (i. e., Re(sj ) < 0).
We can determine the criteria at which these roots cross the y-axis (pure imaginary
line) and go from left to right half-plane. For this, we can substitute s = jω, j = √−1
(i. e., the roots lying on imaginary axis) in the expression of G(s) given in equation
(17.247), which leads to

G(jω) = jωτ + 1 + kP exp(−jωτD )


= jωτ + 1 + kP [cos(ωτD ) − j cos(ωτD )]
= [1 + kP cos(ωτD )] + j[ωτ − kP sin(ωτD )] = 0

󳨐⇒
1
cos(ωτD ) = − (17.254)
kP
ωτ
and sin(ωτD ) = (17.255)
kP

The relation (17.254)–(17.255) cannot be satisfied if kp < 1. In other words, when kp <
1 the roots never cross the imaginary axis and lie always in left half-plane, i. e., the
control system is stable. When kP > 1, equation (17.254) lead to

−1 √kP2 − 1
ωτD = cos−1 ( ) = sin−1 ( ). (17.256)
kP kP

Similarly, equations (17.254) and (17.255) give

ωτ = √kP2 − 1 (17.257)

Thus, ω can be eliminated from equations (17.256) and (17.257), which lead to

√k 2 −1
τD cos ( kP ) sin ( kP )
−1 −1 −1 P

= = (17.258)
τ √kP2 − 1 √kP2 − 1

Thus, we summarize the analysis by the following two points:


1. If 0 ≤ kp ≤ 1, then all the roots of G(s) = 0 are in the left half-plane and control
system is stable.
2. If kp > 1, then the locus separating the stable and unstable region is given by
equation (17.258). This is also shown in Figure 17.16 schematically.
The above results can be extended to systems described by second-, third- and
higher-order transfer functions. To illustrate the above two points, we consider
17.6 Control system with delayed feedback | 411

Figure 17.16: Stability region of linear control system with delayed proportional control.

cos−1 ( k−1 ) 2π
three examples with kp = 2(󳨐⇒ P
= 3√ 3
= 1.2092) and τ = 1: (i) τD = 1.21, (ii)
√kP2 −1
τD = 0.605 < 1.21 and (iii) τD = 2.418 > 1.21. The responses to unit-step disturbance
corresponding to these cases are shown in Figure 17.17.

Figure 17.17: Response to unit step function for proportional control with delay for stable, unstable
and oscillatory region.

Determination of the poles of the transfer function in the complex plane


Express the root of G(s) as

s = a + ib (17.259)
412 | 17 Laplace transforms

󳨐⇒ from equation (17.247)

G(s) = 0 󳨐⇒ τ(a + ib) + 1 + kP exp(−τD a) exp(−iτD b) = 0

󳨐⇒

1 + τa + kP exp(−τD a) cos(τD b) = 0
and [τb − kP exp(−τD a) sin(τD b)] = 0

󳨐⇒

1 k sin(τD b)
a= ln[ P ] (17.260)
τD τb

and

τ k sin(τD b)
1+ ln[ P ] + τb cot(τD b) = 0 (17.261)
τD τb

Thus, for a given set of τ, τD and kp , equations (17.260)–(17.261) can be solved for real
values of b and a numerically, and hence the roots. Using these roots, the response
curve can be obtained using equation (17.253) with sj = aj ± ibj .
Example 1: τ = τD = kP = 1;
In this case, the response function can be given from equations (17.246)–(17.249)
as
1
X(s) = ; G(s) = 1 + s + exp(−s)
sG(s)

The roots s = a + ib can be obtained by solving equations (17.260)–(17.261):

sin(b) sin(b)
1 + ln[ ] + b cot(b) = 0 and a = ln[ ] (17.262)
b b

Table 17.1 lists first few roots G(s) in this example. These roots are also shown in Fig-
ure 17.18.
Using these roots, the response function can be expressed from equation (17.253)
as

1 ∞ exp[(ai + bi )t]
x(t) = + ∑(
2 i=1 (ai + bi )[τ − kP τD exp[−(ai + bi )τD ]]
exp[(ai − bi )t]
+ ). (17.263)
(ai − bi )[τ − kP τD exp[−(ai − bi )τD ]]

Using first 16 conjugate roots (listed in Table 17.1), the expression given in equation
(17.263) is plotted in Figure 17.19 in red solid lines, along with the method of steps based
numerical solution (green dashed line).
17.6 Control system with delayed feedback | 413

Table 17.1: First few roots for proportional control with delayed feedback (τ = τD = kp = 1).

j aj bj sj = aj + ibj
1 −0.605021 ±1.78819 −0.605021 ± 1.78819i
2 −2.05283 ±7.71841 −2.05283 ± 7.71841i
3 −2.64736 ±14.0202 −2.64736 ± 14.0202i
4 −3.01658 ±20.3214 −3.01658 ± 20.3214i
5 −3.28526 ±26.6179 −3.28526 ± 26.6179i
6 −3.49668 ±32.911 −3.49668 ± 32.911i
7 −3.67104 ±39.2019 −3.67104 ± 39.2019i
8 −3.81944 ±45.4912 −3.81944 ± 45.4912i
9 −3.94861 ±51.7794 −3.94861 ± 51.7794i
10 −4.06298 ±58.0668 −4.06298 ± 58.0668i
11 −4.1656 ±64.3535 −4.1656 ± 64.3535i
12 −4.25866 ±70.6397 −4.25866 ± 70.6397i
13 −4.34378 ±76.9256 −4.34378 ± 76.9256i
14 −4.42223 ±83.2111 −4.42223 ± 83.2111i
15 −4.49496 ±89.4964 −4.49496 ± 89.4964i
16 −4.56276 ±95.7814 −4.56276 ± 95.7814i

Figure 17.18: First few roots of G(s) = 0 for proprotional control with delayed feedback (τ = τD =
kp = 1).

Both match very good with each other as expected. Note that since Re(sj ) < 0 ∀j,
1
limt→∞ x(t) = 1+k = 21 .
p

Numerical inversion of the Laplace transform


In addition to the residue theorem and other numerical methods, equation (17.245)
can also be solved by inverting the Laplace transform numerically. For example, Fig-
ure 17.20 shows a comparison of the results from numerical inversion and residue the-
orem for τ = kP = 1 with small delay of τD = 1 (top plot) as well as larger delay of
τD = 10 (bottom plot).
It can be seen that the numerical inversion leads to good match with the solution
from residue theorem. However, numerical inversion sometimes fails at certain times
414 | 17 Laplace transforms

Figure 17.19: Response to unit step disturbance equipped with proportional control with delayed
feedback for τ = τD = kp = 1.

Figure 17.20: Response to unit-step disturbance equipped with proportional control with delayed
feedback for τ = kp = 1 with τD = 1 (top plot) and τD = 10 (bottom plot) and from residue theorem
(solid lines) and numerical inversion of Laplace transform (marker points).

where the plot is sharper or the slope is high, especially for larger delay [see the bottom
plot (τD = 10) in Figure 17.20 near time t = 10 or 20].

Problems
1. Use the complex inversion formula to evaluate the inverse Laplace transform of
the following functions:
1 1 s
(i) (s+1)(s 2 +1) (ii) (s4 +4) (iii) (s2 +1)4
17.6 Control system with delayed feedback | 415

2. Use the complex inversion formula to evaluate the inverse Laplace transform of
the following functions:
(i) e−√s (ii) s√1s+1 (iii) s2 cosh
1
√s
(iv) scosh x√s
cosh a√s
, (0 < x < a)
3. Solve the following linear initial value problems using the Laplace transformation
method:
(i)

d4 u d2 u
− 2 +u=0
dt 4 dt 2
u(0) = 1, u′ (0) = 0, u′′ (0) = 1, u′′′ (0) = 0

(ii)

Lu = f (t), u[i] (0) = bi , i = 0, 1, . . . , n − 1

where L is a linear differential operator with constant coefficients. Determine the


asymptotic frequency response of the system for homogeneous initial conditions,
i. e., the form of the solution for t → ∞ when f (t) = A sin ωt and bi = 0 (i =
0, 1, . . . , n − 1),
(iii)

d2 u du
t2 +t + (t 2 − 1)u = 0, u(1) = 2, u(t) bounded for all t
dt 2 dt

4. Solve the following linear integral, integrodifferential and delay (difference) equa-
tions using Laplace transformation
(i)

t
t3
∫ u(t ′ )u(t − t ′ ) dt ′ = 2u(t) + − 2t
6
0

(ii)

u′′ (t) + au′ (t) = ∫ g(t − t ′ )u(t ′ ) dt ′ + bu(t) + f (t)


0

u(0) = c0 , u′ (0) = c1

(iii)

u′ (t) = au(t) + bu(t − D) + α sin ωt; u(0) = 0

(iv)
416 | 17 Laplace transforms

u(t) + au(t − 1) + bu(t − 2) = f (t)

where

e−t t>0
u(t) = 0 for t < 0 and f (t) = {
0 t<0

5. The dynamic model for a cascade of N perfectly stirred tank reactors (CSTRs) is
given by

τ dc1
= c0 (t) − c1 (t)
N dt
τ dci
= ci−1 (t) − ci (t), i = 2, 3, . . . , N
N dt

where τ is the mean residence (space) time in the cascade, N is the number of
tanks and ci−1 (t) and ci (t) are the concentrations of a tracer in the stream enter-
ing and leaving tank i, respectively. The response of the system to a unit impulse
defined the residence time distribution (RTD) function, denoted as E(t), i. e.,

E(t) = cN (t)

for c0 (t) = δ(t) and the initial conditions ci (t = 0) = 0, i = 1, 2, . . . , N.


(a) Use the Laplace transform technique to show that E(s) ̂ = 1sτ N where E(s)
̂ is
(1+ ) N
the Laplace transform of E(t). Determine the function E(t).
(b) If the i-th moment of E(t) is defined as

Mi = ∫ t i E(t) dt,
0

use the function E(s)


̂ to determine the first three moments. If the i-th central
moment is defined by

mi = ∫ (t − M1 )i E(t) dt; i≥2


0

show that

τ2
m2 = M2 − M12 =
N
2τ3
m3 = M3 − 3M1 M2 + 2M13 =
N2
6. The dynamic model for a recycle reactor (tubular plug flow reactor with recycle)
is given by
17.6 Control system with delayed feedback | 417

𝜕c (1 + R) 𝜕c
+ =0 0 < z < 1, t>0
𝜕t τ 𝜕z
R c (t)
c(0, t) = c(1, t) + in
1+R 1+R
c(z, 0) = 0, 0 < z < 1,

where τ is the space time and R is the recycle ratio.


(a) Use the Laplace transform technique to show that the RTD function (E(t) =
c(1, t)), which is the response of the system for a unit pulse input (cin (t) = δ(t))
is given by

Ri (i + 1)τ
E(t) = ∑ δ(t − )
i=0 (1 + R) 1+i 1+R

Plot a schematic diagram of E(t) for varying values of R.


(b) If the moments of the RTD function are defined by
∞ ∞

Mi = ∫ t i E(t) dt, mi = ∫ (t − M1 )i E(t) dt, i = 1, 2, . . .


0 0

show that
(1 + 2R) 2
M1 = τ, M2 = τ
(1 + R)
R 2
m1 = 0, m2 = τ
1+R
7. The dynamics of a single input-single output (SISO) first-order system with PI con-
trol is described by the linear equations

dx
τ + x = u + f (t)
dt
t

u = −kp [x(t − τD ) + k1 ∫ x(t ′ ) dt ′ ],


0

where x and u are state and control variables, τ is the process time constant, τD
is the delay time, kP and kI are proportional and integral gains and f (t) is the dis-
turbance function. Use the Laplace transform method to determine the response
of the system for a unit step disturbance for the following cases:

τ τD kI kP
(a) 1 0 0 0, 1, 10
1
(b) 1 0 2
, 1, 2 1
(c) 1 1 0 1
(d) 1 0.5 1 1
418 | 17 Laplace transforms

8. Solve the following initial-boundary value problem using the Laplace transforma-
tion and make a schematic plot of the solution at any fixed position (x ≠ 0) as a
function of time

𝜕2 u 𝜕u
= ; 0 < x < 1, t > 0
𝜕x 2 𝜕t
I.C u(x, 0) = 0; BCs u(0, t) = δ(t), u(1, t) = 0

9. Consider the flow system shown in Figure 17.21. Assume that each tank is well
mixed and species A enters tank 1 at a concentration of cin (t) and leaves at c1 (t).
Assume further that VR1 = 1 m3 , VR2 = 32 m3 and q1 = q2 = 2 m3 / min.

Figure 17.21: Schematic diagram of intercating tanks.

(a) Formulate the differential equations describing the system.


(b) Determine the response of the system (i. e., how the exit concentration ci
varies with time) for a unit impulse input cin (t) = δ(t). Assume that no A is
present initially in either tank.
(c) Show a schematic diagram of the response
(d) Determine the first two moments, and hence the mean and variance of the
response curve.
10. The transient behavior of some biological systems with delayed feedback [as well
as other control systems with delay] is described by the linear equation

du
= u(t) − βu(t − τ) + f (t), t > 0;
dt
u(t) = 0, −τ ≤ t ≤ 0.

Here, f (t) is the external input disturbance, τ > 0 is the delay time and β > 1 is the
strength of the delayed feedback.
(a) Determine the steady-state response for a unit-step input
17.6 Control system with delayed feedback | 419

(b) Write the general form of the transient response for a unit-step input (no need
to compute it)
(c) Can the system go unstable, i. e., any poles/eigenvalues cross the imaginary
axis? If so, determine the smallest value of τ for which the system becomes
unstable.
(d) Show a schematic diagram of the transient response for β = 2 and τ = 0.60
π
[Hint: 3√ 3
= 0.6046].
11. [Frequency response]
Consider the inhomogeneous n-th order scalar differential equation

dn u dn−1 u du
Lu = α0 n
+ α1 n−1 + ⋅ ⋅ ⋅ + αn−1 + αn u = f (t), t>0
dt dt dt
I.Cs: u[k] (t = 0) = 0, k = 0, 1, . . . , (n − 1).

(a) Show that the solution in Laplace domain can be expressed as

F(s)
U(s) = ℓ{u(t)} =
Pn (s)

where F(s) = ℓ{f (t)} and

Pn (λ) = α0 λn + α1 λn−1 + ⋅ ⋅ ⋅ + αn−1 λ + αn

(b) Consider the frequency response, i. e., when


f (t) = A sin ωt 󳨐⇒ F(s) = ,
s2 + ω2

and assume that λ1 , λ2 , . . . , λn are n distinct root of Pn (λ) = 0, then the general
form of solution can be expressed as

j=n
ωeλj t 1
u(t) = A[∑ + AR sin(ωt − ϕ)]
j=1 ω + λj Pn (λj )
2 2 ′

where
1
AR = amplitude ratio =
|Pn (iω)|
1
ϕ = phase lag = arg Pn (iω) = − arg
Pn (iω)

[Remark: When Re(λj ) > 0, the steady-state frequency response is simplified


to

u(t) = AAR sin(ωt − ϕ),


420 | 17 Laplace transforms

which can be obtained by taking the system transfer function P 1(s) and replac-
n
1 1 1
ing s by iω to get the complex number Pn (iω)
. If we write Pn (iω)
= |Pn (iω)|eiϕ
, we
obtain AR and ϕ.]
|
Part IV: Linear ordinary differential
equations-boundary value problems
18 Two-point boundary value problems
In this and the next chapter, we discuss the solution of linear differential equations
with prescribed end or boundary conditions. Specifically, we discuss the problem of
determining a function u(x) satisfying an n-th order differential equation in the inde-
pendent variable x (a < x < b) and n boundary conditions involving the function and
its first (n − 1) derivatives at the end points x = a and x = b.

18.1 The adjoint differential operator


Before we study the properties of linear boundary value problems, we revisit the con-
cept of an adjoint to a linear differential operator. As shown in Chapter 14, the concept
of an adjoint plays a very important role in the theory of differential equations and was
first introduced by Lagrange in connection with the problem of finding the integrating
factors. Consider the first-order linear differential operator:

du
Lu = p0 (x) + p1 (x)u (18.1)
dx

In general, the RHS of equation (18.1) is not an exact derivative. We would like to find
a function v such that vLu is an exact derivative. Multiplying equation (18.1) by v(x),
we have
du
vLu = vp0 (x) + vp1 (x)u
dx
d
= [vp0 u] − u(vp0 )′ + uvp1
dx
d
= [vp0 u] + [−(p0 v)′ + p1 v]u (18.2)
dx
Let us define

L∗ v = −(p0 v)′ + p1 v
= −p0 v′ + (p1 − p′0 )v

where L∗ is also a linear differential operator. Equation (18.2) may now be written as

d
vLu − uL∗ v = [vp0 u] (18.3)
dx
Now suppose that v satisfies the homogeneous equation

L∗ v = 0 (18.4)

Then equation (18.3) reduces to

https://doi.org/10.1515/9783110739701-019
424 | 18 Two-point boundary value problems

d
vLu = [vp0 u], (18.5)
dx

which is an exact derivative. Equation (18.4) is called the adjoint equation to Lu = 0


and equation (18.3) is called the Lagrange identity. We note that the adjoint equation

L∗ v = 0

may be written as

p1 − p′0
v′ = v
p0

Integrating once, we get

−p′0 p1
ln v = ∫( + ) dx
p0 p0
p (x)
= − ln p0 (x) + ∫ 1 dx
p0 (x)

Thus, the integrating factor to equation (18.1) is given by

1 p (x)
v= exp{∫ 1 dx}. (18.6)
p0 (x) p0 (x)

Now, if we consider the homogeneous equations Lu = 0 and L∗ v = 0, we have from


equation (18.3),

d
[vp0 u] = 0.
dx

Integrating, we obtain

v(x)p0 (x)u(x) = constant. (18.7)

Thus, if we know either u(x) or v(x), we can determine the other, or the solutions to
the equations Lu = 0 and L∗ v = 0 are intimately related.
Now consider the second-order operator

d2 u du
Lu = p0 (x) + p1 (x) + p2 (x)u (18.8)
dx2 dx

Following the same procedure, we obtain

d2 u du
vLu = vp0 + vp1 (x) + vp2 (x)u
dx2 dx
= vp0 u′′ + vp1 u′ + vp2 u
18.1 The adjoint differential operator | 425


= (vp0 u′ ) − u′ (vp0 )′ + (vp1 u)′ − u(vp1 )′ + vp2 u
′ ′
= (vp0 u′ ) − [u(vp0 )′ ] + u(p0 v)′′ + (vp1 u)′ − u(vp1 )′ + vp2 u
d
= u[(p0 v)′′ − (p1 v)′ + p2 v] + [vp0 u′ − u(vp0 )′ + vp1 u]
dx
󳨐⇒
d
vLu − uL∗ v = [vp0 u′ − u(vp0 )′ + vp1 u]
dx
d
= [π(u, v)] (18.9)
dx
Thus, if v satisfies the adjoint equation, i. e., L∗ v = 0, where

L∗ v = (p0 v)′′ − (p1 v)′ + p2 v


= p′′ ′ ′ ′′ ′ ′
0 v + 2p0 v + p0 v − (p1 v + p1 v ) + p2 v
= p0 v′′ + (2p′0 − p1 )v′ + (p′′ ′
0 − p1 + p2 )v, (18.10)

vLu is an exact derivative of

π(u, v) = vp0 u′ − uvp′0 − uv′ p0 + vp1 u


= vu′ p0 − v′ up0 + (p1 − p′0 )vu
p1 − p′0 p0 u
=[ v v′ ] [ ][ ′ ]
−p0 0 u
= kT (v(x))P(x)k(u(x)) (18.11)

where k is the Wronskian vector. The function π(u, v) is called the bilinear concomitant
and P(x) is called the concomitant matrix. Thus,

d T
vLu − uL∗ v = [k (v)P(x)k(u)] (18.12)
dx
Equation (18.12) is again the Lagrange identity in terms of the Wronskian vectors and
concomitant matrix. Note that if two solutions of the adjoint equation L∗ v = 0 are
known, then we have

π(u, v1 ) = v1 (x)p0 (x)u′ (x) − u(x)(v1 (x)p0 (x)) + v1 (x)p1 (x)u(x) = c1 (18.13)
′ ′
π(u, v2 ) = v2 (x)p0 (x)u (x) − u(x)(v2 (x)p0 (x)) + v2 (x)p1 (x)u(x) = c2 (18.14)

These two linear equations can be solved for u(x) and u′ (x) to determine the two lin-
early independent solutions of the equation Lu = 0. Similarly, if two solutions u1 (x)
and u2 (x) of Lu = 0 are known, we can determine the solutions of L∗ v = 0. Thus,
the solutions of the two homogeneous equations Lu = 0 and L∗ v = 0 are closely re-
lated.
426 | 18 Two-point boundary value problems

18.1.1 The Lagrange identity for an n-th order linear differential operator

Now, consider an n-th order linear differential operator:

dn u dn−1 u du
Lu ≡ p0 (x) n
+ p1 (x) n−1 + ⋅ ⋅ ⋅ + pn−1 (x) + pn (x)u
dx dx dx
n
= ∑ pn−j (x)u[j] (18.15)
j=0

where

dj u
u[j] = (18.16)
dxj
It may be shown that the adjoint equation is given by
n
L∗ v = ∑ (−1)j [pn−j (x)v]
[j]
(18.17)
j=0

and the Lagrange identity is given by

d T
vLu − uL∗ v = [k (v)Pk(u)] (18.18)
dx
where

u(x)
u′ (x)
k(u) = ( u′′ (x) ) (18.19)
.
u[n−1] (x)

is the Wronskian vector of u(x) and the elements of the concomitant matrix P are de-
fined by

h−1
{ ∑n−j+1 (−1)h−1 (
{ ) pn−h−j+1
[h−i]
(x), i≤n−j+1
p∗ij (x) ={ h=i i−1 (18.20)
{
{ =0 i>n−j+1

The bilinear concomitant for the n-th order case may be expressed as

n n−i+1
π(u, v) = ∑ ∑ p∗li v[l−1] u[i−1] . (18.21)
i=1 l=1

It is clear that π(u, v) defined by equation (18.21) is a bilinear form, and hence the name
bilinear concomitant. For example, for n=3, we get
18.1 The adjoint differential operator | 427

p′′
0 − p1 + p2

p1 − p′0 p0
P = ( 2p0 − p1

−p0 0 ) (18.22)
p0 0 0

Using the Lagrange identity, we can prove the following theorems.

Theorem 18.1. The operators L and L∗ are adjoint to each other, i. e., L∗∗ y = Ly (the
adjoint relationship is a reciprocal one).

Theorem 18.2. The concomitant matrix P is nonsingular and its determinant is given by
det P(x) = {p0 (x)}n .

Proof. Observe that for an n × n matrix

a11 a12 . . a1n


a21 a22 . . a2n
( . . )
. .
an1 an2 . . ann

the indices on the antidiagonal sum to n+1, i. e., if i+j = n+1, then aij is an antidiagonal
element. [Remark: antidiagonal elements are those on the line connecting a1n and an1 .]
Since p∗ij = 0, for i > n − j + 1 󳨐⇒ P is an upper triangular matrix with

p∗n−j+1,j = (−1)n−j p0 (x)


n
󳨐⇒ det P = {p0 (x)}

Definition. If L∗ = L, then we say that the differential operator L is formally self-


adjoint.

Theorem 18.3. A necessary and sufficient condition for L to be formally self-adjoint is


that

P = −PT

i. e., P is a skew-symmetric matrix.

Theorem 18.4. If u and v are fundamental vectors for Lu = 0 and L∗ v = 0, respectively,


then

KT (v)PK(u) = C

where C is a nonsingular constant matrix. Further, we can choose v(x) and u(x) such
that C is the identity matrix (here K is the Wronskian matrix).
428 | 18 Two-point boundary value problems

Example 18.1. Self-adjoint form of a second-order operator

Lu = p0 (x)u′′ + p1 (x)u′ + p2 (x)u


L∗ v = p0 (x)v′′ + (2p′0 − p1 )v′ + (p′′ ′
0 − p1 + p2 )v
L = L∗ 󳨐⇒ p′′ ′
0 − p1 = 0 and p1 = 2p′0 − p1 󳨐⇒ p1 = p′0

Lu = p0 (x)u′′ + p′0 (x)u′ + p2 (x)u



= (p0 (x)u′ ) + p2 (x)u
d du
= (p (x) ) + p2 (x)u
dx 0 dx

is formally self-adjoint.
For algebraic details and proofs of the theorems stated above, we refer to the book
by R. H. Cole [14].

18.2 Two-point boundary value problems

Let

dn dn−1 d
L = p0 (x) + p 1 (x) + ⋅ ⋅ ⋅ + pn−1 (x) + pn (x) (18.23)
dx n dxn−1 dx

be an n-th order differential operator and u(x) ∈ C n [a, b], p0 (x) ≠ 0 in [a, b]. Consider
the problem of solving

Lu = −f (x), a<x<b (18.24)

subject to the boundary conditions

α11 u(a) + ⋅ ⋅ ⋅ + α1n u[n−1] (a) + β11 u(b) + ⋅ ⋅ ⋅ + β1n u[n−1] (b) = d1
α21 u(a) + ⋅ ⋅ ⋅ + α2n u[n−1] (a) + β21 u(b) + ⋅ ⋅ ⋅ + β2n u[n−1] (b) = d2
.
.
[n−1] [n−1]
αm1 u(a) + ⋅ ⋅ ⋅ + αmn u (a) + βm1 u(b) + ⋅ ⋅ ⋅ + βmn u (b) = dm (18.25)

Let
18.2 Two-point boundary value problems | 429

u(x)
u′ (x)
k(u(x)) = ( . ) be the Wronskian vector of u(x)
.
u[n−1] (x)

and define the coefficient matrices

α11 α12 . . α1n β11 β12 . . β1n


α21 α22 . . α2n β21 β22 . . β2n
Wa = ( . ), Wb = ( . )
. .
αm1 αm2 . . αmn βm1 βm2 . . βmn

Then the boundary conditions (18.25) may be written as

Wa k(u(a)) + Wb k(u(b)) = d (18.26)

Let

W = (Wa Wb ) = m by 2n matrix and write Eq. (18.26) as


k(u(a))
W( )=d (18.27)
k(u(b))

In practice and most of our applications, m = n, but, for the present we do not impose
this restriction. We assume that rank W = m, i. e., the boundary conditions are inde-
pendent. The n-th order two-point BVP is defined by equations (18.24) and (18.27). Its
solution requires determining a function u(x) satisfying the differential equation as
well as the end conditions.
We can use the principle of superposition to write the solution of equations (18.24)
and (18.27) as

u(x) = u1 (x) + u2 (x), (18.28)

where u1 (x) satisfies the inhomogeneous equation with homogeneous boundary con-
ditions:

Lu1 (x) = −f (x) (18.29)


k(u1 (a))
W( )=0 (18.30)
k(u1 (b))

while u2 (x) satisfies the homogeneous equation with inhomogeneous BCs:

Lu2 (x) = 0 (18.31)


430 | 18 Two-point boundary value problems

k(u2 (a))
W( ) = d. (18.32)
k(u2 (b))

As in the case of linear algebraic equations, whether equations (18.29)–(18.30) has no


solution, a unique solution or an infinite number of solutions, depends on the prop-
erties of the homogeneous problem. These properties are discussed now.

The two-point homogeneous BVP


Consider the two-point homogeneous BVP defined by

Lu = 0, (18.33)
k(u(a))
W( ) = 0. (18.34)
k(u(b))

Theorem 18.5. The solutions of the two-point homogeneous boundary value problem
defined by equations (18.33)–(18.34) form a vector space. Let C r [a, b] be the vector space
of all functions that are r-times differentiable. We can think of L as a linear operator,
i. e.,

L : C r [a, b] 󳨀→ C[a, b]

Then the solutions of Lu = 0 form the kernel of L, i. e., these are elements whose image
under L is the zero image.

Figure 18.1: Schematic of domain, codomain and transformation of kernel of an operator L.

Figure 18.1 shows schematically the domain, codomain and transformation of kernel
of the operator L. We have already shown that the kernel of L is a vector space V (which
is a subspace of C r [a, b]) of dimension n. Now let ψ1 (x) be any element in C r [a, b].
Define a linear transformation (mapping) from C r [a, b] to ℝ2n by the relation
18.2 Two-point boundary value problems | 431

Figure 18.2: Schematic of domain and codomain of the boundary operator ℬ.

ψ1 (a)
ψ′1 (a)
( . )
( )
( )
ψ1 (a) ) = boundary vector (18.35)
[n−1]
ℬ(ψ1 (x)) = (
( )
( ψ1 (b) )
.
ψ [n−1]
( 1 (b) )

where ℬ is a linear mapping (see Figure 18.2). This mapping is not obviously one-to-
one. But we can define inverse images of sets in R2n in a familiar fashion. If S is any set
of elements, then we use ℬ−1 (S) to represent the set of all elements in V whose images
are in S.

Lemma. If S is a subspace of R2n , then ℬ−1 (S) is a subspace of C r [a, b].

Proof. Let ψ1 , ψ2 ∈ B−1 (S) and α1 , α2 be any constants. Then the boundary vector of
α1 ψ1 + α2 ψ2 is given by

ℬ(α1 ψ1 + α2 ψ2 ) = α1 ℬψ1 + α2 ℬψ2 (18.36)

Since S is a subspace, this vector must be in S,

󳨐⇒ α1 ψ1 + α2 ψ2 is in ℬ−1 (S).

∴ The result.

Proof of Theorem 18.5. Equation (18.34) requires the boundary vector to be orthogonal
to the rows of W. Thus, it defines a subspace S of R2n . Since ℬ is linear, from the above
lemma, the inverse image of this subspace is a subspace of C r [a, b].

Definition. The dimension of the solution space of equations (18.33)–(18.34) is called


the index of compatibility of the BVP. The BVP is said to be incompatible if its index of
compatibility is zero. (In this case, u ≡ 0 is the only solution to the BVP).
432 | 18 Two-point boundary value problems

Definition. Let ψ1 (x), . . . , ψn (x) be a fundamental set of solutions to Lu = 0 and

ψ1
ψ2
( . )
( )
K(ψ(x)) = Wronskian matrix of ψ, where ψ = (
( . )
) (18.37)
( . )
.
( ψn )

Define

D = Wa K(ψ(a)) + Wb K(ψ(b)) (18.38)

The matrix D is called the characteristic matrix of the BVP. It plays an important role
in determining the properties of the BVP.

Theorem. If the BVP defined by (18.33)–(18.34) has a characteristic matrix of rank r,


then its index of compatibility is n − r.

Proof. If ψ is a fundamental vector, then

u = ψT c (18.39)

is a solution. From equation (18.39), we evaluate the Wronskian matrix

K(u(x)) = K(ψ(x))c (18.40)

Thus, equation (18.34) 󳨐⇒

Dc = 0 (18.41)

If rank of D is r then, equation (18.41) has (n − r) linearly independent solutions. These


determine precisely (n − r) linearly independent solutions of the BVP.

Example 18.2. Find the index of compatibility of the system:

u′′′ = 0, 0<x<1
u(0) = 0, u(1) = u′ (0), u′ (1) = u′′ (0)

Solution:

ψ1 (x) = 1
ψ2 (x) = x
ψ3 (x) = x 2
18.2 Two-point boundary value problems | 433

are linearly independent solutions of the homogeneous system. We have

1 x x2
K(ψ(x)) = ( 0 1 2x )
0 0 2

1 0 0 1 1 1
K(ψ(0)) = ( 0 1 0 ), K(ψ(1)) = ( 0 1 2 )
0 0 2 0 0 2
1 0 0 0 0 0
W0 = ( 0 −1 0 ), W1 = ( 1 0 0 )
0 0 −1 0 1 0

1 0 0 1 0 0 0 0 0 1 1 1
D=( 0 −1 0 )( 0 1 0 )+( 1 0 0 )( 0 1 2 )
0 0 −1 0 0 2 0 1 0 0 0 2
1 0 0 0 0 0
=( 0 −1 0 )+( 1 1 1 )
0 0 −2 0 1 2
1 0 0
=( 1 0 1 )
0 1 0
rank D = 3

∴ index of compatibility = 0. Thus, the only solution to the homogeneous problem is


the trivial one. [This can also be verified by direct integration and application of the
boundary conditions.]

Example 18.3. Find the index of compatibility of the system:

u′′ − 3u′ + 2u = 0, 0<x<1



u(0) − u (0) = 0, −u′ (1) + u(1) = 0

We note that the two linearly independent solutions of the homogeneous equation are

ψ1 (x) = ex , ψ2 (x) = e2x


434 | 18 Two-point boundary value problems

ex e2x
K(ψ(x)) = ( )
ex 2e2x
1 1 e e2
K(ψ(0)) = ( ), K(ψ(1)) = ( )
1 2 e 2e2

1 −1 1 1 0 0 e e2
D=( )( )+( )( )
0 0 1 2 1 −1 e 2e2
0 −1 0 0
=( )+( )
0 0 0 −e2
0 −1
=( )
0 −e2
0 −1 c
Dc = 0 󳨐⇒ ( ) ( 1 ) = 0 󳨐⇒ c2 = 0, c1 = 1
0 −e2 c2
Rank D = 1

∴ index of compatibility is one, i. e., the solution space has dimension one.

ψ1 (x) = ex is a basis for the solution space.

18.3 The adjoint boundary value problem


Recall the standard inner product of two real valued functions of a real variable x, u(x)
and v(x) in C r [a, b]:

⟨u, v⟩ = ∫ u(x)v(x) dx (18.42)


a

If the functions are complex-valued, the inner product is defined by

⟨u, v⟩ = ∫ u(x)v(x) dx (18.43)


a

where the over bar denotes complex conjugate.


Green’s formula: Consider the n-th order homogeneous equation
n
Lu ≡ p0 (x)u[n] + p1 (x)u[n−1] + ⋅ ⋅ ⋅ + pn (x)u = ∑ pn−j u[j] = 0 (18.44)
j=0

We have shown that the adjoint operator is defined by


18.3 The adjoint boundary value problem | 435

n
L∗ v = ∑ (−1)j [pn−j (x)v]
[j]
=0 (18.45)
j=0

and the Lagrange identity may be expressed as

d d T
vLu − uL∗ v = [π(u, v)] = {k (v(x))P(x)k(u(x))} (18.46)
dx dx
Integrating equation (18.46) both sides from x = a to x = b, we get

∫(vLu − uL∗ v) dx = ⟨Lu, v⟩ − ⟨u, L∗ v⟩


a

= kT (v(b))P(b)k(u(b)) − kT (v(a))P(a)k(u(a))
−P(a) 0 k(u(a))
= [ kT (v(a)) kT (v(b)) ] [ ][ ] (18.47)
0 P(b) k(u(b))

Equation (18.47) is called the Green’s formula. Now, consider the homogeneous two-
point BVP,

Lu = 0 (18.48)
Wa k(u(a)) + Wb k(u(b)) = 0 (18.49)

We want to impose boundary conditions on the function v of the adjoint problem such
that when these conditions and the adjoint equation

L∗ v = 0 (18.50)

are satisfied, the right-hand side of Green’s formula is zero, i. e., we want to find a set
of boundary conditions (called the adjoint BCs) such that

⟨Lu, v⟩ = ⟨u, L∗ v⟩.

Write Green’s formula as

−P(a) 0 k(u(a))
[ kT (v(a)) kT (v(b)) ] [ ] [ ]
0 P(b) k(u(b)) =0 (18.51)
1 × 2n
2n × 2n 2n × 1

The BCs on u may be written as

k(u(a)) k(u(a))
[ Wa Wb ] [ ] = 0 ⇒ W[ ]=0 (18.52)
k(u(b)) k(u(b))

i. e., the boundary vector is orthogonal to the rows of W. Thus, we can satisfy (18.51)
if we require the vector
436 | 18 Two-point boundary value problems

−P(a) 0
[ kT (v(a)) kT (v(b)) ] [ ]
0 P(b)

belong to the row space of W, i. e.,

−P(a) 0
[ kT (v(a)) kT (v(b)) ] [ ] = aT W (18.53)
0 P(b)
wT1
wT2
= ( a1 a2 ... a3 ) ( ) (18.54)
.
wTn

where a is any vector in Rn . These are called the adjoint boundary conditions. Taking
the transpose of equation (18.54), we get

−PT (a) 0 k(v(a))


WT a = [ ][ ] (18.55)
0 PT (b) k(v(b))

Equation (18.55) defines a set of 2n relations in k(v(a)) and k(v(b)). However, these
relations contain n unknown constants a. By eliminating these constants, we obtain
a set of n relations in terms of k(v(a)) and k(v(b)). These give the adjoint boundary
conditions. Equation (18.55) may be written as

− PT (a)k(v(a)) = WTa a and PT (b)k(v(b)) = WTb a (18.56)

This is another form of the adjoint BCs.

Example 18.4. Consider the BVP:

Lu ≡ u′′ − 3u′ + 2u = 0
u(0) − u′ (0) = 0, u(1) − u′ (1) = 0

We have already determined the adjoint homogeneous equation

L∗ v = v′′ + 3v′ + 2v = 0.

To find the adjoint BCs, use equation (18.55):

u(0)
1 −1 0 0 [ u′
[ (0)
]
] 0
[ ][ ]=( )
0 0 1 −1 [ u(1) ] 0
[ u (1)

]


18.3 The adjoint boundary value problem | 437

1 −1 0 0
W=[ ]
0 0 1 −1

and

−3 1
P=( )
−1 0

3 1 0 0 v(0) 1 0
[ −1 0 0 0 ] [ v′ (0) ] [ −1 0 ] a
][ 1 ]
[ ][ ] [ ]
[ ][ ]=[
[ 0 0 −3 −1 ] [ v(1) ] [ 0 1 ] a2
[ 0 0 1 0 ] [ v (1) ] [ 0

−1 ]
a1
[ −a ]
[ 1 ]
=[ ]
[ a2 ]
[ −a2 ]

󳨐⇒

3v(0) + v′ (0) a1
[ −v(0) ] [ −a ]
[ ] [ 1 ]
[ ]=[ ]
[ −3v(1) − v′ (1) ] [ a2 ]
[ v(1) ] [ −a2 ]

Add rows (1) and (2), and rows (3) and (4),
󳨐⇒

2v(0) + v′ (0) 0
[ −v(0) ] [ −a ]
[ ] [ 1 ]
[ ]=[ ]
[ −2v(1) − v′ (1) ] [ 0 ]
[ +v(1) ] [ −a2 ]

Thus, the adjoint boundary conditions are

2v(0) + v′ (0) = 0
−2v(1) − v′ (1) = 0

The boundary value problems

u′′ − 3u′ + 2u = 0 v′′ + 3v′ + 2v = 0


u(0) − u′ (0) = 0 2v(0) + v′ (0) = 0
u(1) − u′ (0) = 0 2v(1) + v′ (1) = 0
438 | 18 Two-point boundary value problems

are adjoint to each other. It may be verified that ϕ1 (x) = e−2x is a basis for the solution
space of the adjoint problem.

Remark. While the above general formalism is useful for higher order BVPs, for the
case of second- and fourth- order BVPs where the concomitant is a simple expression,
we can determine the adjoint BCs by simply using the relation

π(u(a), v(a)) − π(u(b), v(b)) = 0.

For example, in the above example,

π(u, v) = u′ v − uv′ − 3uv,

and equating it at the two end points leads to

u′ (1)v(1) − u(1)v′ (1) − 3u(1)v(1) − u′ (0)v(0) + u(0)v′ (0) + 3u(0)v(0) = 0.

Now, using the BCs on u(x), this simplifies to

−u(1)[v′ (1) + 2v(1)] + u(0)[v′ (0) + 2v(0)] = 0.

The adjoint BCs are obtained by setting the quantities in the brackets to zero.

Example 18.5. Consider the fourth-order differential equation

d4 u
Lu = =0
dx4
u(0) = u′′ (0) = 0, u(1) = u′′ (1) = 0

Following the above procedure, it may be verified that this BVP is self-adjoint, i. e., the
adjoint operator and BCs are same.

18.3.1 Adjoint BCs and conditions for self-adjointness of the BVP

If the adjoint BCs may be written in the form

k(v(a))
Q[ ] = 0, (18.57)
k(v(b))

then the two systems

Lu = 0 (18.58)
k(u(a))
W[ ]=0 (18.59)
k(u(b))
18.3 The adjoint boundary value problem | 439

and

L∗ v = 0 (18.60)
k(v(a))
Q[ ]=0 (18.61)
k(v(b))

are adjoint to each other. We have seen that the set of solutions to equations (18.58)–
(18.59) forms the vector space, which is a subspace of C n [a, b]. The set of solutions to
equations (18.60)–(18.61) also form a subspace of C n [a, b]. If these two subspaces are
identical, then we say that the BVP is self-adjoint. This is so if L = L∗ and Q = W or
Q = CW, where C is a nonsingular matrix. In terms of the coefficient and concomitant
matrices, the condition for self-adjointness of the BVP may be expressed as
T
P−1 (a) 0
Q[ ] WT = 0. (18.62)
0 −P−1 (b)

Further, it may be shown that the index of compatibility of the adjoint system is
the same as that of the original system. If we start with the Lagrange identity

d T
vLu − uL∗ v = [k (v(x))P(x)k(u(x))]
dx
and assume that u satisfies

Lu = 0

and v satisfies

L∗ v = 0

󳨐⇒
d T
[k (v(x))P(x)k(u(x))] = 0 󳨐⇒ kT (v(x))P(x)k(u(x)) = constant (18.63)
dx
Let

u(x) = uT c

where u(x) is a fundamental vector. Similarly, let

v(x) = vT d

󳨐⇒

kT (vT d)P(x)k(uT c) = constant


440 | 18 Two-point boundary value problems

or

dT KT (v(x))P(x)K(u(x))c = constant

Since d and c are arbitrary vectors in ℝn 󳨐⇒

KT (v(x))P(x)K(u(x)) = C (constant matrix)

We can choose u such that C = I,


KT (v(x))P(x)K(u(x)) = I (18.64)

Now, to determine the index of compatibility of the adjoint system, we use the adjoint
BCs,

−kT (v(a))P(a) = aT Wa
kT (v(b))P(b) = aT Wb

If v is fundamental vector, then

v = cT v for some c ∈ Rn

󳨐⇒

−cT KT (v(a))P(a) = aT Wa (18.65)


T T T
c K (v(b))P(b) = a Wb (18.66)

Multiply equation (18.65) by K(u(a)) and multiply equation (18.66) by K(u(b)) and use
equation (18.64),
󳨐⇒

−cT = aT Wa K(u(a))
cT = aT Wb K(u(b))

that leads after adding to


󳨐⇒

0 = aT D

or

DT a = 0 (18.67)
18.3 The adjoint boundary value problem | 441

If rank D = r, then there are (n − r) solutions, aj , to the equations. From each of these
solutions, we get a solution to the adjoint problem

vj = −aTj Wa K(u(a))v(x) (18.68)

Thus, the index of compatibility of the adjoint system is also (n − r).

Problems
1. (a) Show that the differential operator

1 ′
Lu = − [(p(x)u′ ) + q(x)u], a<x<b
w(x)

is formally self-adjoint with respect to the inner product

⟨u, v⟩ = ∫ w(x)u(x)v(x) dx
a

(b) Show that any formally self-adjoint operator of order 2m with respect to
dm
the usual inner product may be written in the form Lu = dx m {q0 (x)u
[m]
}+
dm−1
{q (x)u[m−1] }
dx m−1 1
+ ⋅ ⋅ ⋅ + qm (x)u
2. (a) Find the index of compatibility and the solution space of each of the following
boundary value problems:
(i) u′′′ = 0, 0 < x < 1; u(0) = 0, u′ (1) = 0, u′ (0) − 2u(1) = 0 and (ii) u′′ − 3u′ +
2u = 0, 0 < x < 1; u(0) − u(1) = 0, u′ (0) − u′ (1) = 0
(b) Determine the values of the parameter λ for which the following boundary
value problems are compatible:
(i)

u′′ + λ2 u = 0, 0 < x < π; u(0) = u(π) = 0

(ii)
u[4] − λ4 u = 0, −1 < x < 1; u(−1) = u(1) = u′′ (−1) = u′′ (1) = 0
4
3. Given the fourth-order operator Lu = ddxu4 , and boundary conditions as follows:
(a) u(0) = 0, u′′ (0) = 0, u(1) = 0, u′′ (1) = 0
(b) u(0) = 0, u′ (0) = 0, u(1) = 0, u′ (1) = 0
(c) u(0) = 0, u′ (0) = 0, u′′ (1) = 0, u′′′ (1) = 0
(d) u(0) = 0, u′′′ (0) = 0, u(1) = 0, u′ (1) = 0
(e) u(0) = 0, u′′′ (0) = 0, u(1) = 0, u′′ (1) = 0
Determine the adjoint boundary conditions.
4. The following boundary value problem is known as the Orr–Sommerfeld equation
and arises in the stability analysis of parallel shear flows:
442 | 18 Two-point boundary value problems

u[4] − 2k 2 u′′ + k 4 u − ik Re{[1 − x 2 − c][u′′ − k 2 u] + 2u} = 0,


0 < x < 1, i = √−1
′ ′′′ ′
u (0) = u (0) = u(1) = u (1) = 0

Here, k is the wave number (k > 0), Re is the Reynolds number (Re > 0) and c is a
complex number (which is the dimensionless wave speed). Show that the adjoint
system is given by

v[4] − 2k 2 v′′ + k 4 v + ik Re{[1 − x 2 − c][v′′ − k 2 v] − 4xv′ } = 0,


v′ (0) = v′′′ (0) = v(1) = v′ (1) = 0; c = complex conjugate of c

5. Let V = C[a, b], the vector space of complex valued continuous functions defined
over the real interval [a, b] and T be the linear operator on V defined by

Tu(x) = ∫ K(s, x)u(s) ds


a

where u ∈ C[a, b] and K(s, x) is continuous in [a, b] × [a, b]. Determine the adjoint
operator with respect to the usual inner product.
6. (a) Given the linear operator

𝜕2 u 𝜕2 u
Lu = + + λu, 0 < x < a, 0<y<b
𝜕x 2 𝜕y2

and boundary conditions

𝜕u
u(0, y) = 0, (a, y) = 0,
𝜕x
𝜕u
u(x, 0) = 0, (x, b) + αu(x, b) = 0
𝜕y

Determine the adjoint system.


7. Given the linear operator

𝜕2 u 𝜕u
Lu = − , 0 < x < 1, t>0
𝜕x 2 𝜕t

and boundary and initial conditions

𝜕u
− αu = 0, @ x = 0; u = 0, @x = 1
𝜕x
u = 0, @t = 0

Determine the adjoint system.


18.3 The adjoint boundary value problem | 443

8. Consider the linear partial differential equation


n n n
𝜕2 u 𝜕u
Lu = ∑ ∑ aij + ∑ bi + cu
i=1 j=1
𝜕xi 𝜕xj i=1 𝜕xi

where the coefficients aij , bi and c are real analytic functions of real variables
x1 , . . . , xn and where the matrix of elements aij is symmetric. (a) Determine the
adjoint operator and the form of the Lagrange identity (b) Use the divergence the-
orem to obtain a formula analogous to Green’s formula.
9. The steady-state conversion (u) in a radial flow reactor (see Figure 18.3) is given
by the boundary value problem

1 1 d du r du
(r ) − 0 − Da u = − Da, r0 < r < 1
Pe r dr dr r dr
du 1 du
(1) = 0; (r ) − u(r0 ) = 0
dr Pe dr o

where Pe is the Peclet number, Da is the Damköhler number and r0 is the dimen-
sionless inner radius.

Figure 18.3: Schematic diagram illustrating a radial flow reactor.

(a) Determine the adjoint homogeneous problem w. r. t. the inner product

⟨u, v⟩ = ∫ ruv dr
r0

(b) Give a physical interpretation of the adjoint problem


(c) Write down the BVP that describes the conversion v(r) when the flow direction
is reversed.
(d) Use the Lagrange identity and the above results to show that u(1) = v(r0 ), i. e.,
the exit conversion is independent of the flow direction.
444 | 18 Two-point boundary value problems

10. Let V be the vector space of complex valued functions u(x) defined on the interval
(0, a) satisfying the periodicity condition u(0) = u(a). Let L be a linear operator on
V defined by

du
Lu(x) = −i ; i = √−1
dx

Show that this operator is self-adjoint with respect to the usual inner product on V,
i. e.,
a

⟨u, v⟩ = ∫ u(x)v(x) dx.


0
19 The nonhomogeneous BVP and Green’s function
19.1 Introduction to Green’s function
Consider the nonhomogeneous BVP

Lu = −f (x), a<x<b (19.1)

with homogeneous BCs

Wa k(u(a)) + Wb k(u(b)) = 0. (19.2)

Suppose that we can express the solution of equation (19.1) and (19.2) as

u(x) = ∫ G(x, ξ )f (ξ ) dξ (19.3)


a

where G(x, ξ ) is called the Green’s function of the linear operator L with homogeneous
BCs

Wa k(u(a)) + Wb k(u(b)) = 0. (19.4)

Remark. Some authors define the Green’s function with a positive sign on the RHS of
equation (19.1). However, we shall use the definition above to be consistent with the en-
gineering literature. With this notation, positive (negative) values of f (x) correspond
to source (sink).

We first discuss an example in which the Green’s function is obtained by direct


integration. We then obtain a formula for the Green’s function of a second-order self-
adjoint BVP. Lastly, we discuss the case of inhomogeneous n-th order BVP.

Example 19.1. Consider the second-order BVP defined by

d2 u
= −f (x), 0 < x < 1
dx2
u′ (0) = 0, u(1) = 0.

Integrating once from 0 to x and using the first boundary condition, we get

x
du
= − ∫ f (η) dη
dx
0

Integrating again,

https://doi.org/10.1515/9783110739701-020
446 | 19 The nonhomogeneous BVP and Green’s function

x ξ

u(x) = u(0) − ∫[∫ f (η) dη] dξ ,


0 0
1 ξ

󳨐⇒ u(1) = u(0) − ∫ ∫ f (η) dη dξ


0 0

󳨐⇒

1 ξ x ξ

u(x) − u(1) = ∫[∫ f (η) dη] dξ − ∫[∫ f (η) dη] dξ


0 0 0 0

u(1) = 0 󳨐⇒

1 ξ 0 ξ

u(x) = ∫[∫ f (η) dη] dξ + ∫[∫ f (η) dη] dξ ,


0 0 x 0

which is simplified to

1 ξ

u(x) = ∫[∫ f (η) dη] dξ


x 0
1 x 1 ξ

= ∫ ∫ f (η) dη dξ + ∫ ∫ f (η) dη dξ
x 0 x x

To evaluate this double integral, we change the order of integration. Figure 19.1 shows
the schematic of the domain and change in order of variables in the second double
integration.

Figure 19.1: Schematic of the change in variables within double integration.


19.2 Green’s function for second-order self-adjoint TPBVP | 447

Thus,

x 1 1 ξ =1

u(x) = ∫ ∫ f (η) dξ dη + ∫ ∫ f (η) dξ dη


0 x η=x ξ =η
x 1

= (1 − x) ∫ f (η) dη + ∫(1 − η)f (η) dη


0 x
1

= ∫ G(x, η)f (η) dη


0

where

(1 − x), 0≤η≤x
G(x, η) = {
(1 − η), x<η≤1

is the Green’s function.

19.2 Green’s function for second-order self-adjoint TPBVP


We now generalize the above procedure to the more general second-order BVP,

d du
Lu ≡ (p(x) ) − q(x)u = −f (x), a < x < b, (19.5)
dx dx

with one of the following three types of BCs:

α1 u(a) + α2 u′ (a) = 0
} Robin/mixed/radiation boundary conditions (19.6)
β1 u(b) + β2 u′ (b) = 0
u(a) = 0
} Dirichlet boundary conditions (19.7)
u(b) = 0
u(a)p(a) = u(b)p(b)
} Periodic boundary conditions (19.8)
u′ (a) = u′ (b)

It may be easily verified that the TPBVP defined by equation (19.5) with one of the
above set of BCs (19.6)–(19.8) is self-adjoint. We want to write the solution of equation
(19.5) with Robin/Dirichlet/Periodic BCs in the form

u(x) = ∫ G(x, s)f (s) ds (19.9)


a
448 | 19 The nonhomogeneous BVP and Green’s function

where G(x, s) is the Green’s function. The method for developing Green’s functions is
based on the method of variation of parameters. Suppose that u1 (x) and u2 (x) are two
linearly independent solutions of the homogeneous equation

Lu = (pu′ ) − qu = 0

Assume that u1 (x) satisfies the boundary conditions at x = a and u2 (x) satisfies the
boundary condition at x = b, i. e.,

α1 u1 (a) + α2 u′1 (a) = 0


β1 u2 (b) + β2 u′2 (b) = 0

Express the solution to equations (19.5) and (19.6)–(19.8) as

u = c1 u1 (x) + c2 u2 (x) (19.10)

󳨐⇒

u′ = c1 u′1 + c2 u′2 + c1′ u1 + c2′ u2 (19.11)

Choose c1 and c2 such that

c1′ u1 + c2′ u2 = 0 (19.12)

󳨐⇒

u′ = c1 u′1 + c2 u′2

󳨐⇒

pu′ = c1 pu′1 + c2 pu′2

󳨐⇒
′ ′
(pu′ ) = c1 (pu′1 ) + c2 (pu′2 ) + c1′ pu′1 + c2 pu′2

and

qu = qc1 u1 + qc2 u2

From the above two equations, we get


′ ′
−f (x) = c1 [(pu′1 ) − qu1 ] + c2 [(pu′2 ) − qu2 ] + c1′ pu′1 + c2′ pu′2

󳨐⇒
19.2 Green’s function for second-order self-adjoint TPBVP | 449

f
c1′ u′1 + c2′ u′2 = − (19.13)
p

Equations (19.12) and (19.13) 󳨐⇒

u1 u2 c1′ 0 f
[ ][ ]=[ ] = −e2
u′1 u′2 c2′ − pf p

󳨐⇒

f
K(u(x))c′ = −e2
p
f
c′ = −K−1 (u(x))e2
p
󳨐⇒
x
f (s)
c = c0 − ∫ K−1 (u(s))e2 ds (19.14)
p(s)
a

Substituting equation (19.14) in equation (19.10) gives


x
c10 f (s)
u(x) = [ u1 (x) u2 (x) ] [ ] − [ u1 (x) u2 (x) ] ∫ K−1 (u(s))e2 ds
c20 p(s)
a
x
′ c10 f (s)
u (x) = [ u′1 (x) u′2 (x) ][ ]−[ u′1 (x) u′2 (x) ] ∫ K−1 (u(s))e2 ds
c20 p(s)
a

f (x)
− [ u1 (x) u2 (x) ] K−1 (u(x))e2 (19.15)
p(x)

The last term in equations (19.15) is identically zero as can be seen from the following
analysis:

u1 u2
K(u(x)) = [ ]
u′1 u′2

det K ≡ W(x) = u1 u′2 − u2 u′1

u′2 −u2 1
K−1 (u(x)) = [ ]
−u′1 u1 W

u′2 −u2 1
uT K−1 (u(x)) = [ u1 u2 ] [ ]
−u′1 u1 W
450 | 19 The nonhomogeneous BVP and Green’s function

1
= [ u1 u′2 − u2 u′1 −u1 u2 + u1 u2 ]
W
=[ 1 0 ]

f (x) 0 f (x)
uT K−1 (u)e2 =[ 1 0 ][ ]
p(x) 1 p(x)
=0

󳨐⇒
x
c10 f (s)
u′ (x) = [ u′1 u′2 ] [ ] − [ u′1 u′2 ] ∫ K−1 (u(s))e2 ds (19.16)
c20 p(s)
a

Thus,

c10
u(a) = [ u1 (a) u2 (a) ] [ ]
c20
c10
u′ (a) = [ u′1 (a) u′2 (a) ] [ ] and
c20
α1 u(a) + α2 u′ (a) = α1 [c10 u1 (a) + c20 u2 (a)] + α2 [c10 u′1 (a) + c20 u′2 (a)]
= c10 [α1 u1 (a) + α2 u′1 (a)] + c20 [α1 u2 (a) + α2 u′2 (a)]
= c20 [α1 u2 (a) + α2 u′2 (a)] = 0

󳨐⇒

c20 = 0
b
f (s)
u(b) = uT (b)c0 − uT (b) ∫ K−1 (u(s))e2 ds
p(s)
a
b
f (s)
u (b) = u (b)c0 − u (b) ∫ K−1 (u(s))e2
′ ′T ′T
ds
p(s)
a

β1 u(b) + β2 u′ (b) = 0 󳨐⇒

b
f (s)
β1 c10 u1 (b) + β2 c10 u′1 (b) = [ β1 u1 (b) + β2 u′1 (b) β1 u2 (b) + β2 u′2 (b) ] ∫ K−1 e2 ds
p(s)
a

󳨐⇒
19.2 Green’s function for second-order self-adjoint TPBVP | 451

b
f
c10 = [ 1 0 ] ∫ K−1 (u(s))e2 ds
p
a


b x
f (s) f
u(x) = [u1 (x) 0] ∫ K−1 (u(s))e2 ds − [ u1 u2 ] ∫ K−1 (u(s))e2 ds
p(s) p
a a
x b x x

= [u1 0]{∫{.} ds + ∫{.} ds} − [u1 0] ∫{.} ds − [ 0 u2 ] ∫{.} ds


a x a a
b x

= [u1 0] ∫{.} ds − [ 0 u2 ] ∫{.} ds


x a

Now,

1 u′ (s) −u2 (s) 0


K−1 (u(s))e2 = [ 2′ ][ ]
W(s) −u1 (s) u1 (s) 1

1 −u2 (s)
= [ ]
W(s) u1 (s)


b x
−u (x)u2 (s)f (s) −u (x)u1 (s)f (s)
u(x) = ∫ 1 ds + ∫ 2 ds
p(s)W(s) p(s)W(s)
x a
b

= ∫ G(x, s)f (s) ds


a

where
−u2 (x)u1 (s)
{ p(s)W(s)
, a≤s≤x
G(x, s) = { (19.17)
−u1 (x)u2 (s)
{ p(s)W(s)
, x≤s≤b

is the Green’s function. To simplify the expression for the Green’s function further, we
make the following observation:

d
{p(x)W(x)} = 0
dx
To prove,
452 | 19 The nonhomogeneous BVP and Green’s function

d
LHS = {p(x)(u1 u′2 − u2 u′1 )}
dx
′ ′
= u1 (pu′2 ) + u′1 pu′2 − u2 (pu′1 ) − pu′1 u′2
= qu1 u2 − u2 [qu1 ] = 0

Therefore, p(x)W(x) = constant. We may choose u1 and u2 such that the constant is
equal to minus one (−1). Then

u1 (s)u2 (x), a<s<x


G(x, s) = { (19.18)
u1 (x)u2 (s), x < s < b.

Figure 19.2 shows an interpretation of the Green’s function considered as a function of


x (with s fixed) or function of s (with x fixed).

Figure 19.2: Geometric interpretation of Green’s function.

Example 19.2. Consider

d2 u
= −f (x), u′ (0) = 0, u(1) = 0
dx2

u1 (x) = 1, u2 (x) = 1 − x, p(x) = 1, q(x) = 0


󵄨󵄨 󵄨
󵄨 1 1 − x 󵄨󵄨󵄨
W = 󵄨󵄨󵄨󵄨 󵄨󵄨 = −1
󵄨󵄨 0 −1 󵄨󵄨󵄨

1 − x, a<s<x
G(x, s) = {
1 − s, x<s<1

Example 19.3. Consider

d2 u
= −f (x)
dx2
u(0) = 0, u(1) = 0
19.2 Green’s function for second-order self-adjoint TPBVP | 453

u1 (x) = x, u2 (x) = 1 − x, p(x) = 1, q(x) = 0


󵄨󵄨 󵄨
󵄨 x 1 − x 󵄨󵄨󵄨
W = 󵄨󵄨󵄨󵄨 󵄨󵄨 = −x − 1 + x = −1
󵄨󵄨 1 −1 󵄨󵄨󵄨

Therefore,

s(1 − x), 0<s<x


G(x, s) = {
x(1 − s), x<s<1
x 1

u(x) = ∫ s(1 − x)f (s) ds + ∫ x(1 − s)f (s) ds


0 x
x 1

= (1 − x) ∫ sf (s) ds + x ∫(1 − s)f (s) ds


0 x

Example 19.4. Consider

d du n2
Lu ≡ (x ) − u = −f (x), n is a positive integer.
dx dx x
u′ (0) = 0, u(1) = 0

This operator arises in the solution of inhomogeneous problems in cylindrical ge-


ometry. We note that the homogeneous equation is a Euler equation and

u1 (x) = x n , u2 (x) = x −n − x n

are linearly independent solutions satisfying the boundary conditions at the ends:

󵄨 xn x −n − x n
󵄨󵄨 󵄨󵄨
󵄨󵄨 −2n
W(x) = 󵄨󵄨󵄨󵄨 󵄨󵄨 =
󵄨󵄨 nx n−1 −nx −n−1 − nx n−1 󵄨󵄨
󵄨 x

−2n
p(x)W(x) = x( ) = −2n
x

sn (x −n −x n )
{ 2n
, 0<s<x
G(x, s) = { n −n n
x (s −s )
{ 2n
, x<s<1
s n
[( x ) − (sx)n ]/2n, 0 < s < x
={
[( xs )n − (sx)n ]/2n, x < s < 1
454 | 19 The nonhomogeneous BVP and Green’s function

19.3 Properties of the Green’s function for the second-order


self-adjoint BVP
We now examine the properties of the Green’s function for the second-order self-
adjoint problem:

d du
Lu ≡ (p(x) ) − q(x)u = −f (x), a<x<b (19.19)
dx dx
u1 (s)u2 (x)
{ − p(s)W(s) , a<s<x
G(x, s) = { u (x)u (s) (19.20)
1 2
{ − p(s)W(s) , x<s<b

Properties of G(x, s):


1. G(x, s) is a continuous function in [a, b] × [a, b], even at x = s.
2. G(x, s) is symmetric, i. e.,

G(x, s) = G(s, x) (19.21)

3. G(x, s) satisfies the differential equation

Lu = 0

except perhaps at x = s, i. e.,

d dG
(p(x) ) − q(x)G = 0 (19.22)
dx dx

This is a simple and straightforward calculation.


4. The derivative of G(x, s) has a jump discontinuity at x = s. Treating G(x, s) as a
function of x,

− u1 (x)u
C
2 (s)
, a<x<s
G(x, s) = {
− u2 (x)u
C
1 (s)
, s<x<b
𝜕G 󵄨󵄨󵄨 −u1 (s)u′2 (x) 󵄨󵄨󵄨󵄨
󵄨󵄨 = 󵄨󵄨 (x approaching s from the RHS)
𝜕x 󵄨󵄨x=s+ C 󵄨󵄨x=s
−u1 (s)u′2 (s)
=
C
𝜕G 󵄨󵄨󵄨 −u′1 (x)u2 (s) 󵄨󵄨󵄨󵄨
󵄨󵄨 = 󵄨󵄨 (x approaching s from the LHS)
𝜕x 󵄨󵄨x=s− C 󵄨󵄨x=s
−u′1 (s)u2 (s)
=
C

19.3 Properties of the Green’s function for the second-order self-adjoint BVP | 455

𝜕G 󵄨󵄨󵄨 𝜕G 󵄨󵄨󵄨
󵄨󵄨
󵄨 − 󵄨󵄨 = Jump at x = s
𝜕x 󵄨x=s + 𝜕x 󵄨󵄨x=s−
u′ u − u1 u′2 −W(s)
= 1 2 =
C p(s)W(s)
𝜕G + 𝜕G − −1
(s , s) − (s , s) = (19.23)
𝜕x 𝜕x p(s)

Similarly, treating G(x, s) as a function of s,

𝜕G 󵄨󵄨󵄨 −u1 (x)u′2 (s) 󵄨󵄨󵄨󵄨 −u1 (x)u′2 (x)


󵄨󵄨 = 󵄨󵄨 =
󵄨
𝜕s 󵄨s=x + C 󵄨󵄨s=x C

and

𝜕G 󵄨󵄨󵄨 −u′1 (s)u2 (x) 󵄨󵄨󵄨󵄨 −u′1 (x)u2 (x)


󵄨󵄨 = 󵄨 =
𝜕s 󵄨󵄨s=x− C 󵄨󵄨
󵄨s=x C
󵄨
𝜕G 󵄨󵄨 𝜕G 󵄨󵄨 󵄨 u 1 u2 − u1 u2
′ ′
−W(x)
Jump = 󵄨󵄨 − 󵄨󵄨 = =
𝜕s 󵄨󵄨s=x+ 𝜕s 󵄨󵄨s=x− C W(x)p(x)
𝜕G + 𝜕G − 1
⇒ (x , x) − (x , x) = − (19.24)
𝜕s 𝜕s p(x)

5. G(x, s) satisfies the boundary conditions in both variables.

Proof. Consider G(x, s) as a function of x


BC1

−u1 (a)u2 (s) −u′ (a)u2 (s)


α1 G(a, s) + α2 G′ (a, s) = α1 + α2 1
C C
u2 (s) ′
=− [α1 u1 (a) + α2 u1 (a)] = 0 (19.25)
C

BC2

−u1 (s)u2 (b) −u (s)u′2 (b)


β1 G(b, s) + β2 G′ (b, s) = β1 + β2 1
C C
−u1 (s) ′
= [β1 u2 (b) + β2 u2 (b)] = 0 (19.26)
C

Consider G(x, s) as a function of s


BC1

−u1 (a)u2 (x) −u′ (a)u2 (x)


α1 G(x, a) + α2 G′ (x, a) = α1 [ ] + α2 [ 1 ]
C C
−u2 (x)
= [α1 u1 (a) + α2 u′1 (a)] = 0 (19.27)
C

BC2
456 | 19 The nonhomogeneous BVP and Green’s function

−u1 (x)u2 (b) −u (x)u′2 (b)


β1 G(x, b) + β2 G′ (x, b) = β1 [ ] + β2 [ 1 ]
C C
−u1 (x)
= [β1 u2 (b) + β2 u′2 (b)] = 0 (19.28)
C
b
6. u(x) = ∫a G(x, s)f (s) ds is indeed the solution to Lu = −f (x) with the homoge-
neous boundary conditions.

α1 u(a) + α2 u′ (a) = 0
β1 u(b) + β2 u′ (b) = 0

Proof.

u(x) = ∫ G(x, s)f (s) ds (19.29)


a
x b

= ∫ G(x, s)f (s) ds + ∫ G(x, s)f (s) ds


a x

󳨐⇒
x b
𝜕G(x, s) 𝜕G(x, s)
u′ (x) = ∫ f (s) ds + ∫ f (s) ds
𝜕x 𝜕x
a x

+ G(x, x )f (x ) − G(x, x + )f (x + )
− −
(19.30)

Since G and f are continuous, the last two terms cancel and we get

x b
𝜕G(x, s) 𝜕G(x, s)
u′ (x) = ∫ f (s) ds + ∫ f (s) ds (19.31)
𝜕x 𝜕x
a x

Now, using equations (19.29) and (19.31), we get

u(a) = ∫ G(a, s)f (s) ds


a
b
𝜕G
u′ (a) = ∫ (a, s)f (s) ds
𝜕x
a

󳨐⇒
b
′ 𝜕G
α1 u(a) + α2 u (a) = ∫[α1 G(a, s) + α2 (a, s)]f (s) ds = 0
𝜕x
a
19.3 Properties of the Green’s function for the second-order self-adjoint BVP | 457

since G satisfies the BC at x = a. Similarly,

u(b) = ∫ G(b, s)f (s) ds


a
b
𝜕G
u′ (b) = ∫ (b, s)f (s) ds
𝜕x
a
b
′ 𝜕G
β1 u(b) + β2 u (b) = ∫[β1 G(b, s) + β2 (b, s)]f (s) ds = 0
𝜕x
a

since G satisfies the BC at x = b. Thus, the solution given by equation (19.29) satisfies
the BCs.
Now, equation (19.31) 󳨐⇒

x b
′ ′ 𝜕 𝜕G 𝜕G
(pu ) = [p(x) ∫ (x, s)f (s) ds + p(x) ∫ (x, s)f (s) ds]
𝜕x 𝜕x 𝜕x
a x
x b
′ ′ 𝜕G 𝜕G
= ∫(pG′ ) f (s) ds + ∫(pG′ ) f (s) ds + p(x)f (x)( (x, x − ) − (x, x + ))
𝜕x 𝜕x
a x

󳨐⇒

x b
′ ′ ′ ′ ′
(pu ) + qu = ∫[(pG ) + qG]f (s) ds + ∫[(pG′ ) + qG]f (s) ds
a x
𝜕G 𝜕G
+ p(x)f (x)[ (x, x− ) − (x, x+ )]
𝜕x 𝜕x
−1
= 0 + 0 + p(x)f (x)[ ]
p(x)
= − f (x)

Thus, G(x, s) is the solution of Lu = −f (x).


(7) G(x, s) is a unique function.

Proof. Suppose that there are two, say G1 (x, s) and G2 (x, s) and form

Ḡ = G1 − G2

(i) Ḡ satisfies the BCs


(ii) Ḡ satisfies the homogeneous equation even at x = s because the jump disap-
pears.
458 | 19 The nonhomogeneous BVP and Green’s function

Thus, Ḡ is a regular solution of the homogeneous system which by assumption


is incompatible. Thus, Ḡ ≡ 0 is the only solution 󳨐⇒ G1 = G2 or G(x, s) is a unique
function.
(8) The solution
b

u(x) = ∫ G(x, s)f (s) ds


a

is unique. The proof is similar to (7). Suppose that there are two solutions û 1 (x) and
û 2 (x). Then,

b b

û 1 (x) − û 2 (x) = ∫ G(x, s)f (s) ds − ∫ G(x, s)f (s) ds ≡ 0,


a a

since G(x, s) is continuous.


󳨐⇒

û 1 (x) = û 2 (x).

19.4 Green’s function for the n-th order TPBVP


Consider n-th order TPBVP:

Lu = −f (x), a<x<b (19.32)


Wa k(u(a)) + Wb k(u(b)) = 0 (19.33)

Let

ψ(x)T = [ ψ1 (x) ψ2 (x) . . ψn (x) ]

be a fundamental vector for Lu = 0. Then the general solution of equations (19.32)–


(19.33) is given by
x
T T −1 −f (s)
u(x) = ψ(x) c + ψ(x) ∫ K(ψ(s)) ⋅ en [ ] ds (19.34)
p0 (s)
a

Now,

ψ(x)T = [ψ1 (x) ψ2 (x) ⋅⋅⋅ ψn (x)]


ψ1 ψ2 . ψn
[ ]
ψ′2 ψ′2 ψ′n
= eT1 [ T
[ . ]
. ] = e1 K(ψ(x)) (19.35)
[ ]
[ ψ[n−1]
1 ψ1[n−2] . ψn[n−1] ]
19.4 Green’s function for the n-th order TPBVP | 459


x
−f (s)
u(x) = ψ(x)T c + eT1 K(ψ(x)) ∫ K(ψ(s))
−1
⋅ en [ ] ds (19.36)
p0 (s)
a

= uh + up
uh = ψ(x)T c = c1 ψ1 + c2 ψ2 + ⋅ ⋅ ⋅ + cn ψn
u′h = c1 ψ′1 + c1 ψ′2 + ⋅ ⋅ ⋅ + cn ψ′n
.
.
u[n−1]
h
= c1 ψ[n−1]
1 + c2 ψ2[n−1] + ⋅ ⋅ ⋅ + cn ψn[n−1]

k(uh (x)) = K(ψ(x))c (19.37)


x
−f (s)
up (x) = eT1 K(ψ(x)) ∫ K(ψ(s))
−1
⋅ en [ ] ds
p0 (s)
a
x
−1 −f (s)
up (x) = [ψ1 (x) ψ2 (x) ⋅⋅⋅ ψn (x)] ∫ K(ψ(s)) ⋅ en [ ] ds
p0 (s)
a
x
−1 −f (s)
u′p = [ψ′1 ψ′2 . . . ψ′n ] ∫ K(ψ(s)) ⋅ en [ ] ds
p0 (s)
a
−1 −f (x)
+ [ψ1 (x) ψ2 (x) ⋅⋅⋅ ψn (x)]K(ψ(x)) ⋅ en [ ]
p0 (x)
−f (x)
Second term = eT1 K(ψ(x))K(ψ(x))
−1
⋅ en
p0 (x)
−f (x)
= eT1 .en =0
p0 (x)

x
−1 −f (s)
u′p = [ψ′1 ψ′2 ... ′
ψn ] ∫ K(ψ(s)) ⋅ en ds
p0 (s)
a
x
−f (s)
= eT2 K(ψ(x)) ∫ K(ψ(s))
−1
⋅ en [ ] ds
p0 (s)
a

Similarly,
x
T −1 −f (s)
u′′
p = e3 K(ψ(x)) ∫ K(ψ(s)) ⋅ en [ ] ds
p0 (s)
a
460 | 19 The nonhomogeneous BVP and Green’s function

.
.
.
x
T −1 −f (s)
up[n−1] = en K(ψ(x)) ∫ K(ψ(s)) ⋅ en [ ] ds
p0 (s)
a

󳨐⇒
x
−1 −f (s)
k(up (x)) = K(ψ(x)) ∫ K(ψ(s)) ⋅ en [ ] ds (19.38)
p0 (s)
a

Substitute equation (19.38) in the boundary conditions, equation (19.33): 󳨐⇒

b
−1 −f (s)
Wa K(ψ(a))c + Wb [K(ψ(b))c + K(ψ(b)) ∫ K(ψ(s)) ⋅ en [ ] ds] = 0
p0 (s)
a

󳨐⇒

b
−1 −f (s)
Dc = −Wb K(ψ(b)) ∫ K(ψ(s)) ⋅ en ds
p0 (s)
a

Assuming D is not singular 󳨐⇒

b
−1 −1 −f (s)
c = −D Wb K(ψ(b)) ∫ K(ψ(s)) ⋅ en [ ] ds
p0 (s)
a

󳨐⇒

b
−f (s)
u(x) = − ψ(x)T D−1 Wb K(ψ(b)) ∫ K(ψ(s))
−1
⋅ en [ ] ds
p0 (s)
a
x
−f (s)
+ ψ(x)T ∫ K(ψ(s))
−1
⋅ en [ ] ds (19.39)
p0 (s)
a

Insert the identity

D−1 [Wa K(ψ(a)) + Wb K(ψ(a))] = I (19.40)

in the second term of equation (19.39) before the integral sign. Also, split the first term
of equation (19.39) into two terms.
19.4 Green’s function for the n-th order TPBVP | 461

󳨐⇒
x b
T −1 −1 −f (s)
u(x) = − ψ(x) D Wb K(ψ(b))(∫ + ∫)K(ψ(s)) ⋅ en [ ] ds
p0 (s)
a x
x
−f (s)
+ ψ(x)T ∫ K(ψ(s))
−1
⋅ en [ ] ds
p0 (s)
a

󳨐⇒
x
−f (s)
u(x) = − ψ(x)T D−1 Wb K(ψ(b)) ∫ K(ψ(s))
−1
⋅ en [ ] ds
p0 (s)
a
b
T −1 −1 −f (s)
− ψ(x) D Wb K(ψ(b)) ∫ K(ψ(s)) ⋅ en [ ] ds
p0 (s)
x
x
−f (s)
+ ψ(x)T D−1 Wa K(ψ(a)) ∫ K(ψ(s))
−1
⋅ en [ ] ds
p0 (s)
a
x
−f (s)
+ ψ(x)T D−1 Wb K(ψ(b)) ∫ K(ψ(s))
−1
⋅ en [ ] ds
p0 (s)
a

The first and last terms cancel to give

u(x) = ∫ G(x, s)f (s) ds


a

where
T e
{ −e1 K(ψ(x))D Wa K(ψ(a))K (ψ(s)) p0 (s) , a<s<x
−1 −1 n

G(x, s) = { T en
(19.41)
e K(ψ(x))D−1 Wb K(ψ(b))K−1 (ψ(s)) p (s) , x<s<b
{ 1 0

Example 19.5. Consider the second-order operator

d2
; u(0) = 0, u(1) = 0
dx 2
󳨐⇒

1 0 0 0
W=( )
0 0 1 0
u1 (x) = x and u2 (x) = 1 − x
x 1−x
K(u(x)) = ( )
1 −1
462 | 19 The nonhomogeneous BVP and Green’s function

󳨐⇒

1 1−x
K−1 (u(x)) = ( )
1 −x
0 1 1 0
K(u(0)) = ( ) and K(u(1)) = ( )
1 −1 1 −1
1 0 0 1 0 0 1 0
D=( )( )+( )( )
0 0 1 −1 1 0 1 −1
0 1 0 0
=( )+( )
0 0 1 0
0 1
=( ),
1 0

󳨐⇒

0 1
D−1 = ( )
1 0
0 1 1 0 0 1 1 1−s 0
{
{ −[x (1 − x)] ( )( )( )( )( )
{
{
{ 1 0 0 0 1 −1 1 −s 1
{
{
{
{ 0<s<x
G={
{ 0 1 0 0 1 0 1 1−s 0
{
{
{ [x (1 − x)] ( )( )( )( )( ),
{
{
{ 1 0 1 0 1 −1 1 −s 1
{
{ x<s<1

󳨐⇒

s(1 − x), 0<s<x


G(x, s) = {
x(1 − s), x<s<1

Example 19.6. Green’s function for the second-order operator with mixed BCs

u′′ = −f (x)
u(0) + u(1) = 0, u′ (0) + u′ (1) = 0

󳨐⇒

1 0 1 0
Wa = ( ), Wb = ( )
0 1 0 1
uT (x) = [1 x]
1 x
K(u(x)) = ( )
0 1

󳨐⇒
19.4 Green’s function for the n-th order TPBVP | 463

1 −s
K−1 (u(s)) = ( )
0 1
D = Wa K(u(0)) + Wb K(u(1))
1 0 1 0 1 0 1 1
=( )( )+( )( )
0 1 0 1 0 1 0 1
2 1
=( )
0 2
1 −1
2 4
D−1 = ( 1
)
0 2

1 −1
1 0 1 0 1 −s 0
{
{
{ −[1 x] ( 2 4
)( )( )( )( ), 0<s<x
{
{
{ 0 1 0 1 0 1 0 1 1
2
G(x, s) = { 1 −1
{
{ 1 0 1 1 1 −s 0
{ [1 x] ( 2 4
{
{ )( )( )( )( ), x<s<1
{ 0 1 0 1 0 1 0 1 1
2
1
4
− 21 (x − s), 0<s<x
={ 1 1
4
− 2
(s − x), x<s<1

The Green’s function is symmetric. It is easily verified that the given problem is self-
adjoint even though the BCs are mixed.

Example 19.7. Green’s function for the third-order operator

u′′′ = −f (x)
u(0) = 0, u(1) = 0, u′ (0) − u′ (1) = 0
1 0 0 0 0 0
Wa = ( 0 0 0 ), Wb = ( 1 0 0 )
0 1 0 0 −1 0
uT (x) = (1 x x 2 )
1 x x2
K(u(x)) = ( 0 1 2x )
0 0 2

󳨐⇒

s2
1 −s 2
K−1 (u(s)) = ( 0 1 −s )
1
0 0 2
464 | 19 The nonhomogeneous BVP and Green’s function

1 0 0 1 0 0 0 0 0 1 1 1
D=( 0 0 0 )( 0 1 0 )+( 1 0 0 )( 0 1 2 )
0 1 0 0 0 2 0 −1 0 0 0 2
1 0 0 0 0 0
=( 0 0 0 )+( 1 1 1 )
0 1 0 0 −1 −2
1 0 0
=( 1 1 1 )
0 0 −2
1 0 0
D−1 = ( −1 1 1/2 )
0 0 − 21

{ 1 0 0 1 0 0
{ 2
{
{
{ −(1 x x ) ⋅ ( −1 1 1/2 ) ⋅ ( 0 0 0 )
{
{
{
{
{ 0 0 − 21 0 1 0
{
{
{ s2
{
{
{
{ 1 0 0 1 −s 2
0
{ 0<s<x
{
{
{ ⋅ ( 0 1 0 )⋅( 0 1 −s ) ⋅ ( 0 ) ,
{
{
{
{ 0 0 2 0 0 1 1
2
G(x, s) = {
{
{
{ 1 0 0 0 0 0
{ 2
{
{
{
{ (1 x x ) ⋅ ( −1 1 1/2 ) ⋅ ( 1 0 0 )
− 21
{
{
{
{ 0 0 0 −1 0
{
{
{
{ s2
{
{
{ 1 1 1 1 −s 2
0
{
{
{
{ ⋅( 0 1 2 )⋅( 0 1 −s ) ⋅ ( 0 ) , x<s<1
{
{ 0 0 2 0 0 1 1
2
s
2
(x − s)(1 − x), 0<s<x
={ x
2
(s − x)(s − 1), x<s<1

This Green’s function is not symmetric.


The formula for Green’s function given in equation (19.41) can be made less cum-
bersome by representing it in terms of the solutions of the adjoint equation. Let v(x)
be a fundamental vector for the adjoint equation:

L∗ v = 0

and suppose that we choose u(x) and v(x) such that

KT (v(x))P(x)K(u(x)) = I

󳨐⇒
19.4 Green’s function for the n-th order TPBVP | 465

K−1 (u(s)) = KT (v(s))P(s)

󳨐⇒

K−1 (u(s))en = KT (v(s))P(s)en

. . p0
Recalling the form of P = ( . −p0 0 ), we note that
p0 0 0

Pen = last column of P


= p0 (s)e1

K−1 (u(s))en = KT (v(s))e1 p0 (s)


= v(s)p0 (s)

Thus,

−uT (x)D−1 Wa K(u(a))v(s), a<s<x


G(x, s) = { (19.42)
uT (x)D−1 Wb K(u(b))v(s), x<s<b

This formula reveals the symmetric nature of the Green’s function in terms of u(x) and
v(s).

Theorem 19.1. Regarded as a function of x with s fixed, the Green’s function has the
following properties:
1. Together with its first (n − 1) derivatives it is continuous in [a, s) and (s, b]. At the
point x = s, G and its first n − 2 derivatives have removable discontinuities while
(n − 1)st derivative has an upward jump of − p 1(s) .
0
2. G satisfies the differential equation except at x = s. It satisfies the boundary condi-
tions.
3. G is the only function with properties (1) and (2).

Theorem 19.2. Regarded as a function of s with x fixed. Green’s function has the follow-
ing properties:
1. Together with its first (n − 1) derivatives, it is continuous on [a, x) and (x, b]. At the
point s = x, G and its first (n − 2) derivatives have removable discontinuities while
n−1
the (n − 1)st derivative has a jump of (−1)
p0 (x)
2. G satisfies the adjoint differential equation (L∗ v = 0) except at s = x. It satisfies the
adjoint BCs.
466 | 19 The nonhomogeneous BVP and Green’s function

Theorem 19.3. The solution of the adjoint BVP

L∗ v(s) = −f (s)
−kT (v(a))P(a) = aT Wa , kT (v(b))P(b) = aT Wb

is given by

v(s) = ∫ G(x, s)f (x) dx. (19.43)


a

For proofs of these theorems, we refer to the book by R. H. Cole [14].

19.4.1 Physical interpretation of the Green’s function

Consider the BVP:

Lu = −f (19.44)
k(u(a))
W( )=0 (19.45)
k(u(b))

and let the symbol ℒ stand for the linear differential operator −L and the boundary
conditions (19.45). Then we may write (19.44)–(19.45) as

ℒu = f (19.46)

We represented the solution to (19.44)–(19.45) as

u(x) = ∫ G(x, s)f (s) ds (19.47)


a

Let 𝔾 stand for the operator defined by

𝔾f = ∫ G(x, s)f (s) ds (19.48)


a

Then equation (19.47) becomes

u = 𝔾f (19.49)

Comparing equations (19.46) and (19.49), it is clear that


19.4 Green’s function for the n-th order TPBVP | 467

𝔾 = ℒ−1 ,

which is also shown schematically in Figure 19.3.

Figure 19.3: Schematic demonstration of the operator relationship 𝔾 = ℒ−1 .

It seems apparent from the formula

u(x) = ∫ G(x, s)f (s) ds (19.50)


a

that G(x, s) must be response at position x caused by a unit input at position s. Suppose
that this is the case and there is a distribution of inputs f (s). Then the response at
x caused by an input at s of f (s) ds must be G(x, s)f (s) ds as schematically shown in
b
Figure 19.4. Then ∫a G(x, s)f (s) ds is the total response at x.

Figure 19.4: Schematic of the distributed function f (s) and interpretation of Green’s function.
468 | 19 The nonhomogeneous BVP and Green’s function

Moreover, G(x, s) is the solution of

LG = −δ(x − s) (19.51)
k(G(a, s))
W( )=0 (19.52)
k(G(b, s))

The two examples given below illustrate this point clearly.

Example 19.8 (The deflection of a tightly stretched elastic string). Consider the de-
flection of a tightly stretched elastic string that is fixed at the end points as shown in
Figure 19.5.

Figure 19.5: Deflection of tightly stretched elastic string under a load distribution (top) and a point
load (bottom).

The amplitude of deflection y(x) can be described by

d2 y
T = −F(x); y(0) = 0, y(L) = 0 (19.53)
dx 2

where T is the tension in the spring and F(x) is the force distribution. In dimensionless
form,

d2 y
= −f (z); 0 < z < 1 (19.54)
dz 2
y(0) = y(1) = 0 (19.55)

In this case, the Green’s function

z(1 − s), 0<s<z


G(z, s) = { (19.56)
s(1 − z), z<s<1
19.4 Green’s function for the n-th order TPBVP | 469

represents the deflection of the string at position z due to a unit force acting at position
s and deflection can be expressed (same as in equation (19.50)) as

y(z) = ∫ G(z, s)f (s) ds (19.57)


0

Example 19.9 (Green’s function for the one-dimensional diffusion-convection opera-


tor). Consider the one-dimensional diffusion-convection operator:

1 d2 u du
− = −δ(x − s), 0<x<1 (19.58)
Pe dx2 dx
1 du
− u = 0@x = 0 (19.59)
Pe dx
du
= 0@x = 1 (19.60)
dx

0 < x < s:

1 d2 u du
− =0
Pe dx 2 dx
󳨐⇒

1 du
− u = Constant
Pe dx

BC 󳨐⇒ Constant = 0

du
= Pe ⋅u
dx
󳨐⇒

u = c1 ePe x (19.61)

s < x < 1:

1 d2 u du
− =0
Pe dx 2 dx
󳨐⇒

1 du
− u = Constant = c2
Pe dx
󳨐⇒

u′ − Pe u = c2 . Pe
470 | 19 The nonhomogeneous BVP and Green’s function

ue− Pe x = c2 . Pe ∫ e− Pe x dx

= −c2 .e− Pe x + c3

󳨐⇒

u = c3 ePe x − c2
du
= c3 Pe ePe x , u′ (1) = 0 󳨐⇒ c3 = 0
dx
󳨐⇒

u(x) = −c2 for s < x < 1

c1 ePe x , 0<x<s
u(x) = { (19.62)
−c2 , s<x<1
c1 Pe ePe x , 0<x<s
u′ (x) = { (19.63)
0, s<x<1

It follows from equation (19.58)

1 du
− u = −H(x − s)
Pe dx
󳨐⇒

c2 = −1

Continuity of u(x) at x = s gives

c1 ePe s = 1 󳨐⇒ c1 = e− Pe s

e− Pe s ePe x , 0<x<s
u(x, s) = G(x, s) = { (19.64)
1, s<x<1

∴ The Green’s function for the 1D diffusion-convection operator is given by

e− Pe(s−x) , 0<x<s
G(x, s) = { (19.65)
1, s<x<1

and is shown in Figure 19.6 for Pe = 4.


19.5 Solution of TPBVP with inhomogeneous boundary conditions | 471

Figure 19.6: Green’s function for the one-dimensional diffusion-convection operator.

19.5 Solution of TPBVP with inhomogeneous boundary conditions


We now consider the problem of solving the two-point BVP:

k(u(a))
Lu = −f (x); W( )=d (19.66)
k(u(b))

As stated in the previous chapter, using the principle of superposition, the solution of
(19.66) may be expressed as

u = u1 + u2 (19.67)

where u1 (x) and u2 (x) are the solutions of

k(u1 (a))
Lu1 = −f (x); W( )=0 (19.68)
k(u1 (b))

and

k(u2 (a))
Lu2 = 0; W( )=d (19.69)
k(u2 (b))

We have already seen that (19.68) has a unique solution if the homogeneous problem
is incompatible, i. e., G(x, s) exists and is unique. Thus, we only need to solve (19.69)
to get the complete solution to equation (19.66). We now consider equation (19.69) and
present two methods for solving this BVP.

Method 1
Consider the BVP

k(u(a))
Lu = 0; W( )=d (19.70)
k(u(b))
472 | 19 The nonhomogeneous BVP and Green’s function

Let u(x)T = [u1 (x) u2 (x) . . un (x)] be a fundamental vector. Then any solution to
equation (19.70) must be of the form

u(x) = u(x)T c = c1 u1 (x) + c2 u2 (x) + ⋅ ⋅ ⋅ + cn un (x) (19.71)

In order to determine c, substitute equation (19.71) into the BCs:


󳨐⇒

Wa k(u(a)T c) + Wb k(u(b)T c) = d

󳨐⇒

[Wa K(u(a)) + Wb K(u(b))]c = d

󳨐⇒

Dc = d

Since the homogeneous problem is incompatible, D is invertible and

c = D−1 d. (19.72)

Example 19.10.

d2 u
=0
dx2
u(0) = d1 , u(1) = d2
u1 (x) = x, u2 (x) = (1 − x)
x 1−x
K(u(x)) = ( )
1 −1
1 0 0 1 0 0 1 0
D=( )( )+( )( )
0 0 1 −1 1 0 1 −1
0 1 0 0 0 1
=( )+( )=( )
0 0 1 0 1 0
0 1
D−1 = ( )
1 0

0 1 d
u(x) = [ u1 u2 ] ( )( 1 )
1 0 d2
d2
= [ u1 (x) u2 (x) ] ( )
d1
= d2 x + d1 (1 − x)
19.5 Solution of TPBVP with inhomogeneous boundary conditions | 473

Example 19.11.

u′′ = 0
u(0) + u(1) = d1 , u′ (0) + u′ (1) = d2
u(x)T = [ 1 x ]
2 1
D=( )
0 2

󳨐⇒
1
− 41
D−1 = ( 2
1 )
0 2


d1 d2

D−1 d = ( 2
d2
4 )
2
d1 d2
− 2d1 − d2 d2
u(x) = [1 x] [ 2
d2
4 ]= + x
2
4 2

Method 2: (in terms of Green’s function)


Consider the Lagrange identity

d T
v(s)Lu(s) − u(s)L∗ v(s) = [k (v(s))P(s)k(u(s)))] (19.73)
ds

Let

u(s) = u(s)T c (19.74)

where u(s) is a fundamental vector of (−Lu) = 0. Now, let v(s) = G(x, s) and substitute
in equation (19.73) 󳨐⇒

d T
− u(s)L∗ G(x, s) = [k (G(x, s))P(s)k(u(s)))] (19.75)
ds

We know that

L∗ G(x, s) = 0, a≤s<x

and

L∗ G(x, s) = 0, x<s≤b
474 | 19 The nonhomogeneous BVP and Green’s function

Thus, integrate (19.75) from s = a to s = x − and s = x + to s = b. The LHS is identically


zero. 󳨐⇒

kT (G(x, x − ))P(x − )k(u(x − )) − kT (G(x, a))P(a)k(u(a)) = 0 (19.76)


T T + + +
k (G(x, b))P(b)k(u(b)) − k (G(x, x ))P(x )k(u(x )) = 0 (19.77)

P(s) is continuous 󳨐⇒

P(x − ) = P(x + ) = P(x)

Since the (n − 1) derivatives of u(x) are continuous 󳨐⇒

k(u(x + )) = k(u(x − )) = k(u(x)) = K(u(x))c

Thus, adding equations (19.76) and (19.77) give

[kT (G(x, x − )) − kT (G(x, x+ ))]P(x)k(u(x)) = kT (G(x, a))P(a)k(u(a))


− kT (G(x, b))P(b)k(u(a)) (19.78)

Since G(x, s) and its first (n − 2) derivatives are continuous at s = x and (n − 1) the
n−1
derivative has a jump of (−1)
p (x)
, the LHS of equation (19.78) may be simplified to
0

(−1)n−1 T
LHS = e P(x)k(u(x))
p0 (x) n
. . . p0 (x) u(x)
n−1
(−1) [
[ . −p0 (x) 0 ] [ u′ (x)
][
]
]
= [ 0 . . 0 1 ][ ][ ]
p0 (x) [ . . 0 0 ][ . ]
n−1 n−1
[ (−1) p0 (x) 0 0 0 ][ u (x) ]
u(x)
(−1)n−1 [ u′ (x) ]
[ (−1)n−1 p0 (x)
[ ]
= 0 . . 0 ][ ]
p0 (x) [ . ]
[ un−1 (x) ]
= u(x)

Similarly, the RHS of equation (19.78) simplifies to

k(u(a))
RHS = [ kT (G(x, a))P(a) −kT (G(x, b))P(b) ] [ ]
k(u(b))
P(a) 0 k(u(a))
= [ kT (G(x, a)) kT (G(x, b)) ] [ ][ ]
0 −P(b) k(u(b))


19.5 Solution of TPBVP with inhomogeneous boundary conditions | 475

P(a) 0 k(u(a))
u(x) = [ kT (G(x, a)) kT G(x, b) ] [ ][ ] (19.79)
0 −P(b) k(u(b))

But we have shown that G(x, s) satisfies the homogeneous BCs at s = a and s = b. Thus,
the row vector

P(a) 0
[ kT (G(x, a)) kT (G(x, b)) ] [ ] = h(x)T W (19.80)
0 −P(b)

must belong to the row space of W, and hence may be written as shown in equation
(19.80). Substituting equation (19.80) in equation (19.79), we get

k(u(a))
u(x) = h(x)T W [ ]
k(u(b))
= h(x)T d (19.81)

which is the solution of equation (19.70). The vector function h(x) is determined from
equation (19.80), or equivalently from

kT (G(x, a))P(a) = h(x)T Wa


kT (G(x, b))P(b) = −h(x)T Wb (19.82)

where n of these equations determine h(x) and the remaining n determine the adjoint
BCs in terms of the Green’s function.

Example 19.12.

d2 u
= −f (x)
dx2
u(0) = d1 , u(1) = d2 .

2
d
For the operator − dx 2 , u(0) = 0, u(1) = 0, we have

s(1 − x), 0≤s<x


G(x, s) = {
x(1 − s), x<s≤1

and

0 −1
P(x) = ( )
1 0
p0 (x) = −1

∴ Equation (19.82) becomes


476 | 19 The nonhomogeneous BVP and Green’s function

𝜕G 0 −1 1 0
[G(x, 0) (x, 0)] [ ] = [ h1 (x) h2 (x) ] [ ]
𝜕s 1 0 0 0
𝜕G 0 −1 0 0
[G(x, 1) (x, 1)] [ ] = − [ h1 (x) h2 (x) ] [ ]
𝜕s 1 0 1 0

󳨐⇒

G(x, 0) = 0, G(x, 1) = 0
𝜕G
h1 (x) = 1 − x, h2 (x) = − (x, 1) = x
𝜕s

1

u(x) = ∫ G(x, s)f (s) ds + d1 (1 − x) + d2 x.


0

Problems
1. The deflection of a simply supported beam is described by the fourth-order bound-
ary value problem

d4 u
EI = F(x), 0 < x < L; u(0) = u(L) = u′′ (0) = u′′ (L) = 0
dx 4

where EI = flexural rigidity of the beam and F(x) is the intensity of the distributed
load (force per unit length) (a) Determine the Green’s function of this system by
evaluating the deflection curve for a unit load at x = s. (b) Evaluate the deflection
curve for a triangularly distributed load, i. e.,
wx
F(x) =
L

(c) Determine the maximum deflection for the load in (b)


2. Show that the Green’s function for the operator

d du
Lu = (r ), a < r < b; u′ (a) = 0, u(b) = 0
dr dr

is given by

ln(b/r) a<s<r
G(r, s) = {
ln(b/s) r<s<b

Heat is generated in a thin annular disk with insulated faces. The conductivity is
k in the radial direction. The internal circumference of the disk is insulated while
the external circumferential area is held at temperature zero. The heat generation
19.5 Solution of TPBVP with inhomogeneous boundary conditions | 477

rate per unit volume is given by α(b−r)


(b−a)
. Determine the temperature distribution in
the disk using the Green’s function.
3. (a) Derive a formula for the solution of the n-th order boundary value problem

Lu = −f (x), a<x<b
Wa k(u(a)) + Wb k(u(b)) = d

where the corresponding homogeneous problem is incompatible. (b) Determine a


formula for the Green’s matrix of the vector boundary value problem

du
− A(x)u = f(x), a<x<b
dx
Wa u(a) + Wb u(b) = 0

4. For each of the following problems, find an equivalent integral equation:


(a) (x 2 u′ )′ = −λx2 u, a < x < b; u′ (a) = 0, u(b) = 0
(b) u′′ = −f (u, x); u′ (0) = 0, u(1) = 0 (Note: this is a nonlinear equation)
4
(c) ddxu4 = λu, u(0) = u(1) = u′′ (1) = u′′ (1) = 0.
20 Eigenvalue problems for differential operators
20.1 Definition of eigenvalue problems
Let

Ly = p0 (x)y[n] + p1 (x)y[n−1] + ⋅ ⋅ ⋅ + pn (x)y (20.1)

be a regular linear differential operator, i. e., p0 (x) ≠ 0 in [a, b] and pj (x) ∈ C n−j [a, b].
Consider the homogeneous boundary value problem (BVP)

k(y(a))
− Ly = λy; W( )=0 (20.2)
k(y(b))

Definition. A real or complex number λ for which the BVP defined by equation (20.2) is
compatible is called an eigenvalue and any corresponding nontrivial solution is called
an eigenfunction. The set of all eigenvalues is called the spectrum of the BVP.
Remarks.
(1) For consistency with the finite-dimensional case, we should have defined the
eigenvalue in equation (20.2) with a positive sign. However, the eigenvalues of
most of the differential operators we encounter in our applications are negative
and we are following the literature notation here.
(2) Since the BVP given by equation (20.2) is homogeneous, the eigenfunctions asso-
ciated with an eigenvalue form a subspace of C n [a, b].

The adjoint BVP is defined by

k(v(a))
− L∗ v = λv;
̄ Q( )=0 (20.3)
k(v(b))

Theorem. If λ is an eigenvalue of equation (20.2), then λ̄ is an eigenvalue of the adjoint


BVP defined by equation (20.3).
Proof. Suppose that y(x) is the eigenfunction corresponding to the eigenvalue λ. Let
v(x) be any nonzero function in the domain of L∗ . Then, by definition of the adjoint
operator,

⟨Ly, v⟩ = ⟨y, L∗ v⟩,

where the inner product is defined by

⟨y, v⟩ = ∫ y(x)v(x) dx,


a

we obtain

https://doi.org/10.1515/9783110739701-021
20.1 Definition of eigenvalue problems | 479

⟨−λy, v⟩ = ⟨y, L∗ v⟩.

Using the properties of the inner product,

−λ⟨y, v⟩ = ⟨y, L∗ v⟩
󳨐⇒ ⟨y, −λv⟩
̄ = ⟨y, L∗ v⟩
󳨐⇒ ⟨y, L∗ v + λv⟩
̄ =0

Thus, we have two possibilities: (i) L∗ v + λv̄ = 0, which implies that λ̄ is an eigenvalue
of L∗ with v(x) being the eigenfunction or (ii) the function L∗ v + λv̄ is orthogonal to
y(x), i. e.,

∫(L∗ v + λv)y(x)
̄ dx = 0 (20.4)
a

In case (i), we have already proved the result. Now, consider case (ii) and suppose that

L∗ v + λv̄ = −f (x) (20.5)

Then equation (20.4) 󳨐⇒

⟨f , y⟩ = 0 (20.6)

If −(L∗ + λ)v
̄ represented every continuous function f (x), we have a contradiction since
equation (20.6) 󳨐⇒ y(x) = 0. Therefore, the other alternative must hold, i. e., L∗ v +
λv̄ = 0 for some nonzero v(x).
The above conclusion may also be reached using the following reasoning. Sup-
pose that the operator (L∗ + λ)̄ is invertible, i. e.,

(L∗ + λ)v
̄ = 0 󳨐⇒ v = 0,

then the range of (L∗ + λ)̄ consists of every continuous function. Hence, by choosing
different v(x) we can obtain different f for which ⟨f , y⟩ = 0. Again, we have y(x) = 0,
which is a contradiction. ∴ (L∗ + λ)̄ must be singular and there exists a nonzero v(x)
such that

L∗ v = −λv̄

∴ λ̄ is an eigenvalue of the adjoint system.

Theorem. The eigenfunctions of the BVP defined by equation (20.2) and the adjoint BVP,
(equation (20.3)) corresponding to distinct eigenvalues are orthogonal, i. e.,

⟨ym (x), vn (x)⟩ = 0, m ≠ n


480 | 20 Eigenvalue problems for differential operators

Proof. To prove the biorthogonality property, let

Lym = −λm ym and L∗ vn = −λ̄n vn


󳨐⇒ ⟨Lym , vn ⟩ = ⟨ym , L∗ vn ⟩
󳨐⇒ ⟨−λ y , v ⟩ = ⟨y , −λ̄ v ⟩
m m n m n n
󳨐⇒ −λm ⟨ym , vn ⟩ = −λn ⟨ym , vn ⟩
󳨐⇒ −(λm − λn )⟨ym , vn ⟩ = 0

Since λm ≠ λn 󳨐⇒

⟨ym , vn ⟩ = 0

Corollary. If the BVP is self-adjoint, the eigenfunctions corresponding to distinct eigen-


values are orthogonal. Moreover, we can choose the eigenfunctions such that

1, m=n
⟨ym , yn ⟩ = δmn = {
0, m ≠ n

As in the finite-dimensional case, we call such set of eigenfunctions an orthonormal set.

20.2 Determination of the eigenvalues


Consider the BVP

k(u(a))
Ly = −λy; W( )=0 (20.7)
k(u(b))

Let y1 (x, λ), y2 (x, λ), . . . , yn (x, λ) be a set of linearly independent solutions of equation
(20.7). Then any solution is of the form

y(x) = y(x, λ)T c


= y1 (x, λ)c1 + y2 (x, λ)c2 + ⋅ ⋅ ⋅ + yn (x, λ)cn (20.8)

where c is determined from

D(λ)c = 0 (20.9)

where

D(λ) = Wa K(y(a, λ)) + Wb K(y(b, λ)) (20.10)

is the characteristic matrix. If rank D(λ) = n, then c = 0 and the only solution to the
BVP is the trivial one. Thus, to get a nontrivial solution, we require that
20.2 Determination of the eigenvalues | 481

󵄨 󵄨
h(λ) ≡ det D(λ) = 󵄨󵄨󵄨D(λ)󵄨󵄨󵄨 = 0 (20.11)

Equation (20.11), which determines the eigenvalues of the BVP, is called the charac-
teristic equation. The zeros of the scalar function h(λ) give the eigenvalues. The corre-
sponding eigenfunctions can be obtained by solving equation (20.9) for c and substi-
tuting in equation (20.8).

20.2.1 Relationship between the n-th order eigenvalue problem and the vector
eigenvalue problem

Consider the scalar n-th order eigenvalue problem

k(u(a))
Ly = −λy; W( )=0 (20.12)
k(u(b))

where linear operator L is given by

Ly = p0 y[n] + p1 y[n−1] + ⋅ ⋅ ⋅ + pn y (20.13)

Define

y1 (x) = y(x)
y2 (x) = y′ (x) = y1′
y3 (x) = y′′ (x) = y2′
(20.14)
.
.
yn (x) = y[n−1] x = yn−1

Then equation (20.12) may be written as

dy1
= y2
dx
dy2
= y3
dx
.
.
dyn−1
= yn
dx
dyn p p p λ
= − 1 yn − 2 yn−1 − ⋅ ⋅ ⋅ − n y1 − y1
dx p0 p0 p0 p0

or defining
482 | 20 Eigenvalue problems for differential operators

y1
y2
y=( . ) (20.15)
.
yn
0 1 0 . 0 0 . . . 0
0 0 1 . 0 0 . . . 0
dy (
= . . ) y+λ ( . ) y (20.16)
dx
0 0 0 1 0 .
p pn−1
− n − . . − pp1 − p1 0 . . 0
( p0 p0 0 ) ( 0 )
󳨐⇒

dy
= A(x)y + λB(x)y
dx
󳨐⇒

dy
= [A(x) + λB(x)]y (20.17)
dx
Wa y(a) + Wb y(b) = 0, (20.18)

where the elements of the n × n matrices A(x) and B(x) are continuous functions. Obvi-
ously, this is a more general eigenvalue problem than that defined by equation (20.12).

Theorem. Let A and B be continuous complex valued n × n matrices defined on the real
x-interval [a, b]. Let ξ be any point in (a, b) and w be any constant vector in C n . Consider
the solution of equations (20.17) and (20.18) that passes through the point (ξ , w) and
denote it by y(x, λ, ξ , w). Then:
1. The solution y(x, λ, ξ , w) exists for all x in (a, b) and is continuous in x, λ and w for
x in (a, b) and |w| + |λ| < ∞, for each fixed x in (a, b).
2. It is analytic in w and λ for |w| + |λ| < ∞.

A proof of this theorem may be found in the book by Coddington and Levinson [13]. An-
other way of expressing this result is that if we generate a fundamental matrix of equa-
tion (20.17) by using the n conditions:

y = wj ej , ξ ∈ [a, b] j = 1, 2, 3, . . . , n (20.19)

Then Y(x, λ, ξ , w) is an entire function of λ. Thus, the characteristic matrix

D(λ) = Wa Y(a, λ, ξ , w) + Wb Y(b, λ, ξ , w) (20.20)

has elements that are all entire functions


20.3 Properties of the characteristic equation | 483

󳨐⇒ h(λ) = det D(λ) is an entire function of λ. Thus, to study the nature (real or complex
and distribution) of the eigenvalues, we need to study the zeros of entire functions. In
the special case in which A(x) and B(x) are constant matrices, we get

Y(x) = e(A+λB)x , (20.21)

which is an entire function of λ. In this case,

D(λ) = Wa e(A+λB)a + Wb e(A+λB)b (20.22)

and

h(λ) = det D(λ) (20.23)

It is clear that h(λ) is an entire function of λ.

Remark. We can generate h(λ) for the n-th order equation

Ly = −λy,

which is an entire function by choosing the linearly independent solutions according


to

k(y(ξ )) = ej , a ≤ ξ ≤ b, j = 1, 2, .., n

Other ways of generation of h(λ) may not make it an entire function of λ.

20.3 Properties of the characteristic equation


Theorem. Let h(λ) be an entire function. Then:
1. The zeros of h(λ) are discrete (isolated).
2. There cannot be an infinite number of zeros of h(λ) in any closed region of the com-
plex {λ} plane, i. e., there is no cluster point of the zeros except at infinity.
3. If h(λ) is real for λ real, the zeros must occur in complex conjugate pairs.

Proof. Let us write z for λ.


1. Let ℛ be any closed region in the complex plane and z = a be a zero of h(z). Since
h(z) is analytic in ℛ, we may write

h(z) = ∑ an (z − a)n , a0 = 0
n=0

This Taylor series expansion converges everywhere in ℛ. If a0 = a1 = ⋅ ⋅ ⋅ =


ak−1 = 0, then z = a is a zero of multiplicity k.
484 | 20 Eigenvalue problems for differential operators

󳨐⇒

h(z) = (z − a)k g(z),

where g(z) is an analytical function with g(a) ≠ 0. If k is not finite, then h(z) ≡ 0.
If k is finite, then ∃ a region |z − a| < δ such that g(z) has no zero, or equivalently
h(z) does not vanish. ∴ The zeros are isolated
2. Suppose that z1 , z2 , . . . , zn are the zeros in ℛ such that

lim z = c, i. e.,
n→∞ n

c is a cluster point. Then

lim h(zn ) = h( lim zn ) = h(c) = 0


n→∞ n→∞

Hence, c is a zero of h(z), which is not isolated contradicting (i) 󳨐⇒ No clustering


of the zeros. Now if there are an infinite number of zeros in ℛ then we can extract a
convergent sequence {zn } → a, whose limit is in the domain. This contradicts the
hypothesis. Hence, either h(z) ≡ 0 or h(z) has only a finite number of zeros in ℛ.
3. Let

h(z) = a0 + a1 z + a2 z 2 + a3 z 3 + ⋅ ⋅ ⋅ ,

which converges for all z and is real for z real. Suppose that z = b is a real number

h(b) = a0 + a1 b + a2 b2 + a3 b3 + ⋅ ⋅ ⋅

If b = 0, h is real 󳨐⇒ a0 is real,

h′ (z) = a1 + 2a2 z + ⋅ ⋅ ⋅

h′ (0) is real 󳨐⇒ a1 is real

∴ {ai } are real. Now suppose

h(α) = 0

󳨐⇒

0 = a0 + a1 α + a2 α2 + ⋅ ⋅ ⋅ = h(α)
0 = a0 + a1 ᾱ + a2 ᾱ 2 + ⋅ ⋅ ⋅ = h(α)̄
20.3 Properties of the characteristic equation | 485

∴ ᾱ is also a zero of h(z). If h(z) is real for z real and h(z) is an entire function, then
its zeros must occur in conjugate pairs.

Further properties of the characteristic equation


The characteristic equation is given by

h(λ) = det D(λ)


󵄨󵄨 d (λ) d (λ) . d1n (λ) 󵄨󵄨󵄨󵄨
󵄨󵄨 11 12
󵄨󵄨 󵄨󵄨
󵄨󵄨󵄨 . 󵄨󵄨󵄨
= 󵄨󵄨 󵄨󵄨 (20.24)
󵄨󵄨 . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 d (λ) d (λ) . dnn (λ) 󵄨󵄨󵄨
󵄨 n1 n2

󳨐⇒
󵄨󵄨 d′ d12

. d1n
′ 󵄨󵄨 󵄨󵄨 d d12 . d1n 󵄨󵄨
󵄨󵄨 11 󵄨󵄨 󵄨󵄨 11 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 . . 󵄨󵄨 󵄨 󵄨󵄨
h′ (λ) = 󵄨󵄨󵄨 󵄨󵄨 + ⋅ ⋅ ⋅ + 󵄨󵄨󵄨 . . 󵄨󵄨 (20.25)
󵄨󵄨 . . 󵄨󵄨󵄨 󵄨󵄨 .
󵄨󵄨 . 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 ′ 󵄨󵄨
󵄨󵄨 d dn2 dnn 󵄨󵄨 dn1 dn2 dnn
. 󵄨󵄨 ′
. ′ 󵄨󵄨
󵄨 n1 󵄨

If we expand each determinant by elements of the differentiated rows, we obtain a


linear combination of determinants of order (n − 1), which are minors of D. If rank of
D is (n − 2), then all these determinants vanish
󳨐⇒

h′ (λ) = 0

Thus, if the eigenvalues are to be simple, the rank of D(λ) must be (n − 1). However, the
converse is not true, i. e., h(λ) = h′ (λ) = 0 does not imply rank D = n − 2. To illustrate,
consider

λ 1
D=( )
0 λ

which gives

h = λ2 , h′ = 2λ

Thus,

h = h′ = 0 @ λ = 0

but D(0) has rank one. We now return to equations (20.24) and (20.25). By Laplace’s
expansion, h′′ (λ) is a linear combination of determinants of order (n − 2) and (n − 1)
which are minors of D. Thus, if rank D = (n − 3),
486 | 20 Eigenvalue problems for differential operators

󳨐⇒

h(λ) = h′ (λ) = h′′ (λ) = 0

Thus, the multiplicity of eigenvalue = 3. Again, the converse is not true, i. e., h(λ), h′ (λ)
and h′′ (λ) can be zero even if rank D = (n − 1). Thus, the multiplicity of the eigenvalue
is at least k if rank D = n − k.

Definition. The eigenvalue λi is said to have algebraic multiplicity k if

h(λi ) = 0
dh
(λ ) = 0
dλ i
. (20.26)
.
k−1
d h dk h
(λi ) = 0 but (λ ) ≠ 0
dλk−1 dλk i

The eigenvalue λi is said to have geometric multiplicity k if rank D(λi ) = n − k. If both


these multiplicities are equal and k = 1, the eigenvalue is said to be simple, k = 2, and
the eigenvalue is said to be double, etc.

Example 20.1. Consider the second-order eigenvalue problem

d2 y
= −λy; y(0) = 0; y(1) = 0. (20.27)
dx 2

We take

sin √λx
y1 = cos √λx, y2 = ,
√λ

which satisfy

k(y1 (0)) = e1
k(y2 (0)) = e2

󳨐⇒

cos √λx sin √λx/√λ


K(y(x, λ)) = ( )
− λ sin √λx
√ cos √λx
D(λ) = Wa K(y(0)) + Wb K(y(1))
1 0 1 0 0 0 cos √λ sin √λ/√λ
=( )( )+( )( )
0 0 0 1 1 0 −√λ sin √λ cos √λ
20.3 Properties of the characteristic equation | 487

1 0 0 0
=( )+( )
0 0 cos √λ sin √λ/√λ
1 0
=( ) (20.28)
cos √λ sin λ/√λ

sin √λ
h(λ) = = 0 󳨐⇒ √λ = nπ, n = ±1, ±2, . . .
√λ

The eigenvalues are given by

λn = n2 π 2 , n = 1, 2, . . .
1 0
D(λ) = ( sin √λ ),
cos √λ √λ

D(λn )c = 0 󳨐⇒ c1 = 0
∴ yn (x) = c1 y1 + c2 y2 = c2 sin nπx

Note that n = ±k gives the same eigenfunction and eigenvalue. Choose c2 such that

⟨yn , yn ⟩ = ∫ yn2 dx = 1 󳨐⇒ c2 = √2
0

∴ Normalized eigenfunctions are given by

yn (x) = √2 sin nπx (20.29)

Thus, there are infinite number of eigenvalues and the corresponding eigenfunctions.
In addition,

sin √λ
h(λ) = det D(λ) =
√λ
dh sin nπ cos nπ (−1)n
(λn ) = − 3 3 + = ≠ 0 (20.30)
dλ 2n π 2nπ 2nπ

Thus, the eigenvalues are simple. The eigenspace corresponding to each eigenvalue
has dimension one.

Example 20.2. Consider the eigenvalue problem

y′′ = −λy; y(0) = 0, y(1) − 2y′ (1) = 0 (20.31)

Since the operator is the same, we can use the same Wronskian matrix as that in the
previous example (but the boundary conditions are different). We have
488 | 20 Eigenvalue problems for differential operators

1 0 0 0
Wa = ( ), Wb = ( )
0 0 1 −2

1 0 1 0 0 0 cos √λ sin √λ/√λ


D(λ) = ( )( )+( )( )
0 0 0 1 1 −2 −√λ sin √λ cos √λ
1 0 0 0
=( )+( sin √λ )
0 0 cos λ + 2√λ sin √λ
√ − 2 cos √λ
√λ
1 0
=( sin √λ ) (20.32)
cos λ + 2√λ sin √λ

√λ
− 2 cos √λ

sin √λ
h(λ) = − 2 cos √λ = 0 󳨐⇒ tan √λ = 2√λ (20.33)
√λ

This is the characteristic equation for determining the eigenvalues. This is a transcen-
dental equation and has an infinite number of roots as shown in Figure 20.1, where the
two curves (LHS and RHS of equation (20.33)) intersect at infinite number of points.

Figure 20.1: The roots of characteristic equation (intersection of the two curves).

This leads to infinite number of eigenvalues

±√λ1 , ±√λ2 , . . .

or

λ = λ1 , λ2 , λ3 , . . .
D(λn )c = 0 󳨐⇒ c1 = 0


20.3 Properties of the characteristic equation | 489

yn (x) = c2 sin √λn x


1 1
1 − cos 2√λn x
⟨yn , yn ⟩ = 1 󳨐⇒ 2
c2 ∫ sin2 √λn x dx = 1 󳨐⇒ 2
c2 ∫ dx = 1
2
0 0

󳨐⇒
1
2 sin 2√λn x 󵄨󵄨󵄨󵄨 sin 2√λn
2
= 1 − 󵄨 =1−
c2 2√λn 󵄨0
󵄨
󵄨 2√λn
2 sin √λn cos √λn
=1−
2√λn
= 1 − 2 cos2 √λn (using Eq. (20.33))

= − cos 2√λn

󳨐⇒

2 2
c22 = or c22 =
− cos 2√λn 1 − 2 cos2 √λn

󳨐⇒

2
c2 = √
1 − 2 cos2 √λn

√2 sin √λn x
yn (x) = , n = 1, 2, . . . (20.34)
√1 − 2 cos2 √λn

are normalized eigenfunctions. In addition, it can be shown that h′ (λn ) ≠ 0, i. e., the
eigenvalues are simple.

Example 20.3. Consider the eigenvalue problem

y′′ = −λy; y(0) = y(1); y′ (0) = y′ (1) (20.35)

Again, since the operator is the same, we can use the same Wronskian matrix as that
in Example 20.1. We have

1 0 −1 0
Wa = ( ), Wb = ( )
0 1 0 −1
1 0 1 0 −1 0 cos √λ sin √λ/√λ
D(λ) = ( )( )+( )( )
0 1 0 1 0 −1 −√λ sin √λ cos √λ
490 | 20 Eigenvalue problems for differential operators

1 0 − cos √λ − sin √λ/√λ


=( )+( )
0 1 √λ sin √λ − cos √λ
1 − cos √λ − sin √λ/√λ
=( ) (20.36)
√λ sin √λ 1 − cos √λ

h(λ) = sin2 √λ + (1 − cos √λ)2


= 2[1 − cos √λ] = 0 (20.37)

The eigenvalues are given by

√λ = 2nπ

λn = 4n2 π 2 , n = 0, 1, 2, . . .

λ = 0 gives a nontrivial solution, y0 (x) = 1. For n ≥ 1, we have

0 0
D(λn ) = ( ), rank = 0
0 0

Thus, we have two eigenfunctions

yn1 (x) = sin 2nπx and yn2 (x) = cos 2nπx, n = 1, 2, 3, . . .

We note that each eigenvalue except λ0 = 0 is double:

dh sin √λ
=
dλ √λ
dh 󵄨󵄨󵄨 d2 h 󵄨󵄨󵄨󵄨 1
󵄨󵄨 = 0, 󵄨󵄨 = ≠ 0
dλ 󵄨λ=λn
󵄨 dλ 󵄨λ=λn 2λn
2 󵄨

Thus, the eigenspace corresponding to each eigenvalue λn is two-dimensional. To


summarize, we have
Eigenvalues:

λ0 = 0 (20.38)
2 2
λn = 4n π , n = 1, 2, 3, . . . (20.39)

Eigenfunctions:

y0 (x) = 1 (20.40)
yn1 (x) = sin 2nπx (20.41)
yn2 (x) = cos 2nπx (20.42)
20.3 Properties of the characteristic equation | 491

Example 20.4. Consider the eigenvalue problem

y′′ = −λy; y(0) = 0; y′ (0) = αy(1); α real (20.43)

󳨐⇒

1 0 0 0
Wa = ( ), Wb = ( )
0 1 −α 0
sin √λ
1 0 1 0 0 0 cos √λ
D(λ) = ( )( )+( )( √λ )
0 1 0 1 −α 0 − λ sin √λ
√ cos √λ
1 0 0 0
=( )+( )
−α sin√
√λ
0 1 −α cos √λ
λ
1 0
=( ) (20.44)
1 − α sin√
√λ
−α cos √λ
λ

󵄨 󵄨 α sin √λ
h(λ) = 󵄨󵄨󵄨D(λ)󵄨󵄨󵄨 = 1 − (20.45)
√λ

Characteristic equation:

sin √λ 1
=
√λ α

Note: For the special case of α = 1, λ = 0 is an eigenvalue with eigenfunction

y0 (x) = x

Different cases

(A) α = 1 (B) α > 1 (C) 0<α<1


(D) α=∞ (E) α<0 (20.46)

Case A: α = 1
Let √λ = a + ib 󳨐⇒

sin(a + ib) = a + ib
sin a cosh b + i cos a sinh b = a + ib
a = sin a cosh b (20.47)
b = cos a sinh b (20.48)
a
Equation (20.47) 󳨐⇒ b = cosh−1 ( ) (20.49)
sin a
492 | 20 Eigenvalue problems for differential operators

a a
Equation (20.48) 󳨐⇒ cosh−1 ( ) − cos a sinh{cosh−1 ( )} = 0 (20.50)
sin a sin a

Solve for a and get b from equation (20.49).

Asymptotic values
b
For b ≫ 1, sinh b
→ 0. Therefore, equation (20.48) 󳨐⇒ cos a = 0:

π
a ≈ (2n + 1) , n = 0, 1, 2, . . . 󳨐⇒ sin a = (−1)n
2
a
For n odd, the equation cosh b = sin a
cannot be satisfied. Thus,

π
am = (2m + 1) , m = 0, 2, 4, 6, . . .
2
π
= (4m + 1) , m = 0, 1, 2, 3, . . . (20.51)
2
π
bm = cosh−1 {(4m + 1) }
2
π √ π2
= ln{(4m + 1) + (4m + 1)2 − 1} (20.52)
2 4

√λm = am + ibm ⇒ λm = a2m − b2m ± i2am bm . (20.53)

Exact eigenvalues:

λ0 = 0
λ1 = 48.56 ± i41.5
λ2 = 181.97 ± i93.2
π2 2
λm ≈ (4m + 1)2 − [ln(4m + 1)π]
4
± i(4m + 1)π[ln(4m + 1)π], m ≥ 3 (20.54)

Case D: α = ∞
In this case,

sin √λ
h(λ) = = 0 󳨐⇒ √λ = nπ, n = 1, 2, . . . (20.55)
√λ
󳨐⇒ λn = n2 π 2 ; n = 1, 2, 3, . . . , ∞ (20.56)

No complex eigenvalues for α = ∞


20.3 Properties of the characteristic equation | 493

General α (Cases B, C and E)

√λ = a + ib 󳨐⇒ α sin(a + ib) = a + ib

󳨐⇒

a = α sin a cosh b (20.57)


b = α cos a sinh b (20.58)
a tan a
=
b tanh b

Equation (20.57) 󳨐⇒

a
b = cosh−1 { } (20.59)
α sin a

Substituting equation (20.59) in equation (20.58) 󳨐⇒

a a
cosh−1 { } − α cos a sinh cosh−1 { }=0 (20.60)
α sin a α sin a

Asymptotic values
For b ≫ 1

√λm = am + ibm
π π
= (4m + 1) + i cosh−1 {(4m + 1) } (20.61)
2 2α

Summary
(i) For α = ∞, only real eigenvalues

λ n = n2 π 2

(ii) For 1 < α < ∞, a finite number of real eigenvalues, infinite number of complex
eigenvalues
(iii) For α = 1, one real eigenvalue all others complex eigenvalues
(iv) For 0 < α < 1, all eigenvalues are complex and move to ∞ as α → 0.

Problems
1. Given the eigenvalue problem,

y′′ = −λy;
y(0) = y′ (1)
494 | 20 Eigenvalue problems for differential operators

y(1) = y′ (0)

(a) Find the eigenvalues and determine multiplicity.


(b) Find the eigenfunctions.
(c) Find the eigenvalues and eigenfunctions of the adjoint system.
(d) Verify the biorthogonality property.
2. Given the eigenvalue problem,

d4 y
= λy, 0 < x < 1
dx4
y′′′ (0) + h1 y(0) = 0; y′′′ (1) − h3 y(1) = 0
y′′ (0) − h2 y′ (0) = 0; y′′ (1) + h4 y′ (1) = 0

(a) Show that the eigenvalues are real.


(b) If hj > 0 for j = 1, 2, 3, 4, show that the eigenvalues are all nonnegative by
integrating

1
d4 y
∫y dx
dx4
0

by parts twice.
(c) Show that the eigenfunctions corresponding to different eigenvalues are or-
thogonal.
3. Consider the eigenvalue problem

d2 u
= −λu; 0 < x < 1
dx2
u(0) − u(1) = 0, u′ (0) − u′ (1) = 0

Determine the eigenvalues, eigenfunctions and adjoint eigenfunctions.


4. The buckling of a long column with pinned ends (see Figure 20.2) is described by
the eigenvalue problem:

d2 y P
=− y
dx 2 EI
y(0) = y(L) = 0
y = displacement
P = applied load
EI = flexural rigidity of the column

What is the critical (smallest) load at which buckling occurs? [Note: the resulting
formula is known as the Euler’s column formula.]
20.3 Properties of the characteristic equation | 495

Figure 20.2: Buckling of a column with pinned endpoints.

5. The following sixth-order eigenvalue problem arises in the stability analysis of


a fluid layer confined between two parallel plates kept at different temperatures
(Rayleigh–Benard convection):

ϕ[4] − 2k 2 ϕ′′ + k 4 ϕ − Ra k 2 ψ = λ(ϕ′′ − k 2 ϕ)


ψ′′ − k 2 ψ + ϕ = λ Pr ψ, 0<x<1
ϕ(0) = ϕ (0) = ψ(0) = 0

ϕ(1) = ϕ′ (1) = ψ(1) = 0

Here, k is the wave number, Pr is the Prandtl number and Ra is the Rayleigh num-
ber. (a) Show that the eigenvalues are real (b) Compute the smallest value of Ra for
which an eigenvalue can be zero [this value of Ra defines the critical value beyond
which the conduction state is not stable and leads to convective patterns].
6. Consider the EVP

v′′ = −λv, 0<x<1


Pe
v′ (1) + v(1) = 0
2
Pe
v′ (0) − v(0) + Pe RePe /2 v(1) = 0,
2
where Pe > 0 is the Peclet number and R > 0 is the recycle ratio. Determine the
nature of the eigenvalues for different choices of Pe and R.
21 Sturm–Liouville theory and eigenfunction
expansions
We have seen in Part I that the eigenvectors of a symmetric matrix form an orthonor-
mal set, which can be used to expand any arbitrary vector in terms of the eigenvectors.
We have also used such expansions to solve linear equations in which the symmetric
matrix appeared. This chapter is a generalization of the same to infinite dimensions.
While the general approach remains the same, as we have seen in Chapter 10, going
from finite- to infinite-dimensional vector spaces can lead to some subtle differences
in the manner in which the sequences converge, and hence the interpretation of the
expansions. While the general terminology of “Fourier series” is used for these expan-
sions, it should be pointed out that it took scientists and mathematicians over 200
years to clarify the various concepts related eigenfunction expansions. While we are
mainly concerned with the use of these expansions to solve linear differential equa-
tions, this topic has many other applications.

21.1 Sturm–Liouville theory


Consider the self-adjoint second-order boundary value problem

d dy
− (p(x) ) + q(x)y = λρ(x)y, a<x<b (21.1)
dx dx

with one of the following three types of boundary conditions:


unmixed BCs

α1 y(a) + α2 y′ (a) = 0, (α12 + α22 > 0) (21.2)


β1 y(b) + β2 y (b) = 0

(β12 + β22 > 0) (21.3)

periodic BCs

p(a)y(a) = p(b)y(b) (21.4)


y′ (a) = y′ (b) (21.5)

Dirichlet BCs

y(a) = 0 (21.6)
y(b) = 0 (21.7)

The BVP defined by equation (21.1) and either BCs given in equation (21.2)–(21.3),
(21.4)–(21.5) or (21.6)–(21.7) is called a regular Sturm–Liouville problem if p(x) ≠ 0 in
[a, b], −∞ < a < b < ∞ (interval [a, b] is finite). ρ(x) is called the density or weight

https://doi.org/10.1515/9783110739701-022
21.1 Sturm–Liouville theory | 497

function. It is assumed that ρ(x) > 0 for a < x < b. We discuss here the S − −L theory
for Dirichlet BCs. The extension of the theory to other cases is straightforward. (When
p(x) vanishes in [a, b] or when the interval (a, b) is infinite the BVP is called irregular.
These cases will be considered in Chapters 24 and 25.)
Remark. The equation

y′′ + f (x)y′ + g(x)y = −λh(x)y (21.8)

with either of the three sets of BCs may be written in a self-adjoint form using the
transformation
x

p(x) = exp{∫ f (x) dx} (21.9)


a

󳨐⇒

(p(x)y′ ) = p(x)y′′ + p(x)y′ f (x)


Multiplying both sides of equation (21.8) by the positive function p(x) gives

(p(x)y′ ) + p(x)g(x)y = −λp(x)h(x)y, (21.10)


which is same as equation (21.1) if we define

q(x) = −p(x)g(x) (21.11)


ρ(x) = p(x)h(x). (21.12)

Theorem. Consider the self-adjoint BVP

(p(x)y′ ) − q(x)y = −λρ(x)y, a < x < b (21.13)


y(a) = 0, y(b) = 0 (21.14)

(i) the eigenvalues are real


(ii) the eigenvalues are positive if q(x) ≥ 0 and p(x) > 0 in [a, b]
(iii) the eigenfunctions corresponding to distinct eigenvalues are orthogonal with respect
to the weight function ρ(x)
(iv) the eigenvalues are isolated. There are an infinite number of them with no cluster
point, i. e., λn 󳨀→ ∞ for n → ∞.

Proof. (i) We have already proved this for an n-th order self-adjoint BVP.
(ii) Let λn be an eigenvalue with yn (x) as the corresponding eigenfunction:

(p(x)yn′ (x)) − q(x)yn = −λn ρ(x)yn (21.15)


multiply by yn both sides and integrate.


498 | 21 Sturm–Liouville theory and eigenfunction expansions

󳨐⇒
b b b
d
∫ yn (p(x)yn′ ) dx − ∫ q(x)yn2 dx = −λn ∫ ρ(x)yn2 dx
dx
a a a

󳨐⇒
b b b
󵄨b 2
pyn′ yn 󵄨󵄨󵄨a − ∫ p(x)(yn′ ) dx − ∫ qyn2 dx = −λn ∫ ρ(x)yn2 dx
a a a

The first term vanishes since

yn (a) = yn (b) = 0

󳨐⇒
b b
∫a p(x)(yn′ )2 dx + ∫a q(x)yn2 dx
λn = b
(21.16)
∫a ρ(x)yn2 dx

[Remark: The RHS of equation (21.16) is the Rayleigh quotient encountered in Sec-
tion 7.3.] Thus, λn > 0, if p(x) > 0, q(x) ≥ 0
(iii) To prove (iii), let λn and λm be eigenvalue (λn ≠ λm ) with eigenfunctions yn (x) and
ym (x), respectively. Then,

(pyn′ ) − qyn = −λn ρyn (21.17)


′ ′
(pym ) − qym = −λm ρym (21.18)

Multiply equation (21.17) by ym and equation (21.18) by yn , subtract and integrate by


parts
󳨐⇒
b b

∫[(pyn′ ) ym ) yn ] dx − ∫ q(yn ym − ym yn ) dx
′ ′ ′
− (pym
a a
b

= (λm − λn ) ∫ ρyn ym dx
a

󳨐⇒
b b b
b
(λm − λn ) ∫ ρyn ym dx = [pyn′ ym − pym

yn ]a − ∫ pyn ym dx + ∫ pym
′ ′
yn dx
′ ′

a a a
= p(b)yn′ (b)ym (b) − p(b)ym

(b)yn (b) − p(a)yn′ (a)ym (a)
21.1 Sturm–Liouville theory | 499

+ p(a)ym

(a)yn (a)
=0

󳨐⇒

(λm − λn ) ∫ ρyn ym dx = 0 (21.19)


a

Since λm ≠ λn 󳨐⇒

∫ ρ(x)yn (x)ym (x) dx = 0


a

(iv) We have already shown that the eigenvalues are the zeros of an entire function.
Thus, the zeros are isolated. To prove that there are an infinite number of them, we
refer to the book by Coddington and Levinson [13].
We showed that the BVP (equations (21.13)–(21.14)) may be written as an integral
equation:

y(x) = ∫ λρ(s)y(s)G(x, s) ds
a

or

b
1
y(x) = ∫ G(x, s)ρ(s)y(s) ds (21.20)
λ
a

where G(x, s) is the Green’s function. Thus, if we define an integral operator 𝒢 by

𝒢 y = ∫ G(x, s)ρ(s)y(s) ds (21.21)


a

Then the EVP is equivalent to

1
𝒢 y = μy, (μ = ) (21.22)
λ

i. e., the eigenvalues of 𝒢 are reciprocals of those of equations (21.13)–(21.14) and the
eigenfunctions are the same.
𝒢 : C[a, b] → C[a, b] is a linear operator that is bounded and continuous w. r. t. the
norm induced by the inner product
500 | 21 Sturm–Liouville theory and eigenfunction expansions

⟨u, v⟩ = ∫ u(x)v(x)ρ(x) dx
a

Note also that

⟨𝒢 y, u⟩ = ⟨y, 𝒢 u⟩

i. e., 𝒢 is self-adjoint

b b

⟨𝒢 y, u⟩ = ∫[∫ G(x, s)ρ(s)y(s) ds].u(x)ρ(x) dx


a a
b b

= ∫[∫ G(x, s)ρ(x)u(x) dx].ρ(s)y(s) ds = ⟨y, 𝒢 u⟩,


a a

since G(x, s) = G(s, x). These properties may be used to prove spectral theorem for the
operator 𝒢 . Again, we refer to the book of Coddington and Levinson [13] for further
details.

Asymptotic distribution of eigenvalues for Sturm–Liouville systems


Consider again the S–L eigenvalue problem

d dy
− (p(x) ) + q(x)y = λρ(x)y(x) a<x<b (21.23)
dx dx
α1 y(a) + α2 y′ (a) = 0 (α12 + α22 ≠ 0) (21.24)
β1 y(b) + β2 y (b) = 0

(β12 + β22 ≠ 0) (21.25)

Recall that this is a regular S–L problem if (i) −∞ < a < b < ∞, (ii) p(x) ∈ C 1 (a, b) and
q(x), ρ(x) ∈ C 0 (a, b) (iii) ∃p0 > 0 and ρ0 > 0 such that p(x) ≥ p0 and ρ(x) ≥ ρ0 in [a, b].
For this case, the following asymptotic result on the eigenvalues may be obtained (for
details, we refer to the books by Courant and Hilbert [15] and Morse and Feshback
[23]).

Asymptotic results
For n → ∞
nπ C
λn1/2 ≈ + + O(n−2 ) (21.26)
b ρ(x)
{∫a | p(x) |1/2 dx} n

where C depends only on the BCs,

nπ(x − a) nπ(x − a)
yn (x) ≈ [1 + O(n−2 )] cos{ } + [1 + O(n−2 )] sin{ } (21.27)
b−a b−a
21.1 Sturm–Liouville theory | 501

when α2 ≠ 0 and

∫ ρ(x)yn2 (x) dx = 1 (21.28)


a

In particular, ∃ a constant c > 0 ∋


󵄨󵄨 󵄨
󵄨󵄨yn (x)󵄨󵄨󵄨 ≤ c for all a ≤ x ≤ b (21.29)

We provide below some examples of S–L eigenvalue problems.


Example 21.1. Consider the Graetz–Nusselt EVP that arises in heat/mass transfer
analysis in laminar flow between parallel plates

3
y′′ = −λρ(x)y; ρ(x) = (1 − x 2 ), 0<x<1
2
y (0) = 0,

y(1) = 0

Here, p(x) = 1.0, q = 0 and ρ(x) = 32 (1 − x 2 ) is the velocity profile. The first few eigen-
values are listed in Table 21.1.

Table 21.1: First few eigenvalues of Gratz–Nusselt EVP.

n λn
1 1.88517
2 21.4315
3 62.3166
4 124.537
5 208.091

We note that the Graetz eigenfunctions satisfy the orthogonality relation based on
weighted inner product
1

∫ ρ(x)yn (x)ym (x) dx = 0, m ≠ n.


0

A plot of these functions with the normalization condition yn (0) = 1 is shown in Fig-
ure 21.1.

Example 21.2. Consider the Airy EVP

y′′ = −λxy, 0 < x < 1


y(0) = y(1) = 0.

Here, p = 1, q = 0 and ρ = x. The first few eigenvalues are listed in Table 21.2.
502 | 21 Sturm–Liouville theory and eigenfunction expansions

Figure 21.1: First five Graetz–Nusselt eigenfunctions yn (x) satisfying yn (0) = 1.

Table 21.2: First few eigenvalues of Airy EVP.

n λn

1 18.9563
2 81.8866
3 189.221
4 340.967
5 537.126

The eigenfunctions satisfy the orthogonality condition

∫ xyn (x)ym (x) dx = 0, m ≠ n.


0

Figure 21.2 shows first five of these eigenfunctions with constraint yn′ (0) = 1.

Figure 21.2: First five Airy eigenfunctions yn (x) satisfying yn′ (0) = 1.
21.2 Eigenfunction expansions | 503

21.2 Eigenfunction expansions


We have shown that the self-adjoint BVP

(p(x)y′ ) − q(x)y = −λρ(x)y, a<x<b (21.30)


y(a) = 0, y(b) = 0 (21.31)

has an infinite number of eigenvalues {λi } and an orthonormal set of eigenfunctions


{yi (x)}

⟨yi , yj ⟩ = ∫ ρ(x)yi (x)yj (x) dx


a

1 if i = j
= δij = { (21.32)
0 otherwise

We now consider the expansion of any arbitrary function f (x) in terms of the eigen-
functions {yi (x)}. We write


f (x) = ∑ ci yi (x) (21.33)
i=1

In order to determine ci , multiply both sides of equation (21.33) by ρ(x)yj (x) and inte-
grate from a to b 󳨐⇒

b b ∞
∫ f (x)yj (x)ρ(x) dx = ∫ ρ(x)yj (x)(∑ ci yi (x)) dx
a a i=1

Assuming that the summation and integration can be interchanged (this is so if the
series in (21.33) converges uniformly), we get

b ∞ b

∫ f (x)yj (x)ρ(x) dx = (∑ ci ∫ ρ(x)yi (x)yj (x) dx) dx


a i=1 a

= cj [using equation (21.32)]

Thus,

cj = ∫ f (x)yj (x)ρ(x) dx = ⟨f , yi ⟩ (21.34)


a

and
504 | 21 Sturm–Liouville theory and eigenfunction expansions


f (x) = ∑⟨f , yi ⟩yi (21.35)
i=1

Definition. Let {yi (x)} be an infinite system of orthonormal functions relative to the
weight function ρ(x) on an interval [a, b]. If f (x) is any function for which the integrals
in equation (21.34) exist, the infinite series in equation (21.33) is called the eigenfunc-
tion expansion or Fourier series of f (x) relative to the system {yi (x), ρ(x)}. The coeffi-
cients ci are called Fourier coefficients of f (x) relative to {yi (x)}.

Remarks.
1. Historically, the term Fourier series is used when the orthonormal functions {yi (x)}
are the sine functions, cosine functions, or sine and cosine functions. Each of
these functions are generated from the following self-adjoint problems:
(S)

y′′ = −λy, 0<x<1


y(0) = y(1) = 0 (Sines)

with eigenvalues and eigenfunctions,

λ n = n2 π 2 , yn (x) = √2 sin[nπx], n = 1, 2, . . .

(C)

y′′ = −λy, 0<x<1


y (0) = 0,

y (1) = 0

(Cosines)

with eigenvalues and eigenfunctions,

λ0 = 0, y1 (x) = 1, λ n = n2 π 2 , yn (x) = √2 cos[nπx], n = 1, 2, . . .

(P)

y′′ = −λy, 0<x<1


y(0) = y(1), y′ (0) = y′ (1) (Sines and Cosines)

with eigenvalues and eigenfunctions,

λ0 = 0, y0 (x) = 1, λn = n2 π 2 , yns (x) = √2 sin[nπx],


ync (x) = √2 cos[nπx], n = 1, 2, . . .

2. If the eigenfunctions are not normalized, then the expansion (equation (21.35)) is
of the form
21.3 Convergence in function spaces and introduction to Banach and Hilbert spaces | 505


⟨f , yi ⟩
f (x) = ∑ ci yi (x), ci = . (21.36)
i=1
⟨yi , yi ⟩

3. When the functions {yi (x)} are not sines or cosines, the expansion (21.33) is called
eigenfunction expansion, or generalized Fourier series.

21.3 Convergence in function spaces and introduction to Banach


and Hilbert spaces
This section is a brief introduction to orthogonal expansions in infinite-dimensional
vector spaces and convergence of such expansions.

21.3.1 Cauchy sequence

Definition. A sequence {Sn } contained in a normed linear space is called a Cauchy


sequence if given ε > 0, ∃ an N such that

‖Sn − Sm ‖ < ε, ∀n, m > N.

From this definition, it is seen that any convergent sequence is necessarily a Cauchy
sequence but the converse is not true.

Definition. A normed linear space V is said to be complete if every Cauchy sequence


in the space converges to some point in V.

Example 21.3.
(i) Let V = set of rational numbers in [0, 1] and for x, y ∈ V, define the metric/distance
function d(x, y) = |x − y|. This space is not complete. The reason being sequence
n n
such as Sn = ( n+1 ) , converge but not to a point in V. In this case, limn→∞ Sn is e1 ,
which is irrational.
(ii) Let V = ℝ and d(x, y) = |x − y|. This space is complete.
(iii) Let V = C[a, b], and

󵄨 󵄨
d∞ (f , g) = sup 󵄨󵄨󵄨f (x) − g(x)󵄨󵄨󵄨.
a≤x≤b

This space is complete w. r. t. the supremum norm as the convergence is uniform.


(iv) let V = C[a, b], and

b
2
d2 (f , g) = ∫[f (x) − g(x)] dx.
a
506 | 21 Sturm–Liouville theory and eigenfunction expansions

This space is not complete. For example, sequences such as {e−nx }, {tanh[nx]} ∈ C[a, b]
for finite n, but for n → ∞, the limiting functions are not in C[a, b].
Metric spaces may be completed by appending the missing element to the space.
Thus, to the rational numbers, we add all the limits of convergent sequences, i. e., we
add all the irrationals, to get the real number system, which is complete.
Similarly, to the space C[a, b] with the metric d2 , we add the limits of convergent
sequences, we get the space ℒ2 [a, b], the space of all Lebesque square integrable func-
tions on [a, b].

21.3.2 Riemann and Lebesque integration

Recall the definition of the Riemann integral for a continuous or piecewise continu-
ous function f (x) over an interval [a, b]. We divide the interval [a, b] into n subinter-
vals and form the upper and lower Riemann sums. If these sums converge to the same
value when n → ∞ and the largest size of the subinterval goes to zero, then the Rie-
mann integral exists. Now, consider the Dirichlet function of Section 10.1, for which
the Riemann integral does not exist. However, in the Lebesque theory of integration,
we ignore sets of measure zero (also referred to as null sets) in the integration process.
Thus, the Lebesque integral exists for the Dirichlet function. The set of all Lebesque in-
tegrable functions is denoted by ℒ[a, b], while the set of Riemann integrable function
is defined by R[a, b]. It is clear that C[a, b] ⊂ R[a, b] ⊂ ℒ[a, b].

21.3.3 Banach and Hilbert spaces

Definition. A normed linear space that is complete is called a Banach space.

Example 21.4.
(i) {C[a, b], d∞ } is a Banach space
(ii) {C[a, b], d2 } is not a Banach space
(iii) {ℝn , dp , 1 ≤ p < ∞} is a Banach space
(iv) {ℂn , dp , 1 ≤ p < ∞} is a Banach space

Definition. An inner-product space that is complete is called a Hilbert space.

Every Hilbert space is a Banach space but the converse is not true.

Example 21.5.
(i) {ℝn , ⟨u, v⟩ = ∑nj=1 uj vj } is a Hilbert space.
(ii) {ℝn , ⟨u, v⟩ = vT Gu, G is positive definite} is a Hilbert space.
(iii) {ℂn , ⟨u, v⟩ = ∑nj=1 uj vj } is a Hilbert space.
b
(iv) {C[a, b], ⟨f , g⟩ = ∫a f (x)g(x) dx} is not a Hilbert space.
21.3 Convergence in function spaces and introduction to Banach and Hilbert spaces | 507

b
(v) {ℒ2 [a, b], ⟨f , g⟩ = ∫a ρ(x)f (x)g(x) dx, ρ(x) > 0} is a Hilbert space.
b
(vi) {ℒ2C [a, b], ⟨f , g⟩ = ∫a f (x)g(x) dx} is a Hilbert space of complex valued functions of
a real variable.

Note that in the Hilbert space of example (v) above, the eigenfunction expansion

f (x) = ∑ ⟨f , yn (x)⟩yn (x)
n=1

converges in the mean square sense w. r. t. the norm ‖ ⋅ ‖2 , i. e.,

b 2
N
lim ∫{f (x) − ∑ ⟨f , yn ⟩yn } ρ(x) dx = 0
N→∞
a n=1

and when we write f (x) ≗ g(x), the two functions may disagree on a set of points
having measure zero.

Definition. Let V be a normed linear space and U be a subset of V. Then we say that
U is a dense subset of V if given x ∈ V, ∃x0 ∈ U such that

‖x − x0 ‖ < ε for every ε > 0.

Example 21.6.
(i) The set of rational numbers is dense in the real line. Every irrational number can
be approximated as closely as required by a rational number.
(ii) The set of polynomials is dense in C[a, b] with the metric
󵄨 󵄨
d∞ (f , g) = sup 󵄨󵄨󵄨f (x) − g(x)󵄨󵄨󵄨.
a≤x≤b

This is the well-known Weirstrass theorem.


Similarly, C[a, b] is dense in R[a, b] and R[a, b] is dense in ℒ[a, b]. Thus, C[a, b] is
dense in ℒ[a, b].

21.3.4 Convergence theorems for eigenfunction expansions

Theorem 21.1. Let f (x) be defined and continuous with two continuous derivatives on
[a, b] and f (x) satisfies the same boundary conditions as the eigenfunctions {ρ(x); ϕn (x),
n = 1, 2, . . .}, then the eigenfunction expansion

f (x) = ∑ cn ϕn (x) (21.37)
n=1

converges uniformly to f (x) on [a, b].


508 | 21 Sturm–Liouville theory and eigenfunction expansions

Theorem 21.2. Let f (x) be piecewise smooth on [a, b]. Then, for each x in [a, b], the
eigenfunction expansion converges, and

f (x + ) + f (x − ) ∞
= ∑ cn ϕn (x), a<x<b (21.38)
2 n=1

Mean square convergence


A sequence of functions
n
Sn (x) = ∑ cj ϕj (x)
j=1

is said to converge to f (x) in mean square relative to ρ(x) on [a, b] provided

b
2
lim ∫[f (x) − Sn (x)] ρ(x) dx = 0
n→∞
a

21.3.5 Fourier series (eigenfunction expansions) and Parseval’s theorem

Let {ρ(x); ϕn (x), n = 1, 2, . . .} be the orthonormal set of eigenfunctions of a S–L problem.


Let f (x) ∈ ℒ2 [a, b] and

f (x) = ∑ cn ϕn (x) (21.39)
n=1
b

cn = ⟨f , ϕn ⟩ = ∫ f (x)ϕn (x)ρ(x) dx. (21.40)


a

Then
b n
‖f ‖2 = ∫ ρ(x)f (x)2 dx = ∑ cn2 . (21.41)
a n=1

Equation (21.41) is a generalization of the orthogonal expansion from finite to infinite


dimensions, and is known as Parseval’s relation.
If the function f (x) is such that ‖f ‖ = 1, then Parseval’s relation may be used to
determine the number of terms needed in equation (21.41) so that

n
(‖f ‖2 − ∑ cj2 ) < ε (21.42)
j=1

where ε > 0 is the desired accuracy.


21.3 Convergence in function spaces and introduction to Banach and Hilbert spaces | 509

21.3.6 Example of Fourier series (eigenfunction expansions)

Example 21.7. Consider the function f (x) = x(1 − x) and EVP

u′′ = −λu, 0<x<1


u(0) = u(1) = 0

λn = n2 π 2 , un (x) = √2 sin nπx, n = 1, 2, . . .

Thus, the function f (x) can be expanded as


f (x) = ∑ cj uj (x)
j=1

where

cj = ⟨f , uj ⟩ = ∫ f (ξ )√2 sin(jπξ ) dξ
0
1

= ∫ x(1 − x)√2 sin jπx dx


0

2√2 0 if j is even
= (1 − cos jπ) = { 4√2
j3 π 3 j3 π 3
if j is odd

Thus, the Fourier series expansion of x(1 − x) can be expressed as

8 sin 3πx sin 5πx


x(1 − x) = [sin πx + + + ⋅ ⋅ ⋅]
π3 27 125
8 ∞ sin(2k + 1)πx
= 3 ∑
π k=0 (2k + 1)3

8 N−1 sin(2k + 1)πx


= ∑ + RN (x) (21.43)
π 3 k=0 (2k + 1)3

where RN is the remainder term

8 ∞ sin(2k + 1)πx
RN (x) = ∑ .
π 3 k=N (2k + 1)3

Note that
510 | 21 Sturm–Liouville theory and eigenfunction expansions

󵄨󵄨 󵄨
󵄨 󵄨󵄨 8 sin(2k + 1)πx 󵄨󵄨󵄨󵄨

󵄨󵄨
󵄨󵄨RN (x)󵄨󵄨󵄨 = 󵄨󵄨󵄨 3 ∑ 3
󵄨󵄨
󵄨󵄨 π
󵄨 k=N (2k + 1)
󵄨󵄨
󵄨
8 ∞ 1
≤ 3 ∑
π k=N (2k + 1)3
󵄨∞ 󵄨
8 󵄨󵄨󵄨󵄨 dN 󵄨󵄨󵄨󵄨
< 󵄨 ∫ 󵄨
π 3 󵄨󵄨󵄨󵄨 (2N + 1)3 󵄨󵄨󵄨󵄨
N−1
󵄨󵄨 󵄨 8 1
∴ 󵄨󵄨RN (x)󵄨󵄨󵄨 < 3
π 4(2N − 1)2

For N = 3 󳨐⇒

󵄨󵄨 󵄨
󵄨󵄨RN (x)󵄨󵄨󵄨 < 0.00258

Thus,

8 sin 3πx sin 5πx


S3 (x) = [sin πx + + ]
π 3 27 125

approximates f (x) = x(1 − x) within an accuracy of 0.0026 for all x. Figure 21.3 shows
plots of exact function f (x) along with Fourier series expansion using first few terms.

Figure 21.3: Plot of function f (x) = x(1 − x) and its representation with Fourier series expansion using
only first term and first two terms.

In this case, the function f (x) is twice differentiable and satisfies the same BCs as the
eigenfunctions. Thus, the convergence is uniform.

Example 21.8. Consider the function


π
1, 0≤x≤ 2
f (x) = { π
0, 2
< x ≤ π.
21.3 Convergence in function spaces and introduction to Banach and Hilbert spaces | 511

Let us expand this function in terms of the eigenfunctions of the S–L problem

y′′ = −λy, 0≤x≤π


y (0) = 0;

y (π) = 0.

The eigenvalues are

0, n=0
λn = {
n2 , n = 1, 2, 3, . . .

and the normalized eigenfunctions are


1
{ √π , n=0
yn (x) = {
2
{ √ π cos nx, n = 1, 2, 3, . . . .

Thus, the Fourier series expansion of f (x) can be expressed as



f (x) = ∑ cj yj (x); cj = ⟨f , yj ⟩
j=0

󳨐⇒
π
1 1 π √π
c0 = ⟨y0 , f ⟩ = ∫ f (x) dx = ⋅ =
√π √π 2 2
0

and
π/2 jπ
2 2 sin( 2 )
cj = ⟨yj , f ⟩ = √ ∫ cos jx dx = √
π π j
0

Thus,

√π
c0 =
2
2
c2k = 0 and c2k+1 = (−1)k √ , k = 0, 1, 2, 3
(2k + 1)2 π

󳨐⇒

√π 1 ∞
2 (−1)k 2
f (x) = ⋅ +∑√ ⋅ √ cos(2k + 1)x
2 √π k=0 π (2k + 1) π
1 2 ∞ (−1)k
= + ∑ cos(2k + 1)x. (21.44)
2 π k=0 (2k + 1)
512 | 21 Sturm–Liouville theory and eigenfunction expansions

Figure 21.4 shows the plot of this function and representation from Fourier series ex-
pansion with one, three, five and hundred terms. Note the Gibb’s phenomena of over
and undershoot at the point of discontinuity. In this example, the function f (x) is not
differentiable at x = π2 . Hence, the eigenfunction expansion converges in the mean
square norm.

Figure 21.4: Representation with Fourier series expansion using only first 1, 3, 5 and 100 terms,
demonstrating the Gibb’s phenomena of over and undershoot at the point of discontinuities.

Parseval’s relation gives

∞ b

∑ ck2 = ∫ f (x)2 ρ(x) dx


k=0 a

󳨐⇒
π/2
π ∞ 2 1 π
+∑ ⋅ = ∫ dx =
4 k=0 π (2k + 1)2 2
0

󳨐⇒

2 ∞ 1 π
∑ =
π k=0 (2k + 1)2 4

or

1 π2
∑ = (21.45)
k=0
(2k + 1) 2 8

If we set x = 0 on both sides of equation (21.44), we get

1 2 ∞ (−1)k
1= + ∑
2 π k=0 (2k + 1)
21.3 Convergence in function spaces and introduction to Banach and Hilbert spaces | 513

󳨐⇒

π ∞
(−1)k 1 1 1 1
= ∑ = 1 − + − + − ⋅⋅⋅ (21.46)
4 k=0 (2k + 1) 3 5 7 9

Many such relations (equations (21.45) and (21.46)) can be derived using eigenfunction
expansions.

Example 21.9 (Eigenfunction expansion in two variables). Consider the function


f (x) = x(1 − x)y(1 − y) and EVP

𝜕2 u 𝜕2 u
+ = −λu, 0 < x < 1, 0<y<1
𝜕x 2 𝜕y2
u(0, y) = u(1, y) = u(x, 0) = u(x, 1) = 0

λn = (n2 + m2 )π 2 , unm (x) = 2 sin nπx sin mπy, n = 1, 2, . . . , m = 1, 2, . . .

Thus, the function f (x) can be expanded as


∞ ∞
f (x, y) = ∑ ∑ cij uij (x, y)
i=1 j=1

where
1

cij = ⟨f , uij ⟩ = ∫ f (x ′ , y′ )2 sin(iπx ′ ) sin(jπy′ ) dx′ dy′


0
1 1

= (∫ x ′ (1 − x′ )√2 sin iπx ′ dx′ ).(∫ y′ (1 − y′ )√2 sin jπy′ dy′ )


0 0

2√2 2 √2 0 if i or j is even
= [ 3 3 (1 − cos iπ)].[ 3 3 (1 − cos jπ)] = { 32
i π j π i3 j3 π 6
if i and j are odd

Thus, the Fourier series expansion of x(1 − x)y(1 − y) can be expressed as

x(1 − x)y(1 − y)
64 ∞ ∞ sin(2k + 1)πx sin(2l + 1)πy
= ∑∑
π 6 k=0 l=0 (2k + 1)3 (2l + 1)3

64 ∞ sin(2k + 1)πx ∞
sin(2l + 1)πy
= [ ∑ 3
][ ∑ ]
6
π k=0 (2k + 1) l=0
(2l + 1)3
64 sin 3πx sin 5πx sin 3πy sin 5πy
= [sin πx + + + ⋅ ⋅ ⋅][sin πy + + + ⋅ ⋅ ⋅] (21.47)
π6 27 125 27 125
514 | 21 Sturm–Liouville theory and eigenfunction expansions

Figure 21.5 shows the plots of exact function f (x, y) along with 2D Fourier series ex-
pansion using first few terms. The convergence here is uniform.

Figure 21.5: Plot of function f (x, y) = x(1 − x)y(1 − y) and its representation with Fourier series
expansion using 1 × 1 and 2 × 2 terms.

21.3.7 Fourier series (eigenfunction expansion) of the Green’s function

Consider the S–L problem

(p(x)y′ ) − q(x)y = −λρ(x)y, a<x<b (21.48)


y(a) = 0, y(b) = 0 (21.49)

Let {λi }, {yi (x)}, i = 1, 2, 3, . . . be the eigenvalues and normalized eigenfunctions, respec-
tively. Let G(x, s) be the Green’s function. We have seen that equations (21.48)–(21.49)
may be written as

y(x) = λ ∫ G(x, s)ρ(s)y(s) ds (21.50)


a

Thus,
21.3 Convergence in function spaces and introduction to Banach and Hilbert spaces | 515

b
yi (x)
= ∫ G(x, s)ρ(s)yi (s) ds (21.51)
λi
a

Now, consider the expansion of G(x, s) (considered as a function of s) in terms of the


eigenfunctions {yi (s)}


G(x, s) = ∑ ci yi (s)
i=1


b
yi (x)
ci = ⟨G(x, s), yi (s)⟩ = ∫ ρ(s)G(x, s)yi (s) ds =
λi
a

Thus, we obtain Mercer’s expansion



yi (x)yi (s)
G(x, s) = ∑ (21.52)
i=1
λi

Now consider parseval’s equation

∞ b

∑ ci2 = ∫ ρ(s)f (s)2 ds


i=1 a

Let f (s) = G(x, s)


󳨐⇒
b

yi2 (x)
∑ = ∫ G(x, s)2 ρ(s) ds
i=1 λi2
a

Multiply both sides by ρ(x) and integrate from a to b and use the relation:

∫ ρ(x)yi2 (x) dx = 1
a

󳨐⇒
b b

1
∑ = ∫[∫ G(x, s)2 ρ(s) ds]ρ(x) dx (21.53)
i=1 λi2
a a

Since G(x, s) is continuous, integral on the RHS of equation (21.53) is finite. Therefore,
1
∑∞i=1 λ2 converges:
i
516 | 21 Sturm–Liouville theory and eigenfunction expansions

󳨐⇒
1
→0 for i → ∞ 󳨐⇒ λi2 → ∞ as i → ∞
λi2

Example 21.10.

y′′ = −λy, 0<x<1


y(0) = y(1) = 0
s(1 − x) 0<s<x
G(x, s) = {
x(1 − s) x<s<1


1 1

∫[∫ G(x, s)2 ρ(s) ds]ρ(x) dx


0 0
1 1

= ∫[∫ G(x, s)2 ds] dx


0 0
1 x 1

= ∫[∫ s (1 − x) ds + ∫ x 2 (1 − s)2 ds] dx


2 2

0 0 x
1
x 3 (1 − x)2 x 2 (1 − x)3
= ∫[ + ] dx
3 3
0
1
1
= ∫ x 2 (1 − x)2 dx
3
0
1
=
90
Eigenvalues

λ n = n2 π 2 , n = 1, 2, 3, . . .

󳨐⇒
1 1

1
∑ 2 = ∫ ∫ G(x, s)2 ds dx
λ
i=1 i 0 0

1 1
∑ 4 4 =
n=1 n π 90

1 π4
󳨐⇒ ∑ = .
n=1 n
4 90
21.3 Convergence in function spaces and introduction to Banach and Hilbert spaces | 517

Problems
1. Consider the eigenvalue problem

y′′ = −λy, 0<x<1


y (0) = 0,

y (1) + Bi y(1) = 0 (Bi = Biot number),

which arises in solving the unsteady-state heat (mass) diffusion problem in a flat
plate.
(a) Show that the eigenvalue problem is self-adjoint.
(b) Determine the first eigenvalue as a function of the Biot number, i. e., compute
λ1 for Bi = 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1.0, 2.0, 5.0, 10.0, 20.0, 50.0, 100.0. Iden-
tify the two asymptotes.
(c) Determine the eigenvalues and eigenfunctions for the two limiting cases of no
external resistance (Bi → ∞) and no internal resistance (Bi → 0).
2. Consider the eigenvalue problem

1 ′′
y − y′ = −λy, 0<x<1
Pe
1 ′
y (0) − y(0) = 0, y′ (1) = 0 (Pe = Peclet number),
Pe
which arises in solving unsteady-state diffusion–convection–reaction problems
with Danckwert’s boundary conditions
(a) Show that the substitutions

x Pe Pe2
y = w exp( ), Λ = λ Pe −
2 4

transform the eigenvalue problem to a self-adjoint form

Pe Pe
w′′ = −Λw; w′ (0) − w(0) = 0, w′ (1) + w(1) = 0
2 2
(b) Determine the first eigenvalue as a function of the Peclet number, i. e., com-
pute λ1 for Pe = 0.01, 0.1, 1, 10, 100. Identify the two asymptotes.
(c) Determine and sketch the first eigenfunction for different values of the Peclet
number.
3. Consider the Schrodinger equation in one spatial dimension

h2 d 2 ψ
− + V(x)ψ(x) = Eψ(x)
8π 2 m dx2

where ψ(x) is the wave function. E is the total energy of the particle (electron) and
V(x) is the potential energy of the particle at position x. For a square well potential,
V(x) is zero for 0 < x < L and is infinity outside this region. The appropriate
boundary conditions for the wave function are
518 | 21 Sturm–Liouville theory and eigenfunction expansions

ψ(0) = ψ(L) = 0

(a) Determine the permitted energy levels for the particle. What is the separation
between neighboring quantum levels?
(b) Sketch the first five wave functions.
4. The following fourth-order eigenvalue problem arises in the stability analysis of a
fluid filled porous medium confined between two parallel plates kept at different
temperatures (Lapwood convection):

ϕ′′ − k 2 ϕ + ψ − λϕ = 0
ψ′′ − k 2 ψ + Ra k 2 ϕ = 0
ϕ(0) = ϕ(1) = 0
ψ(0) = ψ(1) = 0

Here, k is the wave number and Ra is the Rayleigh number. Show that the eigen-
values are real.
5. Consider the eigenvalue problem

y′′ + λy = 0, 0 < x < π; y′ (0) = 0, y′ (π) = 0

(a) Determine the eigenvalues and corresponding orthonormal set of eigenfunc-


tions.
(b) Suppose that f (x) has continuous first and second derivatives in [0, π] and sat-
isfies the boundary conditions. Show that the eigenfunction expansion con-
verges to f (x) uniformly on [0, π].
(c) Determine the expansion of the function f (x) = x 2 (π − x)2 .
6. Eigenfunctions satisfying four boundary conditions appear frequently in the so-
lution of many transport and reaction problems.
(a) Find an orthonormal set of eigenfunctions, which vanish along with their first
derivative at the end points of the interval [−1, 1]. Sketch the first three eigen-
functions.
(b) Determine formulas for the expansion of an arbitrary function f (x) in terms
of the above eigenfunctions.
7. Consider the eigenvalue problem

y′′ = −λxy, y(0) = 0, y(1) = 0

(a) Compute an approximate first eigenvalue using the Rayleigh quotient.


(b) Apply the Rayleigh method to the inverse operator G to show that

⟨y, y⟩
λ1 = Min ,
⟨Gy, y⟩
21.3 Convergence in function spaces and introduction to Banach and Hilbert spaces | 519

and hence compute a better approximation to λ1 .


(c) This equation may be transformed into a Bessel’s equation and eigenvalues
obtained exactly. Show that the eigenvalues are the roots of the equation

2
J1/3 ( √λ) = 0
3

Find the eigenvalue from appropriate tables (Abromowitz and Stegun [2]).
8. (a) Let C 1 [a, b] be the space of continuously differentiable functions on [a, b] with
the norm

󵄩󵄩 󵄩 󵄨 󵄨 󵄨 ′ 󵄨
󵄩󵄩f (x)󵄩󵄩󵄩 = Sup{󵄨󵄨󵄨f (x)󵄨󵄨󵄨 + 󵄨󵄨󵄨f (x)󵄨󵄨󵄨}; a≤x≤b

Is the normed space complete (Banach space)?


(b) Repeat (a) for the norm

b
󵄩󵄩 󵄩 󵄨 󵄨2 󵄨 󵄨2
󵄩󵄩f (x)󵄩󵄩󵄩 = √∫(󵄨󵄨󵄨f (x)󵄨󵄨󵄨 + 󵄨󵄨󵄨f ′ (x)󵄨󵄨󵄨 ) dx
a

(c) Consider the following inner product in C 1 [a, b]:

⟨f , g⟩ = ∫[f (x)g(x) + f ′ (x)g ′ (x)] dx


a

Identify the missing elements that need to be added to C 1 [a, b] to make it a Hilbert
space (the resulting Hilbert space is called a Sobolev space).
22 Introduction to the solution of linear integral
equations
22.1 Introduction
An integral equation (IE) is an equation in which the unknown function appears under
one or more integrals. IEs appear in applications that deal with particulate processes
or population balances. In addition, as we have seen in previous chapters, the solution
of many initial and boundary value problems may be expressed in terms of integral
equations. This chapter is a brief introduction to the theory of linear integral equations
in one dependent and one independent variable.
The general form of a linear IE is

h(x)u(x) = f (x) + λ ∫ K(x, s)u(s) ds, (22.1)


a

where u(x) is the unknown function; h(x), f (x) and K(x, s) are known functions and
λ(≠ 0) is a real or complex parameter. If f (x) = 0, the IE is called homogeneous. If the
upper limit of integration is fixed (e. g., x = b), then it is called a Fredholm equation.
Otherwise, it is called a Volterra equation. If the unknown function u(x) appears only
under the integral sign (e. g., h(x) = 0), it is called the integral equation of the first
kind. If it appears both inside and outside of the integral sign, it is called the integral
equation of the second kind.

Examples.
1.

f (x) = ∫ K(x, s)u(s) ds (22.2)


a

is a Fredholm equation of the first kind.


2.

u(x) = f (x) + λ ∫ K(x, s)u(s) ds (22.3)


a

is a Fredholm equation of the second kind.


3.
x

f (x) = ∫ K(x, s)u(s) ds (22.4)


a

https://doi.org/10.1515/9783110739701-023
22.2 Transformation of an IVP into an IE of Volterra type | 521

is a Volterra equation of the first kind.


4.
x

u(x) = f (x) + λ ∫ K(x, s)u(s) ds (22.5)


a

is a Volterra equation of the second kind.

In these equations, K(x, s) is called the kernel. When one or both limits of integration
become infinity or when the kernel becomes infinity within the range of integration,
the IE is called a singular integral equation.
For example, the Laplace transform defined by


̂ (s) = ∫ e−st u(t) dt
u (22.6)
0

is a singular IE of the first kind with kernel K(t, s) = e−st . The Fourier transform defined
by


̂f (α) = ∫ e−iαx f (x) dx (22.7)
−∞

is also a singular integral of the first kind with kernel K(x, α) = e−iαx . The other types
of kernels that are of interest are
(i) Separable kernel: K(x, s) = ∑nj=1 aj (x)bj (s).
(ii) Symmetric kernel: K(x, s) = K(s, x).
(iii) Convolution kernel: K(x, s) = h(x − s).

In the next two sections, we review and outline the procedure for converting initial
and boundary value problems into integral equations.

22.2 Transformation of an IVP into an IE of Volterra type


Consider the second-order IVP

d2 u du
+ a1 (t) + a2 (t)u = g(t), t > 0 (22.8)
dt 2 dt
u(0) = α0 , u′ (0) = α1 (22.9)

and assume that the functions ai (t) are continuous. Define


522 | 22 Introduction to the solution of linear integral equations

d2 u
= h(t) (22.10)
dt 2

and integrate once to obtain

t
du
= α1 + ∫ h(s) ds (22.11)
dt
0

Integrating again gives

t t′

u(t) = α0 + α1 t + ∫ ∫ h(s) ds dt ′ . (22.12)


0 0

Changing the order of integration in the double integral (in equation (22.12)) and sim-
plifying gives

u(t) = α0 + α1 t + ∫(t − s)h(s) ds (22.13)


0

dj u
Now, multiplying dt j
by an−j (t) and summing from j = 0 to 2 with a0 (t) = 1 gives

h(t) = f (t) + ∫ K(t, s)h(s) ds (22.14)


0

where

f (t) = g(t) − α1 a1 (t) − (α0 + α1 t)a2 (t)


K(t, s) = −a1 (t) − (t − s)a2 (t).

Once h(t) is known, we can determine u(t) from equation (22.13). Thus, the IVP is re-
duced to a Volterra integral equation of the second kind.
The above procedure can be extended to the n-th order IVP. In this case, equation
(22.13) becomes

t
t n−1 (t − s)n−1
u(t) = α0 + α1 t + ⋅ ⋅ ⋅ + αn−1 +∫ h(s) ds (22.15)
(n − 1)! (n − 1)!
0

and equation (22.14) remains the same with

t n−1
f (t) = g(t) − αn−1 a1 (t) − ⋅ ⋅ ⋅ − [α0 + α1 t + ⋅ ⋅ ⋅ + αn−1 ]a (t) (22.16)
(n − 1)! n
22.3 Transformation of TPBVP into an IE of Fredholm type | 523

n
(t − s)j−1
K(t, s) = − ∑ a (t) (22.17)
j=1
(j − 1)! j

We note that for the special case of homogeneous initial conditions (αj = 0), f (t) =
g(t) and the kernel depends only on the coefficient functions aj (t).

22.3 Transformation of TPBVP into an IE of Fredholm type


We have already shown (in Chapter 19) that the n-th order TPBVP

Lu = −f (x), a<x<b (22.18)


Wa k(u(a)) + Wb k(u(b)) = 0 (22.19)

may be transformed to an integral equation of the form

u(x) = ∫ G(x, s)f (s) ds (22.20)


a

where G(x, s) is the Green’s function. We note that equation (22.20) is valid when f (x) is
replaced by a more general (and possibly nonlinear) source term of the form h(x, u(x)),
in which case equation (22.20) becomes a nonlinear IE of the form

u(x) = ∫ G(x, s)h(s, u(s)) ds (22.21)


a

Thus, two-point BVPs can be transformed into Fredholm integral equations with the
kernel being the Green’s function. We have also seen that for the special case in which
the homogeneous two-point BVP is self-adjoint, the Green’s function (kernel) is sym-
metric. The kernel can also be made symmetric for the more general case in which the
weight function in the inner product is not unity. For example, the Sturm–Liouville
eigenvalue problem

d du
(p(x) ) − q(x)u(x) = −λρ(x)u(x), a<x<b (22.22)
dx dx
u(a) = 0, u(b) = 0 (22.23)

can be converted to an integral equation

u(x) = λ ∫ G(x, s)ρ(s)u(s) ds. (22.24)


a

However, the kernel G(x, s)ρ(s) is not symmetric when ρ(s) is not unity. By defining
524 | 22 Introduction to the solution of linear integral equations

ϕ(x) = √ρ(s)u(x), (22.25)

Equation (22.24) may be written as a homogeneous Fredholm equation with symmetric


kernel:
b

ϕ(x) = λ ∫ K(x, s)ϕ(s) ds (22.26)


a

where

K(x, s) = G(x, s)√ρ(x)√ρ(s). (22.27)

This is possible since the density function ρ(x) is strictly positive in (a, b).

22.4 Solution of Fredholm integral equations with separable


kernels
Consider the Fredholm IE of the second kind as given by

u(x) = f (x) + λ ∫ K(x, s)u(s) ds (22.28)


a

We consider the case of separable kernel as the solution procedure for this case may
be related to that of linear algebraic equations. Expressing the kernel as

N
K(x, s) = ∑ ai (x)bi (s). (22.29)
i=1

Without loss of generality, we assume that the functions ai (x) and bi (s) are linearly
independent. If they are not, we can combine the terms and reduce the number of
terms in the summation.

22.4.1 Homogeneous equation

We first consider the homogeneous Fredholm IE, i. e., equation (22.28) with f (x) = 0,
with separable kernel (equation (22.29)), which can be expressed as

b N
u(x) = λ ∫(∑ ai (x)bi (s))u(s) ds. (22.30)
a i=1

Interchanging the summation and integration gives


22.4 Solution of Fredholm integral equations with separable kernels | 525

N b

u(x) = λ ∑ ai (x)∫bi (s)u(s) ds. (22.31)


i=1 a

Let
b

ci = ∫ bi (s)u(s) ds (22.32)
a

then equation (22.31) implies

N
u(x) = λ ∑ cj aj (x). (22.33)
j=1

To determine ci , we substitute equation (22.33) in equation (22.32), which leads to

b N
ci = ∫ bi (s)λ ∑ cj aj (s) ds
a j=1

N b

= λ ∑(∫ bi (s)aj (s) ds)cj


j=1 a
N
⇒ ci = λ ∑ Aij cj (22.34)
j=1

where
b

Aij = ∫ aj (s)bi (s) ds, i = 1, 2, . . . , N and j = 1, 2, . . . , N (22.35)


a

Thus, equation (22.34) and equation (22.35) lead to

c = λAc

or the homogeneous linear algebraic equations

(I − λA)c = 0. (22.36)

Let

D(λ) = |I − λA| = det(I − λA). (22.37)

If D(λ) ≠ 0, then the only solution to equation (22.36) is the trivial one, i. e., c = 0,
which implies u(x) ≡ 0 is the only solution to the homogeneous equation (22.30). The
526 | 22 Introduction to the solution of linear integral equations

λ-values for which D(λ) = 0 are called the eigenvalues of the kernel. There are at most
n of them. The nontrivial solution c corresponding to an eigenvalue gives a nontrivial
u(x) = λ∑Nj=1 cj aj (x). These are the eigenfunctions of the kernel.

22.4.2 Inhomogeneous equation

Now, consider the inhomogeneous Fredholm IE (equation (22.28)) with separable ker-
nel (equation (22.29)), which can be expressed as

b N
u(x) = f (x) + λ ∫[∑ ai (x)bi (s)]u(s) ds (22.38)
a i=1


N
u(x) = f (x) + λ ∑ ai (x)ci (22.39)
i=1

where
b

ci = ∫ bi (s)u(s) ds
a
b N
= ∫ bi (s)[f (s) + λ ∑ aj (s)cj ] ds
a j=1

c = f + λAc (22.40)

where
b

fi = ∫ bi (s)f (s) ds, i = 1, 2, . . . , N


a

Equation (22.40) can be expressed as

(I − λA)c = f. (22.41)

If D(λ) ≠ 0, then equation (22.40) can be inverted as

c = (I − λA)−1 f


22.4 Solution of Fredholm integral equations with separable kernels | 527

1 N
ci = ∑ D (λ)fj (22.42)
D(λ) j=1 ij

where Dij (λ) = (i, j)th element of the classical adjoint of (I−λA), i. e., matrix of cofactors.
Substituting equation (22.42) into equation (22.39) gives

N
ai (x) N
u(x) = f (x) + λ ∑ (∑D (λ)fj )
i=1
D(λ) j=1 ij

N N b
a (x)
= f (x) + λ ∑ ∑ i Dij (λ) ∫ bj (s)f (s) ds
i=1 j=1
D(λ)
a


b

u(x) = f (x) + λ ∫ Γ(x, s, λ)f (s) ds (22.43)


a

where
N N ai (x)Dij (λ)bj (s)
Γ(x, s, λ) = ∑ ∑ (22.44)
i=1 j=1
D(λ)

is called the resolvent kernel. Thus, when the kernel is separable and D(λ) ≠ 0, the
solution of the Fredholm equation of the second kind is given by equations (22.43)
and (22.44). Further, it can be shown that the solution in this case is unique.

Example 22.1. Consider the equation

u(x) = f (x) + λ ∫(x + s)u(s) ds.


0

Here,

K(x, s) = (x + s)
= x.1 + 1.s
= a1 (x)b1 (s) + a2 (x)b2 (s)

Thus,

Aij = ∫ aj (s)bi (s) ds


a

leads to
528 | 22 Introduction to the solution of linear integral equations

1 1
1
A11 = ∫ s.1 ds = , A12 = ∫ 1.1 ds = 1,
2
0 0
1 1
1 1
A21 = ∫ s.s ds = , A22 = ∫ 1.s ds = ,
3 2
0 0


1 λ
2
1 1− 2
−λ
A=( 1 1
) ⇒ (I − λA) = ( )
3 2 − λ3 1− λ
2
λ2
⇒ D(λ) = 1 − λ −
12

λ
1 1− 2
λ f1
c= ( λ λ
)( )
1−λ− λ2 1− f2
12 3 2

c1 1 12λf2 + (12 − 6λ)f1


⇒( )= ( )
c2 12 − 12λ − λ2 4λf1 + (12 − 6λ)f2

where
1 1

f1 = ∫ f (s) ds and f2 = ∫ sf (s) ds


0 0

Thus, from equation (22.39),

u(x) = f (x) + λ(xc1 + c2 )


1

= f (x) + λ ∫ Γ(x, s, λ)f (s) ds


0

where
6(λ − 2)(x + s) − 12λxs − 4λ
Γ(x, s, λ) = .
λ2 + 12λ − 12

We note that D(λ) = 0 ⇒ λ1,2 = −6 ± 4√3. For these values of λ, the homogeneous
equation has nontrivial solutions. Hence, the above solution is valid only if λ ≠ λ1 , λ2 .
To determine the nontrivial solutions (or eigenfunctions) of the homogeneous
equation for λ = λ1 , λ2 , we have

(I − λ1 A)c = 0 ⇒ c1 = 1, c2 = √3
(I − λ2 A)c = 0 ⇒ c1 = 1, c 2 = − √3
22.4 Solution of Fredholm integral equations with separable kernels | 529


2
u1 (x) = λ1 ∑ cj aj (x) = λ1 (√3x + 1)
j=1

u2 (x) = λ2 (−√3x + 1)

are eigenfunctions (or any multiple of these).

Example 22.2. Consider the equation

u(x) = f (x) + λ ∫ (xs + x 2 s2 )u(s) ds.


−1

Here,

K(x, s) = xs + x 2 s2
= a1 (x)b1 (s) + a2 (x)b2 (s)


2
3
0
A=( 2 )
0 5
f1 f2
⇒ c1 = 2
, c2 =
1− 3
λ 1 − 52 λ

where
1 1

f1 = ∫ sf (s) ds and f2 = ∫ s2 f (s) ds


−1 −1


1

u(x) = f (x) + λ ∫ Γ(x, s, λ)f (s) ds


−1

where

sx s2 x 2
Γ(x, s, λ) = 2λ
+ 2λ
1− 3
1− 5

is the resolvent kernel. Thus, the solution is unique for any λ except for λ = 3/2 or
λ = 5/2. For λ = 3/2, u1 (x) = x or for λ = 5/2, u2 (x) = x 2 is a solution to the homogeneous
equation.
530 | 22 Introduction to the solution of linear integral equations

22.5 Solution procedure for Volterra integral equations of the


second kind
Recall that a Volterra integral equation (VIE) of the second kind is given by

u(t) = f (t) + λ ∫ K(t, s)u(s) ds, (22.45)


a

where K(t, s) is the kernel of the equation and f (t) is a continuous function and λ is a
parameter. There exist various methods for solving equation (22.45). We discuss here
two of these.

22.5.1 Method of successive approximation

Let u0 (t) be the initial guess, then VIE of the second kind (equation (22.45)) gives the
sequence u0 (t), u1 (t), . . . , un (t) that are expressed by

u1 (t) = f (t) + λ ∫ K(t, s)u0 (s) ds (22.46)


0
t

u2 (t) = f (t) + λ ∫ K(t, s)u1 (s) ds (22.47)


0

and so on. Thus, they satisfy the following recurrence relation:

un (t) = f (t) + λ ∫ K(t, s)un−1 (s) ds, n = 1, 2, . . . (22.48)


0

Let

u(t) = lim un (t)


n→∞

when it exists. In the so-called Picard’s method, u0 (t) = f (t), and the recurrence rela-
tion leads to

u0 (t) = f (t) (22.49)


t

u1 (t) = f (t) + λ ∫ K(t, s)f (s) ds (22.50)


0
22.5 Solution procedure for Volterra integral equations of the second kind | 531

t s

u2 (t) = f (t) + λ ∫ K(t, s)[f (s) + λ ∫ K(s, s′ )f (s′ ) ds′ ] ds


0 0
t t s

= f (t) + λ ∫ K(t, s)f (s) ds + λ2 ∫ ∫ K(t, s)K(s, s′ )f (s′ ) ds′ ds (22.51)


0 0 0

t s
2
u2 − u1 = λ ∫ ∫ K(t, s)K(s, s′ )f (s′ ) ds′ ds
0 0

which, after interchanging the order of integration and simplifying further, gives

t t

u2 − u1 = λ2 ∫ f (s′ )[∫ K(t, s)K(s, s′ ) ds] ds′


0 s′
t

= λ2 ∫ K2 (t, s′ )f (s′ ) ds′


0

where
t

K2 (t, s′ ) = ∫ K(t, s)K(s, s′ ) ds


s′

Similarly,

t t
3 ′ ′ ′
u3 − u2 = λ ∫ K3 (t, s )f (s ) ds ; ′
K3 (t, s ) = ∫ K(t, s)K2 (s, s′ ) ds
0 s′

and so on. Continuing sequentially, we get

t
n
un − un−1 = λ ∫ Kn (t, s′ )f (s′ ) ds′ ; (22.52)
0

with

Kn (t, s ) = ∫ K(t, s)Kn−1 (s, s′ ) ds,



and K1 (t, s) = K(t, s) (22.53)
s′

or
532 | 22 Introduction to the solution of linear integral equations

n t
i−1
un (t) = f (t) + λ ∑ λ ∫ Ki (t, s′ )f (s′ ) ds′ (22.54)
i=1 0

that leads to the Neumann series solution as


t

u(t) = lim un = f (t) + λ ∫ Γ(t, s′ , λ)f (s′ ) ds′ (22.55)


n→∞
0

where Γ(t, s′ , λ) is the resolvent kernel given by



Γ(t, s′ , λ) = ∑ λi−1 Ki (t, s′ ) (22.56)
i=1

with iterated kernels as defined in (22.53).


Example 22.3.
t

u(t) = f (t) + λ ∫ et−s u(s) ds.


0

Here,

K(t, s′ ) = et−s

t t

K2 (t, s′ ) = ∫ K(t, s)K(s, s′ ) ds = ∫ et−s es−s ds


s′ s′
t

= ∫ et−s ds = (t − s′ )et−s
′ ′

s′
t t

K3 (t, s ) = ∫ K(t, s)K2 (s, s ) ds = ∫ et−s (s − s′ )es−s ds



′ ′

s′ s′
t t
t−s′ t−s′ (t − s′ )2
∫(s − s′ ) ds = et−s


= ∫e (s − s ) ds = e
2!
s′ s′

Thus, the resolvent kernel is given by

(t − s′ )2
Γ(t, s′ , λ) = et−s + λ(t − s′ )et−s + λ2 et−s
′ ′ ′
+ ⋅⋅⋅
2!
λ2 (t − s′ )2
= et−s [1 + λ(t − s′ ) +

+ ⋅ ⋅ ⋅]
2!
= et−s eλ(t−s ) = exp[(1 + λ)(t − s′ )]
′ ′

The solution is given by


22.5 Solution procedure for Volterra integral equations of the second kind | 533

t

u(t) = f (t) + λ ∫ e(1+λ)(t−s ) f (s′ ) ds′ .
0

Theorem. Consider the VIE of the second kind

u(t) = f (t) + λ ∫ K(t, s)u(s) ds


a

where a is a constant and assume that:


1. K(t, s) is continuous in the rectangle R, for which a ≤ t ≤ b, a ≤ s ≤ b, |K(t, s)| ≤ M
in R and K(t, s) ≠ 0.
2. f (t) ≠ 0 is real and continuous in the interval I : a ≤ t ≤ b.
3. λ is a constant, then the integral equation has one and only one continuous solution
u(t) in I, and this solution is given by the absolutely and uniquely convergent series

∞ t

u(t) = f (t) + λ ∑ λn−1 ∫ Kn (t, s′ )f (s′ ) ds′ .


n=1 0

Remarks. (a) The above series is called the Neumann series. (b) When the kernel is
bounded, the Neumann series converges since repeated integration leads to terms of
′ i
the form (t−si! ) for the iterated kernel.

22.5.2 Adomian decomposition method

For VIE (equation (22.45)), the solution can be expressed in the form of

u(t) = ∑ un (t) (22.57)
n=0

where
t

u0 (t) = f (t), and un (t) = λ ∫ K(t, s)un−1 (s) ds ∀n ≥ 1. (22.58)


0

In this Adomian decomposition method, the kernel remains the same but we add
higher-order terms in λ and f (t) to the solution.

Example 22.4. Solve

u(t) = t + ∫(s − t)u(s) ds


0
534 | 22 Introduction to the solution of linear integral equations

Equation (22.58) with λ = 1 gives

u0 (t) = t
t
t3
u1 (t) = ∫(s − t)s ds = −
3!
0
t
s3 t5
u2 (t) = − ∫(s − t) ds =
3! 5!
0

and so on. The solution is given by

t t3 t5
u(t) = − + − ⋅ ⋅ ⋅ = sin t
1! 3! 5!

22.6 Solution procedure for Volterra integral equations of the first


kind
Consider the nonhomogeneous VIE of the first kind as

λ ∫ K(t, s)u(s) ds = f (t) (22.59)


0

There are two ways to convert equation (22.59) into a VIE of the second kind:
(i) by differentiation of equation (22.59)
(ii) by integrating equation (22.59) by parts.

22.6.1 Differentiation approach

Differentiating equation (22.59) gives

t
𝜕K
λK(t, t)u(t) + λ ∫ (t, s)u(s) ds = f ′ (t) (22.60)
𝜕t
0

If K(t, t) ≠ 0 and λ ≠ 0, we can write

t
𝜕K 1 f ′ (t)
u(t) + ∫ (t, s) u(s) ds = .
𝜕t K(t, t) λK(t, t)
0

Thus, defining
22.6 Solution procedure for Volterra integral equations of the first kind | 535

−1 𝜕K(t, s) f ′ (t)
K ∗ (t, s) = ; and f ∗ (t) = (22.61)
λK(t, t) 𝜕t λK(t, t)

we get

t

u(t) = f (t) + λ ∫ K ∗ (t, s)u(s) ds (22.62)
0

which is a VIE of the second kind and can be solved by methods discussed previously.

22.6.2 Integration approach

Defining

ϕ(t) = ∫ u(s) ds (22.63)


0

and integrating equation (22.59) by parts, we get

t
𝜕K(t, s)
f (t) = λ[K(t, t)ϕ(t) − ∫ ϕ(s) ds]
𝜕s
0

Assuming K(t, t) ≠ 0, we can rearrange the above equation as

t 𝜕K(t,s)
f (t)
ϕ(t) = + ∫ 𝜕s ϕ(s) ds
λK(t, t) K(t, t)
0

Thus, defining

𝜕K(t,s)
̂f (t) = f (t)
; and K(t,
̂ s) = 𝜕s
(22.64)
λK(t, t) K(t, t)

we get

ϕ(t) = ̂f (t) + ∫ K(t,


̂ s)ϕ(s) ds, (22.65)
0

which is a VIE of the second kind. Once ϕ(t) is known, u(t) can be obtained from the
relation u(t) = ϕ′ (t). Note that this method does not require the function f (t) to be
differentiable.
536 | 22 Introduction to the solution of linear integral equations

22.7 Volterra integral equations with convolution kernel


Consider a VIE of the second kind:
t

u(t) = f (t) + λ ∫ K(t, t ′ )u(t ′ ) dt ′ (22.66)


0

where

K(t, t ′ ) = g(t − t ′ ) (22.67)


t

u(t) = f (t) + λ ∫ g(t − t ′ )u(t ′ ) dt ′ . (22.68)


0

This equation can be solved by the Laplace transform method using the convolution
property of LT. Let

−st
ℒ[u(t)] = ∫ e u(t) dt = u
̂ (s) = L. T. of u(t). (22.69)
0


t
′ ′ ′
ℒ[∫ g(t − t )u(t ) dt ] = ℒ[g(t) ∗ u(t)] = ĝ (s)u
̂ (s).
0

Then taking LT, equation (22.68) gives

u
̂ (s) = ̂f (s) + λĝ (s)u
̂ (s)
̂f (s)
⇒u
̂ (s) = (22.70)
1 − λĝ (s)

Let

−1 1
ℒ [ ] = G(t) (22.71)
1 − λĝ (s)

t

u(t) = ∫ G(t − t ′ )f (t ′ ) dt ′ (22.72)


0

If we can expand
22.7 Volterra integral equations with convolution kernel | 537

1
= 1 + λĝ (s) + λ2 ĝ (s)2 + ⋅ ⋅ ⋅
1 − λĝ (s)

then

1
−1
ℒ [ ] = G(t) = δ(t) + λg1 (t) + λ2 g2 (t) + ⋅ ⋅ ⋅ ; (22.73)
1 − λĝ (s)
gi (t) = ℒ−1 [ĝ (s)i ], i = 1, 2, . . . (22.74)

u(t) = ∫ G(t − t ′ )f (t ′ ) dt ′
0
t

= ∫[δ(t − t ′ ) + λg1 (t − t ′ ) + λ2 g2 (t − t ′ )]f (t ′ ) dt ′


0
t

= f (t) + λ ∫ Γ(t, t ′ , λ)f (t ′ ) dt ′


0

where

Γ(t, t ′ , λ) = g1 (t − t ′ ) + λg2 (t − t ′ ) + ⋅ ⋅ ⋅

is the resolvent kernel.

Example 22.5. Solve Abel’s equation:

t t
u(s) u(t ′ )
f (t) = ∫ ds = ∫ dt ′ .
√t − s √t − t ′
0 0

The Laplace transform gives

̂f (s) = u π 1 π
̂ (s).√ , (∵ ℒ[ ]=√ )
s √t s

√ŝf (s) s π ̂
u
̂ (s) = = √ f (s)
√π π s
t
1 d −1 π ̂ 1 d f (t ′ )
⇒ u(t) = ℒ [√ f (s)] = [∫ dt ′ ]
π dt s π dt √t − t ′
0
538 | 22 Introduction to the solution of linear integral equations

22.8 Fredholm integral equations of the second kind


The Neumann series and the Adomian decomposition method can also be used to
solve Fredholm integral equations (FIE) as illustrated by the example below.

Diffusion–reaction problem
Consider a diffusion–reaction problem given by

d2 c
= ϕ2 R(c), 0 < x < 1 (22.75)
dx 2
c′ (0) = 0, and c(1) = 1 (22.76)

where c(x) is concentration, R(c) is rate of reaction and ϕ is the Thiele modulus. We
can express equations (22.75)–(22.76) as an integral equation by integrating twice and
changing the order of integration, which leads to the integral equation as

c(x) = 1 − ϕ2 ∫ K(x, s)R(c(s)) ds (22.77)


0

where

1 − x, 0<s<x
K(x, s) = { (22.78)
1 − s, x<s<1

For linear kinetics, i. e., R(c) = c, we can rewrite equation (22.77) as

c(x) = 1 − ϕ2 ∫ K(x, s)c(s) ds (22.79)


0

which is a FIE of the second kind.

22.8.1 Solution by successive substitution

FIE (equation (22.79)) can be written with equation (22.78) as

x 1

c(x) = 1 − ϕ2 ∫(1 − x)c(s) ds − ϕ2 ∫(1 − s)c(s) ds (22.80)


0 x

Thus, writing the recurrence relation as


22.8 Fredholm integral equations of the second kind | 539

x 1
2 2
cj (x) = 1 − ϕ ∫(1 − x)cj−1 (s) ds − ϕ ∫(1 − s)cj−1 (s) ds (22.81)
0 x

with initial guess of c0 (x) = 1, and defining the effectiveness factor as

η = ∫ c(s) ds (22.82)
0

we get

c0 (x) = 1 ⇒ η0 = 1
ϕ2 ϕ2
c1 (x) = 1 − (1 − x 2 ) ⇒ η1 = 1 −
2 3
2 4
ϕ ϕ
c2 (x) = 1 − (1 − x 2 ) + (1 − x 2 )(5 − x 2 )
2 24
ϕ2 2
⇒ η2 = 1 − + ϕ4
3 15

and so on.
The higher-order terms can be obtained following the sequence, where it can be
shown that the solutions for concentration profile and effectiveness factor are the Tay-
lor series expansion (in ϕ2 ) of the functions

cosh ϕx tanh ϕ
c = c∞ = ; and η = η∞ = . (22.83)
cosh ϕ ϕ

In this case, the solution converges for all values of ϕ2 , though the convergence may
be slow for ϕ2 > 1.

22.8.2 Solution by Adomian decomposition method

In this approach, we consider


∞ ∞
c0 (x) = 1 and c(x) = ∑ ci (x) ⇒ η = ∑ ηi (22.84)
i=0 i=0

Thus, substituting equation (22.84) in FIE (equation (22.79)), we get

(c0 + c1 + c2 + ⋅ ⋅ ⋅) = 1 − ϕ2 ∫ K(x, s)[c0 + c1 + c2 + ⋅ ⋅ ⋅] ds. (22.85)


0


540 | 22 Introduction to the solution of linear integral equations

1
2
c1 (t) = −ϕ ∫ K(x, s)c0 (s) ds (22.86)
0
1

c2 (t) = −ϕ2 ∫ K(x, s)c1 (s) ds (22.87)


0
1
2
c3 (t) = −ϕ ∫ K(x, s)c2 (s) ds, and so on. (22.88)
0

Thus, using the symbolic manipulation program Mathematica® , we can obtain

c0 (x) = 1 ⇒ η0 = 1 (22.89)
ϕ2 ϕ2
c1 (x) = − (1 − x 2 ) ⇒ η1 = − (22.90)
2 3
ϕ4 2ϕ4
c2 (x) = (5 − 6x 2 + x 4 ) ⇒ η2 = (22.91)
24 15
6
ϕ 17ϕ6
c3 (x) = − (61 − 75x2 + 15x 4 − x 6 ) ⇒ η3 = − , and so on (22.92)
720 315
Solving above recurrence relation sequentially, we can obtain the solution to any or-
der.

22.9 Fredholm integral equations with symmetric kernels


Consider a homogeneous Fredholm integral equation (i. e., f (x) = 0) given by

u(x) = λ ∫ K(x, s)u(s) ds (22.93)


a

with symmetric kernel, i. e.,

K(x, s) = K(s, x) (22.94)

The eigenvalues and eigenfunctions of the kernel are defined by

ϕn (x) = λn ∫ K(x, s)ϕn (s) ds, ϕn (x) ≠ 0 (22.95)


a

Theorem.
1. The eigenvalues of a symmetric kernel are real.
2. The eigenfunctions of a symmetric kernel corresponding to distinct eigenvalues are
orthogonal.
22.9 Fredholm integral equations with symmetric kernels | 541

3. The multiplicity of an eigenvalue is finite if kernel is symmetric and square inte-


grable.
4. If λj is repeated p times, there are p linearly independent eigenfunctions correspond-
ing to λj and these can be made orthogonal to each other.
5. The sequence of eigenfunctions of a symmetric kernel K(x, s) can be made orthonor-
mal, i. e.,

b
1, i=j
∫ ϕi (x)ϕj (x) dx = δij = {
0, i ≠ j.
a

6. The eigenvalues of a symmetric nonseparable integrable kernel form an infinite se-


quence with no finite limit point [if the kernel is separable, then eigenvalues form a
finite sequence]. If we include each eigenvalue in the sequence a number of times
equal to its multiplicity, then

∞ b b
1
∑ 2
≤ ∫ ∫ K(x, s)2 dx ds
λ
n=1 n a a

Equivalently, λ1 → 0 for n → ∞. When the equality sign holds, the kernel is said to
n
be closed.
7. The set of eigenvalues of the second iterated kernel

K2 (x, s) = ∫ K(x, s′ )K(s′ , s) ds′


a

are square of eigenvalues of the given kernel K(x, s).


8. If λ1 is the smallest eigenvalue of the kernel, then

1 󵄨 󵄨
= max󵄨󵄨󵄨⟨Kϕ, ϕ⟩󵄨󵄨󵄨, ‖ϕ‖ = 1
|λ1 |

and the maximum on the RHS is attained when ϕ(x) is an eigenfunction of the sym-
metric ℒ2 -kernel corresponding to the smallest eigenvalue.

Proofs of the various statements in the above theorem may be found in the book
by Courant and Hilbert [15].

Mercer’s theorem. If the kernel K(x, s) is symmetric and square integrable on the square
{(x, s) : a ≤ x ≤ b, a ≤ s ≤ b}, continuous and has only positive eigenvalues or at most a
finite number of negative eigenvalues, then the series

ϕn (x)ϕn (s)

n=1 λn
542 | 22 Introduction to the solution of linear integral equations

converges absolutely and uniformly, and


ϕn (x)ϕn (s)
K(x, s) = ∑ .
n=1 λn

22.10 Adjoint operator and Fredholm alternative


If we define an integral operator T : C[a, b] → C[a, b] by

Tu(x) = ∫ K(x, s)u(s) ds, (22.96)


a

it is easily seen that the adjoint operator with respect to the usual inner product

⟨u(x), v(x)⟩ = ∫ u(x)v(x) dx


a

is given by

b

T v(x) = ∫ K(s, x)v(s) ds. (22.97)
a

From this result, it follows that the adjoint homogeneous equation to

u(x) = λ ∫ K(x, s)u(s) ds (22.98)


a

is given by

v(x) = λ ∫ K(s, x)v(s) ds (22.99)


a

We state now an important theorem related to the existence and uniqueness of


solutions to Fredholm integral equations. A proof of this theorem may be found in the
book by Courant and Hilbert [15].

Theorem (Fredholm alternative). Either the integral equation

u(x) = f (x) + λ ∫ K(x, s)u(s) ds (22.100)


a
22.11 Solution of FIE of the second kind with symmetric kernels | 543

with fixed λ has one and only one solution u(x) for arbitrary ℒ2 -functions f (x) and K(x, s),
in particular, the solution u(x) ≡ 0 for f (x) = 0, or the homogeneous equation

u(x) = λ ∫ K(x, s)u(s) ds


a

possesses a finite number of r linearly independent solutions uhi (x), i = 1, 2, . . . , r. In the


first case, the adjoint homogeneous equation

v(x) = f (x) + λ ∫ K(s, x)v(s) ds


a

also has a unique solution. In the second case, the adjoint homogeneous equation

v(x) = λ ∫ K(s, x)v(s) ds


a

also has r linearly independent solutions vhi (x), i = 1, 2, . . . , r. The inhomogeneous equa-
tion has a solution if and only if the function f (x) satisfies

⟨f , vhi ⟩ = ∫ f (x)vhi (x) dx = 0, i = 1, 2, . . . , r.


a

In this case, the solution to equation (22.100) is determined only up to an additive linear
combination ∑ri=1 ci uhi (x); it may be determined uniquely by the additional requirements

⟨u, uhi ⟩ = 0, i = 1, 2, . . . , r.

[Remark: Compare this theorem with the version for algebraic equations discussed in
Section 4.5.2.]

22.11 Solution of FIE of the second kind with symmetric kernels


Consider the FIE of the second kind as
b

u(x) = f (x) + λ ∫ K(x, s)u(s) ds (22.101)


a

with symmetric kernel, i. e.,

K(x, s) = K(s, x) (22.102)


544 | 22 Introduction to the solution of linear integral equations

Here, the resolvent kernel is given by


ϕn (x)ϕn (s)
Γ(x, s, λ) = ∑ (22.103)
n=1 (λn − λ)

and

b

an ϕn (x)
u(x) = f (x) + λ ∑ ; an = ∫ ϕn (s)f (s) ds (22.104)
n=1 (λn − λ) a

where λn and ϕn are eigenvalues and eigenfunctions of the kernel K(x, s) as defined in
equation (22.95).
A solution exists and is unique if λ ≠ λn (n = 1, 2, . . .). If λ = λn for some n, a solution
may exist but it is not unique.
Let

∞ b

f (x) = ∑ an ϕn (x) ⇐⇒ an = ∫ ϕn (s)f (s) ds (22.105)


n=1 a

Write

u(x) = ∑ bn ϕn (x) (22.106)
n=1

This can be done since u(x) ∈ ℒ2 and eigenfunctions form a basis for ℒ2 . To deter-
mine bn , substitute equation (22.106) into equation (22.101) to get

∞ ∞ b ∞
∑ bn ϕn (x) = ∑ an ϕn (x) + λ ∫ K(x, s) ∑ bn ϕn (s) ds
n=1 n=1 a n=1

Multiply both sides by ϕj (x) and integrate and use the orthogonal property of the
eigenfunctions, which leads to

aj λj
bj = (22.107)
λj − λ


∞ aj λj
u(x) = ∑ ϕj (x) (22.108)
j=1
λj − λ

This solution exists only if λ ≠ λj , j = 1, 2, . . . .


22.11 Solution of FIE of the second kind with symmetric kernels | 545

Equation (22.108) implies

b
∞ λj ϕj (x)
u(x) = ∑ ∫ ϕj (s)f (s) ds
j=1
λj − λ
a
b ∞
λj ϕj (x)ϕj (s)
= ∫∑ f (s) ds
λj − λ
a j=1
b ∞
λ
= ∫ ∑(1 + )ϕ (x)ϕj (s)f (s) ds
λj − λ j
a j=1
b b ∞
∞ ϕj (x)ϕj (s)
= ∑ ϕj (x) ∫ ϕj (s)f (s) ds + λ ∫ ∑ f (s) ds (22.109)
j=1
λj − λ
a a j=1


b

u(x) = f (x) + λ ∫ Γ(x, s, λ)f (s) ds (22.110)


a

where
∞ ϕj (x)ϕj (s)
Γ(x, s, λ) = ∑ (22.111)
j=1
λj − λ

is the resolvent kernel.


When λ = λj for some j, then bj is indeterminate and equation (22.101) is consistent
iff aj = 0, i. e., f (x) is orthogonal to ϕj (x). In this case, the solution is not unique and
is of the form:

an λn
u(x) = cj ϕj (x) + ∑ ϕn (x) (22.112)
n=1 λn − λ
n=j̸

where cj is any arbitrary constant.

Example 22.6. The Fredholm equation

u(x) = x + 4π 2 ∫ K(x, s)u(s) ds


0

with

s(1 − x), 0≤s≤x


K(x, s) = {
x(1 − s), x≤s≤1
546 | 22 Introduction to the solution of linear integral equations

is not solvable since λ = 4π 2 is an eigenvalue of the kernel with eigenfunction


√2 sin(2πx) and

1
−1
∫ x.√2 sin(2πx) dx = ≠ 0.
√2π
0

Example 22.7. The Fredholm equation

u(x) = sin(3πx) + 4π 2 ∫ K(x, s)u(s) ds


0

with kernel K(x, s) same as in Example 22.6, is solvable with but the solution is not
unique. It may be shown that

9
u= sin(3πx) + c2 sin(2πx)
5

is a solution for any c2 .

Problems
1. Apply the IE method and the Adomian decomposition method to solve vector form
of one-dimensional diffusion–reaction model with linear kinetics.
2. Consider the Fredholm integral equation of the first kind

f (x) = ∫ K(x, s)u(s) ds (1)


a

with a degenerate kernel of the form

N
K(x, s) = ∑ ai (x)bi (s)
i=1

where {ai (x), i = 1, 2 . . . , N} and {bi (s), i = 1, 2, . . . , N} are linearly independent sets.
(a) Reason that the equation does not have solution unless the function f (x) can be
expressed as a linear combination of ai (x), (b) Reason that when equation (1) has
a solution, it is not unique, i. e., there could be infinitely many solutions, (c) Con-
sider equation (1) with a continuous kernel but not separable and continuous
f (x). Is the solution also continuous? Comment on the possible types of solutions,
(d) How do the results in (b) and (c) change if the kernel is also symmetric?
3. (a) Determine the eigenvalues, eigenfunctions and the resolvent kernel for the
Fredholm equation
22.11 Solution of FIE of the second kind with symmetric kernels | 547

u(x) = f (x) + λ ∫ [xs + x 2 s2 ]u(s) ds


−1

(b) Determine the solution.


4. Consider the axial dispersion model

1 d2 c dc
− − Da R(c) = 0; 0 < x < 1
Pe dx 2 dx
1 dc
= c − 1@x = 0
Pe dx
dc
= 0@x = 1
dx

where R(c) is the dimensionless reaction rate and Pe and Da are the Peclet and
Damkohler numbers, respectively. (a) Convert the boundary value problem into a
Fredholm integral equation and (b) Solve the equation in (a) for the case of linear
kinetics using the Neumann series method and determine the exit concentration
as a function of Da up to quadratic terms.
5. Consider the Volterra integral equation for human population N(t) at time t:

N(t) = N0 f (t) + k ∫ f (t − τ)N(τ)dτ


0

where f (t) is the survival function and k is a constant describing the rate of popu-
lation variation per capita [or the birth rate is k times N(t)] (a) Solve the equation
assuming a survival function of the form

t
f (t) = exp[− ]
T

where T is the average life span of a person (b) Use the result in (a) to show that
the population increases exponentially if kT > 1 and decreases exponentially if
kT < 1.
|
Part V: Fourier transforms and solution of boundary
and initial-boundary value problems
23 Finite Fourier transforms
Finite Fourier Transform (FFT) and its various extensions is the most important tool
available for scientists and engineers to solve many practical problems. The concepts
of FFT appear in the analysis of time series, spatial profiles, length and time scales,
data analysis and compression, development of numerical algorithms and so forth. In
this chapter, we discuss mainly one application of FFT, namely, the solution of linear
boundary and initial–boundary value problems (partial differential equations).

23.1 Definition and general properties


Let {λj , wj (x), ρ(x)}, j = 1, 2, 3, . . . be the eigenvalues and the corresponding eigenfunc-
tions of a self-adjoint eigenvalue problem in the interval [a, b]. Assume that the eigen-
functions are normalized so that
b

∫ ρ(x)wi (x)wj (x) dx = δij (23.1)


a

The eigenfunctions {wj (x)} form a basis for ℒ2 [a, b], the Hilbert space of Lebesque
square integrable (real valued) functions defined on the interval [a, b]. If f (x) ∈
ℒ2 [a, b], we have

∫ ρ(x)f (x)2 dx < ∞ (23.2)


a

and we can write



f (x) = ∑ ci wi (x) (23.3)
i=1

where
b

cj = ⟨f , wj ⟩ = ∫ ρ(x)f (x)wj (x) dx (23.4)


a

and

∞ b

∑ ci2 = ∫ ρ(x)f (x)2 dx = ‖f ‖2 (Parseval’s relation) (23.5)


j=1 a

and the integrals are all defined in the Lebesque sense. The expansion given by equa-
tion (23.3) converges in ℒ2 [a, b], i. e., the LHS and RHS of equation (23.3) are equal al-

https://doi.org/10.1515/9783110739701-024
552 | 23 Finite Fourier transforms

most everywhere (except perhaps on a set of measure zero in [a, b]). Equations (23.3)
and (23.4) define the finite Fourier transform, i. e., given any f (x) ∈ ℒ2 [a, b] we define
the Finite Fourier Transform (FFT) of f (x) to be the infinite sequence of constants {ci },
which give the coordinates of f (x) in the Hilbert space ℒ2 [a, b]. We write

ℱ {f (x)} = FFT of f (x)


= ⟨f , wi ⟩
= ci (23.6)

The inverse transform uses the coordinates {ci } to reconstruct the function (vector)
f (x). Thus,

ℱ {ci } = ∑ ci wi (x) = f (x) (23.7)
−1

i=1

Thus,

ℱℱ
−1
= ℱ −1 ℱ = identity. (23.8)

The finite Fourier transform may be used to simplify and solve many linear differential
equations in which the spatial self-adjoint operator 𝕃 (whose eigenvalues and eigen-
functions are λi and wi (x), respectively) appears. We outline below the general proce-
dure and illustrate with several examples.

23.1.1 Example 1 (solution of Poisson’s equation)

Consider the general form of Poisson’s equation:

𝕃u = f (23.9)

where 𝕃 = 𝕃∗ is a symmetric or self-adjoint operator. Let λn be eigenvalues of 𝕃 with


normalized eigenfunctions wn , i. e., 𝕃wn = λn wn . Then FFT of equation (23.9) gives

⟨𝕃u, wn ⟩ = ⟨f , wn ⟩ 󳨐⇒ ⟨u, 𝕃wn ⟩ = ⟨f , wn ⟩


󳨐⇒ λn ⟨u, wn ⟩ = ⟨f , wn ⟩
1
󳨐⇒ ⟨u, wn ⟩ = ⟨f , wn ⟩ if λn ≠ 0 (23.10)
λn
󳨐⇒
∞ ∞
1
u = ∑ ⟨u, wn ⟩wn = ∑ ⟨f , wn ⟩wn (23.11)
n=1 λ
n=1 n

is the formal solution.


23.1 Definition and general properties | 553

Remarks. (a) If 𝕃 is an operator in two or three spatial dimensions, the sum may be
a double or triple sum, (b) If the BCs are inhomogeneous, the solution will have ad-
ditional terms and (c) In most of our applications, 𝕃 is usually an elliptic differential
operator such as the Laplacian operator, i. e., 𝕃 = −∇2 .

23.1.2 Example 2 (solution of heat/diffusion equation)

Consider the heat/diffusion equation:

𝜕u
= −𝕃u, t > 0; u = f @t = 0 (23.12)
𝜕t

with a self-adjoint operator 𝕃. Taking FFT 󳨐⇒

𝜕
⟨u, wn ⟩ = −λn ⟨u, wn ⟩, t > 0; (23.13)
𝜕t
⟨u, wn ⟩ = ⟨f , wn ⟩ @ t = 0 (23.14)
∞ ∞
u = ∑ ⟨u, wn ⟩wn = ∑ exp(−λn t)⟨f , wn ⟩wn (23.15)
n=1 n=1

is the formal solution. In addition, when the operator 𝕃 is a 2/3D spatial operator, the
sum may be a double or triple sum.

23.1.3 Example 3 (solution of the wave equation)

Consider the wave equation:

𝜕2 u
= −𝕃u, t > 0; (23.16)
𝜕t 2
u = f @ t = 0 (initial position) (23.17)
𝜕u
= g@t = 0 (initial velocity) (23.18)
𝜕t

with a self-adjoint operator 𝕃. Taking FFT, we get

𝜕2
⟨u, wn ⟩ = −λn ⟨u, wn ⟩, t > 0;
𝜕t 2
⟨u, wn ⟩ = ⟨f , wn ⟩ @ t = 0
𝜕
⟨u, wn ⟩ = ⟨g, wn ⟩ @ t = 0
𝜕t
󳨐⇒
554 | 23 Finite Fourier transforms

1
⟨u, wn ⟩ = ⟨f , wn ⟩ cos[√λn t] + ⟨g, wn ⟩ sin[√λn t] (23.19)
√λn

󳨐⇒
∞ ∞
1
u = ∑ ⟨f , wn ⟩ cos[√λn t]wn + ∑ ⟨g, wn ⟩ sin[√λn t]wn (23.20)
n=1 n=1 √λ n

is the formal solution.


We now illustrate the application of FFT with specific examples in the rest of this chap-
ter.

23.2 Application of FFT for BVPs in 1D


23.2.1 Example 1 (Poisson’s equation in 1-D)

Consider the boundary value problem

d2 u
= −f (x), 0 < x < 1 (23.21)
dx 2
u(0) = 0, u(1) = 0 (23.22)

To solve, we consider the operator 𝕃 defined by the self-adjoint eigenvalue problem

d2 w
𝕃w = − = λw, 0<x<1 (23.23)
dx2
w(0) = w(1) = 0 (23.24)

We have the eigenvalues and normalized eigenfunctions

λn = n2 π 2 , wn (x) = √2 sin nπx (23.25)

Taking FFT of equations (23.21)–(23.22), i. e., multiplying by wj (x) and integrating from
0 and 1, gives

d2 u
⟨ , w ⟩ = ⟨−f , wj ⟩
dx 2 j

du dwj x=1 d 2 wj
( wj − u ) + ⟨u, ⟩ = ⟨−f , wj ⟩
dx dx x=0 dx 2

The first term (or the concomitant) vanishes as both u(x) and wj (x) satisfy the homo-
geneous boundary conditions. Thus, we have
23.2 Application of FFT for BVPs in 1D | 555

d 2 wj
⟨u, ⟩ = ⟨−f , wj ⟩
dx 2

−λj ⟨u, wj ⟩ = −⟨f , wj ⟩

󳨐⇒

⟨f , wj ⟩
⟨u, wj ⟩ = . (23.26)
λj

Taking the inverse transform gives



u = ∑⟨u, wj ⟩wj
j=1
∞ ⟨f , wj ⟩
=∑ wj
j=1
λj
1
∞ √2 sin jπx
=∑ ∫ f (ξ )√2 sin jπξ dξ (23.27)
j=1
j2 π 2
0

We consider some special cases of this solution.

Special case 1: f (x) = 1


For f (x) = 1,

1
2√ 2
, j odd
∫ f (ξ )√2 sin jπξ dξ = { jπ
0, j even
0

󳨐⇒ Equation (23.27) can be simplified as

4 ∞ sin[(2k − 1)πx]
u(x) = ∑ . (23.28)
π 3 k=1 (2k − 1)3

For f (x) = 1, equations (23.21)–(23.22) can also be solved by integrating twice, which
leads to another form of the solution
1
u(x) = x(1 − x). (23.29)
2

In other words, equation (23.28) is the Fourier series expansion of equation (23.29).
The exact solution (equation (23.29)) and Fourier series expansion (equation (23.28))
with only two terms are plotted in Figure 23.1.
556 | 23 Finite Fourier transforms

Figure 23.1: Solution of Poisson’s equation with f (x) = 1: exact solution compared with the two
terms of the Fourier series solution.

It can be seen from this figure that the Fourier series solution with only two terms
represents the solution accurately in this example. The maximum value predicted by
Fourier series solution with only two terms is 0.124 as compared to 0.125 predicted by
the exact solution.
Consider again the solution given by equation (23.27)

1 1
∞ √2 sin jπx
u=∑ ∫ f (ξ )√2 sin jπξ dξ = ∫ G(x, ξ )f (ξ ) dξ ,
j=1
j2 π 2
0 0

where
∞ w (x)w (ξ )

2 sin jπx sin jπξ j j
G(x, ξ ) = ∑ = ∑ (23.30)
j=1
j2 π 2 j=1
λ j

is the Green’s function of 𝕃. The Green’s function may also be written as

(1 − x)ξ , 0<ξ <x


G(x, ξ ) = { (23.31)
x(1 − ξ ), x < ξ < 1.

Special case 2: f (x) = δ(x − 21 )


For f (x) = δ(x− 21 ), a unit point source at the center of the domain, the direct integration
method gives the solution of equations (23.21)–(23.22) as follows:

x 1
2
, 0≤x≤ 2
u(x) = { 1
(23.32)
(1−x)
2
, 2
≤ x ≤ 1.

Alternatively, the solution (equation (23.27)) obtained from the eigenfunction expan-
sion method simplifies as follows:
23.2 Application of FFT for BVPs in 1D | 557

√2 sin jπx
√2 sin jπ

u=∑
j=1
j2 π 2 2

2 ∞ sin[(2k − 1)πx]
= ∑ (−1)k−1 . (23.33)
π 2 k=1 (2k − 1)2

Again, both expressions are equivalent. The exact solution and two terms FFT solution
are plotted in Figure 23.2.

Figure 23.2: Solution of Poisson’s equation with f (x) = δ(x − 21 ): exact solution compared with the
two terms Fourier series solution.

It can be seen from Figure 23.2 again that only two terms can predict the solution with
good accuracy. The largest error occurs in this case at maxima (x = 21 ), where exact
value is uexact = 41 . This error can be reduced further by including more number of
terms in the FFT solution as shown in Figure 23.3.

Remarks.
(1) If the interval is (0, a), the eigenvalues are modified to

n2 π 2
λn = (23.34)
a2

and the normalized eigenfunctions to

2 nπx
wn (x) = √ sin( ) (23.35)
a a

and the solution is given by

nπx a
2a ∞ sin( a ) nπξ
u(x) = 2 ∑ ∫ f (ξ ) sin( ) dξ (23.36)
π n=1 n2 a
0
558 | 23 Finite Fourier transforms

Figure 23.3: Demonstration of the convergence of Fourier series expansion with the solution of Pois-
son’s equation with f (x) = δ(x − 21 ).

(2) The solution of the inhomogeneous problem

d2 u
= −f (x), 0 < x < 1 (23.37)
dx 2
u(0) = α1 , u(1) = α2 (23.38)

may be obtained by writing

u(x) = u1 (x) + u2 (x) (23.39)

where

d 2 u1
= −f (x) (23.40)
dx2
u1 (0) = 0, u1 (1) = 0 (23.41)

and u2 satisfies the homogeneous equation with nonhomogeneous boundary con-


ditions, i. e.,

d 2 u2
=0 (23.42)
dx2
u1 (0) = α1 , u2 (1) = α2 (23.43)

Solving equations (23.42) and (23.43) gives

u2 = (α2 − α1 )x + α1 (23.44)

while u1 (x) is given by equation (23.27).


23.2 Application of FFT for BVPs in 1D | 559

23.2.2 Example 2: higher-order boundary value problems (coupled equations) in 1D

Consider the vector equation

d2 u
− Au = −f(x); u(0) = 0 = u(1) (23.45)
dx2

where u and f(x) are vectors with n-components and A is a constant n × n matrix. For
n = 2, let aij {i, j = 1, 2} be real constants and consider the BVP in two variables as
defined by

d 2 u1
dx 2
− a11 u1 − a12 u2 = −f1 (x) }
2 }, 0<x<1 (23.46)
d u2
dx 2
− a21 u1 − a22 u2 = −f2 (x) }

with homogeneous Dirichlet boundary conditions

uj (0) = uj (1) = 0; j = 1, 2. (23.47)

Let

cim = ⟨ui , wm ⟩ = FFT of ui (23.48)

Taking FFT, we get

−m2 π 2 c1m − a11 c1m − a12 c2m = −f1m


−m2 π 2 c2m − a21 c1m − a22 c2m = −f2m

󳨐⇒

a11 + m2 π 2 a12 c f
( ) ( 1m ) = ( 1m )
a21 a22 + m2 π 2 c2m f2m
Am cm = fm , Am = A + m2 π 2 I. (23.49)

We can solve these linear algebraic equations by the biorthogonal expansion

2
y∗km fm 1
cm = ∑ xkm (23.50)
y∗ x λ
k=1 km km km

where λkm , xkm and y∗km are eigenvalues, eigenvectors and eigenrows of Am . Taking the
inverse Fourier transform of equation (23.50), we get
560 | 23 Finite Fourier transforms


u(x) = ∑ cm wm (x)
m=1
2

y∗km fm 1
= ∑ ∑ √2 sin mπx x (23.51)
m=1 k=1 y∗km xkm λkm km

where
1
f1 (ξ ) √
fm = ∫ [ ] 2 sin mπξ dξ . (23.52)
f2 (ξ )
0

Various special cases of this solution may be examined as in the previous example.

23.3 FFT for parabolic, hyperbolic and elliptic PDEs (two


independent variables)
23.3.1 Example 3: heat/diffusion equation in a finite domain

Consider the heat equation in dimensionless form

𝜕u 𝜕2 u
= 2; 0 < x < 1, t>0 (23.53)
𝜕t 𝜕x
with homogeneous boundary conditions

u(0, t) = 0, u(1, t) = 0 (23.54)

and the initial condition

u(x, 0) = f (x). (23.55)

Once again, the relevant spatial operator is

d2 w
𝕃w : − , w(0) = 0, w(1) = 0, (23.56)
dx2
which has eigenvalues

λ n = n2 π 2 (23.57)

and normalized eigenfunctions:

wn (x) = √2 sin nπx (23.58)

Taking inner product of equations (23.53)–(23.55) with wn (x) gives


23.3 FFT for parabolic, hyperbolic and elliptic PDEs (two independent variables) | 561

d 𝜕2 u
⟨u, wn ⟩ = ⟨ 2 , wn ⟩
dt 𝜕x
x=1
𝜕2 w n 𝜕u dwn
= ⟨u, ⟩ + ( w − u )
𝜕x 2 𝜕x n dx x=0
= ⟨u, −λn wn ⟩
= −λn ⟨u, wn ⟩

(Remark: The concomitant vanishes again as both u(x, t) and wn (x) satisfy the homoge-
neous boundary conditions.) Integrating this equation using initial condition, we get

⟨u, wn ⟩ = ⟨f , wn ⟩e−λn t (23.59)

Thus,

u(x, t) = ∑ ⟨f , wn ⟩e−λn t wn (x) (23.60)
n=1

is the solution. Substituting for λn and wn gives

∞ 1
2 2
u(x, t) = ∑ (√2 sin nπx)e−n π t ∫ f (ξ )√2 sin nπξ dξ
n=1 0

󳨐⇒

∞ 1
−n2 π 2 t
u(x, t) = 2 ∑ e sin nπx ∫ f (ξ ) sin nπξ dξ (23.61)
n=1 0

We now examine some special cases of solution given by equation (23.61).

Special case 1: f (ξ) = 1 ⇒

1 1
cos nπξ 󵄨󵄨󵄨󵄨
∫ f (ξ ) sin nπξ dξ = − 󵄨󵄨
nπ 󵄨󵄨0
0
1 − cos nπ
=

0 if n = 2k
={ 2
(2k−1)π
if n = 2k − 1, k = 1, 2, . . .
2 2
4 ∞ e−(2k−1) π t sin(2k − 1)πx
⇒ u(x, t) = ∑ (23.62)
π k=1 (2k − 1)

1
The maximum value of u (umax ) occurs at x = 2
and is given by
562 | 23 Finite Fourier transforms

2 2
1 4 ∞ (−1)k−1 e−(2k−1) π t
umax = u( , t) = ∑
2 π k=1 (2k − 1)
4 −π 2 t 1 −9π 2 t 1 −25π 2 t 4 2
= [e − e + e − ⋅ ⋅ ⋅] ≈ e−π t for t → ∞ (23.63)
π 3 5 π

The spatial average value of u (⟨u⟩) is given by

1 2 2
4 ∞ e−(2k−1) π t 1
⟨u⟩ = ∫ u dx = ∑ [− cos(2k − 1)πx]0
π 2 k=1 (2k − 1)2
0
2 2
8 ∞ e−(2k−1) π t 8 2 1 2 1 2
= ∑ = 2 [e−π t + e−9π t + e−25π t + ⋅ ⋅ ⋅] (23.64)
π 2 k=1 (2k − 1)2 π 9 25

The profiles for different times are shown in Figure 23.4. From equation (23.64), it may
be observed that for t > 0.0016, one term is sufficient to estimate the average value of
u(x, t) to within 5 %.

Figure 23.4: Dimensionless temperature distribution and its evolution with time with uniform initial
source.

Special case 2: f (x) = sin[mπx], m = 1, 2, . . . 󳨐⇒

1
1, n=m
2 ∫ f (ξ ) sin nπξ dξ = δnm = {
0, n ≠ m
0

Equation (23.61) 󳨐⇒
∞ 2 2 2 2
u(x, t) = ∑ e−n π t sin nπxδnm = e−m π t sin[mπx] (23.65)
n=1

Thus, when the initial condition corresponds to a mode represented by an eigenfunc-


tion, the final solution remains in the same mode with amplitude decaying exponen-
tially in time [with reciprocal of the eigenvalue as the decay constant]. The solutions
corresponding to m = 1 and m = 2 are shown in Figure 23.5.
23.3 FFT for parabolic, hyperbolic and elliptic PDEs (two independent variables) | 563

Figure 23.5: Solution of heat equation with initial conditions given by f (x) = sin[mπx], m = 1, 2.

Special case 3: f (x) = δ(x − s) 󳨐⇒

∞ 2 2
u(x, t) = 2 ∑ e−n π t sin nπx sin nπs (23.66)
n=1

1
Take s = 2
(mid-point of the interval)
󳨐⇒

nπ 0, n even
sin ={
2 ±1, n odd

Take n = 2k + 1, k = 0, 1, 2, . . .
󳨐⇒

(2k + 1)π π
sin{ } = sin(kπ + )
2 2
= (−1)k


∞ 2 2
u(x, t) = 2 ∑ (−1)k e−(2k+1) π t sin[(2k + 1)πx]. (23.67)
k=0

The spatial average value of u(x, t) is given by


564 | 23 Finite Fourier transforms

4 ∞ (−1)k −(2k+1)2 π 2 t
⟨u⟩ = ∑ e . (23.68)
π k=0 (2k + 1)

The dimensionless flux is given by

∞ 2 2
𝜕u
= 2π ∑ (−1)k e−(2k+1) π t [cos(2k + 1)πx] ⋅ (2k + 1)π. (23.69)
𝜕x k=0

Evaluating the dimensionless flux at the left boundary, we have

𝜕u 󵄨󵄨󵄨󵄨 ∞ 2 2
󵄨󵄨 = 2π ∑ (−1)k e−(2k+1) π t ⋅ (2k + 1)π
𝜕x 󵄨x=0󵄨
k=0
∞ 2 2
= 2 ∑ (−1)k (2k + 1)e−(2k+1) π t

k=0
2 2 2
= 2[e−π t − 3e−9π t + 5e−25π t − ⋅ ⋅ ⋅] (23.70)

The peak value of the concentration/temperature is at the center and is given by

1 ∞ 2 2
u( , t) = 2 ∑ e−(2k+1) π t
2 k=0
2 2 2
= 2[e−π t + e−9π t + e−25π t + ⋅ ⋅ ⋅] (23.71)

This series converges for all t > 0. However, the convergence may be very slow for t
values close to zero. Some schematic profiles are shown in Figure 23.6.

Figure 23.6: Dimensionless temperature distribution and its evolution with time with point initial
source.
23.3 FFT for parabolic, hyperbolic and elliptic PDEs (two independent variables) | 565

23.3.2 Example 4: Green’s function for the heat/diffusion equation in a finite


domain

Consider the heat/diffusion equation with a source term

𝜕u 𝜕2 u
− = f (x, t), 0 < x < 1, t>0 (23.72)
𝜕t 𝜕x 2
IC:

u(x, 0) = 0 (23.73)

BCs:

u(0, t) = u(1, t) = 0 (23.74)

Let λn = n2 π 2 and wn (x) = √2 sin nπx be the eigenvalues and normalized eigenfunc-
2
tions of the operator − ddxw2 with w(0) = w(1) = 0. Let

un = ⟨u, wn ⟩
fn = ⟨f , wn ⟩

Taking FFT of equations (23.72)–(23.74) gives

dun
+ n2 π 2 un = fn (t) (23.75)
dt
un = 0 @ t = 0 (23.76)

Integrating equations (23.75)–(23.76), we get

t
n2 π 2 t 2 2 ′
un e = ∫ en π t fn (t ′ ) dt ′ + cn
0

Equation (23.76) gives cn = 0. Thus,

t
2 2
un = ∫ en π (t ′ −t)
fn (t ′ ) dt ′
0

󳨐⇒

∞ t 1
2 2
u = ∑ √2 sin nπx ∫ en π (t ′ −t)
[∫ √2f (s, t ′ ) sin nπs ds] dt ′
n=1 0 0

󳨐⇒
566 | 23 Finite Fourier transforms

∞ t 1
2 2
u(x, t) = 2 ∑ sin nπx ∫ ∫ en π (t ′ −t)
f (s, t ′ ) sin nπs ds dt ′ (23.77)
n=1 0 0
t 1

= ∫ ∫ G(x, s, t, t ′ )f (s, t ′ ) ds dt ′ (23.78)


0 0

where
∞ 2 2
G(x, s, t, t ′ ) = 2 ∑ en π (t ′ −t)
⋅ sin nπx sin nπs (23.79)
n=1

is Green’s function. We note that if

f (x, t) = δ(x − ξ )δ(t − τ)

then

u(x, t) = G(x, ξ , t, τ).

Thus, G(x, ξ , t, τ) is the temperature (or concentration) at position x and time t due to
a unit source at position ξ at time τ(t > τ). We now consider some special cases of the
solution given above.
(i) f (x, t) = g1 (x)δ(t)
For this case, the solution is given by

∞ 1
2 2
u(x, t) = ∑ e−n π t (2 sin nπx) ∫ g1 (s) sin nπs ds (23.80)
n=1 0

This solution is identical to the solution obtained when we take the initial condi-
tion to be u(x, 0) = g1 (x)
(ii) f (x, t) = g2 (t)δ(x − x0 )

This corresponds to a point source at x = x0 whose magnitude g2 (t) varies with time.
For this case, the solution simplifies to

∞ t
2 2
u(x, t) = 2 ∑ (sin nπx)(sin nπx0 ) ∫ en π (t ′ −t)
g2 (t ′ ) dt ′ . (23.81)
n=1 0

23.3.3 Example 5: heat/diffusion equation in the finite domain with time


dependent boundary condition

Consider

𝜕u 𝜕2 u
= 2 0 < x < 1, t>0 (23.82)
𝜕t 𝜕x
23.3 FFT for parabolic, hyperbolic and elliptic PDEs (two independent variables) | 567

IC:

u(x, 0) = 0 (23.83)

BCs:

u(0, t) = f (t) u(1, t) = 0 (23.84)

Taking FFT of equations (23.82)–(23.83), we get


1
dun 𝜕2 u
= ∫ 2 ⋅ wn (x) dx
dt 𝜕x
0
1 x=1
d 2 wn 𝜕u
= ∫u dx + [ ⋅ wn − uwn′ (x)]
dx 2 𝜕x x=0
0
𝜕u
= − n2 π 2 un + (1, t)wn (1) − u(1, t)wn′ (1)
𝜕x
𝜕u
− (0, t)wn (0) + u(0, t)wn′ (0)
𝜕x
󳨐⇒
dun
= −n2 π 2 un + f (t)wn′ (0)
dt
with

wn = √2 sin nπx, wn′ = √2nπ cos nπx 󳨐⇒ wn′ (0) = √2nπ (23.85)


dun
= −n2 π 2 un + √2nπf (t) (23.86)
dt
un = 0 @ t = 0 (23.87)

󳨐⇒
t
2 2
un = ∫ e−n π (t−t ′ ) √
2nπf (t ′ ) dt ′ (23.88)
0

󳨐⇒

∞ t
2 2
u(x, t) = ∑ √2 sin nπx ∫ e−n π (t−t ′ ) √
2nπf (t ′ ) dt ′
n=1 0
t ∞
2 2
= ∫ ∑ (2nπ sin nπx)e−n π (t−t ′ )
f (t ′ ) dt ′ (23.89)
0 n=1
568 | 23 Finite Fourier transforms

Special case:

1, t>0
f (t) = H(t) = {
0, t<0
t 2 2 2 2 ′
t 󵄨󵄨t
2 2 e−n π t ⋅ en π 1 −n2 π 2 t
∫ e−n π (t−t ′ ) 󵄨󵄨
dt ′ = 󵄨󵄨 = 2 2 [1 − e ] (23.90)
n2 π 2 󵄨󵄨0 n π
0

󳨐⇒

2 sin nπx 2 2
u(x, t) = ∑ [1 − e−n π t ]
n=1 nπ

󳨐⇒

2 sin nπx
u(x, ∞) = ∑ =1−x
n=1 nπ

󳨐⇒

t 2 sin nπx
∞ 2 2
u(x, t) = 1 − x − ∑ e−n π (23.91)
n=1 nπ
2 2
2 ∞ e−n π t sin nπx
= us (x) − ∑ (23.92)
π n=1 n

where us (x) is the steady-state profile. The profiles at various times are shown in Fig-
ure 23.7.

Figure 23.7: Dimensionless temperature distribution and steady-state profile for sudden rise of
temperature to unity at the left boundary.
23.3 FFT for parabolic, hyperbolic and elliptic PDEs (two independent variables) | 569

23.3.4 Example 6: heat/diffusion equation in a finite domain with general initial


and boundary conditions

Consider

𝜕u 𝜕2 u
= 2, 0<x<1 (23.93)
𝜕t 𝜕x
BCs:

u(0, t) = f1 (t), u(1, t) = f2 (t) (23.94)

IC:

u(x, 0) = g(x) (23.95)

We use the principle of superposition and write

u = u1 + u2 + u3 (23.96)

where

𝜕u1 𝜕2 u1 BCs: u1 (0, t) = f1 (t), u1 (1, t) = 0


= , { (23.97)
𝜕t 𝜕x 2 I. C.: u1 (x, 0) = 0
𝜕u2 𝜕2 u2 BCs: u2 (0, t) = 0, u2 (1, t) = f2 (t)
= , { (23.98)
𝜕t 𝜕x 2 I. C.: u2 (x, 0) = 0
𝜕u3 𝜕2 u3 BCs: u3 (0, t) = 0, u3 (1, t) = 0
= , { (23.99)
𝜕t 𝜕x 2 I. C.: u3 (x, 0) = g(x)

Each of these problems has been solved before.

23.3.5 Example 7 (wave equation)

Consider the wave equation

𝜕2 u 2
2𝜕 u
= c , 0 < x < 1, t>0 (23.100)
𝜕t 2 𝜕x2
BCs:

u(0, t) = 0, u(1, t) = 0 (fixed ends) (23.101)

ICs:

u(x, 0) = f (x) (initial displacement) (23.102)


570 | 23 Finite Fourier transforms

𝜕u
(x, 0) = g(x) (initial velocity) (23.103)
𝜕t

For the operator,

d2 w
, 0 < x < 1, w(0) = 0, w(1) = 0 (23.104)
dx2

the eigenvalues are

λn = −n2 π 2 , n = 1, 2, . . . (23.105)

while the normalized eigenfunctions are

wn (x) = √2 sin nπx (23.106)

Taking inner product of equations (23.100)–(23.103) with wn (x) gives

d2
⟨u, wn ⟩ = −c2 n2 π 2 ⟨u, wn ⟩ ⇒ ⟨u, wn ⟩ = c1n cos nπct + c2n sin nπct (23.107)
dt 2
IC1:

c1n = ⟨f , wn ⟩ (23.108)

IC2:

c2n ⋅ nπc = ⟨g, wn ⟩ (23.109)


⟨g, wn ⟩
⟨u, wn ⟩ = ⟨f , wn ⟩ cos nπct + sin nπct (23.110)
nπc

Taking inverse FFT gives

1 1

sin nπct
u(x, t) = ∑ √2 sin nπx[cos nπct ∫ f (ξ )√2 sin nπξ dξ + ∫ g(ξ )√2 sin nπξ dξ ]
n=1 nπc
0 0


1 1

sin nπct
u(x, t) = 2 ∑ sin nπx[cos nπct ∫ f (ξ ) sin nπξ dξ + ∫ g(ξ ) sin nπξ dξ ] (23.111)
n=1 nπc
0 0

Consider the special case in which

g(ξ ) = 0 (zero initial velocity) (23.112)


23.3 FFT for parabolic, hyperbolic and elliptic PDEs (two independent variables) | 571

f (ξ ) = α sin jπξ (initial displacement is in the shape of j-th eigenfunction).


(23.113)

The solution simplifies to

1
u(x, t) = 2 ⋅ sin jπx cos jπct ⋅ α
2
= (cos jπct)(α sin jπx) (23.114)

This is a pure mode of vibration, which is periodic in time with a period

2π 2
T= = (23.115)
jπc jc

and

jc
cyclic frequency = (23.116)
2

The solution profile for specific case of c = 1 and j = 10 is shown in Figure 23.8.

Figure 23.8: Solution of wave equation for c = 1 and initial displacement f (x) = sin[10πx].

[Remark: The cyclic frequency of mode j = 1 is called the fundamental frequency while
j ≥ 2 are referred to as overtones or harmonics.]

23.3.6 Example 8 (Poisson’s equation in 2-D)

Consider the equation

𝜕2 u 𝜕2 u
+ = −f (x, y); 0 < x < a; 0<y<b (23.117)
𝜕x2 𝜕y2
u = 0 on 𝜕Ω (23.118)
572 | 23 Finite Fourier transforms

This is the Poisson equation in a rectangle (Ω = (x, y) : 0 < x < a, 0 < y < b)
and represents the temperature (at steady-state) due to a source f (x, y). Equations
(23.117)–(23.118) contain two operators:

d2
𝕃1 : − , w(0) = 0, w(a) = 0
dx2

n2 π 2
eigenvalues λn = a2
eigenfunctions wn (x) = √ a2 sin( nπx
a
)

d2
𝕃2 : − , w(0) = 0, w(b) = 0
dy2

m2 π 2
eigenvalues λm = b2
eigenfunctions wm (y) = √ b2 sin( mπy
b
)
Consider the eigenvalue problem

𝜕2 w 𝜕2 w
+ 2 = −λw (23.119)
𝜕x 2 𝜕y
w(0, y) = 0, w(a, y) = 0 (23.120)
w(x, 0) = 0, w(x, b) = 0 (23.121)

To determine the eigenvalues and eigenfunction of (23.119)–(23.121), take inner prod-


uct with wn (x) 󳨐⇒

d2
− λn ⟨w(x, y), wn (x)⟩ + ⟨w(x, y), wn (x)⟩ = −λ⟨w(x, y), wn (x)⟩
dy2

Take the inner product again w. r. t. wm (y)


󳨐⇒

− λn ⟨⟨w, wn ⟩, wm ⟩ − λm ⟨⟨w, wn ⟩, wm ⟩ = −λ⟨⟨w, wn ⟩, wm ⟩ (23.122)

If wnm is an eigenfunction and λnm is an eigenvalue of (23.119)–(23.121), then equation


(23.122) 󳨐⇒

n2 m 2
λnm = λn + λm = π 2 ( + ) (23.123)
a2 b2
wnm (x, y) = wn (x)wm (y)
4 nπx nπy
=√ sin( ) sin( ) (23.124)
ab a b

These are the eigenvalues and eigenfunctions of the Laplacian operator


23.3 FFT for parabolic, hyperbolic and elliptic PDEs (two independent variables) | 573

𝜕2 w 𝜕2 w w(0, y) = 0 w(a, y) = 0
𝕃w : −( + 2 ), { (23.125)
𝜕x 2 𝜕y w(x, 0) = 0 w(x, b) = 0

in the domain Ω. Now, to solve equations (23.117)–(23.118) take inner product with wnm ,
󳨐⇒

⟨𝕃u, wnm ⟩ = −⟨f , wnm ⟩


⟨u, 𝕃∗ wnm ⟩ = −⟨f , wnm ⟩; 𝕃∗ = 𝕃
−λnm ⟨u, wnm ⟩ = ⟨f , wnm ⟩

󳨐⇒
⟨f , wnm ⟩
⟨u, wnm ⟩ =
λnm
󳨐⇒
∞ ∞
u = ∑ ∑ ⟨u, wnm ⟩wnm
n=1 m=1
∞ ∞
1 2 nπx mπy
=∑∑ sin( ) sin( )⟨f , wnm ⟩
2 n
n=1 m=1 π ( a2
2
+ m2
) √ab a b
b2
a b
2 nπξ mπη
⟨f , wnm ⟩ = ∫ ∫ f (ξ , η) sin( ) sin( ) dξdη
√ab a b
0 0

󳨐⇒

nπx mπy a b
4 ∞ ∞ sin( a ) sin( b
) nπξ mπη
u(x, y) = ∑∑ ∫ ∫ f (ξ , η) sin( ) sin( ) dξdη.
abπ 2 n=1 m=1 2 2
( n2 + m2 ) a b
a b 0 0
(23.126)

We now consider some special cases of this solution.

Special case 1: f (x, y) = 1


For special case of f (x, y) = 1, the integration term in equation (23.126) simplifies to

a b
nπξ mπη
I = ∫ ∫ f (ξ , η) sin( ) sin( ) dξdη
a b
0 0
a b
nπξ mπη
= ∫ sin( ) dξ ∫ sin( )dη
a b
0 0
4ab
nmπ 2
, n and m odd
={
0, if n or m even

󳨐⇒ from equation (23.126)


574 | 23 Finite Fourier transforms

16 ∞ ∞ sin[ (2i−1)πx
a
] sin[ (2j−1)πy
b
]
u(x, y) = 4
∑ ∑ (23.127)
π i=1 j=1 ( 2 + 2 )(2i − 1)(2j − 1)
(2i−1) 2 (2j−1) 2

a b

which further implies the solution for a = 1 and b = 1 (unit square) as

16 ∞ ∞ sin[(2i − 1)πx] sin[(2j − 1)πy]


u(x, y) = ∑∑ (23.128)
π 4 i=1 j=1 ((2i − 1)2 + (2j − 1)2 )(2i − 1)(2j − 1)

The solution (equation (23.128)) is plotted in Figure 23.9.

Figure 23.9: Solution of 2D Poisson’s equation in a square domain with f (x, y) = 1.

The top diagram shows a 3D plot of the solution in xy domain with 50×50 = 2500 terms
in the summation, while the bottom diagram corresponds to u(x, y = 21 ) versus x using
the Fourier series expansion with 2 × 2 = 4 terms (green dashed line) and 50 × 50 terms
(blue solid lines). It can be seen from the bottom plot that the Fourier series solution
with only 2 × 2 terms is sufficient to predict the solution with good accuracy in this
case. The maximum value umax = u( 21 , 21 ) is 0.07219 using the 2 × 2 terms as compared
to 0.07367 using the 50 × 50 terms in summation.

Special case 2: f (x, y) = δ(x − 2a )δ(y − 2b )


Another special case of interest is a point source at the center of the domain, i. e.,

a b
f (x, y) = δ(x − )δ(y − ).
2 2
23.3 FFT for parabolic, hyperbolic and elliptic PDEs (two independent variables) | 575

The solution (equation (23.126)) can be simplified for this case as follows:

4 ∞ ∞ k+j sin[ ] sin[ (2j−1)πy


(2k−1)πx
a b
]
u(x, y) = 2
∑ ∑ (−1) , (23.129)
abπ n=1 m=1 [( 2k−1 2
) +( 2j−1 2
)]
a b

which can further be simplified for the case of a = 1 and b = 1 (unit square) as

4 ∞ ∞ sin[(2k − 1)πx] sin[(2j − 1)πy]


u(x, y) = ∑ ∑ (−1)k+j . (23.130)
π 2 n=1 m=1 [(2k − 1)2 + (2j − 1)2 ]

A plot of this solution is shown in Figure 23.10.

Figure 23.10: Solution of 2D Poisson’s equation in a square domain with f (x, y) = δ(x − 21 )δ(y − 21 ).

The left diagram shows a 3D plot of the solution and the top-right diagram shows a
contour plot of solution in xy domain with 50 × 50 = 2500 terms in the summation,
while the bottom-right diagram corresponds to u(x, y = 21 ) versus x using the Fourier
series expansion with 2 × 2 = 4 terms (green dashed line) and 50 × 50 terms (blue solid
lines). It can be seen from the bottom plot that the Fourier series solution with only
2 × 2 terms is sufficient to predict the solution in most of the domain. However, the
maximum error occurs at the center (i. e., x = 21 ), where the maximum value umax =
576 | 23 Finite Fourier transforms

u( 21 , 21 ) is 0.3062 using the 2 × 2 terms as compared to 0.81593 using the 50 × 50 terms


in summation.

23.4 Additional applications of FFT in rectangular coordinates


23.4.1 Example 9 (diffusion and reaction in a catalyst cube)

The dimensionless model describing the problem of diffusion–reaction in a catalyst


particle that is in the form of a cube may be expressed as

𝜕2 Y 𝜕2 Y 𝜕2 Y
+ + − Λ2 Y =0 (23.131)
𝜕ξ 2 𝜕η2 𝜕ρ2
𝜕Y
= 0 @ ξ = 0; Y = 1@ξ = 1 (23.132)
𝜕ξ
𝜕Y
= 0 @ η = 0; Y = 1@η = 1 (23.133)
𝜕η
𝜕Y
= 0 @ ρ = 0; Y = 1@ρ = 1 (23.134)
𝜕ρ

where Λ is a parameter known as the Thiele modulus [Remark: The above model ap-
V
plies to 81 th of a cube. If we define the normalized Thiele modulus as Φ2 = Dk ( S p )2 , then
p
2
Φ2 = Λ9 . Here, Vp is the volume of the particle and Sp is the external surface area.]
Consider the operator

𝜕2
𝕃1 : − , Y ′ (0) = 0, Y(1) = 0 (23.135)
𝜕ξ 2

Eigenvalues

π2
λn = (2n − 1)2 (23.136)
4

Eigenfunctions (normalized)

√2 cos{ (2n − 1)πξ } (23.137)


2

Define w = 1 − Y. Then equations (23.131)–(23.134)󳨐⇒

𝜕2 w 𝜕2 w 𝜕2 w
+ 2 + 2 + Λ2 − Λ2 w = 0 (23.138)
𝜕ξ 2 𝜕η 𝜕ρ
𝜕w
= 0 @ ξ = 0; w = 0@ξ = 1 (23.139)
𝜕ξ
23.4 Additional applications of FFT in rectangular coordinates | 577

𝜕w
= 0 @ η = 0; w = 0@η = 1 (23.140)
𝜕η
𝜕w
= 0 @ ρ = 0; w = 0@ρ = 1 (23.141)
𝜕ρ

Eigenvalue problem

𝜕2 w 𝜕2 w 𝜕2 w
+ 2 + 2 = −λw (23.142)
𝜕ξ 2 𝜕η 𝜕ρ

with BCs given by equation (23.139), (23.140), (23.141).


Eigenvalues

π2
λnml = [(2n − 1)2 + (2m − 1)2 + (2l − 1)2 ] (23.143)
4

Eigenfunctions (normalized)

(2n − 1)πξ (2m − 1)πη (2l − 1)πρ


wnm1 = 2√2 cos{ } cos{ } cos{ } (23.144)
2 2 2

Inner product of equations (23.138)–(23.141) with wnml 󳨐⇒

−λnml ⟨w, wnml ⟩ − Λ2 ⟨w, wnml ⟩ + Λ2 ⟨1, wnml ⟩ = 0

󳨐⇒

Λ2 ⟨1, wnml ⟩
⟨w, wnml ⟩ =
⟨Λ2 + λnml ⟩
1
(2n − 1)πξ
⟨1, wnml ⟩ = (∫ √2 cos dξ )
2
0
1 1
(2m − 1)πη (2l − 1)πρ
× (∫ √2 cos dη)(∫ √2 cos dρ)
2 2
0 0
n−1 m−1
2√2(−1) 2√2(−1) 2√2(−1)l−1
= ⋅ ⋅
(2n − 1)π (2m − 1)π (2l − 1)π
3 n+m+l−3
(2√2) (−1)
=
(2n − 1)(2m − 1)(2l − 1)π 3


2 n+m+l−3 πξ πη πρ
64 ∞ ∞ ∞ Λ (−1) cos[(2n − 1) 2 ] cos[(2m − 1) 2 ] cos[(2l − 1) 2
]
w(ξ , η, ρ) = ∑ ∑ ∑
π 3 n=1 m=1 l=1 (2n − 1)(2m − 1)(2l − 1)[Λ2 + λnml ]
(23.145)
578 | 23 Finite Fourier transforms

The effectiveness factor

1 1 1 1 1 1
̂ = ∫ ∫ ∫ Y dξ dη dρ = 1 − ∫ ∫ ∫ w dξ dη dρ
η (23.146)
0 0 0 0 0 0

󳨐⇒

512Λ2 ∞ ∞ ∞ 1
η
̂ =1− ∑∑∑ (23.147)
π n=1 m=1 l=1 (2n − 1) (2m − 1) (2l − 1)2 [Λ2 + λnml ]
6 2 2

Remarks.
1. Solution of the same problem in the square geometry (two-dimensional case for
2
which Φ2 = Λ4 ) gives

64Λ2 ∞ ∞ 1
η
̂ =1− ∑∑ (23.148)
π 4 n=1 m=1 (2n − 1)2 (2m − 1)2 [Λ2 + π 2 (2n − 1)2 + (2m − 1)2 ]
4

while for the slab geometry (one-dimensional case for which Φ2 = Λ2 ),

8Λ2 ∞ 1 tanh Λ
η
̂ =1− ∑ = (23.149)
2
π n=1 (2n − 1) [Λ +
2 2 π2
(2n − 1)2 ] Λ
4

A plot of the effectiveness factor for 1D, 2D and 3D solutions is shown in Fig-
ure 23.11.

Figure 23.11: Effectiveness factor for 1D, 2D and 3D diffusion–reaction problems in rectangular coor-
dinates.

2. The above formulae may be used to obtain the small and large Φ asymptotes for
all three cases. These asymptotes can also be visualized from Figure 23.11.
23.4 Additional applications of FFT in rectangular coordinates | 579

23.4.2 Example 10 (axial dispersion model)

Consider the 1D transient diffusion–convection model with a position and time de-
pendent source term:

𝜕c 𝜕c 𝜕2 c
+ ⟨u⟩ = D 2 + S(x, t), 0 < x < L, t>0 (23.150)
𝜕t 𝜕x 𝜕x

with inlet (Dankwerts) BCs

𝜕c
−D = ⟨u⟩[c0 (t) − c], @ x = 0, (23.151)
𝜕x

exit condition
𝜕c
= 0, @x = L
𝜕x

and initial condition (IC)

c = ci (x), 0 < x < L, t = 0. (23.152)

Let c∗ = a reference concentration. Define

x ⟨u⟩t
z= , τ= (23.153)
L L
c(x, t) ⟨u⟩L
C= , Pe = (23.154)
c ∗ D
󳨐⇒

𝜕C ⟨u⟩ ⟨u⟩c∗ 𝜕C Dc∗ 𝜕2 C L


c∗ ⋅ + = 2 + S(Lz, τ)
𝜕τ L L 𝜕z L 𝜕z 2 ⟨u⟩
󳨐⇒

𝜕C 𝜕C 1 𝜕2 C 1 L L
+ = + ∗ S(Lz, τ)
𝜕τ 𝜕z Pe 𝜕z 2 c ⟨u⟩ ⟨u⟩

Let

L Lτ
s(z, τ) = S(Lz, ) = dimensionless source term (23.155)
⟨u⟩c∗ ⟨u⟩

𝜕C 𝜕C 1 𝜕2 C
+ = + s(z, τ) (23.156)
𝜕τ 𝜕z Pe 𝜕z 2

BC1 ⇒
580 | 23 Finite Fourier transforms


D 𝜕C c0 ( ⟨u⟩ )
− = ⟨u⟩[ − C]
L 𝜕z c ∗

1 𝜕C 1 Lτ
− = ∗ c0 ( )−C
Pe 𝜕z c ⟨u⟩

1 𝜕C
− + C = ĉ0 (τ) @ z = 0 (23.157)
Pe 𝜕z
where

c0 ( ⟨u⟩ )
ĉ0 (τ) = (23.158)
c∗
BC2⇒
𝜕C
= 0@z = 1 (23.159)
𝜕z
IC⇒
ci (Lz)
C= = ĉi (z) @ τ = 0 (23.160)
c∗
Thus, the dimensionless form of the model is

𝜕C 𝜕C 1 𝜕2 C
+ = + s(z, τ) (23.161)
𝜕τ 𝜕z Pe 𝜕z 2
1 𝜕C
− C = −ĉ0 (τ) @ z = 0 (23.162)
Pe 𝜕z
𝜕C
= 0@z = 1 (23.163)
𝜕z
C = ĉi (z) @ τ = 0 (23.164)

The dimensionless group Pe is the Peclet number, which is the ratio of diffusion to
convection time scales. The problem defined by equations (23.161)–(23.164) cannot be
solved by the Laplace transform method except for some special cases of the source
function s(z, t) and initial conditions ĉi (z). We obtain a formal solution for the general
case using FFT, and examine various special cases.
The spatial operator appearing in equation (23.161) is not formally self-adjoint. To
put in a self-adjoint form, we define

Pe z
C = w(z, τ). exp[ ] (23.165)
2
23.4 Additional applications of FFT in rectangular coordinates | 581

𝜕C 𝜕w Pe z
= exp[ ]
𝜕τ 𝜕τ 2
𝜕C 𝜕w Pe z Pe Pe z
= exp[ ]+ w exp[ ]
𝜕z 𝜕z 2 2 2
𝜕2 C 𝜕2 w Pe z 𝜕w Pe z Pe2 Pe z
= exp[ ] + Pe exp[ ]+ w exp[ ]
𝜕z 2 𝜕z 2 2 𝜕z 2 4 2

⇒ Substituting in equation (23.161)

𝜕w 𝜕w Pe 1 𝜕2 w 𝜕w Pe2 Pe z
+ + w= [ 2 + Pe + w] + s(z, τ)e− 2
𝜕τ 𝜕z 2 Pe 𝜕z 𝜕z 4

⇒ the model becomes

𝜕w 1 𝜕2 w Pe
= − w + s∗ (z, τ) (23.166)
𝜕τ Pe 𝜕z 2 4
1 𝜕w w
− = −ĉ0 (τ) @ z = 0 (23.167)
Pe 𝜕z 2
1 𝜕w w
+ = 0@z = 1 (23.168)
Pe 𝜕z 2
Pe z
w = ci (z) exp(− )@τ = 0
2
Δ
= ŵ i (z) (23.169)

Consider the self-adjoint EVP

d2 ψ
= −Λ2 ψ, 0<z<1 (23.170)
dz 2
Pe Pe
ψ′ (0) − ψ(0) = 0, ψ′ (1) + ψ(1) = 0 (23.171)
2 2

Let Λ2n be the eigenvalues and ψn (z) be the eigenfunctions (normalized). Taking inner
product of equation (23.166) with ψn (z) gives

d 1 𝜕2 w Pe
⟨w, ψn ⟩ = ⟨ 2 , ψn ⟩ − ⟨w, ψn ⟩ + ⟨s∗ , ψn ⟩ (23.172)
dτ Pe 𝜕z 4

Now,

𝜕2 w 󵄨󵄨 𝜕w 󵄨1
󵄨󵄨 ′ 󵄨󵄨󵄨 2
⟨ , ψ ⟩ = 󵄨 ψ − wψ n 󵄨󵄨󵄨 + ⟨w, −Λn ψn ⟩
𝜕z 2 n 󵄨󵄨 𝜕z n
󵄨 󵄨0
𝜕w 𝜕w
= (1, τ)ψn (1) − w(1)ψ′n (1) − (0, τ)ψn (0) + w(0, τ)ψ′n (0) − Λ2n ⟨w, ψn ⟩
𝜕z 𝜕z
582 | 23 Finite Fourier transforms

Pe w(1, τ) − Pe ψn (1) Pe w(0, τ)


= − ψn (1) − w(1)[ ] − ψn (0)[ − Pe ĉ0 (τ)]
2 2 2
Pe
+ w(0, τ) ⋅ ψ − Λ2n ⟨w, ψn ⟩
2 n
= Pe ψn (0)ĉ0 (τ) − Λ2n ⟨w, ψn ⟩

Thus,

d Λ2 Pe
⟨w, ψn ⟩ = ψn (0)ĉ0 (τ) − n ⟨w, ψn ⟩ − ⟨w, ψn ⟩ + ⟨s∗ , ψn ⟩
dτ Pe 4

󳨐⇒

d Λ2 Pe
⟨w, ψn ⟩ + ( n + )⟨w, ψn ⟩ = ψn (0)ĉ0 (τ) + ⟨s∗ , ψn ⟩ (23.173)
dτ Pe 4

IC

⟨w, ψn ⟩ = ⟨ŵ i (z), ψn ⟩ @ τ = 0 (23.174)

Let

Λ2n Pe
μn = + (23.175)
Pe 4

τ τ

⟨w, ψn ⟩eμn τ = ∫ eμn τ ψn (0)c(τ


̂ ′ ) dτ′ + ∫ eμn τ ⟨s∗ (z, τ′ ), ψn ⟩ dτ′ + constant
′ ′

0 0

τ = 0 ⇒ constant = ⟨ŵ i (z), ψn ⟩

⟨w, ψn ⟩ = ⟨ŵ i (z), ψn ⟩e−μn τ + ∫ e−μn (τ−τ ) ψn (0)ĉ0 (τ′ ) dτ′


0
τ

+ ∫ e−μn (τ−τ ) ⟨s∗ (z, τ′ ), ψn ⟩ dτ′ (23.176)
0



w(z, τ) = ∑ ⟨w, ψn ⟩ψn (z) (23.177)
n=1
23.4 Additional applications of FFT in rectangular coordinates | 583

1
e−μn τ ∫0 ŵ i (z ′ )ψn (z ′ ) dz ′
Pe z
∞ [ ]
[ τ ′ ]
C(z, τ) = e 2 ∑ ψn (z) [ + ∫0 e−μn (τ−τ ) ψn (0)ĉ0 (τ′ ) dτ′ ] (23.178)
n=1
[ ]
τ ′ 1
[ + ∫0 e−μn (τ−τ ) (∫0 s∗ (z ′ , τ′ )ψn (z ′ ) dz ′ ) dτ′ ]

Equation (23.178) gives the general solution to the axial dispersion model. We note that
the first term is due to the initial condition, the second term is due to the inlet condition
and the third term is due to the source term. To evaluate this solution, we need to
determine the eigenvalues and normalized eigenfunctions and their dependence on
the Peclet number.

Eigenvalue problem

d2 ψ
= −Λ2 ψ, 0 < z < 1 (23.179)
dz 2
Pe Pe
ψ′ (0) = ψ(0), ψ′ (1) = − ψ(1) (23.180)
2 2
ψ = c1 sin Λz + c2 cos Λz
ψ′ = c1 Λ cos Λz − c2 Λ sin Λz

Pe
ψ′ (0) = c1 Λ, ψ(0) = c2 , BC1 󳨐⇒ c1 Λ = c
2 2

BC2 ⇒

Pe
c1 Λ cos Λ − c2 Λ sin Λ + [c sin Λ + c2 cos Λ] = 0
2 1

Pe Pe Pe Pe
c2 [ cos Λ − Λ sin Λ + ⋅ sin Λ + cos Λ] = 0
2 2 2Λ 2

c2 ≠ 0 ⇒

Pe2
Pe cos Λ + ( − Λ) sin Λ = 0

󳨐⇒

Λ Pe
cot Λ = − (Characteristic equation), (23.181)
Pe 4Λ

and the eigenfunctions:


584 | 23 Finite Fourier transforms

Pe sin Λn z
ψn (z) = c2 [ + cos Λn z]. (23.182)
2 Λn

[Note that equation (23.181) is identical to equation (17.172).]

Normalized eigenfunctions
The eigenfunction can be normalized and the constant c2 can be obtained by set-
ting

∫ ψn (z)2 dz = 1
0


1 1 1
1 Pe2 sin2 Λn z Pe
= ∫ cos2 Λn z dz + ∫ dz + ∫ cos Λn z sin Λn z dz
2
c2 4 Λ2n Λn
0 0 0
2
1 sin 2Λn Pe 1 sin 2Λn Pe cos 2Λn − 1
=( + )+ ( − )− ⋅
2 4Λn 4Λ2n 2 4Λn Λn 4Λn
sin Λn cos Λn Pe2 Pe sin2 Λn Pe2 1
= (1 − ) + ⋅ + +
2Λn 4Λ2n Λn 2Λn 8Λ2n 2

󳨐⇒ from equation (23.181) as

1 1 Pe2 Pe
= + + (sin2 Λn + cos2 Λn )
c2 2 8Λ2n 2Λ2n
2

1 Pe2 Pe
= + +
2 8Λ2n 2Λ2n

8Λ2n
󳨐⇒ c2 = √ (23.183)
Pe2 +4 Pe +4Λ2n

Determination of the roots of the characteristic equation


The characteristic equation (23.181), which is same as equation (17.172), can further be
simplified (by solving for Pe in terms of Λ) into more convenient form as

Λ
Pe tan[ Λ2 ]
= { 2Λ (23.184)
4 − 2 cot[ Λ2 ],

which can be used to determine the eigenvalue Λn for a given Peclet number Pe as
shown in Figure 23.12.
23.4 Additional applications of FFT in rectangular coordinates | 585

Figure 23.12: Roots of characteristic equation and determination of eigenvalues.

It can be seen from this figure that for any Pe, the n-th root Λn lies in the interval [(n −
1)π, nπ], i. e.,

(n − 1)π ≤ Λn ≤ nπ, n = 1, 2, 3, . . . for any Pe .

For large Pe (i. e., Pe ≫ 1), it can be shown that



Λn ≈ 4
, n = 1, 2, . . .
(1 + Pe
)

while for small Pe (i. e., Pe ≪ 1), we have

Λ1 ≈ √Pe and
Pe
Λn ≈ (n − 1)π + + ⋅⋅⋅, n = 2, 3, . . .
(n − 1)π

The numerical values of the first six roots of the characteristic equation (23.181), Λn are
tabulated for some Pe-values in Table 23.1.

Table 23.1: First six roots Λn of characteristic equations for some Pe values.

Pe Λ1 Λ2 Λ3 Λ4 Λ5 Λ6

0.01 0.09996 3.14477 6.28478 9.42584 12.5672 15.7086


0.1 0.31492 3.1731 6.29906 9.43538 12.5743 15.7143
1.0 0.96019 3.43101 6.4382 9.52962 12.6454 15.7714
2.0 1.30654 3.67319 6.58462 9.63168 12.7232 15.8341
5.0 1.86151 4.21275 6.97179 9.9186 12.9478 16.0176
10.0 2.28445 4.76129 7.46368 10.3266 13.2862 16.3031
20.0 2.62768 5.30732 8.06714 10.9087 13.8192 16.7827
100.0 3.0209 6.04265 9.06603 12.0918 15.1206 18.153
1000.0 3.12908 6.25815 9.38723 12.5163 15.6454 18.7745
586 | 23 Finite Fourier transforms

Similarly, for any value of Pe, the normalized eigenfunctions corresponding to these
eigenvalues can be determined easily. As an example, Figure 23.13 shows the plots of
normalized eigenfunctions for Pe = 10 corresponding to first six eigenvalues.

Figure 23.13: Normalized eignfunctions corresponding to the first six eigenvalues of the self-adjoint
form of axial dispersion operator for Pe = 10.

Specific solutions
We consider some specific cases of the general solution of the axial dispersion model:

∞ 1 τ
Pe z
∑ e−μn τ ψn (z)[ ∫ ŵ i (z ′ )ψn (z ′ ) dz ′ + ∫ eμn τ ψn (0)ĉ0 (τ′ ) dτ′

C(z, τ) = e 2

n=1 0 0
τ 1

+ ∫ eμn τ (∫ s∗ (z ′ , τ′ )ψn (z ′ ) dz ′ ) dτ′ ]



(23.185)
0 0

(1) Special case 1

ĉ0 (τ′ ) = 0 [c0 (t) = 0] (23.186)


ŵ i (z ) = 0

[ci (t) = 0] (23.187)
S(x, t) = c Lδ(x)δ(t)

(23.188)

i. e., unit source at inlet at time zero


L Lτ
s(z, τ) = c∗ Lδ(Lz)δ( )
⟨u⟩c∗ ⟨u⟩
L2 Lτ
= δ(Lz)δ( )
⟨u⟩ ⟨u⟩

Using the result,

1
δ(αz) = δ(z)
|α|
23.4 Additional applications of FFT in rectangular coordinates | 587

s(z, τ) = δ(z)δ(τ) (23.189)


− Pe2 z
s∗ (z, τ) = e δ(z)δ(τ), ŵ i (z ′ ) = 0, c(τ
̂ ′) = 0 (23.190)

Substituting equation (23.190) in equation (23.178) gives

C(z, τ) = response to a unit source at inlet at time zero


Pe z

=e 2 ∑ ψn (z)αn , (23.191)
n=1

where

τ 1
Pe z ′
−μn (τ−τ′ )
αn = ∫ e [∫ e− 2 ψn (z ′ )δ(z ′ ) dz ′ ]δ(τ′ ) dτ′
0 0
−μn τ
=e ψn (0). (23.192)

E(τ) = RTD curve = exit response to a unit-pulse input


= C(1, τ)
Pe

= e 2 ∑ e−μn τ ψn (1)ψn (0) (23.193)
n=1
Pe sin Λn
ψn (0) = c2 and ψn (1) = c2 [ ⋅ + cos Λn ]
2 Λn

Pe sin Λn
ψn (0)ψn (1) = c22 [ ⋅ + cos Λn ]
2 Λn
4 Pe Λ2n sin Λn 2Λ Λ Pe
= [1 + n ( n − )]
Pe 2
+4 Pe +4Λ2n Λn Pe Pe 4Λn
Λn sin Λn 8Λ2n
= ⋅ [2 Pe + ]
2
Pe +4 Pe +4Λ2n Pe

󳨐⇒

2Λn sin Λn Pe2 +4Λn


ψn (1)ψn (0) = [ ]
Pe2 +4 Pe +4Λ2n Pe


588 | 23 Finite Fourier transforms

2 2
Pe

2 Λn sin Λn (Pe +4Λn ) −( Pe24+4Λ 2
n )τ
E(τ) = e 2 ∑ ( ) e Pe

n=1 Pe 2 2
(Pe +4 Pe +4Λn )


2 2
2 Pe2 ∞ Λn sin Λn (Pe +4Λn ) −( Pe24+4Λ2
n )τ
E(τ) = e ∑ e Pe
Pe 2 2
n=1 (Pe +4 Pe +4Λn )
Pe

(−1)n−1 Λ2n Pe2 +4Λ2n
= 8e 2 ∑ 2
e−( 4 Pe

(23.194)
2
n=1 (Pe +4 Pe +4Λn )

The last simplification follows by expressing sin Λn as

(−1)n−1 Pe Λn
sin Λn =
Pe2
Λ2n + 4

and noting that the sign of sin Λn depends on n and such that (n − 1)π ≤ Λn ≤ nπ. [For
n odd, the sign is positive and n even, sign is negative.] A plot of the solution given by
equation (23.194) is shown in Figure 17.11 for Pe = 0.5, 2.0 and 5.0.
(2) Special case 2

s∗ (z ′ , τ′ ) = 0, ĉ0 (τ′ ) = 0 (23.195)

IC:

ŵ i (z ′ ) = δ(z ′ ) = unit pulse at inlet at t = 0 (23.196)

C(z, τ) is given by equations (23.191)–(23.192)

(3) Special case 3

s∗ (z ′ , τ′ ) = 0, ŵ i (z ′ ) = 0 (23.197)
ĉ0 (τ′ ) = δ(τ′ ) (23.198)

⇒ same solution as that given by equations (23.191)–(23.192).


Thus, we get the same solution (as can be expected intuitively) for the unit pulse
appearing as a source term, in the initial condition or in the inlet boundary condition.

23.4.3 Example 11 (Fourier’s ring problem)

As our next example, we consider the historical problem solved by Fourier, i. e., deter-
mining the temperature distribution in a circular ring. The problem in dimensionless
form is described by
23.4 Additional applications of FFT in rectangular coordinates | 589

𝜕u 𝜕2 u
= 2, 0 < θ ≤ 2π, t>0 (23.199)
𝜕t 𝜕θ

with BCs:
𝜕u 𝜕u
u(θ, t) = u(θ + 2π, t), (θ, t) = (θ + 2π, t) (23.200)
𝜕θ 𝜕θ

and IC:

u(θ, 0) = f (θ). (23.201)

We note that the operator

𝜕2 w
, 0 < θ ≤ 2π (23.202)
𝜕θ2

with periodic boundary conditions

w(θ) = w(θ + 2π), w′ (θ) = w′ (θ + 2π) (23.203)

has eigenvalues:

λn = n2 , n = 0, 1, 2, . . . (23.204)

and normalized eigenfunctions:

1
w0 =
√2π
sin nθ
1 { ≡ wns =
w0 = wn = { n = 1, 2, . . . (23.205)
√π
; cos nθ
√2π ≡ wnc = ,
{ √π

Thus, the solution may be expressed as



u(θ, t) = ∑ e−λn t ⟨f , wn ⟩wn (θ)
n=0

∞ 2 sin nθ sin nθ′ ′
= ⟨f , w0 ⟩w0 + ∑ e−n t ∫ f (θ′ ) dθ
n=0
√π √π
0

∞ 2 cos nθ cos nθ′ ′
+ ∑ e−n t ∫ f (θ′ ) dθ
n=1
√π √π
0


590 | 23 Finite Fourier transforms

2π ∞ −n t 2 2π
1 e
u(θ, t) = ∫ f (θ′ ) dθ′ + ∑ ∫ [sin nθ ⋅ sin nθ′ + cos nθ cos nθ′ ]f (θ′ ) dθ′
2π n=1 π
0 0
2π ∞ −n t 2 2π
1 e
= ∫ f (θ′ ) dθ′ + ∑ ∫ cos[n(θ − θ′ )]f (θ′ ) dθ′ (23.206)
2π n=1 π
0 0


2π 2π
1 1 ∞
u(θ, 0) = f (θ) = ∫ f (θ′ ) dθ′ + ∑ ∫ cos[n(θ − θ′ )]f (θ′ ) dθ′ (23.207)
2π π n=1
0 0

1
u(θ, ∞) = ∫ f (θ′ ) dθ′ . (23.208)

0

Note that u(θ, ∞) is the average (steady-state) temperature.

23.4.4 Example 12: (coupled equations) reaction–diffusion equations

As our final example, we consider the coupled reaction–diffusion equations

𝜕2 u1
𝜕u1
= 𝜕x 2
+ a11 u1 + a12 u2 }
0 < x < 1, t>0 (23.209)
𝜕t
2
𝜕 u2 };
𝜕u2
𝜕t
= 𝜕x 2
+ a21 u1 + a22 u2 }

with homogeneous Dirichlet boundary conditions:

u1 (0, t) = u2 (0, t) = 0, u1 (1, t) = u2 (1, t) = 0 (23.210)

and initial conditions

ui (x, 0) = fi (x); i = 1, 2. (23.211)

The spatial operator

𝜕2 w
− = λw, w(0) = 0, w(1) = 0 (23.212)
𝜕x 2
has eigenvalues

λ n = n2 π 2 (23.213)

and normalized eigenfunctions

wn (x) = √2 sin nπx. (23.214)


23.4 Additional applications of FFT in rectangular coordinates | 591

Let

Vjn = ⟨uj , wn ⟩; j = 1, 2 (23.215)

Equations (23.209)–(23.211) ⇒

d V a − n2 π 2 a12 V
[ 1n ] = [ 11 ] [ 1n ] (23.216)
dt V2n a21 a22 − n2 π 2 V2n
V1n g0 0
[ ] = [ 1n
0 ] @ t = 0, gin = ⟨fi , wn ⟩ i = 1, 2 (23.217)
V2n g2n

Let

a11 − n2 π 2 a12
Bn = [ ] (23.218)
a12 a22 − n2 π 2
μ1n , μ2n = eigenvalues of Bn (23.219)
x1n , x2n = eigenvectors of Bn (23.220)
y∗1n , y∗2n = eigenrows of Bn (23.221)

Solution of equations (23.216)–(23.217) is given by

2 y∗jn g0n
Vn = ∑ eμjn t xjn (23.222)
j=1
y∗jn xjn

2 y∗ g0
u1 (x, t) ∞
jn n μjn t
u=( ) = ∑ √2 sin nπx(∑ ∗ e )xjn (23.223)
u2 (x, t) n=1 y x
j=1 jn jn

The growth or decay of the solution is determined by the eigenvalues of the matrix
Bn (n = 1, 2, . . .). If all μjn have a negative real part, the initial perturbations decay to the
trivial solution, while spatial or spatiotemporal patterns may be formed when μjn cross
the imaginary axis.

Problems
1. Consider the problem of unsteady-state heat/mass transfer in a flat plate

𝜕2 θ 𝜕θ
= ; 0 < x < 1, τ>0
𝜕x 2 𝜕τ
θ(x, 0) = f (x)
𝜕θ
(0, τ) = 0
𝜕x
592 | 23 Finite Fourier transforms

𝜕θ
(1, τ) + Bi θ(1, τ) = 0
𝜕x

(a) Determine the solution using finite Fourier transformation. Compare your re-
sult with that in Carslaw and Jaeger [10].
(b) Consider the case in which f (x) = 1. What is the limiting form of the solution
for the case no external resistance (Bi → ∞) and no internal resistance (Bi → 0)?
2. (a) Obtain the solution of the Poisson’s equation

∇2 u = −f in Ω
u=0 on 𝜕Ω

in two and three dimensions when Ω is a rectangular region. Identify Green’s func-
tion and give a physical interpretation.
(b) The velocity profile for slow viscous flow of a fluid in a rectangular channel is
given by

𝜕2 u 𝜕2 u Δp
+ =
𝜕x 2 𝜕y2 μL
u = 0; @ x = 0 and x = a
u = 0; @ y = 0 and y = b

Obtain a complete solution. Use the solution to derive a rectangular analogue of


Poiseuille’s law (relation between pressure drop and flow rate).
3. Heat transfer between two infinite parallel plates in one-dimensional laminar flow
neglecting conduction in the flow direction may be described by

𝜕2 T 3 y2 𝜕T
kf = ⟨u⟩ρC p (1 − )
𝜕y2 2 a2 𝜕x
T = Tw @ y = ±a, T = Tin F(y), @x = 0

(a) Cast into dimensionless form and find the formal solution without determining
the eigenvalues and eigenfunctions (Graetz functions) explicitly. (b) Determine an
expression for the cup-mixing (velocity weighted) temperature (Tm ) for the case
of uniform inlet temperature, i. e., F(y) = 1 (c) If the heat transfer coefficient (h) is
defined by

−kf 𝜕T (x, y = a)
h(x) =
𝜕y
,
Tm − Tw

where Tw is the wall temperature, obtain an expression for the dimensionless heat
transfer coefficient (or the local Nusselt number)
23.4 Additional applications of FFT in rectangular coordinates | 593

h(x)a
Nu(x) =
kf

(d) Determine the two asymptotes (short and long distance) of the Nusselt number
as a function of position
4. (a) Given the operator

d2 w
Lw = − , 0 < x < 1; w′ (0) = 0, w′ (1) = 0
dx2

determine the eigenvalues and orthonormal set of eigenfunctions.


(b) Determine the expansion of the function f (x) = δ(x − 21 ) in terms of the eigen-
functions determined in (a) above.
(c) Use the above results to solve the diffusion equation

𝜕2 u 𝜕u
= ; 0 < x < 1, t>0
𝜕x 2 𝜕t
1
u′ (0, t) = 0, u′ (1, t) = 0, u(x, 0) = δ(x − )
2

Show schematic profiles of u(x, t) for 0 ≤ x ≤ 1 for t = 0, t → ∞ and a finite value


of t. Give a physical interpretation of the solution.
5. (a) Solve Laplace’s equation on the unit square:

𝜕2 u 𝜕2 u
+ = 0, 0 < x < 1, 0<y<1
𝜕x 2 𝜕y2
u(x, 0) = f (x), u(x, 1) = 0; u(0, y) = 0, u(1, y) = 0

(b) Simplify the solution for the special case of f (x) = 1 and plot a few isotherms
(corresponding to constant values of u).
6. Solve the problem

𝜕2 C 𝜕C
D = ; 0 < x < L, t > 0
𝜕x 2 𝜕t
𝜕C 𝜕C
= 0, @ x = 0, −D = kg [C − C0 (t)], @x = L
𝜕x 𝜕x
C = 0, @ t = 0

where C0 (t) is a given function of time.


7. Given the following set of equations, which describe adsorption in a fixed bed of
adsorbent,
594 | 23 Finite Fourier transforms

𝜕2 C 𝜕C 𝜕C 𝜕n
ε[−D +u + ] + (1 − ε) = 0, 0 < x < L
𝜕x 2 𝜕x 𝜕t 𝜕t
𝜕n
(1 − ε) = kg a(C − C ∗ ), n = kC ∗ (equilibrium relation)
𝜕t
𝜕C 𝜕C
−D = u(C0 − C), x = 0 (C0 is a constant), = 0, x=L
𝜕x 𝜕x
C = f (x), n = g(x), t = 0

Determine the solution. This is adsorption for the case in which adsorption is mass
transfer limited. C(x, t) is the concentration in the interstitial fluid and n(x, t) is the
concentration in the solid phase.
8. Consider the steady-state problem of diffusion and surface reaction in a rectangu-
lar pore. The relevant equations are given by

𝜕2 C 𝜕2 C
+ = 0, −H < y < H, 0<z<L
𝜕y2 𝜕z 2
𝜕C
C = C0 , @ z = 0; = 0, @z = L
𝜕z
𝜕C
±D + ks C = 0; @ y = ±H
𝜕y

Cast into dimensionless form and determine the solution. Use the solution to de-
termine the effectiveness factor (ratio of the actual reaction rate in pore to that if
concentration at all points inside is equal to C0 ).
9. Transient convection–reaction problems in one spatial dimension are described
by hyperbolic system of equations of the form

𝜕y 𝜕y
C1 (x) + C2 (x) = C3 (x)y; a < x < b, t > 0
𝜕t 𝜕x
B. C.: Wa y(a, t) + Wb y(b, t) = 0, t > 0
IC: y(x, 0) = f(x)

where Ci , i = 1, 2, 3 are nonsingular N × N matrices that are continuous in x and


Wa , Wb are constant N × N matrices.
(a) Use the separation of variable technique and identify the eigenvalue problem
that results. What is the adjoint eigenvalue problem?
(b) Use the biorthogonal expansion to obtain formal solution to the above tran-
sient equations. Comment on the usefulness of the solution when the eigen-
values are complex.
(Specific examples of this type of systems include heat exchangers, distillation
columns, autothermal reactors, chromatographs, etc. Try to formulate the sim-
plest transient models of any of these systems and put them in the above form.)
10. Axial dispersion model:
23.4 Additional applications of FFT in rectangular coordinates | 595

The dispersion of a tracer in unidirectional flows in pipes and channels is de-


scribed by the axial dispersion model described by the equations

𝜕c 𝜕c 𝜕2 c
+u = D 2 + S(x, t); 0 < x < L, t > 0
𝜕t 𝜕x 𝜕x
𝜕c 𝜕c
BC: { − D = u[c0 (t) − c]@ x = 0, = 0, @ x = L, t>0
𝜕x 𝜕x
I.C : c = ci (x), @ t = 0, 0 < x < L.

Here, c is the concentration of the tracer, u is the average velocity of the stream,
D is the effective axial dispersion coefficient, S is the sources/sinks of tracer, c0 is
the inlet concentration of tracer and ci is the initial distribution of tracer.
(a) Cast the equations into dimensionless form.
(b) Obtain a formal solution to the model in (a).
(c) Simplify the solution for the special case of c0 (t) = 0, ci (x) = 0 and unit pulse
at the inlet at time zero.
(Note: The solution in case (c) gives the residence time distribution function for
the axial dispersion model.)
11. Transient diffusion–convection–reaction problems for N chemical species are de-
scribed by coupled parabolic equations of the form

𝜕2 c 𝜕c 𝜕c
D − u − Kc = , 0 < x < L, t>0
𝜕x 2 𝜕x 𝜕t
𝜕c 𝜕c
B.C: { − D = u(c0 − c), @ x = 0, = 0, @ x = L, t>0
𝜕x 𝜕x
I.C: c = f(x), t=0

where D is the dispersion coefficient, u is the velocity, K is the matrix of rate con-
stants and c is the concentration vector.
(a) Cast the equations into dimensionless form and identify the linear operators
of interest.
(b) Indicate how the equations may be decoupled into N scalar equations.
(c) Obtain a formal solution to each scalar equation. Write down the form of the
solution to the complete system of equations.
12. Obtain a formal solution to the system of partial differential equations

𝜕2 u1 𝜕2 u1
𝜕x 2
+ 𝜕z 2
+ a1 𝜕u 2
= f1 (x, z) }
0 < x < 1, 0<z<1
𝜕x
𝜕2 u2 𝜕2 u2 };
𝜕x 2
+ 𝜕z 2
+ a1 𝜕x = f2 (x, z) }
𝜕u1

u1 (0, z) = u1 (1, z) = 0, u2 (x, 0) = u2 (x, 1) = 0


𝜕u2 𝜕u
(0, z) = 2 (1, z) = 0, u1 (x, 0) = u1 (x, 1) = 0.
𝜕x 𝜕x
596 | 23 Finite Fourier transforms

Does a solution exist for every choice of f1 and f2 ? Explain.


13. (a) Obtain a formal solution to the following system of linear PDEs describing dif-
fusion and reaction:

𝜕u
= D∇2 u + Au; in Ω
𝜕t
u = 0 on 𝜕Ω (Dirichlet boundary conditions)

where Ω is a rectangle. Here, u is the vector of concentrations, D is the matrix of


diffusion coefficients and A is a constant n × n matrix.
(b) Obtain a formal solution to (1) with Neumann boundary conditions,

∇u.n = 0 on 𝜕Ω

Here, n is the unit outward normal to 𝜕Ω.


(c) Show schematic diagrams of contour plots of the spatial eigenfunctions for
cases (a) and (b).
14. let V be the vector space of complex-values functions u(x) defined on the interval
(0, a) satisfying the periodicity condition u(0) = u(a). Let L be the linear operator
on V defined by

du
Lu(x) = −i ; i = √−1
dx

(a) Show that this operator is self-adjoint w. r. t. the usual inner product on V,
i. e.,
a

⟨u, v⟩ = ∫ u(x)v(x) dx
0

(b) determine the eigenvalues and normalized eigenfunctions of L


(c) determine the coefficient in the expansion of an arbitrary complex values pe-
riodic function f (x) with f (x) = f (x + a) in terms of the eigenfunctions in (b)
(d) State the Parseval’s relation for the expansion in (c)
[Remark: The expansion in (c) is the complex form of the Fourier series for a
periodic function for a periodic function, which is often used in signal pro-
cessing and numerical calculations.]
15. Consider the solution of the diffusion equation in a finite domain

𝜕2 u 𝜕u
= ; 0 < x < 1, t>0
𝜕x 2 𝜕t
u(0, t) = 0, u(1, t) = 0
u(x, 0) = f (x)
23.4 Additional applications of FFT in rectangular coordinates | 597

Simplify the solution for the special case of f (x) = 1 and show that for short times
it reduces to the error function solution.
16. Consider the solution of Laplace’s equation in the rectangle

𝜕2 u 𝜕2 u
+ = 0; −a < x < a, 0 < y < b (a, b > 0)
𝜕x 2 𝜕y2
u(−a, y) = 0, u(a, y) = 0, u(x, b) = 0, u(x, 0) = f (x)

(a) Determine the solution using finite Fourier transform.


(b) Consider the case of a and b going to infinity (or the case of rectangle being ex-
tended to the upper half-plane). Show that for this limiting case, the solution
may be simplified to the Poisson’s formula

1 yf (ξ )
u(x, y) = ∫ dξ
π [(x − ξ )2 + y2 ]
−∞
24 Fourier transforms on infinite intervals
Recall that a regular differential operator had two characteristics (i) it was defined on a
finite interval, and (ii) the leading coefficient p0 (x) did not vanish inside or at the end
points of the interval. We now consider problems in which condition (i) is violated.
This leads to the Fourier transform on infinite and semi-infinite domains.

24.1 Fourier transform on (−∞, ∞)


Consider the eigenvalue problem

d2 u
= −λu, −a < x < a (24.1)
dx2
u(−a) = u(a), u′ (−a) = u′ (a) Periodic BCs. (24.2)

It is easily verified that this is a self-adjoint eigenvalue problem.


Eigenvalues:

n2 π 2
λn = , n = 0, 1, 2, . . . (24.3)
a2

Normalized eigenfunctions:

1
y0 (x) = (24.4)
√2a
1 nπx
un (x) = sin( ) (24.5)
√a a
1 nπx
yn (x) = cos( ); each λn (n > 0) is double (24.6)
√a a

Note that
a
∫−a y0 un (x) dx = 0 n ≠ 0
a } (orthogonality relation) (24.7)
∫−a y0 yn (x) dx =0 n ≠ 0

and
a a
0 m ≠ n
∫ ym (x)yn (x) dx = { ⇒ ∫ yn (x)ym (x) dx = δmn (24.8)
1 if m = n
−a −a
a a
0 m ≠ n
∫ um (x)un (x) dx = { ⇒ ∫ un (x)um (x) dx = δmn (24.9)
1 if m = n
−a −a

https://doi.org/10.1515/9783110739701-025
24.1 Fourier transform on (−∞, ∞) | 599

∫ yn (x)um (x) dx = 0 ∀m, n (m = n included) (24.10)


−a

{λn , un (x), yn (x)} form a basis for ℒ2 [−a, a], Hilbert space of periodic functions with the
standard inner product
a

⟨u, y⟩ = ∫ uy dx (24.11)
−a

If f (x) ∈ ℒ2 [−a, a], then



f (x) = a0 y0 (x) + ∑ [an yn (x) + bn un (x)] (24.12)
n=1
a
1
a0 = ⟨f , y0 ⟩ = ∫ f (ξ ) dξ (24.13)
√2a
−a
a
1 nπξ
an = ⟨f , yn ⟩ = ∫ sin( )f (ξ ) dξ (24.14)
√a a
−a
a
1 nπξ
bn = ⟨f , un ⟩ = ∫ cos( )f (ξ ) dξ (24.15)
√a a
−a


a a
1 ∞
1 nπx nπξ
f (x) = ∫ f (ξ ) dξ + ∑ sin( ) ∫ sin( )f (ξ ) dξ
2a n=1 a a a
−a −a
a
1 nπx nπξ
+ cos( ) ∫ cos( )f (ξ ) dξ .
a a a
−a

Equation (24.12) is the classical Fourier expansion of a periodic function in terms of


the eigenfunctions (which in this case are sines and cosines). Now,
a a
1 1 ∞ nπx nπξ nπx nπξ
f (x) = ∫ f (ξ ) dξ + ∑ ∫ [sin( ) sin( ) + cos( ) cos( )]f (ξ ) dξ
2a a n=1 a a a a
−a −a
a a
1 1 ∞ nπ(x − ξ )
= ∫ f (ξ ) dξ + ∑ ∫ cos[ ]f (ξ ) dξ
2a a n=1 a
−a −a

Thus,
600 | 24 Fourier transforms on infinite intervals

a a
1 1 ∞ nπ(x − ξ )
f (x) = ∫ f (ξ ) dξ + ∑ ∫ cos[ ]f (ξ ) dξ (24.16)
2a a n=1 a
−a −a

This is an identity for any f ∈ ℒ2 [−a, a]. We now obtain the Fourier integral formula
from this equation.

24.1.1 Fourier integral formula

Assume that f (x) is absolutely integrable, i. e.,



󵄨 󵄨
∫ 󵄨󵄨󵄨f (x)󵄨󵄨󵄨 dx < ∞ (24.17)
−∞

Then, as we let a 󳨀→ ∞, the first term on the RHS of equation (24.16) goes to zero.
Define
nπ π
αn = ⇒ Δαn = αn+1 − αn = (24.18)
a a

Then the second term on RHS of equation (24.16) may be written


a
1 ∞
T2 = ∑ ∫ cos[αn (x − ξ )]f (ξ ) dξ Δαn (24.19)
π n=1
−a

The sum
a

1
∑ F(αn )Δαn ; F(αn ) = ∫ cos[αn (x − ξ )]f (ξ ) dξ
n=1 π
−a

is a Riemann sum for the integral


∫ F(α) dα
0

Thus, taking limit a → ∞, we get


∞ ∞
1
f (x) = ∫ ∫ f (ξ ) cos α(x − ξ ) dξ dα (24.20)
π
0 −∞

This formula is also an identity for any f ∈ ℒ2 (−∞, ∞). This is known as the Fourier
integral formula. Since cosine is an even function and sine is an odd function, equation
(24.20) may be written as
24.1 Fourier transform on (−∞, ∞) | 601

∞ ∞ ∞ ∞
1 i
f (x) = ∫ ∫ f (ξ ) cos[α(x − ξ )] dξ dα + ∫ ∫ f (ξ ) sin[α(x − ξ )] dξ dα
2π 2π
−∞ −∞ −∞ −∞

(the second term is zero since sin α is odd in α) (24.21)


∞ ∞
1
f (x) = ∫ ∫ f (ξ )eiα(x−ξ ) dξ dα

−∞ −∞
∞ ∞
1
= ∫ ( ∫ f (ξ )e−iαξ dξ )eiαx dα. (24.22)

−∞ −∞

If we define

F(α) = ∫ e−iαξ f (ξ ) dξ = the Fourier transform of f (x), (24.23)


−∞

then equation (24.22) gives the inversion formula:



1
f (x) = ∫ eiαx F(α) dα. (24.24)

−∞

This transform is useful in solving equations with a second derivative operator (as well
as derivatives of other orders) on an infinite domain:

d2
− , −∞ < x < ∞
dx2

For example, (i) in the solution of the heat equation,

𝜕2 u 𝜕u
= , −∞ < x < ∞
𝜕x2 𝜕t
u(x, 0) = f (x)

or (ii) in the solution of the wave equation

𝜕2 u 𝜕2 u
c2 = 2, −∞ < x < ∞
𝜕x 2 𝜕t
𝜕u
u(x, 0) = f (x), (x, 0) = g(x)
𝜕t

or (iii) in the solution of Laplace’s equation

𝜕2 u 𝜕2 u
+ = 0, −a < x < a, −∞ < y < ∞
𝜕x 2 𝜕y2
602 | 24 Fourier transforms on infinite intervals

in an infinite strip with boundary conditions

u(a, y) = f (y), u(−a, y) = g(y).

24.2 Finite Fourier transform and the Fourier transform


Consider the eigenvalue problem

d2 u
= −λu, −∞ < x < ∞ (24.25)
dx2

We note that u = eiαx (with α = ±√λ) satisfies the equation and is bounded if α is real.
Now,

u′ = iαu, u′′ = (iα)2 u = −λu (24.26)

Thus, every α2 is an eigenvalue of equation (24.25) with eigenfunction u = eiαx . (Note


that sin αx and cos αx are also eigenfunctions.) Thus, in this case we have a continuous
spectrum. When the interval is of finite length, the spectrum is discrete. In the case of
discrete spectrum, the finite Fourier transform

f (x) → ⟨f , un (x)⟩ (24.27)

generates a countable infinite sequence of numbers. As the size of the interval is in-
creased to infinity the spectrum becomes continuous and the finite Fourier transform,
becomes a continuous function or “the Fourier transform,”
a

F(f ) = ∫ f (ξ )un (ξ ) dξ , n = 1, 2, . . . ; Finite FT, (discrete spectrum)


−a

a → ∞, continuous spectrum ⇒ F(α) = ∫ e−iαξ f (ξ ) dξ (24.28)


−∞

Thus, F(α) plays the role of the coefficients in the finite Fourier transform.

Remark. Since the cosine is an even function, the Fourier integral formula, given in
equation (24.20), may also be written as
∞ ∞
1
f (x) = ∫ ∫ f (ξ ) cos α(x − ξ ) dξ dα
π
0 −∞
∞ ∞
1
= ∫ ∫ f (ξ ) cos α(ξ − x) dξ dα (24.29)
π
0 −∞

󳨐⇒
24.2 Finite Fourier transform and the Fourier transform | 603

∞ ∞ ∞ ∞
1 i
f (x) = ∫ ∫ f (ξ ) cos[α(ξ − x)] dξ dα + ∫ ∫ f (ξ ) sin[α(ξ − x)] dξ dα
2π 2π
−∞ −∞ −∞ −∞


∞ ∞
1
f (x) = ∫ ∫ f (ξ )eiα(ξ −x) dξ dα

−∞ −∞
∞ ∞
1
= ∫ ( ∫ f (ξ )eiαξ dξ )e−iαx dα (24.30)

−∞ −∞

If we define
∞ ∞
iαξ 1
F(α) = ∫ e f (ξ ) dξ 󳨐⇒ f (x) = ∫ e−iαx F(α) dα (24.31)

−∞ −∞

Many authors (Carslaw and Jaeger [10]; Sneddon [28]) use these (equation (24.31)) as
the transform pair. However, we shall follow the notation used by Churchill [12] and
use the pair defined in equations (24.23) and (24.24) as

F(α) = ∫ f (ξ )e−iαξ dξ (24.32)


−∞

1
f (x) = ∫ F(α)eiαx dα (24.33)

−∞

The transforms given in equations (24.31) and (24.32)–(24.33) differ only by a change
of sign in α. Since α and ξ vary from −∞ to ∞, they are equivalent.

Example 24.1. Consider the function (also known as the decaying pulse)

e−cx , x≥0
f (x) = { (24.34)
0, x<0

where c is a real constant. The FT of this function is given by


∞ ∞

F(α) = ∫ e −iαξ
f (ξ ) dξ = ∫ e−iαξ ⋅ e−cξ dξ
−∞ 0

e−(iα+c)ξ 󵄨󵄨∞
󵄨󵄨
= ∫ e−(iα+c)ξ dξ = 󵄨
−(iα + c) 󵄨󵄨󵄨0
0
1 c − iα
= = (24.35)
(c + iα) (c2 + α2 )

604 | 24 Fourier transforms on infinite intervals


1 1
f (x) = ∫ eiαx dα
2π (c + iα)
−∞

1 c − iα iαx
= ∫ 2 e dα; (c + iα = s)
2π c + α2
−∞
c+i∞
1 1
= ∫ e(s−c)x ds
2πi s
c−i∞
1
= e−cx ⋅ ℓ−1 { }
s
e−cx , x ≥ 0
={
0, x<0

For an extensive table of Fourier transforms, see the book by Churchill [11].
We note that
∞ ∞
1 1 󵄨2
‖f ‖2 = ∫ e−2cx dx =
󵄨
= ∫ 󵄨󵄨󵄨F(α)󵄨󵄨󵄨 dα.
2c 2π
0 −∞

This relation is similar to Parseval’s theorem and is known as the Plancherel’s theo-
rem. We discuss it in more general form below.

24.2.1 Physical interpretation

Consider the eigenvalue problem:

d2 u
= −α2 u, −∞ < x < ∞ (24.36)
dx2

Every α2 > 0 is an eigenvalue with eigenfunction uα (x) = e−iαx . Note that eiαx is also an
eigenfunction corresponding to eigenvalue α2 . However if we let α vary from −∞ to ∞
we need to consider only one of the eigenfunctions. Thus, we have a continuous spec-
trum. To show that the eigenvalue problem is self-adjoint, we consider two functions
u, v ∈ ℒ2 (−∞, ∞), a Hilbert space with the usual inner product. Now,

𝜕2 u(x) 𝜕2 u(x)

⟨ , v⟩ = ∫ v(x) dx
𝜕x 2 𝜕x 2
−∞

𝜕2 v

𝜕v 󵄨󵄨󵄨

𝜕u
=( v − u )󵄨󵄨󵄨 + ∫ u 2 dx
𝜕x 𝜕x 󵄨󵄨−∞ 𝜕x
−∞
𝜕u 󵄨
𝜕v 󵄨󵄨

= ⟨u, Lv⟩ + ( v − u )󵄨󵄨󵄨 (24.37)
𝜕x 𝜕x 󵄨󵄨−∞
24.2 Finite Fourier transform and the Fourier transform | 605

Assuming that u and 𝜕x 𝜕v


vanish at infinity (this assumption is reasonable since if this
is not the case then u is not absolutely integrable),

⟨Lu, v⟩ = ⟨u, Lv⟩ (24.38)

Thus, we may use the formalism we had before. Consider again the Hilbert space
ℒ2 (−∞, ∞) with the usual inner product

⟨u, v⟩ = ∫ u(x)v(x) dx.


−∞

If f (x) ∈ ℒ2 (−∞, ∞), and

F(α) = ℱ (f (x)) = Fourier transform of f (x)

then

iαx
F(α) = ⟨f (x), e ⟩ = ∫ e−iαξ f (ξ ) dξ (24.39)
−∞

and the inverse is given by



1
f (x) = ∫ eiαx F(α) dα (24.40)

−∞

Thus, F(α) plays the role of the coefficients in the finite Fourier transform, as already
mentioned earlier.

24.2.2 Properties of the Fourier transform

The Fourier transform defined by


F(α) = ℱ {f (x)} = ∫ f (ξ )e−iαξ dξ (24.41)


−∞

can be used to obtain the following properties:


1. Transform of derivatives:

dm f
ℱ{ } = (iα)m F(α), m = 1, 2, 3, . . . (24.42)
dxm

if we assume that f and its derivatives vanish for x → ±∞.


606 | 24 Fourier transforms on infinite intervals

2. Multiplication by x:

dF
ℱ {xf (x)} = i (24.43)

3. Shift in x:

iαc
ℱ {f (x + c)} = e F(α), c real (24.44)

4. Shift in α:

icx
ℱ {f (x)e } = F(α − c), c real (24.45)

5. Scaling in x:

1 x
ℱ{ f ( )} = F(αc), c real, c ≠ 0 (24.46)
|c| c

6. Reflection in x:

ℱ {f (−x)} = F(−α) (24.47)

7. Transform of complex conjugate:

ℱ {f (x)} = F(−α) (24.48)

8. Convolution:

ℱ { ∫ f (x )g(x − x ) dx } = F(α)G(α) (24.49)


′ ′ ′

−∞

9. Transform and representation of Dirac delta function:


1
ℱ {δ(x − s)} = e
−iαs
, δ(x − s) = ∫ eiα(x−s) dα (24.50)

−∞

10. Moments theorem: We can expand F(α) in powers of α,


(−iα)k
F(α) = ∑ Mk (24.51)
k=0
k!

where Mk is kth spatial moment of f (x) [see further explanation in the next sec-
tion].
24.2 Finite Fourier transform and the Fourier transform | 607

11. Plancherel’s theorem:


∞ ∞
1
∫ f (x)g(x) dx = ∫ F(α)G(α) dα. (24.52)

−∞ −∞

For the special case of f (x) = g(x), equation (24.52) gives


󵄩󵄩 󵄩2 󵄨 󵄨2 1 󵄩󵄩 󵄩2
󵄩󵄩f (x)󵄩󵄩󵄩 = ∫ 󵄨󵄨󵄨f (x)󵄨󵄨󵄨 dx = 󵄩F(α)󵄩󵄩󵄩 . (24.53)
2π 󵄩
−∞

The LHS of equation (24.53) may be interpreted as the total energy content of f (x)
while the RHS represents the same in the frequency domain.

24.2.3 Moments theorem for Fourier transform

Let F(α) = ℱ {f (x)}, i. e.,

F(α) = ∫ f (ξ )e−iαξ dξ
−∞

(−iαξ )k
∞ ∞
= ∫ f (ξ ) ∑ dξ
k=0
k!
−∞

Interchanging the sum and integral gives

(−iα)k
∞ ∞

F(α) = ∑ ∫ ξ k f (ξ ) dξ
k=0
k!
−∞
∞ k
(−iα)
= ∑ Mk , (24.54)
k=0
k!

where Mk is the kth spatial moment of f (ξ ), defined by

Mk = ∫ ξ k f (ξ ) dξ , k = 0, 1, 2, . . . (24.55)
−∞

while the central moments are defined by

mk = ∫ (ξ − M1 )k f (ξ ) dξ , k = 1, 2, 3, . . . (24.56)
−∞
608 | 24 Fourier transforms on infinite intervals

Thus, if the Fourier transform of a function f (x) is known, we can determine the kth
spatial moment [or temporal moment for a function of time] without inverting the
Fourier transform as follows from equation (24.51):

M0 = F|α=0
dF 󵄨󵄨󵄨󵄨
M1 = −i 󵄨
dα 󵄨󵄨󵄨α=0
d2 F 󵄨󵄨󵄨
M2 = (−i)2 2 󵄨󵄨󵄨
dα 󵄨󵄨α=0
dk F 󵄨󵄨󵄨
Mk = (−i)k k 󵄨󵄨󵄨 , k = 0, 1, 2, . . . (24.57)
dα 󵄨󵄨α=0

The moment theorem is illustrated in the example below.

Example 24.2. Consider the convective-diffusion equation in an infinite domain

𝜕c 𝜕c 𝜕2 c
+u = Dm 2 , −∞ < x < ∞, t>0
𝜕t 𝜕x 𝜕x
with initial condition

c(x, 0) = δ(x);

and the conditions


𝜕c
c, →0 for x → ±∞.
𝜕x
Let

ĉ(α, t) = ℱ {c(x, t)}

󳨐⇒

dĉ
+ u(iα)ĉ = Dm (iα)2 ĉ; ĉ(t = 0) = 1
dt
󳨐⇒

ĉ = exp[−(iαu + α2 Dm )t]
(iαu + α2 Dm )2 t 2
= 1 − (iαu + α2 Dm )t + − ⋅⋅⋅
2!
u2 t 2
= 1 − iαut − α2 (Dm t + ) + ⋅⋅⋅
2
󳨐⇒
24.2 Finite Fourier transform and the Fourier transform | 609

M0 = 1,
M1 = ut,

M2 = 2Dm t + u2 t 2 ,

M3 = 6Dm t 2 u + u3 t 3 ,

M4 = 12D2m t 2 + 12Dm t 3 u2 + u4 t 4 , etc.

󳨐⇒ The central moments are given by

m2 = σ 2 = M2 − M12 = 2Dm t
m3 = 0

m4 = 12D2m t 2 , and so on.

It can be shown that all the odd central moments are zero. Thus, the dispersion is
symmetric around the centroid located at ut. [Remark: This result is not valid in a
finite domain due to inlet and exit boundary conditions.]

24.2.4 Fourier transform in spatial and cyclic frequencies

We have defined the Fourier transform pair in spatial frequency α (rad/cm) as


F(α) = ℱ {f (x)} = ∫ e−iαx f (x) dx

(24.58)
−∞

1
f (x) = ℱ {F(α)} =
−1
∫ eiαx F(α) dα

−∞

α
If we use cyclic frequency ω = 2π
(cycle/cm), the transform pair is defined by

F(ω) = ℱ {f (x)} = ∫ e−2πiωx f (x) dx

(24.59)
−∞

f (x) = ℱ −1 {F(ω)} = ∫ e2πiωx F(α) dω


−∞

Note that the constant multiplier 2π in inverse FT in spatial frequency has disappeared
when cyclic frequency is used. The cyclic frequency definition is used in analyzing
signals in time or spatially periodic structures.
610 | 24 Fourier transforms on infinite intervals

Fourier transforms in 2D and 3D


Let f (x, y) denote a 2D intensity image where x and y are spatial variables (having
length units). The FT pair of f (x, y) in spatial frequency is defined by
∞ ∞

F(α1 , α2 ) = ℱ {f (x, y)} = ∫ ∫ e−iα1 x−iα2 y f (x, y) dx dy


−∞ −∞
(24.60)
∞ ∞
1
f (x, y) = ℱ −1 {F(α1 , α2 )} = ∫ ∫ eiα1 x+iα2 y F(α1 , α2 ) dα1 dα2
(2π)2
−∞ −∞

α1 α2
where α1 and α2 are in rad/cm. If we use cyclic frequencies ω1 = 2π
and ω2 = 2π
in
cycle/cm, the transform pair can be defined by
∞ ∞

F(ω1 , ω2 ) = ℱ {f (x, y)} = ∫ ∫ e−2πi(ω1 x+ω2 y) f (x, y) dx dy


−∞ −∞
(24.61)
∞ ∞

f (x, y) = ℱ −1 {F(ω1 , ω2 )} = ∫ ∫ e2πi(ω1 x+ω2 y) F(ω1 , ω2 ) dω1 dω2


−∞ −∞

α1
Similarly, in 3D, we can define the pair using the spatial frequency vector α = ( α2 ) as
α3

∞ ∞ ∞

F(α) = ℱ {f (x)} = ∫ ∫ ∫ e−iα.x f (x) dx


−∞ −∞ −∞
(24.62)
∞ ∞ ∞
1
f (x) = ℱ −1 {F(α)} = ∫ ∫ ∫ eiα.x F(α) dα
(2π)3
−∞ −∞ −∞

ω1
or in the cyclic frequency vector ω = ( ω2 ) as
ω3

∞ ∞ ∞

F(ω) = ℱ {f (x)} = ∫ ∫ ∫ e−2πiω.x f (x) dx


−∞ −∞ −∞
(24.63)
∞ ∞ ∞

f (x) = ℱ −1 {F(ω)} = ∫ ∫ ∫ e2πiω.x F(ω) dω


−∞ −∞ −∞

x
where x = ( y ) is the vector of spatial coordinates, and α.x = α1 x + α2 y + α3 z and
z
ω.x = ω1 x + ω2 y + ω3 z represent the usual dot product in ℝ3 .
24.2 Finite Fourier transform and the Fourier transform | 611

24.2.5 Fourier transform and Plancherel’s theorem

Let

F(α) = ℱ {f (x)} and G(α) = ℱ {g(x)} (24.64)

then
∞ ∞
1
∫ f (x)g(x) dx = ∫ F(α)G(α) dα (24.65)

−∞ −∞

[As shown below, the 2π factor disappears if we use cyclic frequencies.] As stated ear-
lier, this is known as Plancherel’s theorem. A proof of this theorem uses integral rep-
resentation of the Dirac delta function:

1
δ(α − α ) = ∫ eix(α−α ) dx (24.66)



−∞

Since we have

1
f (x) = ∫ eiαx F(α) dα

−∞

and property (24.48) 󳨐⇒



1
g(x) = ∫ e−iα x G(α′ ) dα′ ,


−∞

we can write
∞ ∞ ∞ ∞
1
∫ f (x)g(x) dx = ∫ ∫ ∫ eiαx F(α)e−iα x G(α′ ) dα′ dα dx

(2π)2
−∞ −∞ −∞ −∞
∞ ∞ ∞
1 1
∫ ∫ F(α)G(α′ )( ∫ eiαx e−iα x dx) dα′ dα

=
2π 2π
−∞ −∞ −∞
∞ ∞
1
= ∫ ∫ F(α)G(α′ )δ(α − α′ ) dα′ dα

−∞ −∞
∞ ∞
1
= ∫ ∫ F(α)G(α) dα.

−∞ −∞

Taking g(x) = f (x), we get from equation (24.65)


612 | 24 Fourier transforms on infinite intervals

∞ ∞
󵄨 󵄨2 1 󵄨 󵄨2
∫ 󵄨󵄨󵄨f (x)󵄨󵄨󵄨 dx = ∫ 󵄨󵄨󵄨F(α)󵄨󵄨󵄨 dα (24.67)

−∞ −∞

󵄨 󵄨2 α
= ∫ 󵄨󵄨󵄨F(2πω)󵄨󵄨󵄨 dω, ω= . (24.68)

−∞

24.3 Solution of BVPs and IBVPs in infinite intervals using the FT


In this section, we illustrate the application of the Fourier transform to the solution of
linear differential equations on infinite and semi-infinite domains.

24.3.1 Heat equation in an infinite rod

Consider solving the heat equation

𝜕2 u 𝜕u
= , −∞ < x < ∞ (24.69)
𝜕x 2 𝜕t

with initial condition

u(x, 0) = f (x). (24.70)

Assume that u and its derivatives w. r. t. x are bounded at x = ±∞. Let

u(α,
̂ t) = ℱ {u(x, t)}
−∞

= ∫ e−iαξ u(ξ , t) dξ (24.71)


−∞

F(α) = ℱ {f (x)} = u(α,


̂ t = 0) (24.72)

Take the inner product of equations (24.69)–(24.70) with eigenfunctions

𝜕2 u iαx 𝜕u
⟨ 2
,e ⟩ = ⟨ , eiαx ⟩
𝜕x 𝜕t

d
−α2 ⟨u, eiαx ⟩ = ⟨u, e+iαx ⟩
dt
dû
=
dt

24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 613

2
⟨u, e+iαx ⟩ = Ke−α t


2
lim⟨u, e+iαx ⟩ = K lim e−α t or ⟨lim u, e+iαx ⟩ = K
t→0 t→0 t→0

K = ⟨f (x), e+iαx ⟩ = F(α)


2
⟨u, eiαx ⟩ = F(α)e−α t (24.73)



1 2
u(x, t) = ∫ eiαx F(α)e−α t dα

−∞
∞ ∞
1 2
= ∫ eiαx e−α t ( ∫ e−iαξ f (ξ ) dξ ) dα

−∞ −∞
∞ ∞
1 2
= ∫ ∫ eiα(x−ξ )−α t f (ξ ) dξ dα (24.74)

−∞ −∞

This is the formal solution of the heat/diffusion equation in an infinite domain.


The integral in equation (24.74) can be evaluated either directly or by using
Cauchy’s theorem as shown below.

Direct method
Let

2
I = ∫ e−α t+iα(x−ξ ) dα
−∞

2
= ∫ e−α t ⋅ [cos α(x − ξ ) + i sin α(x − ξ )] dα
−∞
∞ ∞
2 2
= ∫ e−α t cos α(x − ξ ) dα + i ∫ e−α t sin α(x − ξ ) dα
−∞ −∞

2
= 2 ∫ e−α t cos α(x − ξ ) dα
0
π − (x−ξ4t )2
=2⋅√ e
4t
614 | 24 Fourier transforms on infinite intervals

∴ Equation (24.74) 󳨐⇒

1 π (x−ξ )2
u(x, t) = ∫ 2 ⋅ √ e− 4t f (ξ ) dξ
2π 4t
−∞

1 (x−ξ )2
= ∫ e− 4t f (ξ ) dξ (24.75)
√4πt
−∞

Method based on Cauchy’s theorem


Let

2
I = ∫ e−α t+iα(x−ξ ) dα
−∞

1 i 2 (x−ξ )2
= ∫ e− t [αt− 2 (x−ξ )] ⋅ e− 4t dα
−∞

(x−ξ )2 2
=e − 4t ∫ e−t[α−iδ] dα (24.76)
−∞

where

(x − ξ )
δ= (24.77)
2t

(x−ξ )2
I = e− 4t J, (24.78)

2
J = ∫ e−t(α−iδ) dα (24.79)
−∞

to evaluate J, we use Cauchy’s theorem around the contour shown in Figure 24.1.

Figure 24.1: Schematic of the contour for Cauchy theorem to evaluate integral.
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 615

2
Since g(z) = e−t(z−iδ) is analytic inside and on the boundary of C,
2
∫ e−t(z−iδ) dz = 0,
C


R −R
−t(x−iδ)2 −t(z−iδ)2 2 2
∫e dx + ∫ e dz + ∫ e−tx dx + ∫ e−t(z−iδ) dz = 0 (24.80)
−R Γ2 R Γ4

Take limit R → ∞, then

δ
2 2
∫ e−t(z−iδ) dx = i ∫ e−t[R+iu−iδ] du
Γ2 0
δ
2 2
= ie−tR ∫ et[u−δ] ⋅ e−2iRt(u−δ) du = 0 for R → ∞
0

Similarly,

lim ∫ = 0
R→∞
Γ4

Equation (24.80) ⇒
∞ ∞
−t(x−iδ)2 2 π
∫ e dx = ∫ e−tx dx = √ (24.81)
t
−∞ −∞



1 (x−ξ )2
u(x, t) = ∫ e− 4t f (ξ ) dξ (24.82)
√4πt
−∞

is the solution to equations (24.69)–(24.70).


If we assume that ∫−∞ |f (ξ )| dξ exists, then so long as t > 0, it can be shown that

the integral on the RHS of equation (24.82) converges absolutely and uniformly in both
x and t for t > 0, as well as all of its derivatives w. r. t. x and t. Thus, differentiation
under the integral sign is valid. To verify the initial condition, let

x−ξ
= y ⇒ ξ = x − y√4t ⇒ dξ = −√4t dy (24.83)
√4t

󳨐⇒
616 | 24 Fourier transforms on infinite intervals


1 2
u(x, t) = ∫ e−y f (x − 2y√t) ⋅ (−√4t) dy
√4πt
−∞

1 −y2
= ∫ e f (x − 2y√t) dy (24.84)
√π
−∞

If f (x) is sectionally smooth or piecewise continuous, then the integral converges uni-
formly and absolutely in x and t. Hence, we can take the limit under the integral sign,

1 2
lim u(x, t) = ∫ e−y lim f (x − 2y√t) dy
t→0 √π t→0
−∞

= f (x) (24.85)

∴ For all f (x) for which ∫−∞ |f (x)| dx exists, the solution of equations (24.69)–(24.70) is

given by equation (24.82).

Physical interpretation
Let

f (x) = δ(x − s)
= unit source of heat at position x = s at time t = 0 (24.86)

1 (x−s)2
u(x, t) = e− 4t
√4πt
= temperature at position x and time t due to
a unit source at position s at t = 0 (24.87)

More generally,

1 (x−s)2
W(x, t, s, τ) = e− 4(t−τ)
√4π(t − τ)
= temp. at position x at time t due to a unit source
at position s at time τ(t > τ) (24.88)

W is called a fundamental solution (or the Green’s function) of the heat equation. It
satisfies the equation,

𝜕W 𝜕2 W
− = δ(x − s)δ(t − τ), (24.89)
𝜕t 𝜕x 2

and the adjoint heat equation


24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 617

𝜕W 𝜕2 W
− − = δ(x − s)δ(t − τ) (24.90)
𝜕τ 𝜕s2

We note that f (ξ ) dξ = amount of heat between ξ and ξ + dξ at time t = 0. Thus,


considering the distributed source as the sum(integral) of point sources and using the
principle of superposition, the temperature at x and time t is

1 (x−ξ )2
u(x, t) = ∫ e− 4t f (ξ ) dξ (24.91)
√4πt
−∞

This solution is valid for any piecewise continuous f (ξ ) for which the integral in equa-
tion (24.91) exists. We now consider some special cases:

Special case
Consider

1, ξ ≤0
f (ξ ) = { (24.92)
0, ξ >0


0
1 (x−ξ )2
u= ∫ e− 4t dξ (24.93)
√4πt
−∞

Let

x−ξ

√4t
x
√4t ∞
1 2 1 2
u= ∫ e−η (−√4t) dη = ∫ e−η dη
√4πt √π
∞ x
√4t
x x
√4t ∞ √4t
1 2 2 2 2
= ⋅ [ ∫ e−η dη + ∫ e−η dη − ∫ e−η dη]
2 √π
0x 0
√4t
x
√4t
1 2 2
= [1 − ∫ e−η dη]
2 √π
0

1 x
u(x, t) = [1 − erf( )] (24.94)
2 √4t

Figure 24.2 shows the solution profile at various times.


618 | 24 Fourier transforms on infinite intervals

Figure 24.2: Temporal evolution of dimensionless temperature distribution for 1D transient diffusion
in infinite domain.

Remark. If the equation is

𝜕2 u 𝜕u
D = (24.95)
𝜕x 2 𝜕t

then

1 x
u = [1 − erf( )], (24.96)
2 √4Dt

which has infinite speed of propagation (a property of parabolic equations).

24.3.2 Solution of the heat equation in semi-infinite domain

Consider the equation

𝜕2 u 𝜕u
= , 0 < x < ∞, t>0 (24.97)
𝜕x 2 𝜕t

with boundary condition

u = 0@x = 0 (24.98)

and initial condition

u(x, 0) = f (x) (24.99)

In the solution (equation (24.91)), suppose that we assume that f (ξ ) is odd, i. e.,

f (−ξ ) = −f (ξ ) (24.100)
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 619

Then
0 ∞
1 (x−ξ )2 (x−ξ )2
u(x, t) = [ ∫ e− 4t f (ξ ) dξ + ∫ e− 4t f (ξ ) dξ ] (24.101)
√4πt
−∞ 0

Setting ξ = −s, and f (−s) = −f (s) in the first integral, gives

0
(x+s)2
I1 = − ∫ e− 4t f (−s) ds

0
(x+s)2
= ∫ f (s)e− 4t ds


(x+ξ )2
= − ∫ f (ξ )e− 4t dξ
0

Thus,

1 (x−ξ )2 (x+ξ )2
u(x, t) = [ ∫ (e− 4t − e− 4t )f (ξ ) dξ ]. (24.102)
√4πt
0

Since the integral (equation (24.102)) converges uniformly and absolutely, we can take
limit under the integral sign:

u(0, t) = 0

Thus, the solution to equations (24.97)–(24.99) is given by equation (24.102).

Example: f (ξ) = 1

∞ ∞
1 (x−ξ )2 1 (x+ξ )2
u(x, t) = ∫ e− 4t dξ − ∫ e− 4t dξ
√4πt √4πt
0 0
1 x 1 x
= [1 + erf( )] − [1 − erf( )]
2 √4t 2 √4t
x
= erf( ). (24.103)
√4t

This is the solution to the heat equation in a semi-infinite domain with initial temper-
ature of unity and boundary (x = 0) temperature of zero (for t > 0). Figure 24.3 shows
the spatial profiles of the solution at various times.
620 | 24 Fourier transforms on infinite intervals

Figure 24.3: Spatial profiles of solution u(x, t) at various times for heat equation in semi-infinite
domain.

Nonhomogeneous problem
Consider the heat/diffusion equation in a semi-infinite domain

𝜕2 u 𝜕u
= , 0<x<∞ (24.104)
𝜕x2 𝜕t
with boundary and initial conditions

u = 1 @ x = 0, t>0 and u(x, 0) = f (x). (24.105)

Define

w =u−1 (24.106)

𝜕2 w 𝜕w
= , (24.107)
𝜕x 2 𝜕t
w(0, t) = 0, w(x, 0) = f (x) − 1 = F(x) (24.108)



1 (x−ξ )2 (x+ξ )2
u=1+ [ ∫ [e− 4t − e− 4t ][f (ξ ) − 1] dξ ] (24.109)
√4πt
0

For special case of f (x) = 0, the solution given in equation (24.109) reduces to

x x
u(x, t)|f =0 = u0 (x, t) = 1 − erf( ) = erfc( ). (24.110)
√4t √4t

Thus, equation (24.110) is the solution to


24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 621

𝜕2 u 𝜕u
= , u(0, t) = 1, u(x, 0) = 0 (24.111)
𝜕x 2 𝜕t
Consider the more general case of a nonhomogeneous problem given by

𝜕2 u 𝜕u
= , 0<x<∞ (24.112)
𝜕x 2 𝜕t
u = g(t) @ x = 0, u = f (x) @ t = 0 (24.113)

We have seen how to solve this problem for g(t) = 1. Call this solution U(x, t), i. e.,

1 (x−ξ )2 (x+ξ )2
U(x, t) = 1 + [ ∫ [e− 4t − e− 4t ][f (ξ ) − 1] dξ ] (24.114)
√4πt
0

Differentiate w. r. t. time to get unit impulse response, i. e., let

𝜕U
W(x, t) = (x, t) (24.115)
𝜕t
Then
t t
𝜕U
u(x, t) = ∫ W(x, t − τ)g(τ) dτ ⇒ u(x, t) = ∫ (x, t − τ)g(τ) dτ (24.116)
𝜕t
0 0

is the solution to equations (24.112)–(24.113). This formula is often called Duham-


mel’s formula in the engineering literature. If f (x) = 0, the solution to equations
(24.112)–(24.113) is

x2

2 2
u(x, t) = ∫ g(t − 2 )e−λ dλ (24.117)
√π
x

√4t

and the complete solution to equations (24.112)–(24.113) is

x2
∞ ∞
2 2 1 (x−ξ )2 (x+ξ )2
u(x, t) = ∫ g(t − 2 )e−λ dλ + [ ∫ [e− 4t − e− 4t ]f (ξ ) dξ ]. (24.118)
√π 4λ √4πt
x 0
√4t

Heat equation in a semi-infinite domain with zero flux at x = 0


Consider again the solution

1 (x−ξ )2
u(x, t) = ∫ e− 4t f (ξ ) dξ
√4πt
−∞

of the equation
622 | 24 Fourier transforms on infinite intervals

uxx = ut , −∞ < x < ∞ with u(x, 0) = f (x)

Let f (x) be an even function, i. e.,

f (−x) = f (x)


0 ∞
1 (x−ξ )2 (x−ξ )2
u(x, t) = [ ∫ e− 4t f (ξ ) dξ + ∫ e− 4t f (ξ ) dξ ]
√4πt
−∞ 0
0 ∞
1 2 (x−ξ )2
= [− ∫ e − (x+s)
4t f (−s) ds + ∫ e− 4t f (ξ ) dξ ]
√4πt
∞ 0

1 (x+ξ )2 (x−ξ )2
= ∫ [e− 4t + e− 4t ]f (ξ ) dξ
√4πt
0



𝜕u 1 2(x + ξ ) − (x+ξ4t )2 2(x − ξ ) − (x−ξ4t )2
= ∫ [− e − e ]f (ξ ) dξ
𝜕x √4πt 4t 4t
0
𝜕u 󵄨󵄨󵄨󵄨
󵄨 =0
𝜕x 󵄨󵄨󵄨x=0

∴ The solution of

𝜕2 u 𝜕u
= , 0<x<∞
𝜕x 2 𝜕t
𝜕u (24.119)
(0, t) = 0
𝜕x
u(x, 0) = f (x)

is given by

1 (x+ξ )2 (x−ξ )2
u(x, t) = ∫ [e− 4t + e− 4t ]f (ξ ) dξ (24.120)
√4πt
0

Extension of the solutions from finite to infinite domains:


In order to solve the heat/diffusion equation in a semi-infinite domain as given by

𝜕u 𝜕2 u
= 2, 0<x<∞
𝜕t 𝜕x
u(x, 0) = f (x),
u(0, t) = 0,
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 623

we first consider the same problem in a finite domain of length a (with u(a, t) = 0).
The solution for this case is given by
a
2 ∞ − n2 π2 2 t nπx nπs
u(x, t) = ∑ e a sin( ) ∫ f (s) sin( ) ds
a n=1 a a
0

Let
nπ π
αn = ⇒ Δαn = and a → ∞
a a

∞ ∞
2 2
u(x, t) = ∫ e−α t sin αx( ∫ f (s) sin αs ds) dα
π
0 0
eiαx − e−iαx eiαs − e−iαs
∞∞
2 2
= ∫ ∫ e−α t f (s)[ ][ ] ds
π 2i 2i
0 0
∞∞
1 2
= ∫ ∫ e−α t [cos α(x − s) − cos α(x + s)]f (s) dα ds
π
0 0

Using the result,



2 π − y4t2
∫ e−α t cos αy dα = √ e
4t
0



1 (x−s)2 (x+s)2
u(x, t) = ∫ [e− 4t − e− 4t ]f (s) ds.
√4πt
0

Thus, the solution in a semi-infinite domain may be obtained from that of finite do-
main by taking the limiting process. Other problems of infinite and semi-infinite do-
mains may also be solved in a similar way.

24.3.3 Transforms on the half-line

Consider the Fourier integral identity


∞ ∞
1
f (x) = ∫ ∫ f (ξ ) cos[α(x − ξ )] dξ dα
π
0 −∞
∞ ∞
1
= ∫ ∫ f (ξ )[cos αx cos αξ + sin αx sin αξ ] dξ dα (24.121)
π
0 −∞
624 | 24 Fourier transforms on infinite intervals

If f (ξ ) is odd, then we use the identity


∞∞
2
f (x) = ∫ ∫ f (ξ ) sin αx sin αξ ⋅ dξ dα
π
0 0
∞ ∞
2 2
= ∫ (√ sin αx)( ∫ √ sin αξ ⋅ f (ξ ) dξ ) dα (24.122)
π π
0 0

to define the transform pair

Fs (α) = √ π2 ∫0 f (ξ ) sin αξ ⋅ dξ }

} Fourier sine transform pair. (24.123)


f (x) = √ π2 ∫0 Fs (α) sin αx ⋅ dα }

Note the similarity with the finite Fourier transform pair

l
2 nπξ
F(n) = √ ∫ f (ξ ) sin( ) dξ (24.124)
l l
0

2 ∞ nπx
f (x) = √ ∑ F(n) sin( ) (24.125)
l n=1 l

The sine transform is useful when we have the operator

d2
, 0<x<∞ (24.126)
dx2
u(0) = 0. (24.127)

If f (x) is an even function, then we use the identity


∞∞
2
f (x) = ∫ ∫ f (ξ ) cos αx cos αξ ⋅ dξ dα (24.128)
π
0 0

Thus, we get the Fourier cosine transform

Fc (α) = √ π2 ∫0 f (ξ ) cos αξ dξ }

} Fourier cosine transform pair. (24.129)


f (x) = √ π2 ∫0 Fc (α) cos αx dα }

This transform is useful for the operator

d2
, 0<x<∞ (24.130)
dx2
u′ (0) = 0 (24.131)
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 625

24.3.4 Solution of heat/diffusion equation with radiation BC

Consider the solution of the heat/diffusion equation in a semi-infinite domain

𝜕2 u 𝜕u
= , 0 < x < ∞, t>0 (24.132)
𝜕x 2 𝜕t

with boundary condition

𝜕u
(0, t) − Bi u(0, t) = 0, (24.133)
𝜕x
and initial condition

u(x, 0) = f (x). (24.134)

We consider the eigenvalue problem

d2 w
= −λw (24.135)
dx 2
w′ − Bi w = 0, x = 0 (24.136)
w + Bi w = 0,

x=l (24.137)

which is self-adjoint. We solve the problem on the finite domain to obtain

l

αn cos αn x + Bi sin αn x
f (x) = 2 ∑ ∫ f (ξ )[αn cos αn ξ + Bi sin αn ξ ] dξ (24.138)
n=1 (αn2 + Bi2 )l + 2 Bi
0

where
αn Bi
cot αn l = − (24.139)
2 Bi 2αn

For large n, αn ≈ (n−1)π


l

π
Δαn = αn+1 − αn =
l
Let l → ∞ and replace the Riemann sum by an integral

∞ ∞
2 α cos αx + Bi sin αx
f (x) = ∫ ( ∫ (α cos αξ + Bi sin αξ )f (ξ ) dξ ) dα (24.140)
π α2 + Bi2
0 0

For some class of functions, this is an identity and is another type of Fourier transform.
We define the transform pair by
626 | 24 Fourier transforms on infinite intervals


2 (α cos αξ + Bi sin αξ )
F(α) = √ ∫ f (ξ ) dξ (24.141)
π √α2 + Bi2
0

2 (α cos αx + Bi sin αx)
f (x) = √ ∫ F(α) dα (24.142)
π √α2 + Bi2
0

This transform pair is useful for solving the heat equation with radiation BCs, i. e.,

d2
, 0<x<∞ (24.143)
dx2
u′ − Bi ⋅u = 0 @ x = 0 (24.144)

Let

u(α,
̂ t) = ℱ (u(x, t))

2 α cos αx + Bi sin αx
=√ ∫ u(x, t) dx (24.145)
π √α2 + Bi2
0

then it is easily shown that

𝜕2 u
ℱ{ (x, t)} = −α2 u(α,
̂ t)
𝜕x 2
u(α,
̂ 0) = ℱ (f (x)) = F(α)


2
û = e−α t ⋅ F(α) (24.146)


∞ ∞
2 2 α cos αx + Bi sin αx 2 α cos αξ + Bi sin αξ
u(x, t) = √ ∫ e−α t ( ∫ √ f (ξ ) dξ ) dα
π √α2 + Bi2 π √α2 + Bi2
0 0
−α2 t
∞∞
2 e .f (ξ )[α cos αx + Bi sin αx][α cos αξ + Bi sin αξ ]
= ∫∫ dξ dα (24.147)
π (α2 + Bi2 )
0 0

Changing the order of integration,



2
e−α t .[α cos αx + Bi sin αx][α cos αξ + Bi sin αξ ]
∞ ∞
2
u(x, t) = ∫ f (ξ )( ∫ dα) dξ (24.148)
π (α2 + Bi2 )
0 0

α Bi
Let cos θ = √α2 +Bi2
⇒ sin θ = √α2 +Bi2
. Then the second integral may be written as
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 627


2
I = ∫ e−α t cos(αx − θ) cos(αξ − θ) dα (24.149)
0

1 2
= ∫ e−α t [cos(α(x + ξ ) − 2θ) + cos α(x − ξ )] dα (24.150)
2
0
∞ ∞
1 2 1 2
= ∫ e−α t cos(αx + αξ − 2θ) dα + ∫ e−α t cos α(x − ξ ) dα. (24.151)
2 2
0 0

Thus, it may be shown after algebraic simplifications and evaluation of the integrals
that

1 (x+ξ )2 (x−ξ )2
u(x, t) = ∫ [e− 4t + e− 4t ]f (ξ ) dξ
√4πt
0

2 x+ξ
− Bi ∫ eBi t+Bi(x+ξ )
erf c[ + Bi √t]f (ξ ) dξ (24.152)
√4t
0

Remarks.
(1) The solution of

ut = uxx , x > 0, t > 0 (24.153)


𝜕u
− (0, t) + Bi u(0, t) = ϕ(t) Bi @ x = 0 (24.154)
𝜕x
u(x, 0) = f (x) (24.155)

is given by

1 (x+ξ )2 (x−ξ )2
u = ∫√ [e− 4t + e− 4t ]f (ξ ) dξ
4πt
0

2 x+ξ
− Bi ∫ eBi t+Bi(x+ξ )
erf c[ + Bi √t]f (ξ ) dξ
√4t
0
∞ 2
e−x /4(t−τ) 2 x
+ Bi ∫ { − Bi eBi t+Bi(x+ξ ) erf c[ + Bi √t − τ]}ϕ(τ) dτ
√π(t − τ) √4(t − τ)
0
(24.156)

(2) Many problems in the infinite and semi-infinite regions can also be solved by using
the Laplace transformation. We refer to the books (Carslaw and Jaeger [10]; Crank
[16]) for further examples.
628 | 24 Fourier transforms on infinite intervals

24.3.5 Fourier transforms on an infinite domain: solution of the wave equation

Consider again the operator

d2
, −∞ < x < ∞, (24.157)
dx2

which has continuous spectrum {α2 > 0} with eigenfunction {e±iαx }. Let

ℱ {f (x)} = F(α) = ∫ f (x)e dx. (24.158)


−iαx

−∞

Inversion formula

1
f (x) = ∫ F(α)eiαx dα (24.159)

−∞

Wave equation

𝜕2 u 𝜕2 u
2
= c2 2 , −∞ < x < ∞, t>0 (24.160)
𝜕t 𝜕x
IC:
𝜕u
u(x, 0) = f (x), (x, 0) = g(x), −∞ < x < ∞ (24.161)
𝜕t

̂ t) = ℱ [u(x, t)] = ∫ u(x, t)e−iαx dx
u(α, (24.162)
−∞

Now

𝜕n u
ℱ{ } = (iα)n ⋅ ℱ {u(x, t)}
𝜕x n

Let

F(α) = ℱ [f (x)] and G(α) = ℱ [g(x)] (24.163)

Then, taking FT on equations (24.160)–(24.161) ⇒

d2 û
= −c2 α2 û
dt 2
dû
u(0)
̂ = F(α), (0) = G(α)
dt
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 629


G(α)
u(α,
̂ t) = F(α) cos[αct] + sin[αct] (24.164)
αc


1
u(x, t) = ∫ u(α,
̂ t)e+iαx dα

−∞
∞ ∞
1 1 G(α)
= ∫ F(α) cos αcte+iαx dα + ∫ sin αcte+iαx dα (24.165)
2π 2π αc
−∞ −∞

Consider the first integral


∞ ∞
1
u1 (x, t) = ∫ cos[αct]e+iαx ( ∫ f (ξ )e−iαξ dξ ) dα

−∞ −∞
∞ ∞
1
= ∫ ∫ e+iα(x−ξ ) f (ξ ) cos[αct] dξ dα

−∞ −∞
iαct
e +e −iαct
cos[αct] =
2

∞ ∞ ∞ ∞
1 1
u1 (x, t) = ∫ ∫ eiαx−iαξ +iαct f (ξ ) dξ dα + ∫ ∫ eiαx−iαξ −iαct f (ξ ) dξ dα (24.166)
4π 4π
−∞ −∞ −∞ −∞

In the Fourier integral formula,


∞ ∞
1
f (z) = ∫ ∫ e−iαξ +iαz f (ξ ) dξ dα,

−∞ −∞

let z = x − ct, x + ct. Then equation (24.166) simplifies to

1
u1 (x, t) = [f (x + ct) + f (x − ct)] (24.167)
2

Now consider the second integral in equation (24.167):


∞ ∞
1 sin[αct]
u2 (x, t) = ∫ eiαx ( ∫ e−iαξ g(ξ ) dξ ) dα
2π αc
−∞ −∞
∞ ∞
1 sin αct
= ∫ ∫ eiαx−iαξ g(ξ ) dξ dα
2πc α
−∞ −∞
630 | 24 Fourier transforms on infinite intervals

eiαct − e−iαct
∞ ∞
1
= ∫ ∫ eiαx−iαξ g(ξ )( ) dξ dα
2πc 2iα
−∞ −∞

eiα(x+ct)−iαξ g(ξ ) eiα(x−ct)−iαξ g(ξ )


∞ ∞ ∞ ∞
1 1
= ∫ ∫ dξ dα − ∫ ∫ dξ dα
2πc 2iα 2πc 2iα
−∞ −∞ −∞ −∞

x 1 x
If h(x) = ∫0 f (x) dx, then ℱ {h(x)} = iα
ℱ {f (x)} ⇒ ℱ −1 { F(α)

} = ∫0 f (x) dx,

x−ct x+ct
−1 1
u2 (x, t) = ∫ g(λ) dλ + ∫ g(λ) dλ
2c 2c
0 0
x+ct
1
= ∫ g(λ) dλ (24.168)
2c
x−ct


x+ct
f (x + ct) + f (x − ct) 1
u(x, t) = + ∫ g(λ) dλ (24.169)
2 2c
x−ct

is the solution to the initial value problem defined in equations (24.160)–(24.161).


x + ct = constant is a wave moving to the left with velocity c, x − ct = constant is a wave
moving to the right with velocity c. Therefore, the solution is a superposition of two
waves, one moving to the left and one to the right with a velocity of c.
Related problems
(i)

𝜕2 u 2
2𝜕 u
= c , 0 < x < ∞, t>0 (24.170)
𝜕t 2 𝜕x 2

BC:

u(0, t) = ϕ(t) (24.171)

IC:

𝜕u
u(x, 0) = 0, (x, 0) = 0 (24.172)
𝜕t

Using the Fourier sine transform, it can be shown that

∞ ct

u(x, t) = ∫ ∫ ϕ(τ) sin α(ct − τ) sin αx dα dτ (24.173)


0 0
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 631

ϕ(ct − x), 0 < x < ct


={ (24.174)
0, x > ct

(ii) 𝜕2 u 𝜕2 u
= 2, 0 < x < ∞, t>0 (24.175)
𝜕t 2 𝜕x
BC:

ux (0, t) = 0 (24.176)

IC:
𝜕u
u(x, 0) = f (x), (x, 0) = 0 (24.177)
𝜕t
Using the cosine transform, one gets
f (x+t)+f (t−x)
2
, 0<x<t
u(x, t) = { f (x+t)+f (x−t)
(24.178)
2
, x>t

(iii) The solution of the inhomogeneous wave equation

𝜕2 u 2
2𝜕 u
= c + q(x, t), −∞ < x < ∞, t>0 (24.179)
𝜕t 2 𝜕x 2
with IC
𝜕u
u(x, 0) = 0, (x, 0) = 0 (24.180)
𝜕t
may be written in the form

t x+c(t−τ)
1
u(x, t) = ∫ ∫ q(ξ , τ) dξ dτ (24.181)
2c
0 x−c(t−τ)

24.3.6 Laplace’s equation in infinite and semi-infinite domains

In this section, we consider the solution of Laplace’s equation using the Fourier trans-
form method.

Laplace’s equation in a strip


Consider Laplace’s equation

𝜕2 u 𝜕2 u
+ = 0; −∞ < x < ∞, 0<y<a (24.182)
𝜕x 2 𝜕y2
632 | 24 Fourier transforms on infinite intervals

u(x, 0) = f (x), u(x, a) = 0 (24.183)

Let

û = ℱ (u(x, y)), F(α) = ℱ {f (x)} (24.184)

d2 û
− α2 û = 0
dy2
u(a)
̂ = 0, u(0)
̂ = F(α)

û = c1 sinh αy + c2 sinh α(a − y)

F(α)
u(a)
̂ = 0 ⇒ c1 = 0 and u(0)
̂ = F(α) ⇒ c2 = sinh αa
. Therefore,

F(α) sinh α(a − y)


u(α,
̂ y) = (24.185)
sinh αa
∞ ∞
1 sinh α(a − y)
u(x, y) = ∫ eiαx ( ∫ e−iαξ f (ξ ) dξ ) dα (24.186)
2π sinh αa
−∞ −∞
∞ ∞
1 sinh α(a − y)
= ∫ ∫ eiαx−iαξ f (ξ ) dξ dα (24.187)
2π sinh αa
−∞ −∞

Changing the order of integration,



∞ ∞
1 sinh α(a − y)
u(x, y) = ∫ f (ξ )( ∫ eiα(x−ξ ) dα) dξ
2π sinh αa
−∞ −∞

This may be simplified to



1 πy f (ξ )
u(x, y) = sinh ∫ dξ . (24.188)
a a π(a−y)
cos a + cosh a
π(x−ξ )
−∞

Laplace’s equation in a half-plane


Consider Laplace’s equation

𝜕2 u 𝜕2 u
+ = 0, −∞ < x < ∞, y>0 (24.189)
𝜕x 2 𝜕y2
u(x, 0) = f (x) (24.190)
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 633

u(x, y) is bounded. Let



2
û s = √ ∫ u(x, y) sin αy dy = ℱs {u(x, y)} (24.191)
π
0

𝜕2 u
ℱs { } = −α2 û s + αu(x, 0) (24.192)
𝜕y2

Thus, taking sine transform, equations (24.189)–(24.190) reduces to

d2 û s
− α2 û s = −αf (x), −∞ < x < ∞ (24.193)
dx2

This equation is easily solved using the Fourier transform

αeiβx
∞ ∞
1
û s = ∫ 2 ( ∫ f (ξ )e−iβξ dξ ) dβ (24.194)
2π α + β2
−∞ −∞

and

2
u(x, y) = √ ∫ û s (x, α) sin αy dy
π
0

αeiβx
∞ ∞ ∞
2 1
= √ ∫[ ∫ 2 ( ∫ f (ξ )e−iβξ dξ ) dβ] sin αy dα
π 2π α + β2
0 −∞ −∞

but

αeiβ(x−ξ )

αe−α(x−ξ )
∫ dβ = 2πi = πe−α(x−ξ )
α2 + β 2 2αi
−∞


∞ ∞
1
u(x, y) = √ ∫ ∫ f (ξ )e−α(x−ξ ) sin αy dξ dα

0 −∞

y f (ξ )
= ∫ dξ (24.195)
√2π (x − ξ )2 + y2
−∞

This is the Poisson’s integral formula.


634 | 24 Fourier transforms on infinite intervals

Special case: f (x) = boxcar function


For the special case of f (x) being boxcar function:

1, |x| ≤ 1
f (x) = {
0, |x| > 1,

the solution (24.195) is reduced to

1 1−x 1+x
u(x, y) = {tan−1 [ ] + tan−1 [ ]}
√2π y y

and a 3D plot of the solution is shown in Figure 24.4.

1, |x|≤1
Figure 24.4: Solution profile for 2D Laplace equation in a half-plane with u(x, y = 0) = { .
0, |x|>1

Laplace’s equation in a half-strip


Consider Laplace’s equation in a half-strip:

𝜕2 u 𝜕2 u
+ = 0, 0 < x < ∞, 0<y<1 (24.196)
𝜕x 2 𝜕y2

with boundary conditions

u(x, 0) = f (x), u(x, 1) = 0, 0<x<∞ (24.197)


u(0, y) = 0, 0 < y < 1. (24.198)

We use the sine transform by defining


2
û = √ ∫ u(ξ , y) sin αξ dξ (24.199)
π
0
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 635


2
f̂ = √ ∫ f (ξ ) sin αξ dξ . (24.200)
π
0

Taking the sine transform gives

d2 û
− α2 û = 0
dy2
û = f ̂ @ y = 0, û = 0 @ y = 1

û = c1 sinh αy + c2 sinh α(1 − y)

BCs ⇒

c1 = 0

c2 =
sinh α

f ̂ sinh α(1 − y)
û = (24.201)
sinh α
∞ ∞
2 sinh α(1 − y)
u(x, y) = ∫ sin αx( ∫ f (ξ ) sin αξ dξ ) dα (24.202)
π sinh α
0 0

is the formal solution.

Remark. If the BC at x = 0 is 𝜕u
𝜕x
= 0 then we can use the cosine transform.

24.3.7 Multiple Fourier transforms

Consider the solution of the heat equation in three dimensions

𝜕u 𝜕2 u 𝜕2 u 𝜕2 u
= 2 + 2 + 2, −∞ < x < ∞, −∞ < y < ∞, −∞ < z < ∞, t>0
𝜕t 𝜕x 𝜕y 𝜕x
(24.203)

with initial condition

u(x, y, z, 0) = f (x, y, z) (24.204)

Eigenvalues
636 | 24 Fourier transforms on infinite intervals

{α12 + α22 + α32 } = |α|2 , α = [α1 α2 α3 ] (24.205)

eigenfunctions

e−i(α1 x+α2 y+α3 z) = e−iα⋅x , x = [x y z] (24.206)

Define
∞ ∞ ∞
−i(α1 x+α2 y+α3 z)
ℱ {f (x, y, z)} = ∫ ∫ ∫ e f (x, y, z) dx dy dz = F(α1 , α2 , α3 ) (24.207)
−∞ −∞ −∞

ℱ {u(x, y, z, t)} = u(α


̂ 1 , α2 , α3 , t) (24.208)

Then we get

dû
= −(α12 + α22 + α32 )û
dt
u(0)
̂ = F(α1 , α2 , α3 )


2 2 2
û = F(α1 , α2 , α3 )e−(α1 +α2 +α3 )t
∞ ∞ ∞
2 2 2
= ∫ ∫ ∫ f (ξ , η, λ)e−(α1 +α2 +α3 )t−i(α1 ξ +α2 η+α3 λ) dξ dη dλ (24.209)
−∞ −∞ −∞


∞ ∞ ∞
1 2 2 2
u(x, y, z) = 3
∫ ∫ ∫ e−(α1 +α2 +α3 )t+i(α1 x+α2 y+α3 z) dα1 dα2 dα3
(2π)
−∞ −∞ −∞
∞ ∞ ∞

× ( ∫ ∫ ∫ f (ξ , η, ζ )e−i(α1 ξ +α2 η+α3 λ) dξ dη dζ )


−∞ −∞ −∞

This can be simplified in the same way as the one-dimensional problem. The final
result is
∞ ∞ ∞
1 2 2 2
u(x, y, z, t) = 3
∫ ∫ ∫ e−((x−ξ ) +(y−η) +(z−ζ ) )/4t f (ξ , η, ζ ) dξ dη dζ (24.210)
( 4πt)

−∞ −∞ −∞
∞ ∞ ∞

= ∫ ∫ ∫ G(x, y, z, ξ , η, ζ , t, 0)f (ξ , η, ζ ) dξ dη dζ (24.211)


−∞ −∞ −∞

where
24.4 Relationship between Fourier and Laplace transforms | 637

1
G= exp{−[(x − ξ )2 + (y − η)2 + (z − ζ )2 ]/4(t − τ)} (24.212)
[4π(t − τ)]3/2

is the temperature at position (x, y, z) and time t due to a unit point source at (ξ , η, ζ )
at time τ(t > τ).
G satisfies the heat equation

𝜕G 𝜕2 G 𝜕2 G 𝜕2 G
− 2 − 2 − 2 = δ(x − ξ )δ(y − η)δ(z − ζ )δ(t − τ) (24.213)
𝜕t 𝜕x 𝜕y 𝜕z

as well as the adjoint heat equation

𝜕G 𝜕2 G 𝜕2 G 𝜕2 G
− − − − = δ(x − ξ )δ(y − η)δ(z − ζ )δ(t − τ) (24.214)
𝜕τ 𝜕ξ 2 𝜕η2 𝜕ζ 2

In analogous fashion, one can define double Fourier transforms, double sine trans-
forms, double cosine transforms, triple cosine transforms, and so on.

24.4 Relationship between Fourier and Laplace transforms


We consider the Fourier integral formula
∞ ∞
1
f (x) = ∫ eiαx ( ∫ f (ξ )e−iαξ dξ ) dα (24.215)

−∞ −∞

and assume that f (x) is absolutely integrable. Then f (x) is bounded, i. e., ∃ a constant
M such that
󵄨󵄨 󵄨
󵄨󵄨f (x)󵄨󵄨󵄨 < M. (24.216)

Now, let

e−γx ϕ(x), x>0


f (x) = { (24.217)
0, x<0

Note that ϕ(x) is of exponential order, i. e.,


󵄨󵄨 󵄨 γx
󵄨󵄨ϕ(x)󵄨󵄨󵄨 ≤ Me , γ>0 (24.218)

Equation (24.215)⇒
∞ ∞
1
e−γx ϕ(x) = ∫ eiαx ( ∫ e−γξ ⋅ e−iαξ ϕ(ξ ) dξ ) dα

−∞ 0


638 | 24 Fourier transforms on infinite intervals

∞ ∞
1
ϕ(x) = ∫ e(γ+iα)x ( ∫ e−(γ+iα)ξ ϕ(ξ ) dξ ) dα (24.219)

−∞ 0

Let s = γ + iα ⇒ dα = 1i ds ⇒

γ+i∞ ∞
1
ϕ(x) = ∫ esx ( ∫ e−sξ ϕ(ξ ) dξ ) ds (24.220)
2πi
γ−i∞ 0

This is an identity for some class of functions ϕ(x). Define

ℓ{ϕ(x)} = Φ(s) = ∫ e−sx ϕ(x) dx (24.221)


0

Then equation (24.220) gives the inversion formula

γ+i∞
1
ϕ(x) = ∫ esx Φ(s) ds. (24.222)
2πi
γ−i∞

We showed that Φ(s) is analytical in the right half-plane Re s > γ. Thus, equations
(24.221) and (24.222) define the Laplace transform and the inversion formula.
Consider again the Fourier transform pair

F(α) = ∫ f (ξ )e−iαξ dξ
−∞

1
f (x) = ∫ F(α)eiαx dα

−∞

and suppose that f (ξ ) = 0, ξ < 0 ⇒

F(α) = ∫ f (ξ )e−iαξ dξ
0

1
f (x) = ∫ F(α)eiαx dα

−∞

or

ℱ {f (x)} = ∫ e f (x) dx
−iαx

0
24.4 Relationship between Fourier and Laplace transforms | 639

Compare this with the Laplace transform


ℓ{f (x)} = ∫ e−sx f (x) dx


0

Thus, in the Laplace transform, we replace s by iα, we get the Fourier transform pro-
vided both exist.

Example 24.3.

e−x , x>0
f (x) = {
0, x<0
1
ℓ{f (x)} =
1+s
1 1 − iα
F(α) = ℱ {f (x)} = = .
1 + iα 1 + α2
󵄨 󵄨2 1
󳨐⇒ 󵄨󵄨󵄨F(α)󵄨󵄨󵄨 = .
1 + α2

We note that
∞ ∞
󵄨 󵄨2 1 1 󵄨 󵄨2
∫ 󵄨󵄨󵄨f (x)󵄨󵄨󵄨 dx = = ∫ 󵄨󵄨󵄨F(α)󵄨󵄨󵄨 dα,
2 2π
−∞ −∞

which is Plancherel’s theorem.

Problems
1. The evolution of small amplitude waves on a vertically falling film is described by
the linear partial differential equation,

𝜕h 𝜕h 𝜕3 h 5 𝜕2 h 27 𝜕2 h Re We 𝜕4 h
+3 +3 3 − Re − Re 2 + = 0, −∞ < x < ∞,
𝜕t 𝜕x 𝜕x 32 𝜕x𝜕t 160 𝜕x 12 𝜕x 4

where h is the film height, Re is the Reynolds number and We is the Weber number.
(a) Use the separation of variables method

hα (x, t) = δeiαx eλt

to determine the condition on the eigenvalue λ so that hα (x, t) is a solution. (b) Use
the result in (a) to determine the range of unstable wave numbers (for which the
real part of the eigenvalue is positive), (c) Plot the neutral stability curve (that
demarcates between the stable and unstable wave numbers) in the (α, We) plane.
2. The stability of the conduction state in a porous layer is determined by the homo-
geneous partial differential equations
640 | 24 Fourier transforms on infinite intervals

𝜕2 ψ 𝜕2 ψ 𝜕θ
+ 2 − Ra =0
𝜕x 2 𝜕z 𝜕x
𝜕2 θ 𝜕2 θ 𝜕ψ
+ + = 0, −∞ < x, ∞, 0<z<1
𝜕x 2 𝜕z 2 𝜕x

with boundary conditions θ = 0, ψ = 0 at z = 0, 1. Here, Ra is a physical parameter


known as the Rayleigh number (ψ(x, z) is the stream function while θ(x, z) is the
temperature perturbation) (a) Use the separation of variables method or the ap-
propriate Fourier transforms to determine the general form of the solution, i. e.,
how a typical term in the solution looks like, (b) Determine the condition(s) under
which there exists a nontrivial solution, (c) What is the critical value of Ra below
which only a trivial solution can exist? What is the wave number that leads to this
destabilization? Give a physical interpretation of the eigenfunctions correspond-
ing to this critical Ra and wave number.
3. (a) Determine the Fourier transform of the function

f (x) = e−a|x| , −∞ < x < ∞, a>0

(b) What is the relationship between the Fourier transforms of f (x) and f (x − ξ )?
(c) Use the above results to solve the boundary value problem

d2 u
− + a2 u = δ(x − ξ ); −∞ < x < ∞
dx2

(d) Use the result in (c) to solve the boundary value problem

d2 u
− + a2 u = h(x); −∞ < x < ∞
dx2

4. Consider the following evolution equation describing the dispersion of a tracer in


laminar flow in a very long tube:

𝜕c 𝜕c 𝜕2 c 𝜕2 c
+ p + λp − 2 = 0, −∞ < z < ∞, t > 0; c(z, 0) = δ(z)
𝜕t 𝜕z 𝜕z𝜕t 𝜕z

Here, p and λ are positive constants and δ(z) is the Dirac’s delta function. (a) Use
the Fourier transform (or any other method) to solve the equation and determine
c(z, t) for λ = 0 (b) If the k-th spatial moment (k ≥ 0) of c(z, t) is defined as

mk (t) = ∫ z k c(z, t) dz
−∞

determine the first three spatial moments (k = 0, 1, 2) for any λ, and hence the
variance without solving for c(z, t).
24.4 Relationship between Fourier and Laplace transforms | 641

5. Solve the problem

𝜕2 u 𝜕u
D = ; 0 < x < ∞, t > 0
𝜕x 2 𝜕t
u = f (t); @ x = 0, t > 0
u = g(t); @ t = 0, x>0

Apply your result to the special case g = 0 and f = A cos ωt to determine how a
periodic signal is attenuated.
6. (a) Consider the problem of a very long empty tubular reactor

𝜕2 c 𝜕c 𝜕c
D 2
− u − kc = , 0 < x < ∞, t > 0
𝜕x 𝜕x 𝜕t
𝜕c
−D = u(c0 − c), x = 0, t > 0
𝜕x
c = f (x), t = 0, x > 0

Put it into self-adjoint form. Consider what transform on the semi-infinite interval
might solve it.
(b) Consider the above problem with c0 = c0 (t). Cast the above equations into di-
mensionless form but leave c(x, t) dimensional. Make successively, substitutions
of the following form to put the equation in its simplest form:

x Pe Pe2 2 𝜕w
c = exp( )v; v = exp{−( + k)τ}w; ϕ=w−
2 4 Pe 𝜕x

What is now the form for the problem and what is the solution? Having found ϕ,
how does one find w?
7. A very long slab with two insulated opposite faces has arbitrary temperatures im-
posed on the other two faces so that the system is described by

𝜕2 u 𝜕2 u
+ = 0, −∞ < x < ∞, 0<y<L
𝜕x 2 𝜕y2
u(x, 0) = f (x), −∞ < x < ∞
u(x, L) = g(x), −∞ < x < ∞

Find a formal solution and show that it is identical to the Poisson’s integral for-
mula in the limit L → ∞.
8. Use multiple Fourier transform to solve the problem of heat flow with heat pro-
duction:
𝜕u
− div grad u = q(x, y, z, t), t > 0, −∞ < x, y, z < ∞
𝜕t
u(x, y, z, 0) = 0, u(x, y, z, t) bounded
25 Fourier transforms in cylindrical and spherical
geometries
Recall that for a regular differential operator, the leading coefficient did not vanish in-
side or at the end points of the interval. We now consider problems in which this con-
dition is violated. These problems arise mostly in cylindrical and spherical domains.

25.1 BVP and IBVP in cylindrical and spherical geometries


As in the case of rectangular coordinates, boundary and initial–boundary value prob-
lems usually involve the Laplacian operator ∇2 with various types of boundary condi-
tions. Examples of some of the well-known problems include but are not limited to:
1. Laplace’s equation

∇2 u = 0 in Ω; u = g on 𝜕Ω (or other BCs) (25.1)

2. Poisson’s equation

∇2 u = f in Ω; u = g on 𝜕Ω (or other BCs) (25.2)

3. Heat equation

𝜕u
= ∇2 u in Ω, t > 0; (25.3)
𝜕t
u = u0 @ t = 0, in Ω (25.4)
αu + βn.∇u = γ on 𝜕Ω, t>0 (25.5)

4. Wave equation

𝜕2 u
= c2 ∇2 u in Ω, t > 0; (25.6)
𝜕t 2
u = 0 on 𝜕Ω (25.7)
𝜕u
u = g1 and = g2 on 𝜕Ω @ t > 0 (25.8)
𝜕t

5. Helmholtz/diffusion–reaction equation

∇2 u = ±k 2 u in Ω; u = g on 𝜕Ω (or other BCs) (25.9)

We examine the solution of some of these equations in spherical and cylindri-


cal geometries using the Fourier transform and other methods. As discussed in
Part IV, the case of hollow cylinder and sphere lead to regular Sturm–Liouville

https://doi.org/10.1515/9783110739701-026
25.1 BVP and IBVP in cylindrical and spherical geometries | 643

BVPs and can be treated by the standard FFT method. Thus, the treatment below
is mostly confined to a solid cylinder and sphere.

25.1.1 Cylindrical geometries

The domain of interest may include either the inside or outside of a cylindrical domain
or the annular region or their combinations as shown schematically in Figure 25.1.

Figure 25.1: Schematic of cylindrical geometries: solid and hollow cylinders.

The Laplacian operator in cylindrical geometry can be obtained by the transformation


from Cartesian coordinate to the cylindrical coordinate system as shown in Figure 25.2,
and results in (r, θ, z) as

1 𝜕 𝜕u 1 𝜕2 u 𝜕2 u
∇2 u = (r ) + 2 2 + 2 , (25.10)
r 𝜕r 𝜕r r 𝜕θ 𝜕z

where the finite cylinder is represented as

{ 0 < θ < 2π
{
Ω≡{ 0<r<a (25.11)
{
{ 0 < z < L,
644 | 25 Fourier transforms in cylindrical and spherical geometries

Figure 25.2: Cylindrical coordinate system.

with the boundary

r = a, 0<z<L
𝜕Ω ≡ { (25.12)
z = 0, L, 0 < r < a.

Similarly, the half-cylinder or sector can be represented by

{ 0 < θ < 2π or θ1 < θ < θ2


{
Ω≡{ 0<r<a (25.13)
{
{ 0<z<L

and so forth. Similarly, the Laplacian operator in 1D (in r only) can be simplified to

1 𝜕 𝜕u
∇2 u = (r ) (25.14)
r 𝜕r 𝜕r

or in 2D as

1 𝜕 𝜕u 1 𝜕2 u
(r, θ) : ∇2 u = (r ) + 2 2 (25.15)
r 𝜕r 𝜕r r 𝜕θ
1 𝜕 𝜕u 𝜕2 u
(r, z) : ∇2 u = (r ) + 2 (25.16)
r 𝜕r 𝜕r 𝜕z

and so forth.

25.1.2 Spherical geometries

Similar to the cylindrical case, the BVPs and initial BVPs in spherical geometries usu-
ally involve solid or hollow spheres. The domain of interest may include either the
25.1 BVP and IBVP in cylindrical and spherical geometries | 645

inside or outside of the spherical domain or the annular region or their combinations.
Figure 25.3 shows the spherical coordinate system, where the relationship between
the Cartesian coordinate to the spherical coordinate system is as follows:

Figure 25.3: Spherical coordinate system.

which results in the Laplacian operator in 3D spherical coordinates in (r, θ, ϕ) as

1 𝜕 2 𝜕u 1 𝜕 𝜕u 1 𝜕2 u
∇2 u = (r ) + (sin θ ) + . (25.17)
r 2 𝜕r 𝜕r r 2 sin θ 𝜕θ 𝜕θ r 2 sin2 θ 𝜕ϕ2

The interior domain of sphere of radius a can be represented as

{ 0<r<a
{
Ω≡{ 0<θ<π (25.18)
{
{ 0 < ϕ < 2π

with the boundary

𝜕Ω ≡ {r = a, 0 < θ < π, 0 < ϕ < 2π. (25.19)

Similarly, a hemisphere or cone can be represented by

{ 0 < θ < π2 or 0 < θ < θ0 < π


2
{
Ω={ 0<r<a (25.20)
{
{ 0<z<L

and so forth.
646 | 25 Fourier transforms in cylindrical and spherical geometries

[Note: The polar angle θ is also referred to as latitude, where 0 < θ < π, with
θ = π2 denoting equator, θ = 0 denoting north pole and θ = π denoting the south pole.
Similarly, the azimuthal angle ϕ is also referred to as longitude where 0 < ϕ < 2π].
Similarly, the Laplacian operator in 1D (in r only) can be simplified to

1 𝜕 2 𝜕u
∇2 u = (r ) (25.21)
r 2 𝜕r 𝜕r

or in 2D as

1 𝜕 2 𝜕u 1 𝜕 𝜕u
(r, θ) : ∇2 u = 2
(r )+ 2 (sin θ ) (25.22)
r 𝜕r 𝜕r r sin θ 𝜕θ 𝜕θ

and so forth.

25.1.3 3D eigenvalue problems in cylindrical geometries

Consider the 3D EVP problem in the interior of a finite cylinder:

1 𝜕 𝜕ψ 1 𝜕2 ψ 𝜕2 ψ
∇2 ψ = (r ) + 2 2 + 2 = −λψ,
r 𝜕r 𝜕r r 𝜕θ 𝜕z
0 < r < a, 0 < θ < 2π, 0<z<L (25.23)

with homogeneous Dirichlet boundary condition on the surface

𝜕ψ
ψ = 0 on r = a, z = 0, L; ψ = finite (or, = 0) @ r = 0. (25.24)
𝜕r

The above EVP can be solved using separation of variables. Let

ψ = R(r)Θ(θ)Z(z) (25.25)

then substituting equation (25.25) into equations (25.23)–(25.24) and dividing by ψ, we


get

1 1 d dR 1 1 d2 Θ 1 d2 Z
(r ) + +λ =− , (25.26)
R r dr dr Θ r dθ
2 2 Z dz 2
dR
R = 0 on r = a; R = finite (or, = 0) @ r = 0 (25.27)
dr
Z = 0 at z = 0, L; (25.28)

Note that LHS in equation (25.26) is function of (r, θ) while the RHS is a function of z
only. Thus, we get
25.1 BVP and IBVP in cylindrical and spherical geometries | 647

1 d2 Z
= −λz Z; Z = 0 @ z = 0, L (25.29)
Z dz 2
󳨐⇒ the z-eigenvalues and eigenfunctions are given by

n2 π 2 nπz
λz = = λn and Zn = √2 sin[ ], n = 1, 2, 3, . . . (25.30)
L2 L

which are orthonormal w. r. t. the standard inner product:

L
1
⟨Zi , Zj ⟩ = ∫ Zi Zj dz = δij (25.31)
L
0

Thus, equation (25.26) can be rewritten by using equations (25.29) and (25.30) as fol-
lows:

1 d dR 1 d2 Θ
r (r ) + r 2 (λ − λn ) = − (25.32)
R dr dr Θ dθ2

which leads to the θ-EVP:

d2 Θ
= −λθ Θ; 0 < θ < 2π (25.33)
dθ2
Θ and Θ′ periodic in θ (25.34)

󳨐⇒

λθ = m2 = λm and Θ0 = 1, Θmc = √2 cos mθ, Θms = √2 sin mθ, m = 1, 2, . . .


(25.35)

which are orthonormal w. r. t. the standard inner product:


1
⟨Θi , Θj ⟩ = ∫ Θi Θj dθ = δij (25.36)

0

Thus, equation (25.32) can be rewritten by using equations (25.33) and (25.35) as fol-
lows:

1 d dR m2
(r ) − 2 R = −(λ − λn )R (25.37)
r dr dr r
dR
R = 0 on r = a; R = finite (or, = 0) @ r = 0 (25.38)
dr

Equations (25.37)–(25.38) suggest that the eigenfunction can be expressed in terms


of Bessel function Jm (√λ − λn r) for λ > λn . Thus, the normalized eigenfunctions and
eigenvalues are given by
648 | 25 Fourier transforms in cylindrical and spherical geometries

Jm (√λmnk − λn r)
Rmnk = (25.39)
Jm+1 (√λmnk − λn a)
Jm (√λmnk − λn a) = 0; λmnk > λn ; (25.40)
m = 0, 1, 2, . . . ; n = 1, 2, 3, . . . ; and k = 1, 2, 3, . . .

which are orthonormal w. r. t. the cylindrical inner product:


a
1
⟨Rmnk , Rmnj ⟩ = ∫ 2rRmnk Rmnj dr = δjk (25.41)
a2
0

Thus, the 3D Laplacian operator with Dirichlet condition in cylindrical coordinate sys-
tem has the eigenvalues and eigenfunctions

μ2mk
λmnk = + λn ,
a2
(25.42)
ψ = ψmnk (r, θ, z) = Rmnk (r)Θm (θ)Zn (z),
m = 0, 1, 2, . . . ; n = 1, 2, 3, . . . ; and k = 1, 2, 3, . . .

where Rmnk , Θm and Zn are given in equations (21.26), (21.29) and (25.41), and μmk are
the k th zero of Jm . Similarly, eigenvalues and eigenfunctions with other BCs (compati-
ble with separation of variables) can also be obtained with similar procedure.

25.1.4 3D eigenvalue problems in spherical geometries

Consider the EVP problem in 3D spherical coordinate:

1 𝜕 2 𝜕ψ 1 𝜕 𝜕ψ 1 𝜕2 ψ
∇2 ψ = (r ) + (sin θ ) + = −λψ, (25.43)
r 2 𝜕r 𝜕r r 2 sin θ 𝜕θ 𝜕θ r 2 sin2 θ 𝜕ϕ2
0 < r < a, 0 < θ < π, 0 < ϕ < 2π

with homogeneous Dirichlet boundary conditions

𝜕ψ
ψ = 0 on r = a, ψ = finite (or, = 0) @ r = 0. (25.44)
𝜕r

Using separation of variables approach, i. e., expressing ψ(r, θ, ϕ) as

ψ = R(r)Θ(θ)Φ(ϕ) (25.45)
ψ
and substituting equation (25.45) into equations (25.43)–(25.44) and dividing by r2
, we
get

1 d 2 dR 1 d dΘ 1 d2 Φ
(r ) + r2 λ = − (sin θ )− , (25.46)
R dr dr Θ sin θ dθ dθ Φ sin θ dϕ2
2
25.1 BVP and IBVP in cylindrical and spherical geometries | 649

𝜕R
R = 0 on r = a; R = finite (or, = 0) @ r = 0 (25.47)
𝜕r

Since LHS in equation (25.46) is function of r while the RHS is a function of (θ, ϕ) only,
both terms should be equated to a constant, and thus we can write

1 d dΘ 1 d2 Φ
(sin θ )+ = −λθϕ .
Θ sin θ dθ dθ Φ sin2 θ dϕ2

󳨐⇒ by multiplying with sin2 θ and rearranging,

sin θ d dΘ 1 d2 Φ
(sin θ ) + λθϕ sin2 θ = − = λϕ (25.48)
Θ dθ dθ Φ dϕ2
0 < θ < π, 0 < ϕ < 2π

Thus, ϕ-eigenvalues and eigenfunctions are given by

λϕ = m2 = λm ; Φ0 = 1, Φmc = √2 cos mϕ, Φms = √2 sin mϕ, m = 1, 2, . . . (25.49)

which are orthonormal w. r. t. the standard inner product:


1
⟨Φi , Φj ⟩ = ∫ Φi Φj dϕ = δij . (25.50)

0

Similarly, θ-eigenfunctions can be expressed from equations (25.48) and (25.49) as

1 d dΘ m2
(sin θ )− Θ = −λθϕ Θ, 0<θ<π (25.51)
sin θ dθ dθ sin2 θ

Since Θ is finite for 0 < θ < π and sin θ vanishes at θ = 0, π, we define

z = cos θ (25.52)

which coverts the EVP (equation (25.51)) to

d2 Θ dΘ m2
(1 − z 2 ) − 2z + (λθϕ − )Θ = 0, −1 < z < 1 (25.53)
dz 2 dz 1 − z2

with Θ being finite in the domain (−1 ≤ z ≤ 1). Equation (25.53) is referred to as “Asso-
ciated Legendre equation” [see Chapter 16, Sections 16.4.5 and 16.4.6 for a discussion
of this equation]. It can be shown that the eigenvalues are

λθϕ = λnm = n(n + 1), n = m, m + 1, . . . , i. e. 0 ≤ m ≤ n (25.54)

with normalized eigenfunctions as


650 | 25 Fourier transforms in cylindrical and spherical geometries

(n − m)!(2n + 1) m
Θnm (z) = √ Pn (z) (25.55)
(n + m)!

where Pnm are the associated Legendre polynomials [m = 0 corresponds to Legendre


polynomials]:

(−1)l+m 2 m/2 d
n+m
n
Pnm = (1 − z ) (1 − z 2 ) . (25.56)
2n n! dz n+m

Note that the constant multiplier in equations (25.54)–(25.56) makes the eigenfunc-
tions Θnm (z) orthogonal with respect to the inner product:

1
1
⟨Θnm , Θkm ⟩ = ∫ Θnm Θkm dz = δij . (25.57)
2
−1

Thus, r-eigenfunction now can be expressed from equation (25.46) as

1 d 2 dR
(r ) + r 2 λ = λθϕ = n(n + 1), n = m, m + 1, . . . (25.58)
R dr dr
𝜕R
R = 0 on r = a; R = finite (or, = 0) @ r = 0 (25.59)
𝜕r

When λ = 0, equations (25.58)–(25.59) is referred to as Euler equation. When λ ≠ 0,


substituting

ξ = r √λ (25.60)

in equations (25.58)–(25.59) gives

d2 R dR
ξ2 + 2ξ + [ξ 2 − n(n + 1)]R = 0, n = m, m + 1, . . . (25.61)
dξ 2 dξ

which is spherical Bessel equation (see the discussion in Chapter 16, Section 16.4.4).
Thus, the eigenfunction can be expressed in terms of spherical Bessel functions jn (ξ ) =
jn (r√λnk ) of first kind (due to the BC: R is finite at r = 0, the spherical Bessel function
of second kind is omitted). Thus, using the boundary condition (equation (25.59)), the
eigenvalues and normalized eigenfunctions are given by

eigenvalues: jn (a√λnk ) = 0, k = 1, 2, 3, . . . (25.62)

2 jn (r√λnk )
eigenfunctions: Rnk (r) = √ , (25.63)
3 jn+1 (a√λnk )

which are orthonormal with respect to the standard spherical inner product:
25.2 FFT method for 1D problems in spherical and cylindrical geometries | 651

a
1
⟨Rnk , Rnj ⟩ = ∫ 3r 2 Rnk Rnj dr = δkj (25.64)
a3
0

Thus, the 3D Laplacian operator with Dirichlet boundary condition on the surface and
with domain being the interior of a sphere has the eigenfunctions

ψ = ψmnk (r, θ, z) = Rnk (r)Θnm (cos θ)Φm (ϕ),


(25.65)
m = 0, 1, 2, . . . ; n = m, m + 1, . . . ; and k = 1, 2, 3, . . .

where Rnk , Θmn and Φm are given in equations (25.49)–(25.50), (25.54)–(25.57) and
(25.62)–(25.64).

25.2 FFT method for 1D problems in spherical and cylindrical


geometries
In this section, we illustrate the use of FFT method for the solution of various 1D prob-
lems in cylindrical and spherical geometries and also compare the same with direct
solution.

25.2.1 Steady-state diffusion and reaction in a cylindrical catalyst

Consider 1D diffusion–reaction problem in a cylindrical catalyst with Dirichlet bound-


ary condition:

1 d dc
De ∇2 c = De (r ) = kc, in Ω ≡ 0 < r < a (25.66)
r dr dr
c = c0 on 𝜕Ω ≡ r = a; c = finite @ r = 0 (25.67)

The quantity of interest is the effectiveness factor defined by


a
actual reaction rate ∫0 kc(r)2πr dr
η= = (25.68)
rate if c = c0 in Ω πa2 kc0

Nondimensionalization
Define

c r a2 k
u= ; ξ = ; ϕ2 = (25.69)
c0 a Dm

󳨐⇒
652 | 25 Fourier transforms in cylindrical and spherical geometries

1 d du
(ξ ) = ϕ2 u, 0 < ξ < 1; u(1) = 1; u(0) finite (25.70)
ξ dξ dξ

and
1

η = ∫ 2ξu(ξ ) dξ (25.71)
0

Direct solution
Equation (25.70) can be solved directly in terms of modified Bessel functions, which
with given BCs can be expressed as

I0 (ϕξ )
u(ξ ) = (25.72)
I0 (ϕ)

󳨐⇒ from equation (25.71) as

2 ′ 2 I1 (ϕ)
η= u (1) = (25.73)
ϕ2 ϕ I0 (ϕ)

Here, I0 and I1 are the modified Bessel functions of first kind of order zero and order
one, respectively.

FFT method
The model equation (25.70) can be rewritten by substituting

v =1−u (25.74)

as

1 d dv
(ξ ) − ϕ2 v = −ϕ2 , 0 < ξ < 1; v(1) = 0; v(0) finite (25.75)
ξ dξ dξ

In this case, the relevant EVP is

1 d dw
(ξ ) = −λw, 0 < ξ < 1; w(1) = 0; w(0) finite (25.76)
ξ dξ dξ

As discussed in earlier section, the eigenvalues and normalized eigenfunctions can be


expressed as

J0 (√λk ξ )
J0 (√λk ) = 0, and wk = , k = 1, 2, 3, . . . (25.77)
J1 (√λk )

which are orthonormal with respect to the standard cylindrical inner product:
25.2 FFT method for 1D problems in spherical and cylindrical geometries | 653

⟨wi , wj ⟩ = ∫ 2ξwi wj dξ = δij . (25.78)


0

Take the inner product (FFT) of equation (25.75) with wj 󳨐⇒

ϕ2 ⟨1, wj ⟩
− λj ⟨v, wj ⟩ − ϕ2 ⟨v, wj ⟩ = −ϕ2 ⟨1, wj ⟩ 󳨐⇒ ⟨v, wj ⟩ = (25.79)
ϕ2 + λj

󳨐⇒
∞ ∞ ϕ2 ⟨1, wj ⟩
v(ξ ) = ∑⟨v, wj ⟩wj = ∑ wj (ξ ) (25.80)
j=1 j=1
ϕ2 + λj

󳨐⇒

η = 1 − ⟨1, v⟩

ϕ2
=1−∑ ⟨1, wj ⟩2
j=1
ϕ2 + λj
1 2

ϕ2 J (√λk ξ )
=1−∑ 2
{∫ 2ξ 0 dξ }
j=1
ϕ + λj J1 (√λk )
0

4ϕ2
=1−∑ (25.81)
λ (ϕ2 + λj )
j=1 j

where eigenvalues λj are the roots of J0 (√λj ) = 0. Figure 25.4 shows a comparison of
effectiveness factor evaluated from direct solution (equation (25.73)) and FFT solution
(equation (25.81)) with few terms included in the summation. Table 25.1 lists the eigen-
values of cylindrical Laplacian operator that are utilized in the summation [Remark:
√λn is the nth zero of J0 (x).]

Figure 25.4: Effectiveness factor for a cylindrical catalyst from direct solution and FFT solution.
654 | 25 Fourier transforms in cylindrical and spherical geometries

Table 25.1: First few eigenvalues λj of the Laplacian operator in 1D cylindrical coordinate: J0 (λj ) = 0.

# √λ λ

1 2.40483 5.78319
2 5.52008 30.4713
3 8.65373 74.887
4 11.7915 139.04
5 14.9309 222.932
6 18.0711 326.563
7 21.2116 449.934
8 24.3525 593.043
9 27.4935 755.891
10 30.6346 938.479
11 33.7758 1140.81
12 36.9171 1362.87
13 40.0584 1604.68
14 43.1998 1866.22
15 46.3412 2147.51
16 49.4826 2448.53
17 52.6241 2769.29
18 55.7655 3109.79
19 58.907 3470.03
20 62.0485 3850.01

As expected, the high ϕ asymptote approaches ϕ2 and FFT solution becomes accurate
as more terms are included in the summation. The number of terms to be included to
obtain η accurately is about equal to ϕ and is an indication of the boundary (reaction)
layer thickness which is of the order ϕ1 .
Remarks.
VΩ a
1. When the effective diffusion length RΩ = AΩ
= 2
is used as the length scale to
define the Thiele modulus as

R2Ω k
Φ2 =
De

instead of the radius, then the effectiveness factor η approaches to Φ1 for Φ ≫ 1.


2. If the inner product in equation (25.78) is defined without the numerical factor 2,
the normalized eigenfunction is multiplied by √2. The inner product as defined
arises naturally and we note that η = ⟨u, 1⟩ or the average value of u(ξ ) w. r. t. this
inner product.

25.2.2 Transient heat/mass transfer in an 1D infinite cylinder

Consider the unsteady-state heat/diffusion equation in 1D infinite cylinder:

𝜕T k 𝜕 𝜕T
ρcp = k∇2 T = (r ), 0 < r < a, t>0 (25.82)
𝜕t r 𝜕r 𝜕r
25.2 FFT method for 1D problems in spherical and cylindrical geometries | 655

with BC and IC;

T = Ts @ r = a; T = T0 (r) @ t = 0; T finite @ r = 0. (25.83)

Defining dimensionless variables

r T − Ts kt T0 (aξ ) − Ts
ξ = ; u= ; τ= ; f (ξ ) = ; (25.84)
a Ts ρcp a2 Ts

the dimensionless temperature u is given by the following initial boundary value prob-
lem:

1 𝜕 𝜕u 𝜕u
(ξ ) = , 0 < ξ < 1, τ > 0 (25.85)
ξ 𝜕ξ 𝜕ξ 𝜕τ
u(ξ = 1, τ) = 0; u(ξ = 0, τ) finite; u(ξ , τ = 0) = f (ξ ) (25.86)

The relevant EVP in this case is same as described previously in equations (25.76)
and equations (25.77)–(25.78), i. e.,

J0 (√λk ξ )
J0 (√λk ) = 0, and wk = , k = 1, 2, 3, . . .
J1 (√λk )

which are orthonormal with respect to the standard cylindrical inner product (equa-
tion (25.133)). Thus, taking the inner product of model equations (25.85)–(25.86)
with wj , we get

d
−λj ⟨u, wj ⟩ = ⟨u, wj ⟩, τ > 0; ⟨u, wj ⟩ = ⟨f , wj ⟩ @ τ = 0


∞ ∞
⟨u, wj ⟩ = ⟨f , wj ⟩e−λj τ 󳨐⇒ u(ξ , τ) = ∑⟨u, wj ⟩wj = ∑ e−λj τ ⟨f , wj ⟩wj (25.87)
j=1 j=1

󳨐⇒

1
J (√λj ξ ) J0 (√λj x)
−λj τ 0

u(ξ , τ) = ∑ e ∫ 2xf (x) dx
j=1 J1 (√λj ) 0
J1 (√λj )

󳨐⇒

1
−λj τ J0 ( λj ξ )
∞ √
u(ξ , τ) = 2∑e ∫ xf (x)J0 (√λj x) dx (25.88)
j=1 J12 (√λj )
0
656 | 25 Fourier transforms in cylindrical and spherical geometries

Special case 1: f (ξ) = 1


For the special case of f (ξ ) = 1, the integral in equation (25.88) simplifies to

1
J1 (√ λ j )
∫ xJ0 (√λj x) dx =
√λ j
0

∞ e−λj τ J0 (√λj ξ )
u(ξ , τ) = 2 ∑ , (25.89)
j=1 √λj ⋅ J1 (√λj )

which leads to the average value as

1 ∞
e−λj τ
⟨u⟩(τ) = ⟨1, u(ξ , τ)⟩ = ∫ 2ξu(ξ , τ) dξ = 4 ∑ . (25.90)
j=1
λj
0

Figure 25.5 shows a 2D density plot of the dimensionless temperature u(ξ , τ) in (ξ , τ)


space, as well as its average value ⟨u⟩(τ) and the value at the center u(ξ = 0, τ) as a
function of time.

Figure 25.5: 2D density plot of u(ξ , τ) along with the temporal profile of average temperature ⟨u⟩(τ)
and center temperature u(ξ = 0, τ).
25.2 FFT method for 1D problems in spherical and cylindrical geometries | 657

The numerical solution at τ = 0 in the density plot shows oscillation in the solution
instead of a constant value of unity. This is due to the Gibb’s phenomena which arises
due to the discontinuity in the initial condition at τ = 0 (as discussed in Chapter 21).

Special case 2: f (ξ) = δ2 (ξ)


Consider the special case f (ξ ) being the Dirac-delta function in cylindrical coordinate:

1 ξ2
f (ξ ) = δ2 (ξ ) = lim e− ε ,
ε→0 ε

which satisfies

∫ 2ξδ2 (ξ ) dξ = 1
0

and

∫ 2ξg(ξ )δ2 (ξ ) dξ = g(0).


0

For this case, the solution given by equation (25.88) simplifies to

∞ J0 (√λj ξ )
u(ξ , τ) = ∑ e−λj τ .
j=1 J12 (√λj )

A plot of the profile at different times is shown in Figure 25.6.

Figure 25.6: Spatial profile of the solution u(ξ , τ) at various times.

We note that the more terms are needed in the summation for τ → 0.
658 | 25 Fourier transforms in cylindrical and spherical geometries

25.2.3 Steady-state 1D diffusion and reaction in a spherical catalyst particle

Consider again the 1D diffusion–reaction model with Dirichlet boundary condition but
in a spherical catalyst particle:

1 d 2 dc
De ∇2 c = De (r ) = kc, in Ω ≡ 0 < r < a (25.91)
r 2 dr dr
c = c0 on 𝜕Ω ≡ r = a; c = finite @ r = 0 (25.92)

In this case, the effectiveness factor is given by

2 a
actual reaction rate ∫0 kc(r)4πr dr
η= = (25.93)
rate if c = c0 in Ω 4π 3
a kc0
3

Nondimensionalization
Define

c r a2 k VΩ a
u= ; ξ = ; ϕ2 = ; RΩ = = (25.94)
c0 a Dm AΩ 3

󳨐⇒

1 d du
(ξ 2 ) = ϕ2 u, 0 < ξ < 1; u(1) = 1; u(0) finite (25.95)
ξ 2 dξ dξ

and

η = ∫ 3ξ 2 u(ξ ) dξ (25.96)
0

Direct solution
Equation (25.95) can be solved exactly in terms of spherical Bessel functions of order
zero, which with given BCc can be expressed as

sinh(ϕξ )
u(ξ ) = (25.97)
ξ sinh(ϕ)

󳨐⇒ from equation (25.96) as

3 ′ 3
η= u (1) = 2 (ϕ coth ϕ − 1). (25.98)
ϕ2 ϕ
25.2 FFT method for 1D problems in spherical and cylindrical geometries | 659

FFT method
The model equation (25.95) can be rewritten by substituting

v =1−u (25.99)

as

1 d dv
(ξ 2 ) − ϕ2 v = −ϕ2 , 0 < ξ < 1; v(1) = 0; v(0) finite (25.100)
ξ 2 dξ dξ

In this case, the relevant EVP is

1 d dw
(ξ 2 ) = −λw, 0 < ξ < 1; w(1) = 0; w(0) finite (25.101)
ξ 2 dξ dξ

As discussed in earlier section, the eigenvalues and normalized eigenfunctions can be


expressed as spherical Bessel functions:

sin √λk
j0 (√λk ) = = 0, 󳨐⇒ λk = k 2 π 2 , k = 1, 2, 3, . . . (25.102)
√λk
2 sin kπξ
and wk = √ , (25.103)
3 ξ

which are orthonormal with respect to the standard spherical inner product:

⟨wi , wj ⟩ = ∫ 3ξ 2 wi wj dξ = δij . (25.104)


0

Take the inner product (FFT) of equation (25.100) with wj 󳨐⇒

ϕ2 ⟨1, wj ⟩
− λj ⟨v, wj ⟩ − ϕ2 ⟨v, wj ⟩ = −ϕ2 ⟨1, wj ⟩ 󳨐⇒ ⟨v, wj ⟩ = (25.105)
ϕ2 + λj

󳨐⇒

∞ ∞ ϕ2 ⟨1, wj ⟩
v(ξ ) = ∑⟨v, wj ⟩wj = ∑ wj (ξ ) (25.106)
j=1 j=1
ϕ2 + λj

󳨐⇒

η = 1 − ⟨1, v⟩

ϕ2
=1−∑ ⟨1, wj ⟩2
j=1
ϕ2 + λj
660 | 25 Fourier transforms in cylindrical and spherical geometries


ϕ2
=1−∑ ⟨1, wj ⟩2
j=1
ϕ2 + λj
1 2

ϕ2 2 sin jπξ
= 1 − ∑ 2 2 2 {∫ 3ξ 2 √ dξ }
j=1
ϕ + j π 3 ξ
0
2


=1−∑ (25.107)
j=1
j2 π 2 (ϕ2 + j2 π 2 )

Figure 25.7 shows the comparison of effectiveness factors evaluated from the direct so-
lution (equation (25.98)) and FFT solution (equation (25.107)) with few terms included
in the summation.

Figure 25.7: Effectiveness factor for a spherical catalyst from exact solution and FFT solution.

3
As expected, the high ϕ asymptote approaches ϕ
and FFT solution becomes accurate
as more terms are included in the summation.
a
Remark. If Thiele modulus is defined based on the effective diffusion length RΩ = 3
instead of radius a, i. e.,

R2Ω k
Φ2 = ,
D3

1
then the effectiveness factor η approaches to Φ
for Φ ≫ 1.

25.2.4 Transient 1D heat conduction in a spherical geometry

Consider the 1D transient heat equation (in dimensionless form)

1 𝜕 𝜕u 𝜕u
(ξ 2 ) = , 0 < ξ < 1, τ>0 (25.108)
ξ 2 𝜕ξ 𝜕ξ 𝜕τ
25.2 FFT method for 1D problems in spherical and cylindrical geometries | 661

BC: u = 0 @ ξ = 1, IC: u(ξ , 0) = f (ξ ) and u(ξ , τ) bounded (25.109)

The relevant EVP in this case is same as that described previously in equations (25.101)
and equations (25.102)–(25.104), i. e.

2 sin kπξ
λk = k 2 π 2 and wk = √ ; k = 1, 2, 3, . . .
3 ξ

which are orthonormal w. r. t. the standard spherical inner product (equation (25.104)).
Thus, taking the inner product of model equations (25.108)–(25.109) with wj , we get

d
−λj ⟨u, wj ⟩ = ⟨u, wj ⟩, τ > 0; ⟨u, wj ⟩ = ⟨f , wj ⟩ @ τ = 0


∞ ∞
⟨u, wj ⟩ = ⟨f , wj ⟩e−λj τ 󳨐⇒ u(ξ , τ) = ∑⟨u, wj ⟩wj = ∑ e−λj τ ⟨f , wj ⟩wj (25.110)
j=1 j=1

󳨐⇒
1
∞ 2 2
τ2sin jπξ sin[jπx]
u(ξ , τ) = ∑ e−j π ∫ 3x 2 f (x) dx
j=1
3 ξ x
0

󳨐⇒
1
−j2 π 2 τ sin[jπξ ]

u(ξ , τ) = 2 ∑ e ∫ xf (x) sin[jπx] dx (25.111)
j=1
ξ
0

Special case: f (ξ) = 1


For the special case of f (ξ ) = 1, the integral in equation (25.111) simplifies to

1
(−1)j−1
∫ x sin[jπx] dx =

0

τ sin[jπξ ]
∞ 2 2
u(ξ , τ) = 2 ∑(−1)j−1 e−j π (25.112)
j=1
jπξ

which leads to the average value as


1 2 2
2e−j π τ ∞
⟨u⟩(τ) = ⟨1, u(ξ , τ)⟩ = ∫ 3ξ u(ξ , τ) dξ = 6 ∑ 2 2 . (25.113)
j=1

0
662 | 25 Fourier transforms in cylindrical and spherical geometries

Figure 25.8 shows the 2D density plot of the dimensionless temperature u(ξ , τ) in (ξ −τ)
space, as well as its average temperature ⟨u⟩(τ) and center temperature u(0, τ) as a
function of time τ.

Figure 25.8: 2D density plot of u(ξ , τ) along with the temporal profile of average temperature ⟨u⟩(τ)
and center temperature u(ξ = 0, τ).

It has similar characteristics as observed in cylindrical geometry.

25.3 2D and 3D problems in cylindrical geometry


25.3.1 Solution of Laplace’s equation inside a unit circle

Consider the 2D Laplace equation inside a unit circle with Dirichlet boundary condi-
tion, i. e.

1 𝜕 𝜕u 1 𝜕2 u 0<r<1
∇2 u = (r ) + 2 2 = 0, (25.114)
r 𝜕r 𝜕r r 𝜕θ 0 < θ < 2π
u(1, θ) = f (θ); (25.115)

As shown earlier, the eigenvalues and eigenfunctions for θ-operator are

λθ = m2 = λm and w0 = 1, wmc = √2 cos mθ, wms = √2 sin mθ, m = 1, 2, . . . ,


(25.116)
25.3 2D and 3D problems in cylindrical geometry | 663

which are orthogonal to the standard inner product:



1
⟨wi , wj ⟩ = ∫ wi wj dθ = δij (25.117)

0

These eigenfunctions are complete. Thus, if f (θ) ∈ ℒ2 [0, 2π], Hilbert space of 2π pe-
riodic functions with the standard inner product, then taking the inner product of
equations (25.114)–(25.115) with wj , we can write

r 2 ⟨u, wj ⟩ + r⟨u, wj ⟩ − j2 ⟨u, wj ⟩ = 0, 0<r<1 (25.118)


⟨u, wj ⟩ = ⟨f , wj ⟩ @ r = 1, j = 0, 1, 2, . . . (25.119)

Equation (25.118) is Euler’s equation. The two linearly independent solutions are r j
and r −j . For bounded solution at r → 0, we can only take r j . Thus, the solution to
equations (25.118)–(25.119) can be expressed as

⟨u, wj ⟩ = ⟨f , wj ⟩r j (25.120)

󳨐⇒
∞ ∞
u(r, θ) = ∑ ⟨u, wj ⟩wj = ∑ r j ⟨f , wj ⟩wj (25.121)
j=0 j=0
2π ∞ j 2π
1 r
= ∫ f (θ′ ) dθ′ + ∑ cos[jθ] ∫ f (θ′ ) cos[jθ′ ] dθ′
2π j=1
π
0 0


rj
+∑ sin[jθ] ∫ f (θ′ ) sin[jθ′ ] dθ′
j=1
π
0

1 ∞
= ∫ f (θ′ )[1 + 2 ∑ r j cos[j(θ − θ′ )]] dθ′ (25.122)
2π j=1
0

Consider the summation term inside the integration:


∞ ∞
S = ∑ r j cos[j(θ − θ′ )] = ∑ r j Re[eij(θ−θ ) ]

j=1 j=1

j
= Re[∑{rei(θ−θ ) } ]

j=1

rei(θ−θ )


z
= Re[ ′ ] ( ∵ ∑ zj = z + z2 + z3 + ⋅ ⋅ ⋅ = )
1 − rei(θ−θ ) j=1
1−z
2
r cos[θ − θ ] − r ′
=
1 + r 2 − 2r cos[θ − θ′ ]
664 | 25 Fourier transforms in cylindrical and spherical geometries

󳨐⇒ from equation (25.122)


1 r cos[θ − θ′ ] − r 2
u(r, θ) = ∫ f (θ′ )[1 + 2 ] dθ′
2π 1 + r 2 − 2r cos[θ − θ′ ]
0

1 (1 − r 2 )f (θ′ )
= ∫ dθ′ (25.123)
2π 1 + r − 2r cos[θ − θ′ ]
2
0

which is also referred to as the Poisson’s integral formula (for the interior of a circle).

Special cases
1. f (θ) = 1: In this case,

⟨f , wj ⟩ = ⟨1, wj ⟩ = ⟨w0 , wj ⟩ = δj0

󳨐⇒ from equation (25.121)


u(r, θ) = ∑ r j δj0 wj = 1
j=0

2. f (θ) = A sin pθ or A cos pθ, p ∈ I: In this case,

A
⟨f , wj ⟩ = δ
√2 jp

󳨐⇒ from equation (25.121),


A
u(r, θ) = ∑ r j δ w
j=0
√2 jps j

= Ar p sin pθ or Ar p cos pθ.

As an example, taking cosine input with p = 5, i. e., f (θ) = cos 5θ, the 3D plot and
the density plot are shown in Figure 25.9.

25.3.2 Vibration of a circular membrane

Consider the vibrational motion of a circular membrane described by the partial dif-
ferential equation

𝜕2 u 𝜕2 u 1 𝜕u 1 𝜕2 u
= c2 ( 2 + + ) 0 < r < 1, 0 < θ < 2π, t>0 (25.124)
𝜕t 2 𝜕r r 𝜕r r 2 𝜕θ2
25.3 2D and 3D problems in cylindrical geometry | 665

Figure 25.9: 3D plot (top) and density plot (bottom) for solution of Laplace equation with boundary
value f (θ) = cos 5θ.

BC

u = 0 @ r = 1 (fixed end) (25.125)

ICs

u = f (r, θ) @ t = 0 (initial displacement) (25.126)


𝜕u
= 0, @t = 0 (zero initial velocity) (25.127)
𝜕t
2
1 𝜕
Equation (25.124) contains two operators 𝜕θ𝜕
2 and r 𝜕r (r 𝜕r ). Due to the circular sym-
𝜕

metry, the BCs for the first operator are periodicity in θ with period 2π. The eigen-
values and eigenfunctions for θ-operators are given in equations (25.35)–(25.36) or
(25.116)–(25.117):

λθ = m2 = λm and w0 = 1, wmc = √2 cos mθ, wms = √2 sin mθ, m = 1, 2, . . . ,


(25.128)

which are orthogonal w. r. t. standard inner product:


666 | 25 Fourier transforms in cylindrical and spherical geometries


1
⟨wi , wj ⟩ = ∫ wi wj dθ = δij (25.129)

0

These eigenfunctions are complete. Thus, if f (r, θ) ∈ ℒ2 [0, 2π], then defining

um (r, t) = ⟨u(r, θ, t), wm (θ)⟩, m = 0, 1, 2, (25.130)

and taking inner product of equation (25.124) with the eigenfunctions wm , we get

𝜕2 u m 2
2 𝜕 um 1 𝜕um m2
= c ( + − 2 um ), 0 < r < 1, t>0 (25.131)
𝜕t 2 𝜕r 2 r 𝜕r r

BC

um = 0 @ r = 1 (25.132)

IC

um = fm (r) = ⟨f (r, θ), wm ⟩ @ t = 0 (25.133)


𝜕um
= 0@t = 0 (25.134)
𝜕t
The relevant eigenvalue problem (r-operator) in this case is

1 d dψ m2
(r ) − 2 ψ = −λψ
r dr dr r
ψ(1) = 0, ψ(0) finite

which is a special case of equation (25.35), i. e., 3D to 2D. Thus, the eigenvalues and
eigenfunctions are given by equations (25.39)–(25.41) as

λ = {λmk } : Jm (√λmk ) = 0, k = 1, 2, 3, . . . (25.135)


Jm (√λmk r)
wmk (r) = (25.136)
Jm+1 (√λmk )

which are orthonormal with respect to the standard cylindrical inner product:

⟨wmi , wmk ⟩ = ∫ 2rwmi wmk dr = δik (25.137)


0

and form a complete set. Thus, taking inner product of equation (25.131) with wmk (r)
gives

d2 Umk
= −c2 λmk Umk ; Umk = ⟨um , wmk ⟩ (25.138)
dt 2
25.3 2D and 3D problems in cylindrical geometry | 667

Umk = ⟨fm , wmk ⟩ = fmk @ t = 0, Umk



= 0@t = 0 (25.139)

󳨐⇒

Umk = fmk cos[c√λmk t] (25.140)

∞ ∞
um (r, t) = ∑ ⟨um , wmk ⟩wmk = ∑ Umk (t)wmk (r)
k=1 k=1
1

Jm (√λmk r) Jm (√λmk ξ )
= ∑ cos[c√λmk t] ⋅ ∫ 2ξfm (ξ ) dξ
k=1 Jm+1 (√λmk ) Jm+1 (√λmk )
0
1

cos c√λmk tJm (√λmk r)
=2∑ 2 (√λ )
⋅ ∫ ξfm (ξ )Jm (√λmk ξ ) dξ (25.141)
k=1 Jm+1 mk
0

󳨐⇒


u(r, θ, t) = ∑ um (r, t)wm (θ)
m=0
∞ ∞
= u0 (r, t) + ∑ umc √2 cos mθ + ∑ ums √2 sin mθ
m=1 m=1

= ∑ cos[c√λ0k t]J0 (√λ0k r)A0k
k=1
∞ ∞
+ ∑ ∑ [Ackm cos mθ + Askm sin mθ]Jm (√λmk r) cos[c√λmk t] (25.142)
k=1 m=1

where

1 2π
1
A0k = 2 ∫ ∫ ξJ0 (√λ0k ξ )f (ξ , θ) dθ dξ (25.143)
πJ1 (√λ0k )
0 0
1 2π
2
Ackm = 2 ∫ ∫ ξJm (√λmk ξ )f (ξ , θ) cos mθ dθ dξ (25.144)
πJm+1 (√λmk )
0 0
1 2π
2
Askm = 2 (√λ )
∫ ∫ ξJm (√λmk ξ )f (ξ , θ) sin mθ dθ dξ (25.145)
πJm+1 mk
0 0

m = 1, 2, . . . ; k = 1, 2, 3, . . .
668 | 25 Fourier transforms in cylindrical and spherical geometries

Special case: f (r, θ) = Jm (√λmk r) cos mθ or Jm (√λmk r) sin mθ (i. e., pure
eigenmodes)
For the special case when f (r, θ) are the pure mode of vibration, the solution remains
in the same mode for all times. For example,
(i) when f (r, θ) = J0 (√λ0k r), then equation (25.143), (25.144), (25.145) implies

Acpq = 0 = Aspq , ∀p, q and A0q = δ0p δkq

and equation (25.142) simplifies to

u(r, θ, t) = J0 (√λ0k r) cos[c√λ0k t] = f (r, θ) cos[c√λ0k t].

(ii) when f (r, θ) = Jm (√λmk r) sin mθ (m > 0), then equations (25.143), (25.144), (25.145)
implies

Acpq = 0 = A0q , ∀p, q and Aspq = δpm δqk

and equation (25.142) simplifies to

u(r, θ, t) = sin[mθ]Jm (√λmk r) cos[c√λmk t] = f (r, θ) cos[c√λmk t].

(iii) when f (r, θ) = Jm (√λmk r) cos mθ(m > 0), then equation (25.143)–(25.145) implies
equations (25.143)–(25.145) implies that all sine coefficients vanish, i. e.,

Aspq = 0 = A0q , ∀p, q and Acpq = δpm δqk

and equation (25.142) simplifies to

u(r, θ, t) = cos[mθ]Jm (√λmk r) cos[c√λmk t] = f (r, θ) cos[c√λmk t]

Thus, the solution remains in the same eigenmode with amplitude ratio A.R. vary-
ing with time as

A.R. = cos[c√λmk t].

The contour profile of first few modes of vibration (i. e., eigenmodes) are shown in
Figure 25.10, while eigenvalues corresponds to these modes are listed in Table 25.2.

Similarly, the 2D problems in cylindrical coordinate system in (r, z) or (z, θ) space can
be solved. For additional chemical engineering applications, we refer to the articles
by Balakotaiah and Gupta [6], Ratnakar and Balakotaiah [26], Balakotaiah [5] and Aris
and Balakotaiah [4].
25.3 2D and 3D problems in cylindrical geometry | 669

Figure 25.10: Contour profiles of first few eigenmodes: cos[mθ]Jm [√λmk r].

Table 25.2: First few eigenvalues of cylindrical Laplace operator.

m k μmk λmk
0 1 2.40483 5.78319
0 2 5.52008 30.4713
1 1 3.83171 14.682
1 2 7.01559 49.2185
2 1 5.13562 26.3746
2 2 8.41724 70.8499

25.3.3 Three-dimensional problems in cylindrical geometry

3D Poisson’s equation
Consider the Poisson equation

1 𝜕 𝜕u 1 𝜕2 u 𝜕2 u
∇2 u = (r ) + 2 2 + 2 = −f (r, θ, z) in Ω (25.146)
r 𝜕r 𝜕r r 𝜕θ 𝜕z
670 | 25 Fourier transforms in cylindrical and spherical geometries

Ω ≡ 0 < r < a, 0 < θ < 2π, 0<z<L

with BC:

u = 0 on 𝜕Ω, i. e., @ r = R and @ z = 0, L (25.147)

The eigenvalues and eigenfunctions are obtained in Section 25.1.3 for the Laplacian
operator with Dirichlet condition in the cylindrical geometry (see equations (25.23)–
(25.24) to (25.42)). Using these eigenfunctions and corresponding inner product, the
solution can be obtained by taking the inner product of equations (23.72)–(23.73) with
ψmnk as follows:

⟨f , ψnmk ⟩
− λmnk ⟨u, ψmnk ⟩ = −⟨f , ψnmk ⟩ 󳨐⇒ ⟨u, ψmnk ⟩ = (25.148)
λmnk


∞ ∞ ∞
⟨f , ψnmk ⟩
u = ∑ ⟨u, ψmnk ⟩ψmnk = ∑ ∑ ∑ ψmnk (25.149)
m,n,k n=1 m=0 k=1 λmnk

where

μ2mk n2 π 2
λmnk = + 2 : Jm (μmk ) = 0 (25.150)
a2 L
J ( μ r)
{
{ ψ0nk = √2 J0(√√μ 0ka) sin nπz
L
{ 1 0k
{
{ c Jm (√μmk r)
ψmnk = { ψmnk = 2 J (√μ a) cos mθ sin nπz L
(25.151)
{ m+1 mk
{
{
{ s Jm (√μmk r) nπz
ψ = 2 J (√μ a) sin mθ sin L
{ mnk m+1 mk

and
L a 2π
1
⟨f , ψnmk ⟩ = ∫ ∫ ∫ rf (r, θ, z)ψ(r, θ, z) dθ dr dz (25.152)
πa2 L
0 0 0

3D Heat/Diffusion equation
Consider the transient heat/diffusion equation

𝜕u 1 𝜕 𝜕u 1 𝜕2 u 𝜕2 u
= ∇2 u = (r ) + 2 2 + 2 in Ω (25.153)
𝜕t r 𝜕r 𝜕r r 𝜕θ 𝜕z
Ω ≡ 0 < r < a, 0 < θ < 2π, 0<z<L

with BC:

u = 0 on 𝜕Ω, i. e. @ r = R and @ z = 0, L (25.154)


25.4 2D and 3D problems in spherical geometry | 671

and IC:

u(r, θ, z, t = 0) = f (r, θ, z) (25.155)

Again, using the same eigenvalues and eigenfunctions (obtained in Section 25.1.3 for
the Laplacian operator with Dirichlet condition in the cylindrical geometries), and tak-
ing the inner product of equation (25.153) with ψmnk , we get:

d
⟨u, ψmnk ⟩ = −λmnk ⟨u, ψnmk ⟩, t > 0;
dt
⟨u, ψmnk ⟩ = ⟨f , ψnmk ⟩ @ t = 0

󳨐⇒

⟨u, ψmnk ⟩ = ⟨f , ψnmk ⟩e−λmnk t (25.156)


∞ ∞ ∞
u = ∑ ⟨u, ψmnk ⟩ψmnk = ∑ ∑ ∑ e−λmnk t ⟨f , ψnmk ⟩ψmnk (25.157)
m,n,k n=1 m=0 k=1

where eigenvalues and eigenfunctions are expressed in equations (25.150)–(25.152).


Similarly, other problems in cylindrical geometry with other types of boundary
conditions can be solved once the relevant eigenvalue problems are identified.

25.4 2D and 3D problems in spherical geometry


Consider the spherical coordinate system shown in Figure 25.3. For a sphere of radius
a, 0 < r < a, 0 < θ < π, 0 < ϕ < 2π.

25.4.1 Poisson’s equation in a sphere

Consider the Poisson’s equation in the interior of a sphere with Dirichlet boundary
condition, i. e.,

1 𝜕 2 𝜕u 1 𝜕 𝜕u 1 𝜕2 u
󳶚2 u = (r ) + (sin θ ) + = −f (r, θ, ϕ) in Ω
r 2 𝜕r 𝜕r r 2 sin θ 𝜕θ 𝜕θ r 2 sin2 θ 𝜕ϕ2 (25.158)
Ω ≡ 0 < r < a, 0 < θ < π, 0 < ϕ < 2π

with

u = 0 on 𝜕Ω ≡ r = a, u(0, θ, ϕ) = finite (25.159)


672 | 25 Fourier transforms in cylindrical and spherical geometries

FFT method
The eigenvalues and eigenfunctions of Laplacian operator in spherical coordinate are
described in Section 25.1.4 (see equations (25.43)–(25.44) to (25.65)). Taking the inner
product of equations (25.158)–(25.159) with eigenfunctions ψmnk , we get

⟨f , ψmnk ⟩
− λnk ⟨u, ψmnk ⟩ = −⟨f , ψmnk ⟩ 󳨐⇒ ⟨u, ψmnk ⟩ = (25.160)
λnk

where the eigenvalues λnk are given by

λnk = μ2nk : jn (μnk ) = 0, n = m, m + 1, . . . ; k = 1, 2, 3, . . . (25.161)

and eigenfunctions by

2(n − m)!(2n + 1) jn (μnk r) m


ψmnk (r, θ, ϕ) = √ P (cos θ)Φm (ϕ) (25.162)
3(n + m)! jn+1 (μnk a) n

{ Φ0 = 1
{
{ c
Φm (ϕ) = { Φm = √2 cos mϕ n = m, m + 1, . . . (25.163)
{
{ s
{ Φm = √2 sin mϕ,

󳨐⇒
∞ ∞ ∞
⟨f , ψmnk ⟩
u = ∑ ⟨u, ψmnk ⟩ψmnk = ∑ ∑ ∑ = ψmnk (25.164)
m,n,k m=0 n=m k=1
λnk

For the special case of azimuthal/longitudinal symmetry (i. e., symmetry w. r. t. ϕ),
i. e., for the 2D case in (r, θ), we can disregard the m-operator, and only the m = 0
mode remains. In this case, the formal solution (equation (25.164)) simplifies to

∞ ∞
⟨f , ψnk ⟩
u = ∑⟨u, ψnk ⟩ψnk = ∑ ∑ = ψnk (25.165)
n,k n=0 k=1
λnk

where λnk is obtained the same way (from equation (25.161)) and eigenfunctions sim-
plifies to

2(2n + 1) jn (μnk r)
ψnk (r, θ) = √ P (cos θ) (25.166)
3 jn+1 (μnk a) n

The solution of equations (25.158)–(25.159) can also be obtained by using the FFT
method in combination of direct or Green’s function method, where FFT is used to
reduce the problem to 1D (i. e., 3D to 1D or 2D to 1D), and then solving the 1D problem
directly. We demonstrate this approach below.
25.4 2D and 3D problems in spherical geometry | 673

Green’s function-based method


Two-dimensional problem
We consider first a simplified case in which we assume longitudinal symmetry (i. e.,
symmetry with respect to ϕ). The relevant problem is

1 𝜕 2 𝜕u 1 𝜕 𝜕u
2
(r )+ 2 (sin θ ) = −f (r, θ), (25.167)
r 𝜕r 𝜕r r sin θ 𝜕θ 𝜕θ
0 < r < a, 0<θ<π
u(a, θ) = 0, u(r, θ) finite (25.168)

Looking at the operator w. r. t. θ, we have the following eigenvalue problem:

1 d dy
(sin θ ) = −λy, 0<θ<π (25.169)
sin θ dθ dθ
y(θ) is finite for 0 < θ < π (25.170)

which is the special case of θ-operator in general case (see equation (25.51)). Thus, the
eigenvalues and eigenfunction can be obtained from equations (25.54)–(25.56) with
m = 0, i. e., eigenvalues λ:

λn = n(n + 1), n = 0, 1, 2, . . . (25.171)

eigenfunction yn :

yn (θ) = √2n + 1Pn (cos θ), (25.172)

which are complete and orthonormal w. r. t. to the inner product


π
1
⟨yn , ym ⟩ = ∫ yn (θ)ym (θ) sin θ dθ = δmn . (25.173)
2
0

It can be shown that they form a basis for the Hilbert space ℒ2 [0, π] with the inner
product defined above in equation (25.173), i. e., for any function f (θ), we can express
π

1
f (θ) = ∑ fn yn (θ); fn = ⟨f , yn ⟩ = ∫ f (θ)yn (θ) sin θ dθ (25.174)
n=0
2
0

Now consider the solution of the Poisson equation (25.167). Taking the inner product
with eigenfunctions, we get

1 d 2 dun n(n + 1)
(r )− un = −fn (r); n = 0, 1, 2, . . . (25.175)
r 2 dr dr r2
un (a) = 0, un (r) is finite (25.176)
674 | 25 Fourier transforms in cylindrical and spherical geometries

where
π
1
un = ⟨u, yn ⟩ = ∫ u(r, θ)yn (θ) sin θ dθ (25.177)
2
0
π
1
fn = ⟨f , yn ⟩ = ∫ f (r, θ)yn (θ) sin θ dθ (25.178)
2
0

Equations (25.175)–(25.176) may be solved using the Green’s function method. We note
that
n n
r r r
−n−1
u1 = ( ) and u2 = ( ) −( ) (25.179)
a a a

are two linearly independent solutions of the homogeneous equation satisfying the
BCs at r = 0 and r = a, respectively. Thus,
a

un (r) = ∫ Gn (r, s)fn (s)s2 ds (25.180)


0

where

1 ( s )n [( r )−n−1 − ( ar )n ], s≤r
Gn (r, s) = { ar n as −n−1 (25.181)
(2n + 1)a ( a ) [( a ) − ( as )n ], s⩾r

Therefore, the solution of equations (25.175)–(25.176) is given by



u(r, θ) = ∑ un yn (θ)
n=0
a π

2n + 1
= ∑ Pn (cos θ) ∫ Gn (r, s)s2 (∫ f (s, α)Pn (cos α) sin αdα) ds
n=0
2
0 0
a π ∞
2n + 1
= ∫ ∫[ ∑ Gn (r, s)Pn (cos θ)Pn (cos α)]f (s, α)s2 sin αdα ds
n=0
2
0 0
a π

= ∫ ∫ G(r, s, θ, α)f (s, α)s2 sin αdα ds (25.182)


0 0

where G is the Green’s function for the operator on the LHS of equations (25.175)–
(25.176), and can be expressed as

G(r, s, θ, α) = ∑ Gn (r, s)Kn (θ, α) (25.183)
n=0
25.4 2D and 3D problems in spherical geometry | 675

where Gn is given in equation (25.181) and Kn is given by

2n + 1
Kn = Pn (cos θ)Pn (cos α) (25.184)
2

Three-dimensional problem
Similar to the 2D problem, the solution to the 3D problem can be obtained by consid-
ering the eigenvalue problem in (θ, ϕ) and converting the 3D model to 1D model using
FFT, and then using the direct solution. For example, equations (25.49)–(25.50) and
(25.54)–(25.56) give the eigenvalues and eigenfunctions:

(n − m)!(2n + 1) m
ynm = √ Pn (cos θ)Φm (ϕ) (25.185)
(n + m)!

{ Φ0 = 1
{
{ c
Φm (ϕ) = { Φm = √2 cos mϕ n = m, m + 1, . . . (25.186)
{
{ s
{ Φm = √2 sin mϕ,

with eigenvalues λnm = n(n + 1). Thus, taking the inner product of equations (25.158)–
(25.159) with ynm , we get

1 d 2 dunm n(n + 1)
(r )− unm = −fnm (r); (25.187)
r 2 dr dr r2
unm (a) = 0; unm (r) is finite; m = 0, 1, 2, . . . ; n = m, m + 1, . . . (25.188)

where

π 2π
1
unm = ⟨u, ynm ⟩ = ∫ ∫ u(r, θ, ϕ)ynm (θ, ϕ) sin θ dθ dϕ (25.189)

0 0
π 2π
1
fnm = ⟨f , ynm ⟩ = ∫ ∫ f (r, θ, ϕ)ynm (θ, ϕ) sin θ dθ dϕ. (25.190)

0 0

Note that equations (25.187)–(25.188) is same as equations (25.175)–(25.176), and hence


unm is given by (equation (25.180))

unm (r) = ∫ Gn (r, s)fnm (s)s2 ds, (25.191)


0

where Gn is same as given in equations (25.180)–(25.181). Thus, the solution can be


expressed as
676 | 25 Fourier transforms in cylindrical and spherical geometries

∞ ∞
u(r, θ, ϕ) = ∑ ∑ unm (r)ynm (θ, ϕ)
m=0 n=m
a π 2π
∞ ∞
1
= ∑ ∑ ynm (θ, ϕ) ∫ Gn (r, s)s2 ∫ ∫ f (r, θ′ , ϕ′ )ynm (θ′ , ϕ′ ) sin θ′ dθ′ dϕ′ ds
m=0 n=m 4π
0 0 0
a π 2π

= ∫ ∫ ∫ G(r, s, θ, θ′ , ϕ, ϕ′ )f (s, θ′ , ϕ′ )s2 sin θ′ dθ′ dϕ′ ds. (25.192)


0 0 0

Here, the Green’s function G(r, s, θ, θ′ , ϕ, ϕ′ ) can be expressed as



G(r, s, θ, θ′ , ϕ, ϕ′ ) = ∑ Gn (r, s)Knm (θ, θ′ , ϕ, ϕ′ ) (25.193)
n=0

where Gn (r, s) is given in equations (25.180)–(25.181) and Kn is given by

1
Knm (θ, θ′ , ϕ, ϕ′ ) = y (θ, ϕ)ynm (θ′ , ϕ′ ) (25.194)
4π nm

Similarly, other problems in spherical geometry with other types of boundary con-
ditions can also be solved using this approach. For further example of problems in
cylindrical and spherical geometries, we refer to the books by Carslaw and Jaeger [10]
and Crank [16].

Problems
1. Consider the problem of cooling a circular cylinder of length 10 cms and diameter
2 cms made of copper. Its initial temperature is 500 ∘ C and it is suddenly plunged
into an agitated bath at 100 ∘ C temperature. Assume that the agitation is high
enough so that the surface of the cylinder is at 100 ∘ C.
(a) Derive a mathematical model for the temperature history of the rod and cast
into dimensionless form.
(b) Find the solution of the model. Determine an expression for the maximum
(center) temperature of the rod as a function of time. Compute this value after
0.05, 0.1, 1.0, 5.0 and 10 seconds.
(c) Repeat part (b) by assuming the cylinder to be of infinite length and compare
your result.
2. Consider the solution of the Laplace’s equation in a circle of radius unity with
Dirichlet condition u = f (θ) on the boundary (r = 1, 0 < θ ≤ 2π). Obtain a solution
to this problem using finite Fourier transform. Show that the solution obtained
may be simplified to the Poisson’s integral formula:


1 (1 − r 2 )f (ϕ)
u(r, θ) = ∫ dϕ
2π 1 − 2r cos(θ − ϕ) + r 2
0
25.4 2D and 3D problems in spherical geometry | 677

3. The dispersion of a tracer in laminar flow in a pipe is described by the convective-


diffusion equation

𝜕C r 2 𝜕C
+ 2u[1
̂ − 2] = Dm ∇2 C + f (r, t); 0 < r < R, 0 < z < L, t>0
𝜕t R 𝜕z

with no flux boundary conditions at the pipe wall and C = C0 (r, t) at z = 0. Cast
into dimensionless form and solve using finite Fourier transform. Determine an
expression for the convected mean concentration at any axial position and time.
4. Consider the problem of tracer dispersion described in problem 3 above. Assume
that the pipe is of infinite length. Cast into dimensionless form and solve using
finite Fourier transform.
5. Consider the problem of unsteady-state diffusion and reaction in a porous spher-
ical catalyst

εD 𝜕 2 𝜕C 𝜕C
(r ) − kC = ε ; 0 < r < R, t>0
r 2 𝜕r 𝜕r 𝜕t
C = C0 (t), @ r = R, t > 0; C = F(r), @ t = 0, 0<r<R

Cast into dimensionless form and solve using finite Fourier transform.
6. Solve the problem of vibration of a sphere:

𝜕2 u
= ∇2 u for r < 1, t>0
𝜕t 2
𝜕u
u(r, θ, ϕ, 0) = f (r, θ, ϕ); (r, θ, ϕ, 0) = 0
𝜕t

7. Creeping flow around a sphere placed in an infinite stream of fluid moving at a


constant velocity of U is described by the equations (assuming azimuthal sym-
metry)

1 𝜕 2 1 𝜕
(r vr ) + (v sin θ) = 0 (continuity)
2
r 𝜕r r sin θ 𝜕θ θ
𝜕p 1 𝜕 𝜕v 1 𝜕 𝜕v v 2 𝜕v 2
− + μ[ 2 (r 2 r ) + 2 (sin θ r ) − 2 2r − 2 θ − 2 vθ cot θ] = 0
𝜕r r 𝜕r 𝜕r r sin θ 𝜕θ 𝜕θ r r 𝜕θ r
(r-momentum)
1 𝜕p 1 𝜕 𝜕v 1 𝜕 𝜕v 2 𝜕v vθ
− + μ[ 2 (r 2 θ ) + 2 (sin θ θ ) + 2 r − =0
r 𝜕θ r 𝜕r 𝜕r r sin θ 𝜕θ 𝜕θ r 𝜕θ r 2 sin2 θ
(θ-momentum)

with boundary conditions

@ r = R, vr = vθ = 0
r → ∞, vr = U cos θ, vθ = −U sin θ, p = p0
678 | 25 Fourier transforms in cylindrical and spherical geometries

Obtain a solution using separation of variables.


8. (a) Obtain a formal solution to the following system of linear PDEs describing dif-
fusion and reaction:
𝜕u
= D∇2 u + Au; in Ω (1)
𝜕t
u = 0 on 𝜕Ω

where Ω is a (i) circular disk (ii) sphere. Here, u is the vector of concentrations,
D is the matrix of diffusion coefficients and A is a constant n × n matrix, (b) Obtain
a formal solution to (1) with Neumann boundary conditions

∇u.n = 0 on 𝜕Ω

where n is the outward normal to 𝜕Ω. (c) Show schematic diagrams of contour
plots of the first few eigenfunctions for case a(i).
9. A first-order chemical reaction carried out in a tubular reactor that discharges into
a stirred reactor. The entire assembly is held at a constant temperature. Reaction
occurs in the tube as well as the stirred vessel.
(a) Assuming that the axial dispersion model applies in the tube, develop a math-
ematical model for the system.
(b) Devise a suitable self-adjoint formalism and determine a formal solution to
the model.
(c) Determine the residence time distribution function (response of the system
for a pulse input when there is no reaction) of the model developed in (a).
(d) How does your result change if the axial dispersion model is replaced by a
two-dimensional model that accounts for axial as well as radial dispersion of
the reacting species?
10. (a) Consider the problem of heat loss from a very long heated pipe buried vertically
in the earth. Suppose that the pipe outside radius is R and its surface temperature
is raised and kept constant at Ts . Assume that the initial earth temperature is Ta .
(i) Set up an appropriate mathematical model and determine an expression for
the heat flux at the surface of the pipe as a function of time.
(ii) Use the result in (i) to determine the effective heat transfer coefficient as a
function of time.
(b) A circular cylinder of radius R and infinite length is immersed in a fluid at
rest everywhere, and is suddenly made to move with steady velocity U parallel to
its length. Determine the frictional force per unit length of the cylinder at time t
after the motion has begun. With appropriate change of notation, show that this
expression is identical to that determined in (a) above.
(c) Determine the asymptotic form of the solutions in (a) and (b) for small and
large times.
|
Part VI: Formulation and solution of some classical
chemical engineering problems
Introduction
The aim of this last part is to demonstrate the use of mathematical tools, developed
in earlier parts, in the solution of some classical problems encountered by Chemi-
cal Engineers. It is hoped that these representative problems combined with practical
knowledge gained from experience can help the student in the mathematical analysis
of other such problems.

https://doi.org/10.1515/9783110739701-027
26 The classical Graetz–Nusselt problem
The classical Graetz–Nusselt problem deals with the determination of the heat or mass
transfer from a fluid in a duct to the wall in steady laminar flow. Here, we formulate
the model for heat transfer, and show how the tools of linear analysis may be used to
determine the heat transfer coefficient.

26.1 Model formulations and formal solution


Consider the problem of describing the steady-state temperature of a fluid in laminar
flow in a duct of an arbitrary cross-section as shown schematically in Figure 26.1. As-
sume that the hydraulic diameter of the duct (dh ) is much smaller compared to the
length L (i. e., L/dh ≫ 1) so that axial conduction may be neglected.

Figure 26.1: Schematics of (a) a laminar flow in a duct of an arbitrary cross-section, and (b) typical
cross-sections used in various applications.

The mathematical model with negligible axial conduction is given by

𝜕T 𝜕2 T 𝜕2 T
ρf Cpf uf (y′ , z ′ ) = k ∇′2
f ⊥ T = k f ( + ), x ′ > 0, (y′ , z ′ ) ∈ Ω (26.1)
𝜕x ′ 𝜕y′2 𝜕z ′2

with inlet and boundary conditions

https://doi.org/10.1515/9783110739701-028
684 | 26 The classical Graetz–Nusselt problem

T|x′ =0 = Tin ; T|𝜕Ω = Tw or (n.kf ∇⊥′ T)󵄨󵄨󵄨𝜕Ω = qw . (26.2)


󵄨
f

Here, T, ρf , Cpf and kf are temperature, density, specific heat capacity and thermal
conductivity of the fluid; x ′ is the coordinate along flow direction, (y′ , z ′ ) are transverse
coordinates; Ωf and 𝜕Ωf are transverse domain and its boundary, respectively; Tin and
Tw are fluid temperatures at the inlet and at the transverse boundary (wall); qw is the
heat flux at the transverse boundary entering the domain Ωf ; uf (y′ , z ′ ) is the velocity
profile; and n is unit outward normal to 𝜕Ωf . In the general case, there are two types of
boundary conditions: (i) constant wall temperature and (ii) constant heat flux at the
wall. Here, we discuss the first case and leave the second case for exercises.

26.1.1 Analysis of constant wall temperature boundary condition

We consider the boundary condition given by

T = Tw @ 𝜕Ωf (26.3)

and define the following quantities to nondimensionalize the governing model (equa-
tions (26.1) and (26.3)):

AΩf (y′ , z ′ ) uf (y′ , z ′ )


RΩf = ; (y, z) = ; u(y, z) = ;
PΩf RΩf uf
kf αf x ′ T −T Ω′f
αf = ; x= ; θ= w ; Ω= ; ∇⊥ = RΩf ∇⊥′ (26.4)
ρf Cpf R2Ωf uf Tw − Tin RΩf

Here, AΩf , PΩf and RΩf are the cross-section area, wetted perimeter and hydraulic ra-
dius [hydraulic diameter dh = 4RΩf ], respectively; uf = A1 ∫Ω u(y′ , z ′ ) dy′ dz ′ is the
Ωf f
average velocity in the fluid phase; αf is the thermal diffusivity. This leads to the fol-
lowing dimensionless linear model:

1 𝜕2 θ 𝜕2 θ 𝜕θ
Lθ = ( 2 + 2) = , x > 0, (y, z) ∈ Ω (26.5)
u(y, z) 𝜕y 𝜕z 𝜕x

with inlet and boundary conditions as

θ|x=0 = 1; θ|𝜕Ω = 0. (26.6)

The same model is obtained for mass transfer from the duct interior to the wall with
an infinitely fast wall reaction (so that the species concentration at the wall is zero).
The dimensionless model given by equations (26.5) and (26.6) can be solved by
considering the eigenvalue problem (EVP) defined by
26.1 Model formulations and formal solution | 685

1 1 𝜕2 ψ 𝜕2 ψ
Lψ = ∇⊥2 ψ = ( 2 + 2 ) = −λψ in Ω; ψ|𝜕Ω = 0. (26.7)
u(y, z) u(y, z) 𝜕y 𝜕z

Note that the operator appearing in equation (26.7) is a self-adjoint operator (i. e.,
L∗ = L) w. r. t. the inner product defined by

1
⟨ψi , ψj ⟩ = ∫ u(y, z)ψi ψj dy dz = δij ; AΩ = ∫ dy dz (26.8)

Ω Ω

where

1, i=j
δij = { }
0, i ≠ j

is the Kronecker delta. Hence, equation (26.7) defines a Sturm–Liouville EVP with
eigenvalues λi and eigenfunctions ψi .
Thus, taking the inner product with ψi , equation (26.5) leads to

d
⟨θ, ψi ⟩ = ⟨Lθ, ψi ⟩ = ⟨θ, L∗ ψi ⟩ = ⟨θ, Lψi ⟩ = −λi ⟨θ, ψi ⟩; ⟨θ, ψi ⟩|x=0 = ⟨1, ψi ⟩
dx
󳨐⇒

⟨θ, ψi ⟩ = ⟨1, ψi ⟩ exp(−λi x) 󳨐⇒


θ(x, y, z) = ∑⟨θ, ψi ⟩ψi = ∑ exp(−λi x)⟨1, ψi ⟩ψi (y, z) (26.9)
i i

Heat transfer coefficients and Nusselt number


The heat transfer coefficient from wall to the fluid is defined by

qw 1
h= = ∫ (n.kf ∇⊥′ T) dPΩf , (26.10)
Tw − Tm (Tw − Tm )PΩf
𝜕Ωf

which leads to the Nusselt number, Nu (or the dimensionless heat transfer coefficient)
as
hRΩf −1
NuΩ = = ∫ (n.∇⊥ θ) dPΩ . (26.11)
kf θm PΩ
𝜕Ω

Here, Tm or θm is the cup-mixing (velocity averaged) temperature defined as

1
θm (x) = ⟨1, θ⟩ = ∫ u(y, z)θ(x, y, z) dy dz = ∑ exp(−λi x)⟨1, ψi ⟩2 . (26.12)
AΩ i
Ω

Thus, the Nusselt number can be expressed in terms of eigenvalues and eigen-
functions from equations (26.11)–(26.12) as follows:
686 | 26 The classical Graetz–Nusselt problem

1

∑i exp(−λi x)⟨1, ψi ⟩ ∫𝜕Ω (−n.∇⊥ ψi ) dPΩ
NuΩ = . (26.13)
∑i exp(−λi x)⟨1, ψi ⟩2

The above expression can further be simplified using the EVP (equation (26.7)) that
suggests

1 𝜕2 ψ 𝜕2 ψ
−λ⟨1, ψ⟩ = ⟨1, Lψ⟩ = ∫( 2 + 2 )dAΩ
AΩ 𝜕y 𝜕z
Ω
1 1
= ∫ (n.∇⊥ ψ) dPΩ = ∫ (n.∇⊥ ψ) dPΩ (26.14)
AΩ PΩ
𝜕Ω 𝜕Ω

Note that due to nondimensionlization of the spatial coordinate with hydraulic radius,
the dimensionless domain Ω has the property: AΩ = PΩ . Thus, the general expression
of the Nusselt number can be simplified from equations (26.13) and (26.14) as

∑i λi exp(−λi x)⟨1, ψi ⟩2 ∑i λi exp(−λi x)βi


NuΩ (x) = = , (26.15)
∑i exp(−λi x)⟨1, ψi ⟩2 ∑i exp(−λi x)βi

where βi = ⟨1, ψi ⟩2 is the Fourier weight. Equation (26.15) shows that the local Nusselt
number can be expressed in terms of the eigenvalues and Fourier weights.
[Remark: In the literature, the Nusselt number Nu is defined using hydraulic di-
ameter dh as the length scale. It is clear that Nu = 4 NuΩ .]

Long distance asymptote


In order to examine the limit of NuΩ (x) for large x, we note that the linear operator L is
self-adjoint operator implying that all eigenvalues λi are real. In addition, taking the
inner product with ψ, the EVP defined in equation (26.7) leads to

1
−λ⟨ψ, ψ⟩ = ⟨Lψ, ψ⟩ = ∫ ψ∇⊥2 ψ dy dz

Ω
1 1
= ∫ ψ(n.∇⊥ ψ) dPΩ − ∫(∇⊥ ψ.∇⊥ ψ) dy dz (Green’s identity)
AΩ AΩ
𝜕Ω Ω
1
=0− ∫(∇⊥ ψ.∇⊥ ψ) dy dz

Ω

󳨐⇒

1 ∫Ω (∇⊥ ψi .∇⊥ ψi ) dy dz
λi = >0 ∀i (since u does not change sign)
AΩ ∫ u(y, z)ψ2i dy dz
Ω

Further, from expanding unity in terms of eigenfunctions ψ′i s, we get the Parseval’s
relation:
26.1 Model formulations and formal solution | 687

1 = ∑⟨1, ψi ⟩ψi 󳨐⇒ ∑⟨1, ψi ⟩2 = ∑ βi = ⟨1, 1⟩ = 1


i i i

where the energy content βi of the ith mode is less than unity. Thus, the long distance
asymptote (x ≫ 1) can be obtained from equation (26.15) as follows:

NuΩ∞ = lim NuΩ (x) = λ1 . (26.16)


x≫1

Thus, only the first eigenvalue determines the dimensionless heat transfer coefficient
(NuΩ ) at long distance from inlet.

Short distance asymptote (Leveque solution)


For short distance asymptote, the main gradient exists very close to the wall (i. e., do-
main of interest is the boundary layer region near 𝜕Ωf ). Assuming ξ is the dimension-
less distance normal to the wall, the velocity profile near the wall can be approximated
by

u(ξ ) = u0 ξ (26.17)

and the governing model (equations (26.5) and (26.6)) can be simplified as follows:

1 𝜕2 θ 𝜕θ
= ; θ|x=0 = 1; θ|ξ =0 = 0; θ|ξ →∞ → 1 (26.18)
u0 ξ 𝜕ξ 2 𝜕x

Thus, by defining a new variable as

γξ
z= , (26.19)
x 1/3

we get

𝜕θ 𝜕θ 𝜕z −γξ 𝜕θ
= =
𝜕x 𝜕z 𝜕x 3x 4/3 𝜕z

and

𝜕θ 𝜕θ 𝜕z γ 𝜕θ 𝜕2 θ γ 2 𝜕2 θ
= = 1/3 or 2
= 2/3 2 ,
𝜕ξ 𝜕z 𝜕ξ x 𝜕z 𝜕ξ x 𝜕z

which expresses the governing model (equation (26.18)) as

1 γ 2 𝜕2 θ −γξ 𝜕θ
=
u0 ξ x 2/3 𝜕z 2 3x 4/3 𝜕z

󳨐⇒
688 | 26 The classical Graetz–Nusselt problem

𝜕2 θ u u0
= − 03 z 2 = −3z 2
𝜕θ 𝜕θ
if γ = √3 (26.20)
𝜕z 2 3γ 𝜕z 𝜕z 9

with boundary conditions:

θ = 1 @ z → ∞; and θ = 0@z = 0 (26.21)

󳨐⇒
γξ
z=
x 1/3
1
θ(x, ξ ) = θ(z) = ∫ exp(−t 3 ) dt (26.22)
Γ( 43 )
0

where Γ is the Gamma function.


Thus, the Nusselt number can be obtained from equations (26.10)–(26.11) as

1 𝜕θ 󵄨󵄨󵄨󵄨 γ 𝜕θ 󵄨󵄨󵄨 γ
NuΩ0 = lim NuΩ (x) = lim 󵄨󵄨 = 1/3 󵄨󵄨󵄨 = (26.23)
x≪1 x≪1 θm 𝜕ξ 󵄨󵄨ξ =0 x 𝜕z 󵄨󵄨z=0 Γ( 4 )x 1/3
3

󳨐⇒
1/3
γ 1 u0
NuΩ0 = lim NuΩ (x) = = ( ) . (26.24)
x≪1 Γ( 43 )x 1/3 4
Γ( 3 ) 9x

The two asymptote can also be combined to obtain a simpler approximate expression
for the Nusselt number as discussed in Gundlapally and Balakotaiah [19].

26.2 Parallel plate with fully-developed velocity profile


Consider the flow between parallel plates (Figure 26.1b) where hydraulic radius is the
half-spacing between the plates. In this case, the fully-developed velocity profile in
laminar flow is given by

3
u(y) = (1 − y2 ) (26.25)
2

and the governing model can be expressed in dimensionless form as follows:

3 𝜕θ 𝜕2 θ
(1 − y2 ) = , x > 0 and 0 < y < 1 (26.26)
2 𝜕x 𝜕y2

with boundary and inlet conditions as

θ(x = 0, y) = 1 and θ(x, y = 1) = 0 (26.27)


26.2 Parallel plate with fully-developed velocity profile | 689

The EVP can be defined for parallel plate geometry from equation (26.7) as follows:

2 𝜕2 ψ
Lψ = = −λψ, 0 < y < 1; (26.28)
3(1 − y ) 𝜕y2
2

with boundary conditions

𝜕ψ 󵄨󵄨󵄨󵄨
ψ|y=1 = 0; 󵄨 = 0, (26.29)
𝜕y 󵄨󵄨󵄨y=0

which is self-adjoint w. r. t. the inner product:

1
3
⟨ψi , ψj ⟩ = ∫ (1 − y2 )ψi ψj dy = δij (26.30)
2
0

The first few eigenvalues and Fourier coefficients βi = ⟨1, ψ2i ⟩ are listed in Table 26.1.
The corresponding Graetz eigenfunctions are shown in Figure 21.1.

Table 26.1: First few eigenvalues and Fourier coefficients of the Graetz problem for flow between
parallel plates and constant wall temperature.

i λi βi = ⟨1, ψi ⟩2

1 1.8852 0.91035
2 21.431 0.05314
3 62.317 0.01528
4 124.54 0.00681
5 208.09 0.00374
6 312.98 0.00232

Using these values and equation (26.15), the Nusselt number can be plotted against
the dimensionless axial coordinate x, which is shown in Figure 26.2.
It can be seen from this figure that it exhibits the two asymptotes in the limits of
x ≫ 1 and x ≪ 1. The long distance asymptote (x ≫ 1) can be obtained from equation
(26.16), which simplifies to

NuΩ∞ = lim NuΩ (x) = λ1 = 1.8852. (26.31)


x≫1

Similarly, the short distance asymptote can be obtained from equation (26.15), where
velocity profile is simplified as

3
lim u(y) = lim (1 − y2 ) → 3(1 − y) = 3ξ 󳨐⇒ u0 = 3,
y→1 y󳨀→1 2
690 | 26 The classical Graetz–Nusselt problem

Figure 26.2: Local or position dependent Nusselt number (dimensionless heat transfer coefficient)
for fully developed laminar flow through parallel plates for the case of constant wall temperature.

and hence the short distance asymptote of Nusselt number can be given from equation
(26.24) as

1 3 1 −1/3
NuΩ0 = lim NuΩ (x) = √ x = 0.7765x −1/3 . (26.32)
x≪1 Γ( 43 ) 3

26.3 Circular channel with fully-developed velocity profile


Consider the fully developed laminar flow in a circular duct (Figure 26.1b). Here, hy-
draulic radius RΩf is half of the duct radius a, i. e., RΩf = a2 . While the transverse co-
ordinate can be nondimenionalized by hydraulic radius leading to y ∈ [0, 2], it could
also be nondimensionalized by the duct radius, which will lead y ∈ [0, 1]. In the gen-
eral case, we have shown the use of hydraulic radius in earlier sections and applied
it to parallel plate geometry. Now, we show the use of duct radius to nondimension-
alize for the circular geometry, which leads to the fully-developed velocity profile in
laminar flow as

u(y) = 2(1 − y2 ) (26.33)

and the governing model can be expressed in dimensionless form as

𝜕θ 1 𝜕
2(1 − y2 )
𝜕θ
= (y ), x > 0 and 0 < y < 1 (26.34)
𝜕x y 𝜕y 𝜕y

with boundary and inlet conditions as

θ(x = 0, y) = 1 and θ(x, y = 1) = 0. (26.35)


26.3 Circular channel with fully-developed velocity profile | 691

The EVP can be defined as follows:

1 1 𝜕 𝜕ψ
Lψ = (y ) = −λψ, 0 < y < 1; (26.36)
2(1 − y2 ) y 𝜕y 𝜕y

with boundary conditions

𝜕ψ 󵄨󵄨󵄨󵄨
ψ|y=1 = 0; 󵄨 = 0, (26.37)
𝜕y 󵄨󵄨󵄨y=0

which is self-adjoint w. r. t. the inner product:

1 1

⟨ψi , ψj ⟩ = ∫ 2yu(y)ψi ψj dy = ∫ 4y(1 − y2 )ψi ψj dy = δij . (26.38)


0 0

The first six eigenvalues and Fourier coefficients βi = ⟨1, ψ2i ⟩ are listed in Table 26.2.
The corresponding Graetz eigenfunctions are shown in Figure 26.3.

Table 26.2: First few eigenvalues and Fourier coefficients of the Graetz problem for flow through a
circular duct and constant wall temperature.

i λi βi = ⟨1, ψi ⟩2
1 3.65679 0.819050
2 22.3047 0.097527
3 56.9605 0.032504
4 107.620 0.015440
5 174.282 0.008788
6 256.945 0.005584

Figure 26.3: First six eigenfunctions of the Graetz–Nusselt problem for fully developed laminar flow
through a circular duct with constant wall temperature.

Note that in this case, since duct radius is used to nondimensionlize the transverse
coordinate, AΩ ≠ PΩ (AΩ = 1 and PΩ = 2), and hence we rewrite equation (26.14) as
692 | 26 The classical Graetz–Nusselt problem

1 2
−λ⟨1, ψ⟩ = ⟨1, Lψ⟩ = ∫ (n.∇⊥ ψ) dPΩ = ∫ (n.∇⊥ ψ) dPΩ
AΩ PΩ
𝜕Ω 𝜕Ω

󳨐⇒

1 −λj
∫ (n.∇⊥ ψj ) dPΩ = ⟨1, ψj ⟩.
PΩ 2
𝜕Ω

This simplifies the Nusselt number based on hydraulic diameter from equation (26.14)
as

2ha ∑ λ exp(−λi x)βi


Nu = = 4 NuΩ = i i ; βi = ⟨1, ψi ⟩2 , (26.39)
kf ∑i exp(−λi x)βi

which is plotted against the dimensionless axial coordinate x in Figure 26.4.

Figure 26.4: Local or position dependent Nusselt number (dimensionless heat transfer coefficient)
based on hydraulic diameter for fully developed laminar flow through a circular duct for the case of
constant wall temperature.

As expected, it exhibits the two asymptotes in the limits of x ≫ 1 and x ≪ 1. The long
distance asymptote (x ≫ 1) can be obtained from equation (26.39), which simplifies
to

Nu∞ = lim Nu(x) = λ1 = 3.6568. (26.40)


x≫1

Similarly, the short distance asymptote can be obtained from equation (26.15), where
the velocity profile is simplified as

lim u(y) = lim 2(1 − y2 ) → 4(1 − y) = 4ξ 󳨐⇒ u0 = 4,


y→1 y󳨀→1
26.3 Circular channel with fully-developed velocity profile | 693

and hence the Short distance asymptote of the Nusselt number can be given from equa-
tion (26.24) as

2 3 4 −1/3
Nu0 = √ x = 1.7092x −1/3 . (26.41)
Γ( 43 ) 9

Other duct geometries as well as flux and mixed boundary conditions at the wall can
be analyzed in a similar way.

Problems
1. Consider the Graetz–Nusselt problem in a triangular duct with constant wall tem-
perature boundary condition:
(a) Identify the EVP and obtain the formal solution.
(b) Determine the long and short distance asymptotes for the Nusselt number.
2. Consider the Graetz–Nusselt problem in a circular tube with constant wall flux
boundary condition. Formulate the model equations and identify the EVP. Deter-
mine an expression for the Nusselt number and identify the short and long dis-
tance asymptotes.
3. Repeat problem 2 for parallel plate geometry.
4. The concentration of a reactant in a tubular catalytic reactor in which the flow
is laminar and fully developed is given by (assuming centerline symmetry and
neglecting axial diffusion)

r 2 𝜕C 1 𝜕 𝜕C
2⟨u⟩(1 − ) = Dm (r ); x > 0, 0 < r < a
a2 𝜕x r 𝜕r 𝜕r

with boundary and initial conditions

𝜕C 𝜕C
= 0 at r = 0; −Dm = ks C at r = a
𝜕r 𝜕r
C(r, x = 0) = C0

(a) Cast the model equations in dimensionless form and solve using finite Fourier
transform.
(b) Use the solution in (a) to determine the local Sherwood number (or dimen-
sionless mass transfer coefficient) defined by

2a −Dm 𝜕r (r = a)
𝜕C
2a
Sh = ( )kc = ( ) ,
Dm Dm Cm − C(r = a)

where Cm is the cupmixing (velocity weighted) concentration.


(c) How does the result in (b) simplify for long (L/a ≫ 1) channels?
694 | 26 The classical Graetz–Nusselt problem

(d) Simplify the result in (b) for the two limiting cases of infinitely fast (ks → ∞)
and slow reaction (ks → 0) and comment on your results.
5. The Graetz–Nusselt formulation assumes that the duct length is much larger than
the hydraulic diameter and neglects the axial diffusion term (or takes axial Peclet
number to be infinity). At the other extreme of the large hydraulic diameter com-
pared to length for heat/mass transfer or reaction, for which the axial Peclet num-
ber goes to zero, the appropriate model is the so-called “short tube model.” For
the case of fixed temperature boundary condition, this model may be expressed
as

T − Tin
kf ∇⊥2 T = ρf Cpf uf (y′ , z ′ )( ), (y′ , z ′ ) ∈ Ω
L
T|𝜕Ω = Tw

(a) Cast this model in dimensionless form.


(b) Obtain the solution for the temperature profile and use it to determine the
Nusselt number.
(c) Identify the asymptotic behavior of Nu for small and large values of Ph =
R2Ω ⟨uf ⟩
Lαf
.
6. (a) Discuss the similarities and differences in the solution of the Graetz–Nusselt
problem with fixed wall temperature for the different duct shapes shown in Fig-
ure 26.1
(b) Discuss how the short and long distance asymptotes may be combined to ob-
tain an approximate expression for Nu for any arbitrary geometry.
27 Friction factors for steady-state laminar flow in
ducts
27.1 Model formulations and formal solution
Consider steady-state laminar flow in a duct of a constant but arbitrarily shaped cross-
section as shown schematically in Figure 27.1.

Figure 27.1: Schematic of laminar flow in a duct of an arbitrary cross-section.

For such unidirectional flow in a duct, the x-momentum balance reduces to

ΔP
μ∇′2 ux = ( ) in Ω′ ; ux = 0 on 𝜕Ω′ (no slip boundary condition) (27.1)
L

Here, μ is the fluid viscosity; ux is the axial component of the velocity; −ΔP = P1 −P2 > 0
is the pressure drop; P1 and P2 are pressures at the entrance (x ′ = 0) and the exit
(x ′ = L), respectively; L is the length of the duct; Ω′ and 𝜕Ω′ are the transverse (cross-
sectional) domain and its boundary.
Defining the effective transverse length (i. e., hydraulic radius) RΩ and dimension-
less transverse coordinates as

flow area A ′ (y′ , z ′ )


RΩ = = Ω ; (y, z) = , (27.2)
wetted perimeter PΩ′ RΩ

the characteristic velocity u∗ and dimensionless velocity U as

R2Ω −ΔP ux (y′ , z ′ )


u∗ = ( ); U(y, z) = , (27.3)
μ L u∗

the momentum balance (equation (27.1)) and no-slip wall boundary condition can be
expressed in dimensionless form as follows:

∇2 U = −1 in Ω; U = 0 on 𝜕Ω (27.4)

https://doi.org/10.1515/9783110739701-029
696 | 27 Friction factors for steady-state laminar flow in ducts

Note that the hydraulic radius of scaled (dimensionless) domain Ω is unity.


An overall force balance on the duct gives

(−ΔP)
τw PΩ′ L = (−ΔP)AΩ′ 󳨐⇒ τw = RΩ . (27.5)
L

Here, τw is the average wall shear stress. The friction factor f (or the dimensionless
pressure drop) is defined by
τw
f = 1
, (27.6)
2
ρ⟨ux ⟩2

where ρ is the fluid density and ⟨ux ⟩ is the average velocity. Defining the Reynolds
number Re as

(4RΩ )⟨ux ⟩ρ
Re = , (27.7)
μ

the friction factor can be expressed as


(−ΔP)
L
RΩ (4RΩ )⟨ux ⟩ρ 8u∗ 8
f Re = . = = . (27.8)
1
ρ⟨ux ⟩2 μ ⟨ux ⟩ ⟨U⟩
2

Thus, to obtain the relationship between pressure drop (or f ) and flow rate (or ⟨U⟩),
we need to solve the Poisson equation (27.4) to determine U(y, z) and then ⟨U⟩ can be
determined by

1
⟨U⟩ = ∫ U(y, z) dy dz. (27.9)

Ω

Equation (27.8) gives the required relationship. Equation (27.4) can be solved either by
direct method (e. g., 1D boundary value problem for symmetric geometries) or by finite
Fourier transform.

FFT method
Let λi be the eigenvalue and ψi be the eigenfunction of the EVP:

Lψ = ∇2 ψ = −λψ in Ω; ψ = 0 on 𝜕Ω (27.10)

which is a self-adjoint operator (i. e., L = L∗ ) w. r. t. inner product defined by

1
⟨ψi , ψj ⟩ = ∫ ψi ψj dy dz = δij (27.11)

Ω

Thus, taking the inner product with ψi , equation (27.4) leads to


27.2 Specific example: parallel plates | 697

−⟨1, ψi ⟩ = ⟨∇2 U, ψi ⟩ = ⟨U, ∇2 ψi ⟩ = −λi ⟨U, ψi ⟩

󳨐⇒
1
⟨U, ψi ⟩ = ⟨1, ψi ⟩, (27.12)
λi

which leads to the formal solution for velocity profile as

1
U(y, z) = ∑⟨U, ψi ⟩ψi (y, z) = ∑ ⟨1, ψi ⟩ψi (y, z). (27.13)
i i
λi

Thus, the average velocity can be expressed as

1 β
⟨U⟩ = ⟨1, U⟩ = ∑ ⟨1, ψi ⟩2 = ∑ i ; βi = ⟨1, ψi ⟩2 (27.14)
i
λi i
λi

where βi are the Fourier weights that satisfy the Parseval’s relation: ∑i βi = 1. Thus,
the friction factor can be expressed in terms of eigenvalues of the Laplacian operator
(with Dirichlet boundary condition and Fourier weights) as follows:

8 8 8
f Re = = 1
= βi
. (27.15)
⟨U⟩ ∑i
λi
⟨1, ψi ⟩2 ∑i λi

27.2 Specific example: parallel plates


Consider the parallel plate geometry with half-spacing of a. The hydraulic radius for
this case is RΩ = a. The dimensionless model for velocity can be expressed by

d2 U
= −1; U(y = ±1) = 0 (27.16)
dy2

This model can either be solved directly or by FFT.

27.2.1 Direct solution

The direct solution of equation (27.16) can be obtained by integrating it twice, which
with Dirichlet boundary condition leads to as

1
U(y) = (1 − y2 ) (27.17)
2
󳨐⇒
1 1
1 1
⟨U⟩ = ∫ U(y) dy = ∫ (1 − y2 ) dy = (27.18)
2 3
0 0
698 | 27 Friction factors for steady-state laminar flow in ducts

󳨐⇒
8
f Re = = 24 (27.19)
⟨U⟩

27.2.2 FFT approach

The EVP for parallel plates can be expressed as

d2 ψ
Lψ = = −λψ, ψ[y = ±1] = 0, (27.20)
dy2

which is self-adjoint w. r. t. the inner product defined by

1
1
⟨ψi , ψj ⟩ = ∫ ψi ψj dy = δij (27.21)
2
−1

This gives the following eigenvalues and normalized eigenfunctions:

(2i − 1)2 π 2 πy
λi = , ψi = √2 cos[(2i − 1) ], i = 1, 2, 3, . . . (27.22)
4 2

The average value of these eigenfunctions is


1
1 √ 1 (−1)i−1 2√2
⟨1, ψi ⟩ = ∫ 2 cos[(i − )πy] dy = , (27.23)
2 2 (2i − 1)π
−1

which gives the Fourier weight β as

8
βi = ⟨1, ψi ⟩2 = (27.24)
(2i − 1)2 π 2

Thus, the formal solution can be expressed in terms of eigenvalues and eigenfunctions
from equation (27.13) as

1 ∞
(−1)i−1 16 πy
U(y) = ∑ ⟨1, ψi ⟩ψi (y, z) = ∑ cos[(2i − 1) ]. (27.25)
i
λi i=1
(2i − 1) 3π3 2

This gives the average velocity from equation (27.14) as

βi ∞ 32 1 32 π 4 1
⟨U⟩ = ∑ =∑ 4 = 4 = (27.26)
i
λi i=1 π (2i − 1) 4 π 96 3

which matches the result from direct solution (equation (27.18)). Similarly, the friction
factor can be obtained from equation (27.8) or (27.15) as
27.3 Specific case: elliptical ducts | 699

8
f Re = = 24 (27.27)
⟨U⟩

that matches the direct solution (equation (27.19)).

27.3 Specific case: elliptical ducts


Consider the elliptical duct with the transverse domain given by

y′2 z ′2 a
Ω′ ≡ + 2 − 1 < 0; −a ≤ y′ ≤ a, −b ≤ z ′ ≤ b; σ= (27.28)
a2 b b
where a and b are the lengths of semimajor and semiminor axes, and σ is the aspect
ratio of the ellipse (σ = 1 for a circle). The momentum equation (27.1) can be solved in
many ways. Here, we utilize a single trial function that vanishes at the boundary 𝜕Ω′
and expresses the flow rate as follows:

y′2 z ′2
ux = β(1 − − 2) (27.29)
a2 b

The model equation (27.1) leads to

ΔP ΔP 1 1
μ∇′2 ux = ( ) 󳨐⇒ ( ) = −2β( 2 + 2 ) (27.30)
L μL a b

󳨐⇒ the characteristic velocity u∗ from equation (27.3) is simplified as

1 1
u∗ = R2Ω ( ) = 2βR2Ω ( 2 + 2 ).
−ΔP
(27.31)
μL a b

The average velocity can be obtained from equation (27.29) as follows:

1
⟨ux ⟩ = ∫ ux (y′ , z ′ ) dy′ dz ′
A Ω′
Ω′
′2
b√ 1− y 2
a a
4β y′2 z ′2
= ∫[ ∫ (1 − − 2 ) dz ′ ] dy′
πab a2 b
0 0
a 3/2
4β 2b y′2
= ∫ (1 − 2 ) dy′
πab 3 a
0
π/2

= ∫ cos4 (θ) dθ (taking y′ = a sin θ)

0
8β 3π β
= . = (27.32)
3π 16 2
700 | 27 Friction factors for steady-state laminar flow in ducts

Thus, the friction factor can be expressed from equations (27.8), (27.31) and (27.32) as

8 8u∗ 1 1
f Re = = = 32R2Ω ( 2 + 2 ) (27.33)
⟨U⟩ ⟨ux ⟩ a b

AΩ′
where RΩ = PΩ′
is the hydraulic radius of the duct. The cross-section area of the duct
is AΩ′ = πab while the wetted perimeter PΩ′ is given by the integral:

PΩ′ = ∫ ds = ∫ √dx 2 + dy2


𝜕Ω′ 𝜕Ω′

Since 𝜕Ω′ ≡ ya2 + zb2 − 1 = 0, taking y′ = a sin θ and z ′ = b cos θ, the above expression
′2 ′2

for the perimeter can be simplified as

π/2

PΩ′ = 4 ∫ √a2 cos2 θ + b2 sin2 θ dθ


0
π/2
1
= 4a ∫ √cos2 θ + sin2 θ dθ
σ2
0
π/2
1
= 4a ∫ √1 − (1 − 2 ) sin2 θ dθ = 4a E(σ).
σ
0

The hydraulic radius can be expressed as

π/2
πab πb 1
RΩ = = ; where E(σ) = ∫ √1 − (1 − 2 ) sin2 θdθ. (27.34)
PΩ′ 4E(σ) σ
0

Thus, equation (27.33) can be further simplified to express the friction factor as

R2Ω 1
f Re = 32 2
(1 + 2 )
b σ
2 π/2
1 π 1
= 2(1 + 2 )[ ]; where E(σ) = ∫ √1 − (1 − 2 ) sin2 θ dθ (27.35)
σ E(σ) σ
0

Figure 27.2 plots friction factor for elliptical duct against its aspect ratio and few
of these values are listed in Table 27.1. Note from Figure 27.2 that the friction factor
curve is symmetric around σ = 1 on log-linear plot. This is because aspect ratio of σ
and σ −1 have the same cross-sectional shape of the duct, and hence the same friction
factor. It can also be seen from equation (27.35) that in the limit of σ → ∞, E(σ) →
27.3 Specific case: elliptical ducts | 701

Table 27.1: Friction factor and E-function for elliptical ducts of aspect ratio σ.

σ E(σ) f Re
1 2
σ≪1 σ2
2π = 19.74

0.1 10.16 19.314

0.5 2.422 16.823

π
1.0 2
= 1.571 16.0

1.5 1.322 16.31

2.0 1.211 16.823

10.0 1.016 19.314


2
∞ 1 2π = 19.74
702 | 27 Friction factors for steady-state laminar flow in ducts

Figure 27.2: Friction factor and E-function for elliptical ducts against the aspect ratio σ.

π/2
∫0 cos θ dθ = 1, and hence limσ→∞ f Re = 2π 2 . In addition, since f Re is symmetric
around σ = 1 (on log-linear plot),

lim f Re = lim f Re = 2π 2 (27.36)


σ→0 σ→∞

2π 2
This also implies from equation (27.35) that in the limit of σ → 0, E(σ) → f Re
(1 + σ12 ) →
1
σ2
, which can also be seen from equation (27.35).

Problems
1. Consider the problem of laminar flow in a duct of equilateral triangular cross-
section with side a and height H = √3a/2 as shown in Figure 27.3:

Figure 27.3: Schematic of an equilateral triangular duct.

(a) Verify that the expression

y′ y′ 2 − 3x ′ 2
uz (x ′ , y′ ) = c0 ( − 1)( )
H H2

satisfies the no-slip condition, and hence represents a possible velocity pro-
file. Determine the constant c0 so that equation (27.1) is satisfied.
27.3 Specific case: elliptical ducts | 703

(b) Formulate the momentum balance in dimensionless form so that RΩ = 1.


(c) Use the results in (a) to show that f Re = 40
3
.
2. (a) Determine the dimensionless velocity profile for a duct of circular cross-section
(with RΩ = 1) and use it to slow f Re = 16.
(b) Obtain the expression for f Re using FFT.
3. Consider the problem of steady laminar flow in a rectangular duct of height 2b and
width 2a. Formulate the problem and cast the model in dimensionless form. Use
the Fourier transform method to show that the friction factor is given by

24
f Re =
2 192λ tan h[ (2k−1)π
(1 + λ) [1 − 2λ
]
π5
∑∞
k=1 ]
(2k−1)5

where λ(= a/b) is the aspect ratio. Use the above formula to evaluate and plot f Re
as a function λ for 1 ≤ λ ≤ ∞. Determine the numerical value of f Re for a square
duct.
28 Multicomponent diffusion and reaction
Problems of multicomponent diffusion and reaction are of fundamental importance
in chemical engineering. They arise in the design of catalysts and catalytic reactors,
adsorption and separation processes as well as many other applications. We have al-
ready illustrated the application of linear analysis to these problems in Sections 4.5.6,
5.3 and Chapters 23 and 25. In this chapter, we show some further application to deter-
mine catalyst effectiveness factors and calculation of effluent concentrations in mono-
lith reactors.
In the next section, we examine the problem of catalyst effectiveness factor. We
also introduce the concept of internal mass transfer coefficient and its calculation for
an arbitrary geometry. This is followed by a discussion of the multicomponent case
and illustration of the calculations.

28.1 Generalized effectiveness factor problem


Consider a catalyst (porous) of arbitrary shape with activity profile given by a(x ′ , y′ , z ′ )
as shown in Figure 28.1.

Figure 28.1: Schematic of a porous catalyst particle of arbitrary shape.

Assuming a single step reaction A → B with linear kinetics, the steady-state reactant
concentration profile C(x ′ , y′ , z ′ ) satisfies the following diffusion–reaction problem:

De ∇′2 C = a(x ′ , y′ , z ′ )k0 C in Ω′ ; C = C0 on 𝜕Ω′ (28.1)

where De is the effective diffusivity of the reactant in the porous particle; k0 is the first-
order rate constant (based on unit volume); and a(x′ , y′ , z ′ ) is the normalized activity
profile, i. e.,

1
∫ a(x ′ , y′ , z ′ ) dVΩ′ = 1 (28.2)
V Ω′
Ω′

https://doi.org/10.1515/9783110739701-030
28.1 Generalized effectiveness factor problem | 705

Here, VΩ′ is the volume of the catalyst. Let SΩ′ is the external surface area of the catalyst
particle, then the effective diffusion length RΩ , Thiele modulus ϕ and other quantities
can be defined as

V Ω′ (x ′ , y′ , z ′ ) C(x ′ , y′ , z ′ )
RΩ = ; (x, y, z) = ; c(x, y, z) = ;
SΩ′ RΩ C0

R2Ω k0
ϕ2 = ; g(x, y, z) = a(RΩ x, RΩ y, RΩ z). (28.3)
De

Thus, the model equation (28.1) can be expressed in dimensionless form as follows:

∇2 c = g(x, y, z)ϕ2 c, (x, y, z) ∈ Ω; c = 1 on 𝜕Ω (28.4)

Here, g(x, y, z) satisfies the normalized condition, i. e.,

1 1
∫ g(x, y, z) dΩ = ∫ g(x, y, z) dx dy dz = 1, (28.5)
VΩ VΩ
Ω Ω

where VΩ is the volume of particle in dimensionless units (i. e., of scaled domain). For
the case of uniform catalyst activity, g(x, y, z) = 1. In the general case, g > 0.
To solve equation (28.4), we consider the EVP:

1
Lψ = ∇2 ψ = −λψ in Ω; ψ = 0 on 𝜕Ω (28.6)
g(x, y, z)

which is self-adjoint with respect to the activity weighted inner product:

1
⟨ψi , ψj ⟩ = ∫ g(x, y, z)ψi (x, y, z)ψ(x, y, z) dx dy dz = δij (28.7)

Ω

As shown in earlier chapters, the eigenvalues λj are real and positive. In addition, the
Parseval’s relation suggests that the normalized eigenfunctions and Fourier weights
satisfy

∑ βi = 1; βj = Fourier weight = ⟨1, ψj ⟩2 (28.8)


i

Equation (28.4) can be solved in terms of eigenvalues and eigenfunctions defined in


equation (28.6). For this, we define a new variable u = 1 − c that satisfies the following
equation:

1
Lu = ∇2 u = ϕ2 (u − 1) in Ω; u = 1 − c; u = 0 on 𝜕Ω (28.9)
g(x, y, z)
706 | 28 Multicomponent diffusion and reaction

󳨐⇒ after taking inner product with ψi

⟨Lu, ψi ⟩ = ϕ2 ⟨u, ψi ⟩ − ϕ2 ⟨1, ψi ⟩ = ⟨u, Lψi ⟩ = −λi ⟨u, ψi ⟩

󳨐⇒

ϕ2 ⟨1, ψi ⟩
⟨u, ψi ⟩ =
ϕ2 + λi

󳨐⇒

ϕ2 ⟨1, ψi ⟩
u = ∑⟨u, ψi ⟩ψi = ∑ ψ
i i
ϕ2 + λi i

󳨐⇒ The formal solution for dimensionless concentration c(x, y, z) is

ϕ2 ⟨1, ψi ⟩
c =1−u=1−∑ ψ (x, y, z) (28.10)
i
ϕ2 + λi i

28.1.1 Effectiveness factor

The effectiveness factor can be determined using the concentration profile. The effec-
tiveness factor η is defined by

1
η= ∫ g(x, y, z)c(x, y, z) dx dy dz

Ω

= ⟨1, c⟩ = cm = activity weighted average concentration, (28.11)

󳨐⇒

ϕ2 ⟨1, ψi ⟩2 ϕ 2 βi λβ
η = ⟨1, c⟩ = 1 − ∑ 2
= 1 − ∑ 2
= ∑ 2i i (28.12)
i
ϕ + λi i
ϕ + λi i
ϕ + λi

where βi are Fourier weights defined in equation (28.8). The above expression for ef-
fectiveness factor can be expanded in power series of Thiele modulus ϕ2 as

βi j
η = 1 + ∑ ∑(−1)j j
(ϕ2 ) , (28.13)
i j=1 λi

where the coefficients in the power series are termed as Aris numbers (Balakota-
iah [5]):
28.1 Generalized effectiveness factor problem | 707


j βi
η = 1 + ∑(−1)j Arj (ϕ2 ) ; Arj = ∑ j
(28.14)
j=1 i λi

The Aris numbers depend only on the geometric shape of the catalyst particle. These
can also be obtained by using a perturbation method, by expanding the concentration
in powers of ϕ2 as

j
c = ∑ (−1)j cj (x, y, z)(ϕ2 ) ; (28.15)
j=0

where

Arj = ⟨1, cj ⟩; c0 (x, y, z) = 1;

∇2 cj = −g(x, y, z)cj−1 (x, y, z) in Ω; cj = 0 on 𝜕Ω, j ≥ 1. (28.16)

Equation (28.12) gives the effectiveness factor for a catalyst particle (or layer) in terms
of eigenvalues, Fourier weights and ϕ2 .

28.1.2 Sherwood number (for internal mass-transfer coefficient)

The internal mass-transfer coefficient kci for species exchange between the boundary
and interior of the catalyst particle can be defined as follows:
1
SΩ′
∫𝜕Ω′ De n.∇C dSΩ′
kci = (28.17)
C0 − Cm

where Cm is the activity weighted average concentration, defined by

1
Cm = ∫ a(x ′ , y′ , z ′ )C(x ′ , y′ , z ′ ) dVΩ′ . (28.18)
V Ω′
Ω′

Thus, by using divergence theorem:

k0 VΩ′ Cm
∫ ∇.(∇C) dVΩ′ = ∫ n.∇C dSΩ′ = , (28.19)
De
Ω′ 𝜕Ω′

the internal Sherwood number Shi (or dimensionless mass transfer coefficient) can be
expressed as
λi βi
Cm k0 R2Ω ηϕ2 (∑∞
i=1 ϕ2 +λi
)
Shi = = = (28.20)
C0 − Cm De 1 − η (∑∞ βi
)
i=1 ϕ2 +λi
708 | 28 Multicomponent diffusion and reaction

or
1
η= . (28.21)
ϕ2
1+ Shi

28.1.3 Exact expressions for Shi for some common geometries

The exact expressions for the internal Sherwood number for a slab, an infinite cylinder
and a sphere for uniform activity can be obtained as (see Chapters 23 and 25)

1 1
−1
Shi = ( − ) (slab) (28.22)
ϕ tanh ϕ ϕ2

ϕ2 I1 (2ϕ)
Shi = (infinite cylinder) (28.23)
ϕI0 (2ϕ) − I1 (2ϕ)

3ϕ3 coth(3ϕ) − ϕ2
Shi = (sphere). (28.24)
3ϕ2 − 3ϕ coth(3ϕ) + 1

Several limiting cases of equations (28.12) and (28.20) for the general case are of inter-
est. For ϕ2 → 0, we have

η = 1 − Ar1 ϕ2 + O(ϕ4 ), (28.25)

and
−1
1 ∞
β
Shi ≜ Shi∞ = = (∑ i ) . (28.26)
Ar1 λ
i=1 i

For ϕ2 ≫ 1, the sum defining η may be replaced by an integral. In this limit, it can be
shown that η → ϕ1 while Shi → ϕ. Using these limits, Shi for any ϕ may be approxi-
mated by

Shi = Shi∞ +ϕ tanh[Λ∗ ϕ], (28.27)

where the constants Shi∞ and Λ∗ depend only on the geometric shape of the cata-
lyst particle. These constants can be related to Arj (j = 1, 2, . . .) or λi and βi . Values of
these constants for the case of uniform activity and various common geometries can
be found in the literature (Tu et al. [30]; Sarkar et al. [27]). A plot of Shi for infinite slab
(Shi∞ = 3 and Λ∗ = 0.2), cylinder (Shi∞ = 2 and Λ∗ = 0.33) and sphere (Shi∞ = 35
and Λ∗ = 0.43) is shown in Figure 28.2. The shape of the Shi versus ϕ curve for any
arbitrary geometry is similar to the curves shown in Figure 28.2.
28.2 Multicomponent diffusion and reaction in the washcoat layer of a monolith reactor | 709

Figure 28.2: Internal Sherwood number versus Thiele modulus for slab, infinite cylinder and a
sphere.

28.2 Multicomponent diffusion and reaction in the washcoat layer


of a monolith reactor
So far, we have examined only the case of a single reaction in the catalyst particle or
layer. We now consider the case of multiple reactions with the following assumptions:
(i) steady state, (ii) diffusion in the washcoat/catalyst is governed by Knudsen mech-
anism, (iii) catalyst layer is thin so that gradients are important in only one direction.
Consider R reactions among S species:

R
∑ νij Aj = 0; j = 1, 2, . . . , R (28.28)
j=1

Let ri be the rate of reaction i, then the steady-state species balance for Aj gives

d2 cwj R
Dej + ∑ νij ri (cw ) = 0 (28.29)
dy2 i=1

cwj = csj @ y = 0 (28.30)


dcwj
= 0 @ y = δc . (28.31)
dy

In vector form, equations (28.29) and (28.30)–(28.31) may be written as

d 2 cw dcw
De + νT r(cw ) = 0; cw = cs @ y = 0; = 0 @ y = δc , (28.32)
dy2 dy

where De is the diagonal matrix of Knudsen diffusivities; ν is the stoichiometric co-


efficient matrix; r is a R × 1 vector of reaction rates, cw is the S × 1 vector of species
concentrations and δc is the thickness of the catalyst layer.
710 | 28 Multicomponent diffusion and reaction

We consider the case of linear kinetics, i. e.,

r(cw ) = Kc
̂ w, K
̂ = R × S matrix of rate constants

󳨐⇒

−νT r(cw ) = −νT Kc


̂ w = Kv cw ,

where Kv = S×S matrix of effective rate constants (for species consumption). Equation
(28.32) may be written as

d2 cw
De = Kv cw ; 0 < y < δc .
dy2

Defining
y
ξ = , Φ2w = δc2 D−1
e Kv
δc

󳨐⇒

d 2 cw dcw
= Φ2w cw , 0 < ξ < 1; = 0 @ ξ = 1; cw = cs @ ξ = 0. (28.33)
dξ 2 dξ

Here, Φw is the Thiele matrix, which may be shown to have real and nonnegative
eigenvalues for the case of monomolecular kinetics (Wei and Prater [31]).
Equation (28.33) may be solved using the standard matrix methods (as discussed
in Chapter 5):

cw = cosh[Φw (1 − ξ )].(cosh Φw )−1 cs . (28.34)

The observed or effective rate vector is given by

robs = ∫ Kv cw (ξ ) dξ
0

= Kv (tanh Φw )Φ−1
w cs = Kv cs

(28.35)

where K∗v is the diffusion-disguised rate constant matrix:

K∗v = Kv H; H = (tanh Φw )Φ−1


w

where H is the effectiveness factor matrix.


It is easily verified that

K∗v = Kv for δc → 0 or ‖Φw ‖ → 0, (28.36)


28.2 Multicomponent diffusion and reaction in the washcoat layer of a monolith reactor | 711

while we have
1
Kv∗ = Kv (D−1
e Kv ) ‖Φw ‖ ≫ 1. (28.37)
−1/2
,
δc

Sherwood matrix for internal mass transfer


We define the internal mass transfer coefficient matrix kci by

𝜕cw 󵄨󵄨󵄨󵄨
kci (cs − cw ) = −De 󵄨 = jf −wc (28.38)
𝜕y 󵄨󵄨󵄨y=0

where the average concentration vector in the washcoat cw and the species flux vector
at fluid-washcoat interface jf −wc are given by

cw = Hcs ; jf −wc = δc Kv Hcs = δc K∗v cs , (28.39)

󳨐⇒

kci = δc Kv H(I − H)−1 (28.40)

Defining

Shi = δc D−1
e kci , (28.41)

we obtain

Shi = δc2 D−1


e Kv H(I − H)
−1

w (tanh Φw ) − Φw ] , (28.42)
−1
= [Φ−1 −1 −2

which is a generalization of the expression for the scalar case (slab or parallel plate
geometry).
[Remark: For arbitrary shaped catalyst particle, the expression of internal Sher-
wood number matrix can be shown as

Shi = Shi∞ I + Φw tanh[Λ∗ Φw ], (28.43)

which is a generalization of the scalar case (equation (28.27)). Here, the scalars Shi∞
and Λ∗ depend only on the shape/geometry of the washcoat/catalyst.]
The kci or Shi matrix can be calculated using the spectral theorem or the Cayley–
Hamilton theorem for calculating functions of a matrix. A sample calculation is shown
in the next section.
712 | 28 Multicomponent diffusion and reaction

28.3 Isothermal monolith reactor model for multiple reactions


Consider a single channel of a monolith reactor schematically shown in Figure 28.3.

Figure 28.3: Schematic diagram of a monolith reactor with washcoat/catalytic layer.

Considering the isothermal case, the steady-state reactor model that couples convec-
tion in the channel to transverse diffusion and reaction in the catalyst layer can be
expressed as

dcf
u = −av kce (cf − cs ); cf = cf ,in @ x = 0 (28.44)
dx
jf −wc = kce (cf − cs ) = kci (cs − cw ) (28.45)
= kc0 (cf − cw ) = δc Rv (cw ) (28.46)

where

k−1
c0 = kce + kci ,
−1 −1
(28.47)

is the overall mass transfer coefficient matrix, kce and kci are the external and internal
mass transfer coefficient matrices, respectively. For linear kinetics,

Rv (cw ) = Kv cw (28.48)

x
and we can write equation (28.44) with z = L
as

dcf τ
=− δ K (k + δc Kv )−1 kco cf = −Dacf ; (28.49)
dz RΩ c v c0
cf = cf ,in @ z = 0. (28.50)

L
Here, L is the channel length and τ = ū
is the space time. Thus, the exit concentration
vector is given by

cfe = cf (z = 1) = exp[−Da]cf ,in (28.51)


28.3 Isothermal monolith reactor model for multiple reactions | 713

Here, Da is the Damköhler matrix and it can be computed from Kv and kco as

εf τ
Da = δ K (k + δc Kv )−1 kco (28.52)
RΩ c v c0
1
kce = D Sh , (28.53)
4RΩ m e
kci = δc Kv [Φw (tanh Φw )−1 − I] . (28.54)
−1

Note that in dilute mixture, She may be assumed to be a diagonal matrix of external
Sherwood numbers.
Using these expressions, we can determine the impact of external mass trans-
fer as well as that of pore diffusion on the yield (or selectivity) of intermediate prod-
ucts.

28.3.1 Example: reversible sequential reactions

Consider the reaction scheme shown in Figure 28.4 among four components A, B, C
and D, with rate constants as shown.

Figure 28.4: Reaction scheme (reversible consecutive reactions with 4 components).

For simplicity, assume that the diffusivities of all species (in gas phase and washcoat)
are equal.
󳨐⇒

Dm
Kce = kce I, kce = SheΩ ; (28.55)

1 − 21 0 0
De −1 1 − 41 0
kci = Sh ; Kv = kA, A=( ), (28.56)
δc i 0 − 21 1
2
− 81
0 0 − 41 1
8

⇒ from equation (28.52),

δc kτ
Da = A(kco + δc kA)−1 kco (28.57)

where the overall mass-transfer coefficient matrix is given by


714 | 28 Multicomponent diffusion and reaction

1 δ
k−1
co = I + c Sh−1
kce De i
RΩ δ
= Sh−1
eΩ I + c Sh−1
Dm De i

τ D τ Dm δc
kco = m2 [Sh−1
eΩ I + μShi ] ;
−1 −1
μ= , (28.58)
RΩ RΩ De RΩ

where μ is the ratio of diffusion velocity in the fluid to that in the washcoat. This ex-
presses the Damköhler matrix (equation (28.57)) as

1 Dm τ
−1
Da = A( k + A) eΩ I + μShi )
(Sh−1 −1 −1
kδc co R2Ω
Dm Dm τ
−1
= A( eΩ I + μShi ] + A)
[Sh−1 −1 −1
eΩ I + μShi )
(Sh−1 −1 −1
kδc RΩ 2

Dm τ D
−1
= A{ m I + (Sh−1
eΩ I + μShi )A} ,
−1
2
RΩ kδc RΩ
RΩ −1 R2
−1
=[ A + Ω (Sh−1
eΩ I + μShi )] .
−1
(28.59)
kτδc Dm τ

Limiting cases
1. No external and internal resistances to mass transfer:
In this case, the Damköhler matrix (equation (28.59)) reduces to

δc
Da = kτA (28.60)

2. No pore diffusion resistance:


In this case, the overall mass transfer matrix consists of only the external mass-
transfer resistance, i. e.,

Dm
kc0 = kce = SheΩ I

and the Damkohler matrix (equation (28.59)) reduces to

RΩ −1 R2 1
−1
Da = { A + Ω I} . (28.61)
δc kτ Dm τ SheΩ

Note that in the limit of fast kinetics, k → ∞ (or ϕ2 ≫ 1), the Damköhler matrix
simplifies to diagonal form
28.3 Isothermal monolith reactor model for multiple reactions | 715

Dm τ
Da = SheΩ I, (28.62)
R2Ω

which is independent of kinetics (and hence temperature if average value of dif-


fusivity is used).
3. No external mass transfer resistance:
In this case, the overall mass transfer matrix consists of only the internal mass-
transfer resistance, i. e.,

De
kc0 = kci = Sh
δc i

and the Damköhler matrix simplifies to

RΩ −1 R2Ω μ −1
−1
Da = [ A + Sh ] .
kτδc Dm τ i
δ kτ
= c (A−1 + ϕ2 Sh−1 (28.63)
−1
i ) .

If Shi = Shi∞ I (i. e., asymptotic approximation), the Damköhler matrix reduces
(equation (28.63)) to

δc kτ ϕ2
−1
Da = A(I + A) .
RΩ Shi∞

For ϕ2 ≫ 1 (i. e., at high temperature or fast kinetics), the asymptotic approxima-
tion of internal Sherwood reduces the above expression for Damköhler matrix to
a diagonal form

De τ
Da = Sh I,
RΩ δc i∞

which is independent of kinetics, and hence temperature (except the temperature


dependence may be come through the diffusivity term). This is also true in general
even when external mass-transfer resistance is present (see limiting case 4 below).
4. Asymptotic internal Sherwood number (independent of kinetics):
If internal Sherwood number can be expressed by only asymptotic value, i. e.,
Shi = Shi∞ I, then the overall mass transfer coefficient matrix is diagonal, i. e.,

RΩ δc
−1
kco = ( + ) I
Dm SheΩ De Shi∞
Dm 1 μ
−1
= ( + ) I
RΩ SheΩ Shi∞

and the Damkohler matrix reduces from equation (28.59) to


716 | 28 Multicomponent diffusion and reaction

RΩ −1 R2 1 μ
−1
Da = [ A + Ω ( + )I] . (28.64)
kτδc Dm τ SheΩ Shi∞

If the first term is negligible or k → ∞, the Damköhler matrix can be further sim-
plified from equation (28.64) to

De τ 1 μ
−1
Da = ( + ) I,
R2Ω SheΩ Shi∞

which is independent of kinetics, and hence temperature as discussed earlier.


However, for fast kinetics, the internal Sherwood number usually depends on ki-
netics and asymptotic representation may be oversimplification.
5. Pore diffusion control limit with no external mass-transfer resistance:
If there is no external mass-transfer resistance and kinetics is fast (ϕ2 ≫ 1 or
k → ∞), the reactive transport process may be pore diffusion controlled. In this
case, the internal Sherwood number may be simplified from equation (28.42) or
equation (28.43) with Φ2w = ϕ2 A as

k
Shi ≡ Φw = ϕ√A = δc √ A,
De

which is the fast kinetics asymptote. Thus, in this limit, the Damkohler matrix
(equation (28.59)) reduces to

De τ D τ √kDe
Da = Sh = e ϕ√A = τ√A. (28.65)
δc RΩ i δc RΩ RΩ

Note that for the relative reaction rate constant matrix shown in Figure 28.4, the
matrix A is given in equation (28.56), which leads to

0.9033 −0.2951 −0.0435 −0.0174


√A = ( −0.5901 0.8598 −0.1997 −0.0413
). (28.66)
−0.1738 −0.3994 0.5929 −0.1749
−0.1393 −0.1653 −0.3497 0.2336

The positive square root of matrix A (when it has all positive eigenvalues) can
be obtained either using spectral method or Caley–Hamilton theorem (see Chap-
ter 5). Thus, while original reaction constant matrix is Kv = kA, the observed (pore
√kD
diffusion disguised) reaction rate constant matrix becomes Kobs = R √A. Fig-
e

Ω
ure 28.5 shows the effect of pore diffusion on reaction network suggesting the total
number of observed reaction to be 12 in contrast to 6 original reactions (i. e., 6 new
reactions appear—three reversible reactions between species A and D, A and C and
B and D).
28.3 Isothermal monolith reactor model for multiple reactions | 717

Figure 28.5: Original and pore diffusion disguised reaction networks.

In this case, the exit concentration vector can be expressed from equation (28.51)–
(28.54) as

√kDe
cfe = exp[− τ√A]cf ,in . (28.67)

Taking the temperature dependence of the rate constant and other parameters
values as

−12000 −1
k = 1012 exp( ) s ; T in K
T
RΩ = 100 µm; De = 10−7 m2 /s; τ = 1 ms,

and inlet concentration vector as


T
cf ,in = cin ( 0.97 0.01 0.01 0.01 ) ,

the exit concentration of each of the four components is calculated (using equa-
tion (28.51)) and plotted against temperature in Figure 28.6.
As expected, the concentration of A decreases and concentration of D increases
monotonically, while the concentration of the intermediate components can vary
nonmonotonically with temperature and may exhibit maxima. In this specific
case, the concentration of component C exhibits maxima near T = 915.6 K.

The above calculations may be repeated for the case of negligible pore diffusional re-
sistance or external mass transfer to show that the presence of either external or in-
ternal mass transfer reduces the yield (or maximum attainable concentration) of in-
termediate products.
718 | 28 Multicomponent diffusion and reaction

Figure 28.6: Exit concentrations of the four components in pore diffusion controlled limit.

Problems
1. Consider the three regular geometries of a sphere of diameter 2a, cylinder of height
and diameter 2a and a cube of side 2a.
(a) Show that the effective diffusion length (RΩ ) is the same for all three geome-
tries
(b) Formulate the diffusion–reaction problem with linear kinetics and uniform
activity for the three geometries and obtain the solution.
(c) Determine and plot the effectiveness factor and comment on the shape of the
curves.
(d) Determine and plot the internal Sherwood number for the three cases.
2. Consider a porous catalyst particle in the form of a hollow cylinder of length 2L,
inside radius μa (0 ≤ μ < 1) and outside radius a as shown in Figure 28.7.

Figure 28.7: Schematic of a porous catalyst particle in form of hollow cylinder.

(a) Formulate the diffusion–reaction problem for the case of Dirichlet BC at the
interior and exterior surface.
(b) Solve the model for the case of linear kinetics and determine an expression
for the effectiveness factor η.
(c) Determine the internal Sherwood number Shi and the various limits of η and
Shi for μ → 0, μ → 1, aL → ∞ and aL → 0.
28.3 Isothermal monolith reactor model for multiple reactions | 719

3. The steady-state concentration of a reactant inside a cylindrical pore with wall


reaction is described by the partial differential equation

1 𝜕 𝜕C 𝜕2 C
(r ) + 2 = 0; 0 < r < R, 0 < x < L,
r 𝜕r 𝜕r 𝜕x

with boundary conditions

𝜕C 𝜕C
= 0 @ x = 0; C = C0 @ x = L; C = finite @ r = 0; −D = kC @ r = R.
𝜕x 𝜕r

Here, R is the radius of pore, L is the length, k is the reaction rate constant, D is the
molecular diffusivity of the reactant and C0 is the concentration of the reactant in
the gas at the pore mouth.
(a) Cast the equations in dimensionless form and obtain a formal solution to the
concentration profile
(b) The quantity of interest is the pore effectiveness factor defined by

R
D 𝜕C
η= ∫ (L, r)2πr dr.
2πRLkC0 𝜕x
0

Determine an expression for the effectiveness factor


(c) Simplify the expression in (b) for the limiting case in which the pore radius is
very small compared to the length.
4. Consider the nonlinear boundary value problem describing multicomponent dif-
fusion and reaction in a catalyst layer:

d2 c
D = r(c), 0<x<L
dx 2
dc
c = cs @ x = L; = 0 @ x = 0.
dx

Show that linearization of the BVP at the surface conditions leads to

d2 u
= Φ2 u − α, u′ (0) = 0, u(1) = 0,
dx2

where

u = cs − c, Φ2 = L2 D−1 J, α = L2 D−1 r(cs )

and J is the Jacobian of the rate vector r(c) evaluated at c = cs . Solve the linearized
problem and show that the flux vector

dc 󵄨󵄨󵄨󵄨
js = D 󵄨
dx 󵄨󵄨󵄨x=L
720 | 28 Multicomponent diffusion and reaction

may be approximated by

js = LDHD−1 r(cs ), H = Φ−1 tanh Φ.

Discuss how you would use this result in a reactor model.


29 Packed-bed chromatography
A packed-bed is used to carry out reactions and separations. In Chapter 17, we exam-
ined the problem of heat transfer in a packed-bed. Here, we examine a closely related
problem of mass transfer with chromatography.

29.1 Model formulation


Consider a packed-bed of particles as shown in Figure 29.1 (top) with a schematic of
the porous particle packing and fluid flow around it.

Figure 29.1: Schematic of a packed-bed of porous particles for adsorption of solute from a fluid.

In developing a model, the following assumptions are made: (i) plug flow of fluid,
(ii) uniform and constant transport properties, (iii) small particles (i. e., no internal
gradients), (iv) dilute solution and (v) negligible axial dispersion in fluid phase. The
species (solute) balance can be expressed in fluid phase for the small volume element
(between x and x + Δx) as follows:

𝜕
[A Δxεcf ] = Ac εu0 cf |x − Ac εu0 cf |x+Δx − Ac Δxav kc (cf − cfi ) (29.1)
𝜕t c
󳨐⇒

𝜕cf 𝜕cf
ε = −εu0 − kc av (cf − cfi ) (29.2)
𝜕t 𝜕x

where ε is the bed porosity, Ac is the bed cross-section area, u0 is fluid interstitial veloc-
ity, cf is the solute concentration, kc is the mass transfer coefficient, av is the fluid-solid
interfacial area per unit bed volume, cfi is the solute concentration in fluid phase at

https://doi.org/10.1515/9783110739701-031
722 | 29 Packed-bed chromatography

the solid-fluid interface and cs is the solute concentration in the solid phase. Similarly,
the solid phase species balance can be expressed as

𝜕
[A Δx(1 − ε)cs ] = Ac Δxav kc (cf − cfi ) (29.3)
𝜕t c
󳨐⇒

𝜕cs
(1 − ε) = kc av (cf − cfi ). (29.4)
𝜕t

The above model (equations (29.2) and (29.4)) is not closed until cfi is represented in
terms of cs or cf . For this, we can utilize the adsorption isotherm as described below.

29.1.1 Adsorption isotherm

The rate of adsorption ra can be expressed as

ra = ka cfi (cs0 − cs ) (29.5)

where cs0 is the saturation concentration in the solid phase (adsorption capacity of
the solid). Defining the fractional adsorbed concentration θ as

cs
θ= = fraction of adsorbed sites, (29.6)
cs0

the rate of adsorption can be expressed as

ra = ka cfi cs0 (1 − θ). (29.7)

The rate of desorption may be expressed as

rd = kd cs = kd cs0 θ (29.8)

If we assume local equilibrium, or assume that adsorption/desorption rates are much


faster as compared to convective mass transfer, we have

ka cfi cs0 (1 − θ) = kd cs0 θ

or

θ k ka
= a c = Keq cfi ; Keq = = adsorption equilibrium constant (29.9)
1 − θ kd fi kd

󳨐⇒
29.1 Model formulation | 723

cs Keq cfi
θ= = (29.10)
cs0 1 + Keq cfi

Equation (29.10) gives the so-called Langmuir isotherm, which is also plotted in Fig-
ure 29.2.

Figure 29.2: Langmuir isotherm demonstrating the linear regime (Keq cfi ≪ 1) and saturation regime
(Keq cfi ≫ 1).

When Keq cfi ≪ 1, we can write

cs = (Keq cs0 )cfi = Kcfi ; K = Keq cs0 , (29.11)

which shows that the isotherm can be linearized in this regime (see Figure 29.2). Here,
K is the dimensionless adsorption equilibrium constant. In this linear regime, the
model (equations (29.2) and (29.4)) becomes closed and linear, and can be expressed
as follows:

𝜕cf 𝜕cf
ε( + u0 ) = −kc av (cf − cfi ) (29.12)
𝜕t 𝜕x
𝜕c
(1 − ε) s = kc av (cf − cfi ) (29.13)
𝜕t
cs = Kcfi (29.14)

along with initial and inlet conditions:

cf (x, t = 0) = cf 0 (x); cs (x, t = 0) = cs0 (x) (29.15)


cf (x = 0, t) = cfin (t) (29.16)
724 | 29 Packed-bed chromatography

For consistency, we should have initial conditions related as cs0 (x) = Kcf 0 (x). For a
column that is free from adsorbate (solute) initially, we can take cf 0 (x) = cs0 (x) = 0.
Below we consider only this specific case.

29.1.2 Nondimensional form

We can define the dimensionless spatial coordinate and time as follows:

x u0 t
z= ; τ= L = column length (29.17)
L L

and the following dimensionless groups:

K(1 − ε)
α= = capacitance ratio (29.18)
ε
εu0 1/(kc av )
p= = = local (or transverse) Peclet number (29.19)
kc av L L/(εu0 )

where 1/(kc av ) = tm represents the characteristic time for external mass-transfer; εu0 =
⟨u⟩ is the superficial velocity and L/⟨u⟩ = L/(εu0 ) = tc represents the convection (or
space) time; the local Peclet number p is the ratio of external mass-transfer time to
space time. Thus, the model (equations (29.12)–(29.16)) can be expressed in nondi-
mensional form as follows:

𝜕cf 𝜕cf
p( + ) = −(cf − cfi ) (29.20)
𝜕τ 𝜕z
𝜕cfi
αp = (cf − cfi ), τ > 0, 0<z<1 (29.21)
𝜕τ

along with initial and inlet conditions:

cf (z, τ = 0) = 0; cfi (z, τ = 0) = 0 (29.22)


cf (z = 0, τ) = cin (τ) (29.23)

Note that cs is eliminated by using the local equilibrium relation equation (29.14). Sim-
ilarly, cfi can be eliminated from equation (29.20) and can be expressed in terms of cf
as

𝜕cf 𝜕cf
cfi = cf + p( + ) (29.24)
𝜕τ 𝜕z

which can be substituted in equation (29.21) to obtain

𝜕cf 𝜕2 c f 𝜕2 cf 𝜕cf 𝜕cf


αp( +p +p ) = −p( + )
𝜕τ 𝜕τ2 𝜕τ𝜕z 𝜕τ 𝜕z
29.1 Model formulation | 725

󳨐⇒
2 2
𝜕cf 1 𝜕cf αp 𝜕 cf αp 𝜕 cf
+ + + =0 (29.25)
𝜕τ 1 + α 𝜕z 1 + α 𝜕τ𝜕z 1 + α 𝜕τ2
cf (z, τ = 0) = 0; cf (z = 0, τ) = cin (τ) (29.26)

The above model (equations (29.25)–(29.26)) is a single second-order PDE (of hyper-
bolic type) for cf (z, τ). The exit concentration, i. e., cf (z = 1, τ), can be solved and
plotted as a function of time for any given input (or initial condition). The response
to a unit step input is referred to as the breakthrough curve while response to a Dirac
delta function (or pulse) input is referred to as the dispersion curve.

29.1.3 Limiting case: p → 0

In the limiting case of p → 0, the external mass transfer resistance is neglected and
cfi = cf . In this limit, the model equations (29.25)–(29.26) reduces to a single hyperbolic
equation (plug flow) as

1 𝜕cf
𝜕cf
+ = 0, τ > 0, 0<z<1 (29.27)
𝜕τ 1 + α 𝜕z
cf (z, τ = 0) = 0; cf (z = 0, τ) = cin (τ) (29.28)

The above model can be solved by Laplace transform (LT). Let

ĉf (z, s) = ℒ[cf (z, τ)] = Laplace transform of cf (z, τ) (29.29)

Taking LT, equations (29.27)–(29.28) give

dĉf
= −(1 + α)sĉf ; ĉf (z = 0) = ĉ
in (s)
dz
󳨐⇒

ĉf = exp[−(1 + α)sz]ĉ


in (s) (29.30)

which after taking inverse LT, we get

cf (z, τ) = cin (τ − (1 + α)z) (29.31)

Thus, for unit-step input given by the Heaviside’s unit-step function

1, τ>0
cin (τ) = H(τ) = { (29.32)
0, τ < 0,
726 | 29 Packed-bed chromatography

the concentration profile can be given from equation (29.31) as

1, τ > (1 + α)z
cf (z, τ) = H(τ − (1 + α)z) = { (29.33)
0, τ < (1 + α)z

dz 1
Thus, the step input or discontinuity moves with a dimensionless speed dτ
= 1+α
, or
the adsorption front moves with a velocity

dx u0
= . (29.34)
dt 1 + K(1−ε)
ε

It can also be seen from Figure 29.3, which shows the profiles of unit-step input and
the concentration. However, for finite but small values of p, the front velocity remains
the same but the front is not sharp due to dispersion or mass transfer effects.

Figure 29.3: Solution profile with unit step input for packed-bed chromatography in the plug flow
regime.

Equation (29.34) is of fundamental importance in chromatographic separations. It


shows that a solute that interacts with the packing and having an adsorption equilib-
u0
rium constant of K moves through the bed at a speed of 1+α , α = K( 1−ε
ε
). Thus, if a pulse
containing several solutes (having different Ki ) is injected at the inlet to the column,
the solutes move with different velocities, with the one having high Ki value moving
slowly. Thus, the solutes are separated with the separation distance increasing along
the column.
It also follows from equation (29.34) that when the adsorbate/particles/solid in the
bed is initially free of the solute and a step input is introduced, the bed gets saturated
with the solute and breakthrough occurs (as a step function) at t = uL (1 + α). As we
0
29.2 Similarity with heat transfer in packed-beds | 727

show below, when mass transfer and dispersion are present, the breakthrough curves
(and separation of solutes) are not sharp.

29.2 Similarity with heat transfer in packed-beds


As shown in Section 17.5.4, the analogous heat transfer problem is described by

𝜕θf 𝜕θf
ph ( + ) = −(θf − θs ); (29.35)
𝜕τ 𝜕z
𝜕θs
αh p h = (θf − θs ), τ > 0, 0 < z, 1 (29.36)
𝜕τ
θf (z, 0) = 0; θs (z, 0) = 0 (29.37)

θf (0, τ) = θin (τ) (29.38)

Here, the local Peclet number ph is the ratio of heat transfer time th to the space time tc .
We note that the model (equations (29.35)–(29.38)) for heat transfer in packed-bed is
identical to that for chromatography in packed-bed, where θf , θs , αh , ph and θin can be
replaced by cf , cfi , α, p and cin , respectively.

29.3 Impact of interphase mass transfer


Now consider the packed-bed chromatography model given in equations (29.25)–
(29.26). For small values of p, i. e., p → 0, the model leads to

𝜕cf 1 𝜕cf
≈− + O(p). (29.39)
𝜕τ 1 + α 𝜕z
Using the above approximation in equation (29.25), we get

𝜕cf 𝜕cf 𝜕2 cf 2
1 𝜕 cf
(1 + α) + + αp( − + O(p)) = 0
𝜕τ 𝜕z 𝜕τ𝜕z 1 + α 𝜕τ𝜕z
󳨐⇒
2
𝜕cf 𝜕cf α2 p 𝜕 cf
(1 + α) + + = 0 + O(p2 ), (29.40)
𝜕τ 𝜕z 1 + α 𝜕τ𝜕z
which is the hyperbolic form of the model. The same approximation (equation (29.39))
can be used again in equation (29.40) that leads to
2
𝜕cf 𝜕cf α2 p −1 𝜕 cf
(1 + α) + + ( + O(p)) = 0 + O(p2 )
𝜕τ 𝜕z 1 + α 1 + α 𝜕z 2
728 | 29 Packed-bed chromatography

󳨐⇒
2
𝜕cf 𝜕cf α 2 p 𝜕 cf
(1 + α) + − = 0 + O(p2 )
𝜕τ 𝜕z (1 + α)2 𝜕z 2
󳨐⇒
2
𝜕cf 1 𝜕cf α2 p 𝜕 cf
+ − = 0 + O(p2 ) (29.41)
𝜕τ (1 + α) 𝜕z (1 + α)3 𝜕z 2

which is the parabolic form of the model where the dimensionless dispersion coeffi-
α2 p
cient is (1+α) 3.

The model equation can also be expressed in interfacial concentration mode (cfi )
for chromatography or solid temperature (θs ) for heat transfer. For this, let us con-
sider the two-mode model (equations (29.20)–(29.23)). We can use equation (29.21) to
express cf in terms of cfi as

𝜕cfi
cf = cfi + αp (29.42)
𝜕τ

and substitute it into equation (29.20), which leads to

𝜕cfi 𝜕cfi 𝜕2 cfi 𝜕2 cfi 𝜕cfi


p( + + αp + αp ) = −αp
𝜕τ 𝜕z 𝜕τ2 𝜕τ𝜕z 𝜕τ
󳨐⇒
2 2
𝜕cfi 1 𝜕cfi αp 𝜕 cfi αp 𝜕 cfi
+ + + = 0, (29.43)
𝜕τ (1 + α) 𝜕z (1 + α) 𝜕τ𝜕z (1 + α) 𝜕τ2

which is the same equation as that satisfied by cf as given in equation (29.25). There-
c
fore, using equation (29.11), cfi = Ks can be substituted in above equation (29.43), which
leads to equation in solid phase concentration cs as

𝜕cs 1 𝜕cs αp 𝜕2 cs αp 𝜕2 cs
+ + + =0 (29.44)
𝜕τ (1 + α) 𝜕z (1 + α) 𝜕τ𝜕z (1 + α) 𝜕τ2

Thus, all concentration modes cf , cfi and cs satisfy the same equation (see equations
(29.25), (29.43) and (29.44)).
Thus, in the limit of p → 0, the front velocity is the same in solid and fluid phases.
In addition, dispersion of the front in the two phases is also the same. In general, this
is not true when axial conduction or intraparticle gradients are included.
[Remark: Though the differential equations satisfied by cf (z, τ), cfi (z, τ) and cs (z, τ)
are the same, the initial and inlet conditions are different for these variables.]
29.4 Solution of the hyperbolic model by Laplace transform | 729

29.3.1 Pseudo-homogeneous model

The capacitance weighted concentration can be defined by


cf + αcfi
cm = εcf + K(1 − ε)cfi = . (29.45)
1+α
Since, both cf and cfi satisfy the same linear differential equation, we multiply the cf -
1 α
equation (equations (29.25)–(29.26)) by 1+α and cfi -equation (equation (29.43)) by 1+α ,
and add, which leads to

𝜕cm 𝜕cm 𝜕2 c 𝜕2 c
(1 + α) + + αp m + αp 2m = 0 (29.46)
𝜕τ 𝜕z 𝜕τ𝜕z 𝜕τ
For small values of p, the above hyperbolic model can be expressed in parabolic form
as

𝜕cm 1 𝜕cm α 2 p 𝜕2 c m
+ − =0
𝜕τ (1 + α) 𝜕z (1 + α)3 𝜕z 2

󳨐⇒ in dimensional form as
2
𝜕cm u0 𝜕cm α2 2 ε 𝜕 cm
+ − u 0 =0
𝜕t (1 + α) 𝜕x (1 + α)3 kc av 𝜕x 2
󳨐⇒

𝜕cm 𝜕c 𝜕2 c
+ ueff m − Deff 2m = 0; (29.47)
𝜕t 𝜕x 𝜕x
u0 ε α2 u20
ueff = ; Deff = (29.48)
1+α kc av (1 + α)3

where ueff and Deff are the effective velocity and effective dispersion coefficient, re-
spectively. This leads to the effective axial Peclet number Peeff as

ueff L (1 + α)2 Lkc av (1 + α)2 1


Peeff = = = (29.49)
Deff α2 u0 ε α2 p

29.4 Solution of the hyperbolic model by Laplace transform


Consider the two-phase hyperbolic model (equations (29.20)–(29.23)). Taking the
Laplace transform, we can write equation (29.21) as

ĉf (z, s)
αpsĉfi = (ĉf − ĉfi ) 󳨐⇒ ĉfi (z, s) = (29.50)
1 + αps

Thus, equation (29.20) gives


730 | 29 Packed-bed chromatography

dĉf αs
= −sĉf − ĉ ; ĉf (z = 0, s) = ĉ
in (s)
dz 1 + αps f
󳨐⇒

αsz
ĉf (z, s) = exp[−sz − ]ĉ (s). (29.51)
1 + αps in

We have already discussed the inversion of equation (29.51) in Section 17.5.4.


For a pulse input, the dispersion curve E(τ) is given by

exp[− p1 − (τ−1) 1
].[√ αp2 (τ−1) I1 (2√ (τ−1) ) + δ(τ − 1)], τ>1
E(τ) = { αp αp2 (29.52)
0, τ<1

Long time asymptote


exp[u]
In the limit of u ≫ 1, I1 (u) ≈ √2πu
. Therefore, the dispersion curve in the limit of
τ−1 2
αp2
≫ 1 (or τ ≫ 1 + αp ) can be obtained from equation (29.52) as follows:

1 1 1 (τ − 1) 2 √ (τ − 1)
E(τ)|τ≫1+αp2 = √ exp[− − + ]
4πp [α(τ − 1) ]
3 1/4 p αp p α
1/4
1 (√(τ − 1) − √α)2
=( ) exp[− ]. (29.53)
16π p α(τ − 1)3
2 2 αp

The above expression can be simplified to evaluate the E-value at the effective resi-
dence time τ = 1 + α as
1
E(τ)|τ=1+α = . (29.54)
α√4πp

Figure 29.4 shows the dispersion curves E(τ) for specific value of α = 5 and
p-values from 0.01 to 0.1. It can be seen from this figure that for small p-values (i. e.,
p → 0), the dispersion curve is symmetric around τ = 1 + α with maximum value given
in equation (29.54). For large p-values, the dispersion curve is asymmetric and may
have long tail.
1
For a unit-step input, cin (τ) = 1 (or ĉ
in (s) = s ), the breakthrough curve is given by

0, τ<1
F(τ) = { (29.55)
cf∗ (τ, 1), τ>1
τ−1
1 [ exp[− αp ].I0 (2√ αp2 )
(τ−1)

cf∗ (τ, 1) = exp[− ] ]. (29.56)


p + 1

τ−1
exp[− u
].I (2 u
) du
[ αp 0 αp 0 √ αp 2 ]

The breakthrough curve (i. e., exit concentration versus time plot) at various times is
shown in Figure 29.5 for α = 5 and p-values in the range of 0.01 to 0.2. Here, break-
29.5 Chromatography model with dispersion in fluid phase | 731

Figure 29.4: Dispersion curves E(τ) for packed-bed chromatography for α = 5 and p-values varying
from 0.01 to 0.1.

Figure 29.5: Breakthrough curve for packed-bed chromatography for various transverse Peclet num-
ber p.

through occurs at τ = 1 + α = 6 (as expected), where all the curves (for small p) pass
through the point (6, 0.5). However, as p increases, the breakthrough curves are not
symmetric.

29.5 Chromatography model with dispersion in fluid phase


When dispersion in fluid phase is included, the model can be expressed as

2
𝜕cf 𝜕cf p 𝜕 cf
p( + ) = −(cf − cfi ) + (29.57)
𝜕τ 𝜕z Pemf 𝜕z 2
𝜕cfi
αp = (cf − cfi ), τ > 0, 0<z<1 (29.58)
𝜕τ

with initial conditions:


732 | 29 Packed-bed chromatography

cf (z, 0) = cfi (z, 0) = 0 (zero ICs), (29.59)

inlet condition:

1 𝜕cf
= cf − cin (τ) @ z = 0 (29.60)
Pemf 𝜕z

and exit boundary condition:

𝜕cf
= 0 @ z = 1. (29.61)
𝜕z

Here, we have three dimensionless parameters: α, p and Pemf . The new parameter is
u L
the axial Peclet number Pemf = D0 , where Dxe is the effective axial dispersion coeffi-
xe
cient.

29.5.1 Limiting cases

1. Pemf → ∞ (i. e., negligible dispersion in fluid phase). In this case, the model
reduces to hyperbolic form and is discussed earlier.
2. Pemf → 0 (i. e., fluid phase is well mixed or no axial gradient in fluid or solid
phase). In this case, the model reduces to a set of ODEs:

dcf
p = −(cf − cfi ) + p[cin (τ) − cf ] (29.62)

dcfi
αp = (cf − cfi ), τ > 0, (29.63)

cf = cfi = 0 @ τ = 0. (29.64)

This is also referred to as lumped model (or single stage model).


3. p → 0 or α → 0. In this case, the model reduces to axial dispersion model, which
is discussed in detail in previous chapters.

29.5.2 Lumped model for p → 0

The lumped model (equations (29.62)–(29.64)) in the limit of p → 0 (i. e., neglecting
difference between cf and cfi ) reduces to

dcf dcf
p + αp = p[cin (τ) − cf ]; cf (τ = 0) = 0
dτ dτ
󳨐⇒
29.5 Chromatography model with dispersion in fluid phase | 733

dcf
(1 + α) = cin (τ) − cf , τ > 0; cf = 0 @ τ = 0 (29.65)

LT (τ → s) gives

(1 + α)sĉf = ĉin (s) − ĉf



in (s) ĉ
in (s)
󳨐⇒ ĉf = = (29.66)
1 + s(1 + α) (1 + α)[s + 1 ]
(1+α)

1
For a unit step input: ĉ
in (s) = s , we get


in (s) 1 1
ĉf = = − (29.67)
(1 + α)s[s + 1
] s s+ 1
(1+α) (1+α)

󳨐⇒ the exit concentration or breakthrough curve is given by


τ
cf = 1 − e− 1+α . (29.68)

29.5.3 Lumped model for p > 0

Consider the lumped model for p > 0 as given in equations (29.62)–(29.64). Taking LT
gives

spĉf = −(ĉf − ĉfi ) + p(ĉi n(s) − ĉf ) (29.69)


ĉf
αspĉfi = ĉf − ĉfi 󳨐⇒ ĉfi = (29.70)
1 + αps

Substituting equation (29.70) in equation (29.69) and simplifying leads to

(1 + αps)ĉ
in (s)
ĉf = . (29.71)
[1 + αs + αps + s + αps2 ]

Thus, the breakthrough curves can be obtained by considering a unit step input
1
(ĉ
in (s) = s ) as

(1 + αps)
F(s)
̂ = , (29.72)
s[1 + αs + αps + s + αps2 ]

while the dispersion curve can be obtained by considering a unit impulse input
(ĉ
in (s) = 1) as

(1 + αps)
E(s)
̂ = . (29.73)
[1 + αs + αps + s + αps2 ]

Note that the LT of the dispersion curve (equation (29.73)) has two poles for p ≠ 0 given
by
734 | 29 Packed-bed chromatography

−(1 + α + αp) ± √△
s1 , s2 = ; (29.74)
2αp
△ = (1 + α + αp)2 − 4αp
= (1 + α)2 + α2 p2 + 2α(α − 1)p. (29.75)

It can be shown that △ > 0 for all α > 0 and p > 0, and s1 , s2 are always real and
negative.
Write

̂ = 1 E(s)
F(s) ̂ = 1 + αps
s sαp(s − s1 )(s − s2 )
1
s+ αp
= . (29.76)
s(s − s1 )(s − s2 )

󳨐⇒
1
Residue est F(s)|
̂ s=0 = =1 (29.77)
αps1 s2
1
sτ ̂
es1 τ (s1 + αp
)
Residue e F(s)|s=s1 = = β1 es1 τ (let) (29.78)
s1 (s1 − s2 )
1
es2 τ (s2 + αp
)
Residue esτ F(s)|
̂ s=s = = β2 es2 τ (let) (29.79)
2
s2 (s2 − s1 )

󳨐⇒

F(τ) = 1 + β1 es1 τ + β2 es2 τ ,

where β1 and β2 are given by equations (29.78) and (29.79). An example of the break-
through curve is plotted in Figure 29.6 corresponding to p = 0.01 and α = 5.0.

Figure 29.6: A plot of the breakthrough curve from lumped model with α = 5 and p = 0.01.
29.5 Chromatography model with dispersion in fluid phase | 735

29.5.4 Chromatography model with dispersion in fluid phase for unit impulse input

Consider the chromatography model with dispersion in fluid phase (equations (29.57)–
(29.60)) with unit impulse input cin (τ) = δ(τ). LT of equations (29.57)–(29.60) gives

2
dĉf p d ĉf
p(sĉf + ) = −(ĉf − ĉfi ) + (29.80)
dz Pemf dz 2
ĉf
αpsĉfi = (ĉf − ĉfi ) 󳨐⇒ ĉfi = (29.81)
1 + αps
󳨐⇒
2
1 d ĉf dĉf αs
− − sĉf − ĉ = 0, (29.82)
Pemf dz 2 dz 1 + αps f

with boundary conditions:

1 dĉf 𝜕ĉf
= ĉf − 1 @ z = 0; = 0 @ z = 1. (29.83)
Pemf dz 𝜕z

Note that for p = 0 or α = 0, the above model reduces to axial dispersion model while
in the limit of Pemf → ∞, it reduces to the hyperbolic model. Here, we consider the
mixed case where α, p and Pemf are finite. In this case, the LT of the dispersion curve
can be expressed by solving equations (29.82)–(29.83) as
Pemf
4qe 2
E(s)
̂ = ĉf (z = 1, s) =
Pemf q − Pemf q
(29.84)
[(1 + q)2 e 2 − (1 − q)2 e 2 ]

where

4s α
q = √1 + (1 + ). (29.85)
Pemf 1 + αps

The above equations (29.84)–(29.85) can be numerically inverted. Figure 29.7 shows
a plot of dispersion curves from numerical LT inversion for α = 5, p = 0.01 and for
various values of Pemf in the range of 1 to 1000.
The LT solution (equations (29.84)–(29.85)) can also be expanded as power series
in s to obtain the temporal moments without Laplace inversion. Expanding E(s)
̂ gives

2
̂ = 1 − (1 + α)s + s M2 + O(s3 )
E(s) (29.86)
2
736 | 29 Packed-bed chromatography

Figure 29.7: Dispersion curves for chromatography model with dispersion in fluid phase for α = 5,
p = 0.01 and various values of Pem f in the range of 1 to 1000.

where

(1 + α)2
M2 = 1 + 2α + α2 + 2pα2 + [2 Pemf +2e− Pemf − 2] (29.87)
Pe2mf

Thus, the zeroth and first moments are

M0 = 1; M1 = (1 + α) (29.88)

and the dimensionless variance is

M2 − M12 2pα2 2 2
σθ2 = = +[ − (1 − e− Pemf )], (29.89)
M12 (1 + α)2 Pemf Pe2mf

where the first term (in equation (29.89)) is the dispersion contribution due to external
mass transfer and second term (in square bracket) is the dispersion contribution due
to mixing in fluid phase.

29.5.5 Finite stage chromatography model

Consider the hyperbolic model (equations (29.20)–(29.23)). When the upwinding first-
order discretization scheme is used for the convection term, the discretized model can
be expressed as

dcf ,j cf ,j − cf ,j−1 (cf ,j − cfi,j )


+ =− ; cf ,0 = cin (τ); (29.90)
dτ Δz p
dcfi,j (cf ,j − cfi,j )
α = ; cf ,j (τ = 0) = 0 = cfi,j (τ = 0); j = 1, 2, . . . , N (29.91)
dτ p

Taking LT gives
29.6 Impact of intraparticle gradients | 737

psĉ
f ,j + (ĉ
f ,j − ĉ
f ,j−1 )Np = −(ĉ
f ,j − c
̂ fi,j ); ĉ
f ,0 = c
̂ in (s); (29.92)

f ,j
αpsĉ
fi,j = (ĉ
f ,j − c
̂fi,j ) 󳨐⇒ c
̂fi,j = (29.93)
1 + αps
1
where Δz = N
. Thus, the LT solution can be expressed as


f ,j−1

f ,j = s αs ; ĉ
f ,0 = c
̂ in (s) (29.94)
1+ N
+ (1+αsp)N

󳨐⇒

ĉin (s)

f ,N = s αs (29.95)
[1 + N
+ (1+αsp)N ]N

For large N, i. e., when N → ∞, equation (29.95) reduces to


αs
cf (s) = e .ĉ (29.96)
−(s+ 1+αsp ).
in (s)

The LT solution (equation (29.95)) can be expanded as power series in s to determine


the moments as discussed earlier.

29.6 Impact of intraparticle gradients


The packed-bed chromatography model examined in the previous sections does not
include gradients that could exist within the particles, and is valid only for beds of
small particles. We now extend the model to include intraparticle gradients. Let the
bed be composed of spherical particles of diameter 2a and let De be the effective intra-
particle diffusivity of the solute. Using the same notation, the model may be expressed
as
𝜕cf 𝜕cf 1
+ = − (cf − cfi ) (29.97)
𝜕τ 𝜕z p
1 − ε 𝜕cs 1
( ) = (cf − cfi ) (29.98)
ε 𝜕τ p
1 𝜕
= 2 (ξ 2 s ),
𝜕cs 𝜕c
Λ 0<ξ <1 (29.99)
𝜕τ ξ 𝜕ξ 𝜕ξ
𝜕cs 𝜕cs 󵄨󵄨󵄨󵄨
= 0 @ ξ = 0; 󵄨 = Sh(cf − cfi ) (29.100)
𝜕ξ 𝜕ξ 󵄨󵄨󵄨ξ =1
cs = csi = Kcfi @ ξ = 1 (29.101)

with initial and boundary conditions

cf (z, 0) = 0; cs (ξ , z, 0) = 0; cf (0, τ) = cin (τ). (29.102)


738 | 29 Packed-bed chromatography

The new dimensionless groups appearing above are given by

a2 u0 kc a Sh∗
Λ= ; Sh∗ = ; Sh = .
LDe De 5

Here, cs is the average concentration of the adsorbed solute in the particle:

cs = ∫ 3ξ 2 cs (ξ , z, τ) dξ . (29.103)
0

It may be seen that for (a2 /De ) → 0, the gradient inside the particle becomes negligible
and cs = csi = Kcfi and the model reduces to the hyperbolic two-phase model. When
the gradient inside the particle is small, by simplifying equations (29.99)–(29.101) and
(29.103), it may be shown that

Sh
cs = (1 + )c − Sh cf . (29.104)
K si

[Remark: In the literature, this approximation is known as the linear driving force
or parabolic profile approximation.] Using equation (29.104), the model (equation
(29.98)) may be expressed as

Sh 𝜕cfi 1−ε 𝜕cf 1


α(1 + ) −( ) Sh = (cf − cfi ). (29.105)
K 𝜕τ ε 𝜕τ p

Equations (29.97) and (29.105) along with the initial and inlet conditions:

cf (z, 0) = 0; cfi (z, 0) = 0; cf (0, τ) = cin (τ), (29.106)

define the model that includes intraparticle gradients.


The LT method may be used to obtain an analytic solution to equations (29.97),
(29.105) and (29.106). We note that elimination of cfi gives

2
𝜕cf 𝜕cf Sh 𝜕 cf 𝜕2 cf
(1 + α) + + αp(1 + )( 2 + ) = 0. (29.107)
𝜕τ 𝜕z K 𝜕τ 𝜕z𝜕τ

Using leading order approximation, equation (29.107) may be written as


2
𝜕cf 1 𝜕cf α2 p Sh 𝜕 cf
+ + (1 + ) + O(p2 ) = 0. (29.108)
𝜕τ (1 + α) 𝜕z (1 + α)3 K 𝜕z 2

Thus, as can be expected, including of intraparticle gradients does not change the
speed of the adsorption front but the effective axial Peclet number becomes

1 α2 Sh
= p(1 + ) (29.109)
Peeff (1 + α)3 K
29.6 Impact of intraparticle gradients | 739

or in dimensional form

εu20 α2 1 a2
Deff = [ + ]. (29.110)
(1 + α)3 kc av 15De

We note that the first term in the parentheses of equation (29.110) is the external mass
transfer time while the second term represents the intraparticle diffusion time.

Problems
1. Consider the problem of solute uptake by a spherical particle (of radius a), which
is initially of solute free and surface exposed to a time varying concentration csi (t).
Solve the intraparticle diffusion problem and show that the concentration within
the particle is given by

t
2De ∞ (−1)n+1 nπ nπr D n2 π 2
− e (t−t ′ )
cs (r, t) = ∑ sin[ ] ∫ e a2 csi (t ′ ) dt ′
a n=1 r a
0

2. Obtain the solution, and hence the breakthrough curves for the chromatogra-
phy model with intraparticle gradients defined by equations (29.97), (29.105) and
(29.106) of Section 29.6.
3. Determine the dimensionless second central moment of the dispersion curve for
the chromatographic model that accounts for external mass transfer, dispersion
in the fluid phase and intraparticle gradients.
30 Stability of transport and reaction processes
In this chapter, we discuss two problems in which the stability of a base solution is de-
termined by examining the eigenvalues of the linearized system of differential equa-
tions.

30.1 Lapwood convection in a porous rectangular box


30.1.1 Model formulation

Consider a fluid filled porous medium in a closed rectangular box of dimensions L × H


as shown schematically in Figure 30.1.

Figure 30.1: Schematic diagram illustrating Lapwood convection in a porous rectangular box.

The (continuity, momentum/Darcy’s law and energy balance) equations describing


the velocity, pressure and temperature of the fluid and porous medium (Figure 30.1)
are given by

∇′ ⋅ u′ = 0 (continuity equation) (30.1)


μ
∇′ p′ = −ρgez − u′ (Darcy’s law) (30.2)
κ
𝜕T
σ = −u′ ⋅ ∇′ T + λe ∇′2 T (energy balance) (30.3)
𝜕t

where u′ , p′ and T are velocity vector, pressure and temperature, respectively; σ =


ρm cpm
ρf cpf
is the dimensionless heat capacity ratio; κ is the permeability; λe is the effective
thermal diffusivity; g is the acceleration due to gravity; ez is the unit vector in the
direction of gravity and ρ is the density, which may vary with temperature and can be
expressed using Boussinesq approximation as

ρ = ρ0 ⋅ [1 − β(T − T0 )]. (30.4)

https://doi.org/10.1515/9783110739701-032
30.1 Lapwood convection in a porous rectangular box | 741

The boundary and initial conditions are given by

T = T1 @ z ′ = 0; T = T0 @ z ′ = H (30.5)
𝜕T
= 0 @ x ′ = 0, L; T = T ∗ (x ′ , z ′ ) @ t = 0 (30.6)
𝜕x
n.u′ = 0 @ z ′ = 0, H and x ′ = 0, L (30.7)

Here, n is the unit normal to the boundary.

Dimensionless form
We define following dimensionless variables:

z′ x′ L T − T0 H ′ 1
z= , x= , α= , θ= , u= u, ∇′ = ∇, (30.8)
H H H T1 − T0 λe H
λe t κgρ0 Hβ(T1 − T0 ) Rad p′
τ= , Rad = , p= [ + z] (30.9)
H2 μλe β(T1 − T0 ) ρ0 gH

and express the model equations in the following dimensionless form:

∇⋅u=0 (30.10)
∇p = Rad θez − u (30.11)

= −u ⋅ ∇θ + ∇2 θ
𝜕θ
(30.12)
𝜕τ
with boundary conditions,

θ = 1 @ z = 0; θ = 0 @ z = 1; (30.13)
𝜕θ
= 0 @ x = 0, α; θ = θ∗ @ τ = 0 (30.14)
𝜕x
u ⋅ n = 0 @ z = 0, 1 and x = 0, α (30.15)

where Rad is known as the Darcy–Rayleigh number. Note that Darcy’s law does not
permit the specification of tangential velocity at a boundary. We can only specify the
normal component of the velocity to be zero. In component form, the model may be
written as
𝜕ux 𝜕uz
+ =0 (30.16)
𝜕x 𝜕z
𝜕p
= −ux (30.17)
𝜕x
𝜕p
= Rad θ − uz (30.18)
𝜕z
𝜕θ 𝜕θ 𝜕θ 𝜕2 θ 𝜕2 θ
= −ux − uz + + . (30.19)
𝜕τ 𝜕x 𝜕z 𝜕x 2 𝜕z 2
742 | 30 Stability of transport and reaction processes

We can satisfy the continuity equation and remove the pressure and velocity variables
from these equations by introducing the stream function ψ(x, z) as follows:

𝜕ψ 𝜕ψ
ux = − , uz = (30.20)
𝜕z 𝜕x
󳨐⇒

𝜕p 𝜕ψ 𝜕2 p 𝜕2 ψ
= −ux = ⇒ = 2 (30.21)
𝜕x 𝜕z 𝜕x𝜕z 𝜕z
𝜕p 𝜕ψ 𝜕2 p 𝜕θ 𝜕2 ψ
= Rad θ − ⇒ = Rad − (30.22)
𝜕z 𝜕x 𝜕z𝜕x 𝜕x 𝜕x 2
󳨐⇒

𝜕2 ψ 𝜕2 ψ 𝜕θ
+ 2 = Rad (30.23)
𝜕x 2 𝜕z 𝜕x
Therefore, the model equations may be expressed in terms of two scalar variables
ψ(x, y) and θ(x, y) as

∇2 ψ = Rad
𝜕θ
(30.24)
𝜕x
. + ∇2 θ,
𝜕θ 𝜕ψ 𝜕θ 𝜕ψ 𝜕θ
= . − 0 < x < α, 0<z<1 (30.25)
𝜕τ 𝜕z 𝜕x 𝜕x 𝜕z
with the boundary conditions

𝜕ψ 󵄨󵄨󵄨󵄨 𝜕ψ 󵄨󵄨󵄨󵄨
󵄨󵄨 = 0 and 󵄨 =0 (30.26)
𝜕z 󵄨󵄨x=0,α 𝜕x 󵄨󵄨󵄨z=0,1
𝜕θ 󵄨󵄨󵄨󵄨
󵄨 = 0; θ|z=0 = 1; θ|z=1 = 0, (30.27)
𝜕x 󵄨󵄨󵄨x=0,α

and appropriate initial conditions for the time-dependent case. The boundary condi-
tions on ψ can also be taken as

ψ = 0 @ x = 0, α and z = 0, 1 (30.28)

instead of equation (30.26). Both sets of boundary conditions give the same solution,
and in what follows we use the second set given by equation (30.28).

30.1.2 Conduction state and its stability

The steady-state model can be obtained by setting 𝜕


𝜕τ
= 0 and may be written as

ψ ∇2 ψ − Rad 𝜕θ
𝜕x 0
F( )=( )=( ) (30.29)
θ ∇2 θ + 𝜕z 𝜕θ
𝜕ψ

𝜕ψ 𝜕θ 0
𝜕x 𝜕x 𝜕z
30.1 Lapwood convection in a porous rectangular box | 743

ψ = 0 @ x = 0, α ∧ z = 0, 1 (30.30)
󵄨
𝜕θ 󵄨󵄨󵄨
󵄨 = 0; θ|z=0 = 1; θ|z=1 = 0; (30.31)
𝜕x 󵄨󵄨󵄨x=0,α

Note that the only nonlinear terms in the model equations are the quadratic convec-
tion terms in equation (30.29). The base state or conduction solution that exists for all
values of Rad is given by

ψ0 (x, z) = 0 and θ0 (x, z) = 1 − z (30.32)

As the Rayleigh number Rad increases, the buoyancy force overcomes the viscous
force and fluid begins to move or convection sets in. Our aim is to determine the crit-
ical value of Rad at which the conduction state loses stability leading to convection
states. We also note that if ( ψ(x,z)
θ(x,z)
) is a solution of equations (30.29)–(30.31) then so is
( −ψ(α−x,z)
θ(α−x,z)
). Thus, the convective solutions appear in pairs having reflectional symme-
try in the domain. Let u0 and v given by

v1 (x, z) ψ0 (x, z) 0
v=( ), u0 = ( )=( ), (30.33)
v2 (x, z) θ0 (x, z) 1−z

denote the base state and perturbation to the base state. To determine the stability of
the base state u0 (equation (30.33)), we linearize the model equations:

DF(ψ0 , θ0 , Rad ) ⋅ v
𝜕 𝜕 F (ψ + sv1 , θ0 + sv2 )
= lim F(u0 + sv) = lim ( 1 0 )
s→0 𝜕s s→0 𝜕s F2 (ψ0 + sv1 , θ0 + sv2 )
∇2 ψ0 + s∇2 v1 − Rad 𝜕x0 − s Rad 𝜕v
𝜕θ 2
𝜕 𝜕x
= lim ( )
s→0 𝜕s ∇2 θ + s∇2 v2 + ( 𝜕z0 + s 𝜕v
𝜕ψ 𝜕θ0 𝜕v2 𝜕ψ0 𝜕v1 𝜕θ0 𝜕v2
𝜕z
1
)( 𝜕x
+ s 𝜕x
) − ( 𝜕x
+ s 𝜕x
)( 𝜕z
+ s 𝜕z
)
∇2 v1 − Rad 𝜕v2
𝜕x
∇2 v1 − Rad 𝜕v2
𝜕x
=( )=( )
∇2 v2 +
𝜕θ0 𝜕v1
𝜕x 𝜕z
+
𝜕ψ0 𝜕v2
𝜕z 𝜕x
𝜕θ
− 𝜕z0 𝜕v
𝜕x
1

𝜕ψ0 𝜕v2
𝜕x 𝜕z
∇2 v2 + 𝜕v1
𝜕x

Thus, the linearization around the base solution is given by

∇2 v1 − Rad 𝜕v2
𝜕x
L⋅v=( ) = DF(ψ0 , θ0 , Rad ) ⋅ v (30.34)
∇2 v2 + 𝜕v1
𝜕x

The boundary conditions on v1 and v2 are obtained in a similar way from equations
(30.30) and (30.31) by setting

v1 = ψ − ψ0 and v2 = θ − θ0 , (30.35)

which can be given by


744 | 30 Stability of transport and reaction processes

v1 (0, z) = v1 (α, z) = v1 (x, 0) = v1 (x, 1) = 0 (30.36)


𝜕v2 𝜕v
(0, z) = 2 (α, z) = v2 (x, 0) = v2 (x, 1) = 0 (30.37)
𝜕x 𝜕x

Thus, new steady-state solutions (or bifurcation from trivial solution) can occur
only if the equation

Lv = 0

has a nontrivial solution. Now, we look at the linear homogeneous equation Lv = 0 in


component form:

𝜕2 v1 𝜕2 v1 𝜕v2
+ 2 − Rad =0 (30.38)
𝜕x 2 𝜕z 𝜕x
𝜕2 v2 𝜕2 v2 𝜕v1
+ + =0 (30.39)
𝜕x 2 𝜕z 2 𝜕x

The spatial operator in z-direction is

d2 ϕ
− , ϕ(0) = ϕ(1) = 0,
dz 2

which has eigenfunctions

ϕn = sin nπz (30.40)

corresponding to eigenvalues n2 π 2 , (n = 1, 2, 3, . . .). Thus, we write

v1 (x, z) = w1 (x)ϕn (z) (30.41)


v2 (x, z) = w2 (x)ϕn (z) (30.42)

and substitute (30.40)–(30.42) into (30.38)–(30.39) to obtain the following eigenvalue


problem for the x-direction eigenfunctions:

w1′′ − n2 π 2 w1 − Rad w2′ = 0, w1 (0) = w1 (α) = 0 (30.43)


2 2
w2′′ − n π w2 + w1′ = 0, w2′ (0) = w2′ (α) =0 (30.44)

These equations may be combined to give a single fourth-order boundary value prob-
lem:

d 4 w1 2
2 2 d w1
+ (Rad −2n π ) + n4 π 4 w1 = 0 (30.45)
dx4 dx 2
d2 w1 d2 w1
w1 (0) = w1 (α) = 2
(0) = (α) = 0 (30.46)
dx dx 2
30.1 Lapwood convection in a porous rectangular box | 745

This is a linear equation with constant coefficients and can be solved easily. By inspec-
tion, we see that

mπx
w1 (x) = sin( ), m = 1, 2, . . . (30.47)
α

satisfies equations (30.45)–(30.46) iff

m4 π 4 2 2 −m2 π 2
+ (Ra d −2n π )( ) + n4 π 4 = 0
α4 α2
π 2 (m2 + n2 α2 )2
⇒ Rad = (30.48)
m2 α 2

30.1.3 Neutral curve and critical Rad

We are interested in determining the smallest value of the Darcy–Rayleigh number for
which there is a nontrivial solution. Since Rad is monotonically increasing with n but
nonmonotonic with m, we take n = 1 (In physical terms, this implies that it is first
vertical mode that is always destabilized in this specific problem):

π 2 (m2 + α2 )2
⇒ Rad = (30.49)
m2 α2

Equation (30.49) is plotted below for different values of m (= 1, 2 and 3) in Figure 30.2.

Figure 30.2: Neutral stability curves (bifurcation set) for the Lapwood convection problem: First verti-
cal mode and different horizontal modes.

Also, note that

d Rad d2 Rad 󵄨󵄨󵄨󵄨 2π 2


= 0 ⇒ α 2 = m2 and 󵄨 = − <0 (30.50)
dα2 d(α2 )2 󵄨󵄨󵄨α2 =m2 m4
󳨐⇒
746 | 30 Stability of transport and reaction processes

π 2 (α2 + α2 )2
Rad ≥ Radc = = 4π 2 . (30.51)
α4

Thus, the smallest value of Rad is Radc = 4π 2 and is attained when α = m = 1, 2, 3, . . . .


We also note that when α = √m(m + 1), the value of Rad is given by equation (30.49) is
the same for m and m + 1. This is called a bicritical point where two modes are desta-
bilized at the same time.
In practice, only the lower envelope of the above curves (which is part of the bi-
furcation set) is of interest as it defines the boundary between “conduction only” solu-
tions and “convective” solutions. We can also determine the shape of the bifurcating
solution by noting that the nontrivial solution may be written in a parametric form:

Rad = Radc +O(|ε|) and (30.52)


ψ(x, z) 0
( )=( ) + εy0 + O(|ε|2 ) as ϵ → 0 (30.53)
θ(x, z) 1−z

where y0 is the eigenfunction corresponding to zero eigenvalue of L. In this case, we


have

v1 (x, z) sin πz ⋅ sin πx


y0 = ( )=( 2π ) (30.54)
v2 (x, z) Rad
⋅ sin πz ⋅ cos πx

Thus, we can determine the streamlines and isotherms using equations (30.52)–(30.53)
and (30.54). The eigenfunctions from equation (30.54) are plotted in Figure 30.3 for
α = 2.
Note that there are two circulation cells (the symmetric pair has cells rotating in
opposite direction). For α = n, the solution has n circulation cells.

Remark. The above solution can be modified easily for an infinite layer (α → ∞), In
this case, mπ/α becomes a continuous variable (often called the wave number and
denoted by k) and equation (30.49) modifies to

(n2 π 2 + k 2 )2
⇒ Rad = (30.55)
k2
Thus,

d Rad 2 2 2 4n4 π 4
= 0 ⇒ k = n π ⇒ Ra d ≥ Radc = = 4n2 π 2 (30.56)
dk 2 n2 π 2
Once again, the smallest Rad occurs for n = 1 and is given by

Radc = 4π 2 = 39.48 . . . (critical Rayleigh number)

while the critical wave number is given by kc = π. The corresponding eigenfunction is


given by
30.1 Lapwood convection in a porous rectangular box | 747

Figure 30.3: Contour plots of the eigenfunctions (streamlines and isotherms) for the Lapwood prob-
lem.

v1 (x, z) = sin πz ⋅ sin kc x = sin πz ⋅ sin πx (30.57)


v2 (x, z) = sin πz ⋅ cos πx (30.58)

The corresponding flow pattern is similar to that shown in Figure 30.3 except now the
cells are square-shaped.
The temperature θ for the convective solutions is of the form (from equations
(30.52)–(30.53) and (30.54)),


θ(x, z) = 1 − z ± ε ⋅ sin πz ⋅ cos πz (30.59)
Radc

where ε is the amplitude of the convective branch. The isotherms of these convective
branches

θ1 = 1 − z + ε ⋅ sin πz ⋅ cos πz
Radc

θ2 = 1 − z − ε ⋅ sin πz ⋅ cos πz
Radc

are shown in Figure 30.4 for ε = 1.


748 | 30 Stability of transport and reaction processes

Figure 30.4: Isotherms of the bifurcating convective branches for the Lapwood convection problem.

30.2 Chemical reactor stability and dynamics


Mathematical models of chemical reactors are obtained by writing down the species,
energy and momentum balances, and combining them with the constitutive relations
for the various rate processes. These equations are nonlinear (in most cases) as the
reaction rates are usually nonlinear functions of concentration and/or temperature.
These models may be expressed in the form

= F(x, u, ∇u, ∇2 u, p) in Ω
𝜕u
C
𝜕t
BCs: β(x, u, ∇u) = 0 on 𝜕Ω, t > 0 (30.60)
I.C.: Γ(x, u, ∇u) = 0 in Ω @ t = 0

When spatial dependence of the state variables is ignored, we get the so-called
“lumped resistance” models or simply lumped models. In this case, equations (30.60)
are of the form:

du
C = F(u, p), t > 0; u = u0 @ t = 0 (30.61)
dt
30.2 Chemical reactor stability and dynamics | 749

and is a set of (non)linear Ordinary Differential Equations (ODE) or Differential Alge-


braic Equations (DAE) system. Here, C is the capacitance matrix, u is the vector of state
variables and p is a vector of system parameters.
The models described by equation (30.60) are known to exhibit complex steady-
state and transient/dynamic behavior. These may include (i) multiple steady states,
(ii) periodic states in time, (iii) periodic states in space, (iv) quasi-periodic or multifre-
quency states in time and/or space, (v) traveling waves or pulses, (vi) aperiodic states
in space and time, (vii) complex and irregular spatiotemporal patterns or chaos. We
discuss here the first two of these using the simplest of the reactor models.

30.2.1 Model of a cooled CSTR

Consider an ideal CSTR (shown schematically in Figure 30.5) in which NR reactions


among Ns species, represented by

NS
∑ νij Aj = 0; i = 1, 2, . . . , NR (30.62)
j=1

are taking place. Assuming (a) constant physical properties and constant density, (b)
volume of reactor and volumetric flow rate being constant, the species and energy
balances are given by

NR
dcj cj,in (t) − cj
= + ∑ νij ri (c, T), j = 1, 2, 3, . . . Ns (30.63)
dt τc i=1
NR
dT Tin (t) − T (−ΔHR,i ) UAh
LeR = +∑ ri (c, T) − (T − Tc (t)), (30.64)
dt τc i=1
ρf Cpf VR ρf Cpf

where
(MCp )wall
LeR = 1 + = reactor Lewis number
VR ρf Cpf

Figure 30.5: Schematic of an ideal CSTR (continuous stirrered tank reactor).


750 | 30 Stability of transport and reaction processes

and U is overall heat transfer coefficient for heat exchange between reactor contents
and coolant; Ah is heat transfer area; VR is the volume of reactor; τc = Vq R is the space
0
(residence or convection) time and q0 is volumetric flow rate.
Equations (30.63) and (30.64) represent (Ns + 1) nonlinear ODEs that describe the
variation of reactor composition and temperature with time. These equations have to
be integrated numerically with appropriate initial conditions:

cj = cj0 and T = T0 @ t = 0 (30.65)

Denoting

1 0 ⋅⋅⋅ 0 0 c1
0 1 ⋅⋅⋅ 0 0 c2
.
C = ( ... ..
.
..
.
..
.
..
.
), u = ( .. ) (30.66)
0 0 ⋅⋅⋅ 1 0 c Ns
( 0 0 ⋅⋅⋅ 0 LeR ) ( T )

and considering only the special case in which the inputs cj,in (t), Tin (t) and Tc (t) are
independent of time, we can write equations (30.63), (30.64) and (30.65) in the au-
tonomous form given by equation (30.61). If inputs vary with time, equation (30.61)
can be modified to
du
C = F(t, u, p), t > 0; u = u0 @ t = 0 (30.67)
dt
The above form (equation (30.67)) of the lumped model is known as the forced or
nonautonomous system. Here, we consider only the autonomous case.

30.2.2 Dimensionless form of model for a single reaction

Assuming a single step exothermic reaction of the form A → B with linear kinetics, we
can express rate of reaction (for disappearance) of the species A as

Ea
r = k(T)cA = k0 exp(− )c (30.68)
RT A

where k0 is a preexponential factor, Ea is activation energy and cA is the concentra-


tion of species A. Since 0 ≤ cA ≤ cA,in while absolute temperature T can be a large
number, it is convenient to use dimensionless quantities that have the same order of
magnitude. Thus, we define

cA T − Tin
τ = k(Tin )t; χ =1− ; y= ; Da = k(Tin )τc ;
cA,in Tin
(−ΔHR )cA,in E ΔTad ρf Cpf VR
ΔTad = ; γ= a ; β= ; τh = ; (30.69)
ρf Cpf RTin Tin UAh
30.2 Chemical reactor stability and dynamics | 751

1 cA0 Tc − Tin T0 − Tin


α= ; χ0 = 1 − ; yc = ; y0 =
k(Tin )τh cA,in Tin Tin

where τ is the dimensionless time (scaled with reaction time at the inlet temperature);
χ is the conversion; y is the dimensionless temperature of fluid; Da is Damköhler num-
ber at inlet temperature; γ is dimensionless activation energy; ΔTad is the adiabatic
temperature rise; β is dimensionless adiabatic temperature rise and τh is the heat ex-
change time with the coolant (or cooling time); α is the ratio of characteristic reaction
time at the inlet temperature to the cooling time; y0 is the initial fluid temperature and
χ0 is the conversion corresponding to initial concentration. With these dimensionless
quantities, the model equations (30.63) and (30.64) in dimensionless form reduce to
two nonlinear ODEs:

dχ χ γy
=− + (1 − χ) exp( ); (30.70)
dτ Da 1+y
dy y γy
LeR =− + β(1 − χ) exp( ) − α(y − yc ); (30.71)
dτ Da 1+y
χ = χ0 and y = y0 @ t = 0 (30.72)

This more general model has three additional parameters LeR , α and yc compared to
the simpler adiabatic case (α = 0) with LeR = 1 (or negligible reactor wall thermal
capacitance).

30.2.3 Stability analysis

In what follows, we consider special case of yc = 0 (i. e., coolant and feed temperature
are equal). For this case, the steady-state model reduces to

βχs
ys = (30.73)
1 + α Da
γβχs
χs = Da(1 − χs ) exp( ) (30.74)
1 + βχs + α Da

where χs and ys are steady-state conversion and dimensionless temperature.


Note that for adiabatic case (i. e., α = 0), the Damköhler number Da can explicitly
χ γβχ
be expressed in terms of χs as Da = 1−χs exp(− 1+βχs ). However, in case of cooled CSTR,
s s
Da cannot be expressed explicitly in terms of χs or ys . Thus, the determination of dif-
ferent types of conversion versus Da curves is more difficult. We summarize here the
results without detailed derivations. It turns out that equation (30.74) has five different
types of χ versus Da diagrams, depending on the specific values selected for the pa-
rameters γ, β and α. The phase diagram for any γ > 4 is shown below schematically in
Figure 30.6, along with the five different types of χ versus Da diagrams in Figure 30.7.
752 | 30 Stability of transport and reaction processes

Figure 30.6: Phase diagram for cooled CSTR.

Figure 30.7: Different types of χ versus Da diagrams in each of the five regions denoted in Fig-
ure 30.6.
30.2 Chemical reactor stability and dynamics | 753

The solid and dashed lines in the phase diagram shown in Figure 30.6 are referred to as
the isola and hysteresis locus, respectively. These loci divide the (α, β) plane into five
regions in each of which a different type of χ versus Da diagram is obtained. Regions c
and e can exhibit isola (or an isolated solution branch) as shown in Figure 30.7(c) and
Figure 30.7(e).
To determine the stability of the steady state, we write equations (30.70)–(30.72)
for the case of yc = 0, as

dχ χ γy
= F1 (χ, y) = − + (1 − χ) exp( ); (30.75)
dτ Da 1+y
dy 1 −y γy
= F2 (χ, y) = [ + β(1 − χ) exp( ) − αy] (30.76)
dτ LeR Da 1+y

We linearize these equations around the steady state and determine the eigenvalues
of the linearized matrix:
𝜕F1 𝜕F1 󵄨󵄨
𝜕χ 𝜕y
󵄨󵄨
A= ( (30.77)
󵄨
𝜕F2 𝜕F2
)󵄨󵄨󵄨
󵄨󵄨
𝜕χ 𝜕y 󵄨󵄨(χ ,y )
s s

In this case, the eigenvalues can be complex for LeR ≥ 1 and the trace of the matrix can
change sign leading to periodic solutions in time. Usually this occurs when the reactor
is cooled strongly. In such cases, even when a single steady state exists, it could be lo-
cally unstable leading to sustained oscillations of the exit conversion and temperature
(though the inlet concentration and temperature remain constant).
For example, we consider the following parameters: β = 1, γ = 30, yc = 0 and
α = 35, where the steady-state diagram can be obtained by solving F1 = 0 and F2 = 0
and is shown in Figure 30.8 (these parameters lie in region b in Figure 30.6). The top
plot shows the conversion while the bottom plot shows the dimensionless temperature
at steady state. At each point of the steady-state curve in Figure 30.8, the eigenvalues
of the matrix A defined in equation (30.77) can be obtained. When the real part of
any of the two eigenvalues is positive, the solution becomes unstable (such states are
shown in Figure 30.8 by the dashed lines). The system has a stable solution only when
real parts of both eigenvalues are negative (shown in Figure 30.8 by solid lines). The
singular points where solution transitions from stable to unstable regimes are shown
by diamond marker points in Figure 30.8.
To be specific, if we choose a point denoted by the black circle in Figure 30.8 as an
example case, which corresponds to the cooled region with Da = 0.2 and the steady-
state conversion and temperature

χs = 0.6705; ys = 0.0838
754 | 30 Stability of transport and reaction processes

Figure 30.8: Steady-state diagram of (a) conversion (χ) versus Damkohler number (Da) and (b) di-
mensionless temperature (y) versus Damkohler number (Da) for the cooled CSTR corresponding to
β = 1, γ = 30, yc = 0 and α = 35. Dashed curve corresponds to the unstable region while the solid
curve corresponds to the stable region.

Taking LeR = 1.5 and computing the eigenvalues of the linearized matrix, we find that
they are complex with a positive real part, i. e.,

λ1,2 = 7.62033 ± 7.82384i.

Thus, the steady state is unstable. Integrating the full nonlinear equations numeri-
cally, we find that the conversion and temperature oscillate with time, as shown in
Figure 30.9. In this figure, the top plot shows the transient oscillation of conversion
and temperature, while the bottom plot shows the periodic orbit (i. e., limit cycle) in
the χ − y phase plane.
Similar analysis can be performed in other regions and for the more general case
when the feed temperature and coolant temperature are different.
30.2 Chemical reactor stability and dynamics | 755

Figure 30.9: Transient solution corresponding to the parameters β = 1, γ = 30, yc = 0, α = 35,


Da = 0.2 and LeR = 1.5, demonstrating (a) conversion and temperature variation with time (b)
periodic orbit in phase plane.

Problems
1. Bifurcation set for discrete model of thermohaline convection
Consider the following discrete model of thermohaline convection (where the den-
sity is assumed to vary both with temperature and salt concentration):

dx
= Pr(y − x + u) = f1
dt
dy 4
= −xz + Ra x − y = f2
dt 27π 4 T
dz 8
= xy − z = f3
dt 3
du 4
= −xv − Le Ra x − Le u = f4
dt 27π 4 c
dv 8
= xu − Le v = f5
dt 3

Here x, y, z, u, v are the amplitudes for velocity, temperature and concentra-


tion. Pr, Le, RaT and Rac are the Prandtl, Lewis, thermal Rayleigh and con-
756 | 30 Stability of transport and reaction processes

centration Rayleigh numbers, respectively. Assuming the vector of variables:


ψ = (x, y, z, u, v)T , write the above equations as


= f(ψ) = (f1 , f2 , f3 , f4 , f5 )T (30.78)
dt

(a) Show that equation (30.78) has a trivial steady-state solution, i. e., ψs =
(x, y, z, u, v)T = 0 is one of the steady-state solutions.
𝜕f
(b) Determine the Jacobian J = { 𝜕ψi } of the function f and show that it is given by
j

− Pr Pr 0 Pr 0
4
RaT −z
27π 4
−1 −x 0 0
𝜕fi
J={ }= ( y x −8
3
0 0 ) (30.79)
𝜕ψj 4
Le 27π 4 Rac −v 0 0 − Le −x
( u 0 0 x −8 Le
3 )

(c) Neutral curve near trivial solution: Determine the neutral curve near the trivial
solution ψs (by setting the determinant of the Jacobian J at ψs to zero) and
show that it is given by

64 Le2 Pr
|J|ψs | = − (27π 4 + 4 Rac −4 RaT ) = 0
243π 4
27π 4
⇒ RaT = Rac + (30.80)
4

(d) Give a physical interpretation of the neutral curve.


2. Consider the Lapwood problem in a porous rectangular box open at the top and
insulated on the sides (Figure 30.1).
(a) Formulate the mathematical model and cast it into dimensionless form. State
any assumptions clearly.
(b) Determine the conduction state. Show that this state loses its stability when
there is a nontrivial solution to the following boundary value problem:

d2 ϕ
− k2 ϕ + ψ = 0
dz 2
d2 ψ
− k 2 ψ + Rad k 2 ϕ = 0
dz 2
dψ m2 π 2
ϕ(0) = ϕ(1) = 0; ψ(0) = (1) = 0; k2 =
dz α2

where α is the aspect ratio (width to height of the box) and Rad is the Darcy–
Rayleigh number.
(c) Determine the marginal/neutral stability boundary and compare the critical
Rad to that of the box closed at the top.
30.2 Chemical reactor stability and dynamics | 757

3. Consider the homogeneous boundary value problem (BVP)

d2 w
= −λw, 0 < x < 1; w′ (0) = 0 = w(1)
dx2

(a) What is the smallest value of λ for which the BVP is compatible?
(b) If λ = λ1 is the value determined in (a), show that for −∞ < λ < λ1 , the only
solution to the BVP is the trivial one.
(c) Now, consider the nonlinear boundary value problem

d2 w
= −f (w), 0 < x < 1; w′ (0) = 0 = w(1)
dx2

and reason that it has only one solution if the maximum value of f ′ (w) < λ1 .
[Hint: Consider the case that it has two solutions and take their difference and
use the mean value theorem of calculus.]
(d) Use the result in (c) to determine the maximum value of ϕ2 for which the fol-
lowing nonlinear BVP has only one solution:

d2 w
= −ϕ2 (B − w)ew , 0 < x < 1; w′ (0) = 0 = w(1)
dx2

Here, B is a positive constant and 0 < w < B.


4. Consider the Glass–Mackey equation

dz z(t − τ)3
=β− z(t); t > 0, z(t) = z0 for − τ ≤ t ≤ 0
dt 1 + z(t − τ)3

where β and τ are positive constants.


(a) Determine the steady state(s) and the equation that determines the stability
of the steady state.
(b) Determine the locus in the (β, τ) plane for which the system becomes unstable.
(c) Integrate the equation for a set of parameter values in the unstable region and
plot the solution.
Bibliography
[1] Abate J, Valkó PP. Multi-precision Laplace transform inversion. Internat. J. Numer. Methods
Engrg. 2004;60(5):979–93.
[2] Abramowitz M, Stegun IA. Handbook of mathematical functions with formulas, graphs, and
mathematical tables. US Government printing office; 1964.
[3] Amundson NR. Mathematical Methods in Chemical Engineering: Matrices and their application.
Prentice-Hall; 1966.
[4] Aris R, Balakotaiah V. Asymptotic effectiveness of a catalyst particle in the form of a hollow
cylinder. AIChE J. 2013;59(11):4020–4.
[5] Balakotaiah V. On the relationship between Aris and Sherwood numbers and friction and
effectiveness factors. Chem. Eng. Sci. 2008;63(24):5802–12.
[6] Balakotaiah V, Gupta N. Controlling regimes for surface reactions in catalyst pores. Chem. Eng.
Sci. 2000;55(17):3505–14.
[7] Bender CM, Orszag SA. Advanced mathematical methods for scientists and engineers. 1978.
[8] Bronson R. Schaum’s Outline of Matrix Operations. McGraw-Hill; 2011.
[9] Bronson R, Costa GB. Matrix methods: Applied linear algebra. Academic Press; 2008.
[10] Carslaw HS, Jaeger JC. Conduction of heat in solids. Oxford, at the Clarendon Press; 1947.
[11] Churchill RV. Fourier Series and Boundary Value Problems. McGraw-Hill; 1969.
[12] Churchill RV. Operational Mathematics. McGraw Hill; 1972.
[13] Coddington EA, Levinson N. Theory of Ordinary Differential Equations. McGraw Hill; 1965.
[14] Cole RH. Theory of ordinary differential equations. Appleton-Century-Crofts; 1968.
[15] Courant R, Hilbert D. Methods of mathematical physics. New York: Interscience Publication,
1953.
[16] Crank J. The mathematics of diffusion. Oxford university press; 1979.
[17] Defreitas CL, Kane SJ. The Numerical Inversion of the Laplace Transform in a Multi-Precision
Environment. Appl. Math. 2022;13(5):401–18.
[18] Gantmacher FR. The theory of matrices. New York, 1964.
[19] Gundlapally SR, Balakotaiah V. Heat and mass transfer correlations and bifurcation analysis of
catalytic monoliths with developing flows. Chem. Eng. Sci. 2011; 66(9):1879–92.
[20] Halmos PR. Introduction to Hilbert space and the theory of spectral multiplicity. Courier Dover
Publications; 2017.
[21] Kamke E. Differential Gleichungen. J. W. Edwards; 1943.
[22] Lipschutz S, Lipson M. Schaum’s outline of theory and problems of linear algebra. Erlangga;
2001.
[23] Morse PM, Feshbach H. Methods of Theoretical Physics. New York, 1953.
[24] Naylor AW, Sell GR. Linear operator theory in engineering and science. Springer Science &
Business Media; 1982.
[25] Ramkrishna D, Amundson NR. Linear operator methods in chemical engineering with
applications to transport and chemical reaction systems. Prentice Hall; 1985.
[26] Ratnakar RR, Balakotaiah V. Coarse-graining of diffusion–reaction models with catalyst
archipelagos. Chem. Eng. Sci. 2014;110:44–54.
[27] Sarkar B, Ratnakar RR, Balakotaiah V. Multi-scale coarse-grained continuum models for
bifurcation and transient analysis of coupled homogeneous-catalytic reactions in monoliths.
Chem. Eng. J. 2021;407:126500.
[28] Sneddon IN. Fourier transforms. Courier Corporation; 1995.
[29] Spiegel MR. Theory and Problems of Complex Variables, Schaum’s Outline Series in
Mathematics; 1964.
[30] Tu M, Ratnakar R, Balakotaiah V. Reduced order models with local property dependent transfer
coefficients for real time simulations of monolith reactors. Chem. Eng. J. 2020;383:123074.
[31] Wei J, Prater CD. The structure and analysis of complex reaction systems. In: Advances in
Catalysis (Vol. 13, pp. 203–392). Academic Press; 1962.

https://doi.org/10.1515/9783110739701-033
Index
addition 207 eigenvalue 82
adjoint 239 eigenvalues of the kernel 526
adjoint eigenvalue problem 86 eigenvector 82
adjoint equation 307, 424 eigenvector expansions 82
algebraic multiplicity 166, 486 elementary row operations 8
alien cofactor 50 equidimensional 302
analytic 322 essential singularity 326
analytic part 335 Euler’s equation 302
atomic matrix 73
augmented matrix 3 Fibonacci equation 194
Finite Fourier Transform 551, 552
Floquet exponents 297
basic states 103
Fourier coefficients 504
basis 67
Fourier integral formula 600
bijection 214
Fourier series 504
bilinear concomitant 309, 425
Fredholm alternative 14
biorthogonality property 89
Fredholm equation 520
branch point 325
fundamental frequency 571
breakthrough curve 389, 725
fundamental matrix 278
fundamental modes 103
canonical 112 fundamental set of solutions 71, 278
canonical variables 113 fundamental solution 616
Cauchy sequence 505 fundamental vector 281
Cauchy–Goursat theorem 328
Cayley–Hamilton 122 Gaussian elimination 15
characteristic equation 84, 481 generalized eigenvector 160
characteristic exponents 297 generalized Fourier series 505
characteristic multipliers 297 generalized inverse 197
cofactor 49 geometric multiplicity 167, 486
commutative 5 Gibb’s phenomena 512
commute 5 Gram–Schmidt procedure 70
concomitant 309 Green’s formula 435
concomitant matrix 309 Green’s function 445
conjugate harmonic 323
continuous spectrum 602 harmonic 323
heat/diffusion equation 560
Heaviside’s expansion formula 375
diagonal matrix 6
Heaviside’s function 367
difference equation 190
Hermitian 6
dimension 67, 210
Hessian matrix 179
Dirac delta function 367
holomorphic 322
dispersion curve 725
homogeneous system 3
Duhammel’s formula 621
hysteresis locus 753

eigenfunction expansion 504, 505 identity matrix 6


eigenfunctions of the kernel 526 index of compatibility 431
eigenrow 86 injective 214

https://doi.org/10.1515/9783110739701-034
762 | Index

integral equation of the first kind 520 Plancherel’s theorem 604


integral equation of the second kind 520 Poisson’s equation 552
integrating factor 307 principal part 335
iso-morphism 224
isola 753 rank 10
isolated singular point 325 Rayleigh quotient 185
regular 322
Jordan block 164 removable singularity 326
Jordan canonical form 160 residence time distribution 389
resolvent kernel 527
kernel 521 Riemann integrable function 506
right eigenvector 82
Lagrange identity 308, 425 row echelon form 8
Laplace’s expansion 50
least square solution 196 scalar multiplication 208
least squares equations 197 simple pole 325
Lebesque integrable functions 506 singular integral equation 521
left eigenvector 86 singular point 325
linear operator 216 singular values 197
linear transformation 214 singular-value decomposition 198
linearly dependent 66, 210 spectrum 478
linearly independent 66, 210 subspace 67
lower triangular 7 surjective 214
symmetric 6
matrix representation of the linear
transformation 217 tridiagonal 7
minor 49 trivial 11
modal matrix 112
monodromy matrix 297 unit impulse function 367
multiplication 65, 207 upper triangular 7

nonsingular 225 variation of parameters 448


nontrivial 10 vector addition 65, 208
normal 7 Volterra equation 520
normalized 68
weighted inner product 254
one-to-one 214 Wronskian 282
onto 214 Wronskian matrix 281
ordinary point 345 Wronskian vector 281
orthogonal 68
orthonormal basis 68 zero vector 66

You might also like