Professional Documents
Culture Documents
Ratnakar
Applied Linear Analysis for Chemical Engineers
Also of Interest
Chemical Reaction Technology
Dmitry Yu. Murzin, 2022
ISBN 978-3-11-071252-0, e-ISBN (PDF) 978-3-11-071255-1,
e-ISBN (EPUB) 978-3-11-071260-5
|
A Multi-scale Approach with Mathematica®
Authors
Prof. Vemuri Balakotaiah Dr. Ram R. Ratnakar
University of Houston Shell International Exploration & Production
Dept of Chemical and Biomolecular Engineering Houston, TX 77082
4800 Calhoun Road USA
Houston, TX 77204-4004 Ram.Ratnakar@shell.com
USA
bala@uh.edu
The citation of registered names, trade names, trade marks, etc. in this work does not imply, even in
the absence of a specific statement, that such names are exempt from laws and regulations
protecting trade marks etc. and therefore free for general use.
ISBN 978-3-11-073969-5
e-ISBN (PDF) 978-3-11-073970-1
e-ISBN (EPUB) 978-3-11-073978-7
www.degruyter.com
Preface
This book is based on a course that the first author taught at the University of Hous-
ton for about 30 years. This course was a requirement for all first-year graduate stu-
dents and was a prerequisite for two other optional courses, taken mostly by graduate
students whose research involved modeling, computational and nonlinear analysis.
As we state in the Introduction, while this book deals only with the solution of lin-
ear equations, linear analysis is the foundation of all numerical and nonlinear tech-
niques.
Since there are many books already available on applied mathematics for chem-
ical engineers, it is fair to ask the question why another book? For this, our response
is that every author has a unique perspective that may be appealing to others. Fur-
ther, the authors are not aware of any book that deals exclusively with the solution
of various linear equations that arise in engineering in a unified manner, and with
examples.
The senior author had the pleasure of taking the applied mathematics course
from Professor Neal R. Amundson and later teaching the same course when Pro-
fessor Amundson retired. Both authors have used the material extensively in their
own research and would like to point out the following highlights of the material
presented: (i) use of symbolic software (Mathematica® ) for illustrating and enhanc-
ing the impact of physical parameter changes on solutions, (ii) multiscale analysis
of chemical engineering problems with physical interpretation of time and length
scales in terms of eigenvalues and eigenvectors/eigenfunctions, (iii) detailed dis-
cussion of compartment models for various finite- dimensional problems and their
solution in phase spaces, (iv) evaluation and illustration of functions of matrices
(and use of symbolic manipulation) to solve multicomponent diffusion-convection-
reaction problems, (v) illustration of the techniques and interpretation of solutions
to several classical chemical engineering and related problems, (vi) emphasis on
the connection between discrete (matrix algebra) and continuum models (initial,
boundary and initial-boundary value problems), (vii) physical interpretation of ad-
joint operator and adjoint systems and their application in solving inverse problems
and (viii) use of complex analysis and algebra in the solution of practical engineering
problems.
The senior author has taught most of contents of Parts I, III, IV and V in a single
semester (14 weeks or 28 lectures of 90 minutes duration). However, the entire con-
tents of the book can be taught in a two-semester course. For a single semester course,
we recommend covering Chapters 1 to 5, 14, 17, selected sections of Chapters 18 to 21
and 23 to 25.
We wish to acknowledge many colleagues, former students and our mentors who
over the years contributed to our understanding and organization of the subject.
https://doi.org/10.1515/9783110739701-201
VI | Preface
We also want to thank Karin Sora, Nadja Schedensack and Vilma Vaičeliūnienė of
De Gruyter for their help during production.
The second author wishes to acknowledge the constant encouragement and sup-
port of his familty, especially his eldest brother Siddhesh Satyakar.
Finally, the first author wishes to acknowledge the patience and understanding of
his wife, Nalini Vemuri, and dedicate it to her with affection and gratitude.
Introduction
This book deals with the solution of linear equations. We discuss the solutions of lin-
ear algebraic equations, linear initial value problems, linear boundary value prob-
lems, linear integral equations and linear partial differential equations along with
their application to various chemical engineering problems.
It should be pointed out that most practical problems encountered by engineers
are nonlinear and are often solved on a computer using numerical techniques. In most
cases, the nonlinear problem is linearized around a known or approximate solution
and a more accurate solution is obtained by solving a sequence of linear problems. The
nonlinear methods of analysis as well as the numerical techniques used by engineers
draw heavily from the linear analysis. In other words, linear analysis is the foundation
of all nonlinear and numerical techniques.
Generally speaking, most linear problems that arise in applications may be clas-
sified into two groups: (i) problems describing the steady state or equilibrium state of
a physical system and (ii) problems describing the dynamic or transient behavior of
a physical system. The first type of problems are described by linear equations of the
form
Lu = f (1)
where L is a linear operator, u is a state vector and f is a source function. For example,
in finite dimensions, equation (1) may be a set of n linear algebraic equations in n
unknowns,
Au = b (2)
where A is a n×n matrix, u and b are n×1 vectors. When the state vector u belongs to an
infinite-dimensional space, equation (1) may be a two-point boundary value problem
such as
d du
− (p(x) ) + q(x)u = f (x), a<x<b (3)
dx dx
u(a) = u(b) = 0 (4)
or an integral equation such as the Fredholm integral equation of the first kind given
by
https://doi.org/10.1515/9783110739701-202
VIII | Introduction
𝜕2 u 𝜕2 u
−( + ) = f (x, y) in Ω (6)
𝜕x 2 𝜕y2
u=0 on 𝜕Ω (7)
du
= Lu, t > 0 (8)
dt
u = u0 at t = 0 (9)
where t is the time and the evolution equation (8) describes the system behavior for
t > 0, while equation (9) gives the initial condition. In the simpler case of the finite-
dimensional problems, equations (8)–(9) may be of the form
du
= Au (10)
dt
u = u0 at t = 0 (11)
where A is a constant coefficient n×n matrix, u is a n×1 vector of state variables and u0
is a n × 1 vector of initial conditions. An example of an initial value problem in infinite
dimensions is the heat equation in one spatial coordinate and time:
𝜕u 𝜕2 u
= 2 ; 0 < x < 1, t > 0 (12)
𝜕t 𝜕x
u(0, t) = u(1, t) = 0 (Boundary conditions) (13)
u(x, 0) = f (x) (Initial condition) (14)
We shall see that many of the concepts involved in the solution of linear ordinary
and partial differential equations are generalizations of the ideas involved in the solu-
tion of the finite-dimensional problems represented by equations (2) and (10). There-
fore, we shall focus first on the finite-dimensional case.
where the scalars λj (eigenvalues) and the (eigen)vectors xj depend only on the ma-
trix A, while the constants cj are given by
⟨b, yj ⟩
cj = , (16)
⟨xj , yj ⟩
Introduction | IX
where yj are known as the left eigenvectors of A. Here, ⟨x, y⟩ denotes the dot or inner
product of vectors. When the matrix A is symmetric (self-adjoint), xj = yj and the
eigenvectors are normalized to have unit length (⟨xj , xj ⟩ = 1), the expression for the
constants cj simplifies to
cj = ⟨b, xj ⟩. (17)
The above form of the solution has advantages over the direct solution (e. g., by Gaus-
sian elimination) when n is large. For example, when A is symmetric and the eigenval-
ues are well separated (0 < |λ1 | ≪ |λ2 | ≪ |λ3 | ≪ ⋅ ⋅ ⋅ ≪ |λn |), the first few terms may be
sufficient to compute the solution if the desired accuracy is not high. A second advan-
tage is that the solution has the same form for all linear equations of the form given
by (1). For example, when the linear (differential/integral) operator L is symmetric,
the same solution is applicable with a slight modification:
∞
1
u=∑ cϕ; cj = ⟨f , ϕj ⟩, (18)
j=1
λj j j
where λj are the eigenvalues and ϕj are the normalized eigenfunctions of the opera-
tor L.
The solution of the initial value problem, equations (10)–(11) may be expressed as
n ⟨u0 , yj ⟩
u(t) = ∑ cj eλj t xj ; cj = (19)
j=1
⟨xj , yj ⟩
which for the case of symmetric matrix simplifies to cj = ⟨u0 , xj ⟩. The generalization
of this result for the case of a symmetric differential operator is
∞
u(t) = ∑ cj eλj t ϕj ; cj = ⟨f , ϕj ⟩. (20)
j=1
Introduction | VII
2 Determinants | 44
2.1 Definition of determinant | 44
2.2 Properties of the determinant | 46
2.3 Computation of determinant by pivotal condensation | 48
2.4 Minors, cofactors and Laplace’s expansion | 49
2.4.1 Classical adjoint and inverse matrices | 51
2.5 Determinant of the product of two matrices | 52
2.6 Rank of a matrix defined in terms of determinants | 53
2.7 Solution of Au = 0 and Au = b by Cramer’s rule | 54
2.8 Differentiation of a determinant | 56
2.9 Applications of determinants | 57
Bibliography | 759
Index | 761
|
Part I: Applied matrix algebra
1 Matrices and linear algebraic equations
1.1 Simultaneous linear equations
We consider m simultaneous linear equations in n unknowns:
or in matrix notation,
where A is the coefficient matrix with m rows and n columns (m × n matrix), u is the
unknown vector (n × 1 matrix) and b is a m × 1 vector of constants. The elements aij of
the matrix A and bi of the vector b may be real or complex numbers. The matrix
with m rows and (n + 1) columns is called the augmented matrix and is denoted by
aug A = [A b]
A = [aij ]; i = 1, 2, . . . , m; j = 1, 2, . . . , n
where aij is the element of A in the i-th row and j-th column. When b = 0, we obtain
the homogeneous system of equations
https://doi.org/10.1515/9783110739701-001
4 | 1 Matrices and linear algebraic equations
Au = 0 (1.3)
or
As stated in the Introduction, many of the ideas involved in the solution of linear differ-
ential equations are generalizations of those involved in the solution of the homoge-
neous algebraic equation (1.3) and the inhomogeneous algebraic equation (1.2). Gen-
erally speaking, linear equations have either 0, 1 or ∞ number of solutions. In what
follows, we shall discuss the conditions under which equations (1.1)–(1.2) have no so-
lution (inconsistent), a unique solution and an infinite number of solutions.
as real-valued if all its elements are real numbers or real-valued functions. It will be
called complex-valued if one or more of the elements is a complex number or complex-
valued function.
By convention, the elements of a matrix are double subscripted to denote location.
For example, aij refers to the element appearing in the i-th row of the j-th column. If
the number of rows equals to the number of columns (m = n), the matrix is referred
to as a square matrix of order n. (Square matrices appear in most of our applications).
In a square matrix, the elements aii (i = 1, 2, 3, . . . , n) are called diagonal elements. For
the special case n = 1, the matrix is called a column vector (with m elements) while
for m = 1, we have a row vector. The transpose of an m × n matrix A is the n × m matrix
obtained by interchanging the rows and columns of A and is denoted by AT .
1.2 Review of basic matrix operations | 5
C = A + B,
where
i. e., the sum is obtained by adding the corresponding elements. Similarly, we define
for any scalar k,
A ± kB = [aij ± kbij ]
AB = C
n
cij = ∑ aik bkj ; i = 1, 2, . . . , m, j = 1, 2, . . . , r
k=1
From this definition, it can be shown that matrix multiplication is associative and dis-
tributes over addition. However, it is not commutative. Thus,
A(BC) = (AB)C
A(B + C) = AB + AC,
AB ≠ BA
even in the cases in which both the products are defined. Two square matrices A and
B for which AB = BA are said to commute with each other. Also, it may be shown that
6 | 1 Matrices and linear algebraic equations
(AB)T = BT AT
We now review some special types of square matrices that play an important role in
our applications.
A diagonal matrix is a square matrix of all zero elements except possibly those on
the main diagonal (aij = 0 if i ≠ j). The zero matrix is a matrix having all its elements
equal to zero.
An identity matrix of order n is a diagonal matrix of order n having all its diagonal
elements equal to one. It is usually denoted by Im , or simply by I when the order is not
specified. Thus,
1 0 0 0
[ 0 1 0 0 ]
[ ]
I4 = [ ]
[ 0 0 1 0 ]
[ 0 0 0 1 ]
A matrix with real elements is called symmetric if it is equal to its transpose, i. e.,
A = AT
or
aij = aji
for a real symmetric matrix. A square matrix with complex elements is called Hermi-
tian if it equals its conjugate transpose, i. e.,
A = (A)T = A∗
or
aij = ā ji ,
where ’∗’ stands for the transpose and complex conjugation and the overbar stands
for only complex conjugation. Thus,
1 2
A=[ ]
2 −4
1.3 Elementary row operations and row echelon form of a matrix | 7
1 i 2 + 3i
[ ]
B=[ −i −4 3 ]
[ 2 − 3i 3 6 ]
is a Hermitian matrix.
A square matrix is said to be normal if
AA∗ = A∗ A,
i. e., if it commutes with its conjugate transpose. If A has real elements then AT has
real elements and A is normal if it commutes with its transpose.
A square matrix A is called lower triangular if aij = 0 for j > i, i. e., all the elements
above the diagonal are zero. Similarly, A is called upper triangular if aij = 0 for i > j,
or equivalently all the elements below the diagonal are zero. For example,
1 3 5
[ ]
A=[ 0 7 9 ]
[ 0 0 8 ]
−2 1 0 0 0
[ 1 −3 1 0 0 ]
[ ]
[ ]
A=[ 0 1 −4 1 0 ]
[ ]
[ 0 0 1 −5 1 ]
[ 0 0 0 1 −6 ]
Au = b
and recall the following operations that are used to simplify the system and obtain a
solution:
8 | 1 Matrices and linear algebraic equations
We note that these operations do not change the solution. Thus, we define elementary
row operations (ERO) of three basic types on the rows of a matrix:
(E1): interchange of any two rows of a matrix
(E2): multiplication of any row by a nonzero scalar
(E3): multiplication of a row by a constant and add to another row, element by ele-
ment
Examples 1.1.
1 3 1 5 4
0 1 2 3 1
A=( 0 0 1 4 6 )
0 0 0 1 0
0 0 0 0 0
1 0 5 1 2 3 4
B=( 2 0 0 ); C=( 0 1 1 5 )
0 0 1 0 3 6 1
and show that all EROs on A can be performed by doing the same operations on the
m × m identity matrix and premultiplying A by the resulting matrix. For this example,
we take
1 0 0
I3 = ( 0 1 0 )
0 0 1
1 0 0
E1 = ( 0 0 1 )
0 1 0
We note that
Similarly, let
1 0 0
E2 = ( 0 k 0 ), k ≠ 0
0 0 1
1 0 0
E3 = ( 0 1 k )
0 0 1
Then
P such that PA is in row echelon form. [The matrix P is the product of the elementary
matrices Ei ].
2 1 0 1 2 −1
A=( 3 6 1 ), B=( 3 8 9 )
5 7 1 2 −1 2
We now consider the linear equations Au = 0 and Au = b and state the conditions
under which they have solutions.
Au = 0, u ∈ ℝn /ℂn (1.5)
A necessary and sufficient condition for (1.5) to have a nontrivial (nonzero) solution is
rank(A) < n
Proof. The necessity is clear, for suppose that rank A = n. Then reducing A to echelon
form gives the following equivalent set of equations:
1.4 Rank of a matrix and condition for existence of solutions | 11
Here, γij are the elements in the echelon form of A. We note that the only solution to
equations (1.6) is the trivial one.
To prove sufficiency (i. e., there is a nonzero solution when rank A < n), we let
rank A = r. Then, based on the row echelon form of A, the reduced equivalent system
may be written as
Now, we can choose nonzero values for (ur+1 , . . . , un ) and evaluate (u1 , u2 , . . . , ur )
uniquely from equations (1.7). Hence, we get a nontrivial solution when r < n.
u1 − u2 = 0
u2 − u3 = 0
u1 + u3 = 0
for which rank A = 3. Thus, the only solution is the trivial one.
Example 1.4. Consider the homogeneous system in four variables (with complex co-
efficients)
u1 − iu2 = 0
u2 + u3 = 0
u1 + u2 − u4 = 0
u2 + iu3 + iu4 = 0
12 | 1 Matrices and linear algebraic equations
or Au = 0 with
1 −i 0 0
0 1 1 0
A=( ); i = √−1.
1 1 0 −1
0 1 i i
i
1
u = α( )
−1
1+i
u1 − 2u2 − u4 = 0
−2u1 + 3u2 + 3u3 = 0
−u2 + 3u3 − 2u4 = 0
3u1 − 7u2 + 3u3 − 5u4 = 0
3 0
0 1
u = c1 ( ) + c2 ( )
2 −1
3 −2
is a solution for any constants c1 and c2 . [We discuss in Chapter 3 how to obtain this
solution.]
αr+1 = 0
.
αm = 0
If αi ≠ 0 for any r + 1 ≤ i ≤ m, the rank of A and aug A are different and the equations
are inconsistent. Hence, no solution exists in this case. Thus, a necessary condition
for solutions to exist is
We now show that the above condition is also sufficient. Suppose that (1.9) is satisfied
and let
rank A = r
Thus, we can choose (n − r) of the variables (ur+1 , . . . , un ) as we please and obtain the
values of remaining variables using (1.10) above to get a solution. [In Chapter 3, we
shall show that the solution space has dimension (n − r)].
Case 2: r ≤ n ≤ m (more equations than unknowns) In this case again, for consis-
tency, we require
14 | 1 Matrices and linear algebraic equations
αr+1 = 0
.
.
αm = 0
and the last m − r equations are redundant. If r = n, then there is a unique solution.
If r < n, we can assign the values of (n − r) variables at pleasure and determine the
values of the other variables using (1.10). Thus, we have the following theorem.
Theorem.
(a) A necessary and sufficient condition for the system Au = b to have solutions is
rank A = rank(aug A)
(b) If rank A = rank(aug A) = r and n is the number of unknowns, we can assign val-
ues of (n − r) of the unknowns and determine the remaining r unknowns uniquely
provided the matrix of coefficients of these unknowns has rank r.
An important (and very useful) corollary that follows from this theorem is given below.
Corollary. Suppose that is A a square matrix and consider the inhomogeneous system
Au = b. This system has a unique solution for any b iff (if and only if) the only solution
to the corresponding homogeneous system Au = 0 is the trivial one.
The generalization of the above theorem to the case in which A is replaced by a linear
operator is called Fredholm alternative (theorem) and will be discussed later.
u1 − u2 = b1
u2 − u3 = b2
u1 + u3 = b3
for which rank(A) = 3 (see Example 1.3). Thus, there is a unique solution to the above
equations for any choice of b1 , b2 and b3 .
u1 − 2u2 − u4 = b1
−2u1 + 3u2 + 3u3 = b2
−u2 + 3u3 − 2u4 = b3
3u1 − 7u2 + 3u3 − 5u4 = b4
for which rank(A) = 2. It may be verified that the system is consistent iff b3 = 2b1 + b2
and b4 = 5b1 + b2 . We shall return to this example in Chapter 3.
1.5 Gaussian elimination and LU decomposition | 15
Ax = b (1.11)
We first consider the case in which A is lower triangular and write equation (1.11) as
Lx = b (1.12)
or in expanded form
l11 x1 = b1
l21 x1 + l22 x2 = b2
l31 x1 + l32 x2 + l33 x3 = b3
(1.13)
.
.
ln1 x1 + ln2 x2 + ⋅ ⋅ ⋅ + lnn xn = bn
Since we assumed L is nonsingular, lii ≠ 0 for any i and we can solve equations (1.13)
by forward substitution:
x1 = b1 /l11
x2 = (b2 − l21 x1 )/l22
.
. (1.14)
k−1
(bk − ∑ lkj xj )
j=1
xk = ; k = 1, 2, . . . , n
lkk
#AS = 0 + 1 + 2 + ⋅ ⋅ ⋅ + (n − 1)
n(n − 1) n2
= ≈ for a large n (1.15)
2 2
#MD = 1 + 2 + ⋅ ⋅ ⋅ + n
n(n + 1) n2
= ≈ for large n (1.16)
2 2
Usually, when n is large, AS is equal to MD and hereafter, we shall only count MD.
(Another reason for this is that multiplication or division on the computer takes much
longer than addition or subtraction). Thus, the operation count (OC) for solving a
lower triangular system, given by (1.12), by forward substitution is 0.5n2 (n ≫ 1).
Next, we consider the upper triangular system
Ux = c (1.17)
or in expanded form
Again, we assume that U is not singular, i. e., uii ≠ 0 for any i. The solution of (1.18)
can be obtained by back substitution as
xn = cn /unm
xn−1 = (cn−1 − un−1 n xn )/un−1,n−1
.
(1.19)
.
n
xk = (ck − ∑ uk,j xj )/uk,k , k = n, n − 1, . . . , 1
k+1
We note that the operation count for solving the upper triangular system is also 0.5n2
(for n ≫ 1).
We now describe the Gaussian elimination algorithm for solving the general linear
system given by equation (1.11). In this method, we first reduce Ax = b to an equivalent
1.5 Gaussian elimination and LU decomposition | 17
Multiply row 1 by mi1 and subtract from row i (i = 2, . . . , n). At the end of the step, the
form of the augmented system is given by
where a(2)
ij
= a(1)
ij
− mi1 a(1)
1j
; i, j = 2, . . . , n. In the second step, we assume a22
(2)
≠ 0 and
continue to eliminate the unknowns leaving the first row undisturbed. After (n − 1)
steps, we obtain the upper triangular system
a(1)
11 a(1)
12 . . . a1n
(1)
b1(1)
[ ]
[ 0 a(2)
22 . . . a(2)
2n b(2)
2
]
[ ]
[ ]
[ 0 0 a(3)
33 . . a(3)
3n b(3)
3
]
[ ]
aug A(n) =[ . . . . . . . ]
[ ]
[ ]
[ . . . . . . . ]
[ ]
[ . . . . . . . ]
[ 0 0 . . . a(n)
nn b(n)
n ]
or equivalently,
Ux = c (1.20)
18 | 1 Matrices and linear algebraic equations
This completes the elimination procedure. The upper triangular system given by (1.20)
can be solved by back substitution as shown earlier. [Remark: The element aii(i) which
is at the upper left corner after i − 1 steps is called the pivot. When the Gaussian elim-
ination algorithm is implemented in practice, the rows are interchanged so that the
pivot element has the maximum absolute value. This partial pivoting procedure min-
imizes round off errors when solving large systems of linear equations. However, this
procedure does not preserve the initial matrix.]
1.5.3 LU decomposition/factorization
Let
a(1)
11 a(1)
12 . . . a(1)
1n 1 0 0 0 . . 0
[ ] [ ]
[ 0 a(2) . . . a(2) ] [ m21 1 0 0 . . 0 ]
[ 22 2n ] [ ]
[ ] [ m31 m32 1 . . . 0 ]
[ ]
U=[ 0 0 a(3) a3n
(3)
33 . . ], L=[ ]
[ . ] [ . . . . . . . ]
[ . . . . . ] [ ]
[ ] [ ]
[ . . . . . . ] [ . . . . . . . ]
[ 0 0 . . . a(n)
nn ] [ mn1 mn2 0 . . . 1 ]
where mij are the row multipliers determined in the elimination process (Remark:
These row multipliers can be stored in place of zeros during the elimination process).
A straightforward but tedious calculation shows that
A = LU (1.21)
We also note that the number of operations (AS or MD) needed to factorize A as in
(1.21) is given by
OC = (n − 1)2 + (n − 2)2 + ⋅ ⋅ ⋅ + 12
(n − 1)n(2n − 1) 1 3
= ≈ n (for n ≫ 1)
6 3
Thus, the total operation count for solving Ax = b is 31 n3 + n2 (for n ≫ 1). Hence,
for large n, the major part of the work is the LU decomposition. We also note that a
matrix and vector multiplication involves n2 operations while multiplication of two
n × n matrices requires n3 operations.
Example 1.8.
x1 + 2x2 + x3 = 3
2x1 + 3x2 − x3 = −6
3x1 − 2x2 − 4x3 = −2
1.6 Inverse of a square matrix | 19
1 2 1 3
[ ]
aug A(1) = [ 2 3 −1 −6 ]
[ 3 −2 −4 −2 ]
1 2 1 3
[ ]
aug A(2) = [ 0 −1 −3 −12 ]
[ 0 −8 −7 −11 ]
1 2 1 3
[ ]
aug A (3)
=[ 0 −1 −3 −12 ]
[ 0 0 17 85 ]
17x3 = 85, x3 = 5
−x2 − 3x3 = −12 ⇒ x2 = 12 − 3x2 = −3
x1 + 2x2 + x3 = 3 ⇒ x1 = 3 + 6 − 5 = 4
1 2 1 1 00
[ ] [ ]
U=[ 0 −1 −3 ] , L = [ 2 1
0 ]
[ 0 0 17 ] [ 3 8
1 ]
1 0 0 1 2 1 1 2 1
[ ][ ] [ ]
LU = [ 2 1 0 ] [ 0 −1 −3 ]=[ 2 3 −1 ] = A
[ 3 8 1 ] [ 0 0 17 ] [ 3 −2 −4 ]
AB = BA = In (1.22)
The inverse of A is often denoted by A−1 . When A has an inverse, it is said to be non-
singular or invertible. If A does not have an inverse, it is said to be singular.
The following properties may be verified from the definition of the inverse:
1. A has an inverse if and only if it has rank n.
2. When it exists, the inverse of A is unique.
3. When A is nonsingular,
=A (1.23)
−1
(A−1 )
20 | 1 Matrices and linear algebraic equations
4. If A and B are square matrices of same order and both have inverses, then
5. If A is invertible, so is AT and
T
(AT ) (1.25)
−1
= (A−1 )
AB = I. (1.26)
BA = I (1.27)
Thus, to calculate A−1 , it is sufficient to satisfy the relation given by (1.26). Suppose
that the columns of B are denoted by b1 , b2 , b3 , . . . , bn and let ej (j = 1, 2, . . . , n) be
the n-dimensional column vector having unity element in row j and zeros everywhere
else. Then (1.26) is equivalent to
Abj = ej ; j = 1, 2, 3, . . . , n (1.28)
and the j-th column of A−1 can be found by solving the linear equations given by equa-
tion (1.28). Thus, we have the following two methods for finding the inverse of a non-
singular matrix A.
Method 1: Use LU decomposition to factor A = LU. Then solve LUbj = ej ; j =
1, 2, . . . , n. We note that this procedure gives A−1 with a total of 31 n3 operations (for LU
decomposition) plus n × n2 operations (for solving equation (1.28)). Thus, total opera-
tion count is 43 n3 .
Method 2: We form the n × 2n augmented matrix [A I] and use elementary row
operations to transform it to the form [I B], where B = A−1 . It can be shown that the
operation count for this procedure is the same as that for method 1.
Example 1.9.
5 8 1
[ ]
A=[ 0 2 1 ]
[ 4 3 −1 ]
5 8 1 1 0 0
[ ]
[ 0 2 1 0 1 0 ]
[ 4 3 −1 0 0 1 ]
(− 45 )R1 + R3 gives
5 8 1 1 0 0
[ ]
[ 0 2 1 0 1 0 ]
[ 0 − 175 − 95 − 45 0 1 ]
R1 17 R2
5
, ( 10 )R2 + R3 and 2
gives
8 1 1
1 5 5 5
0 0
[ 1 1
]
[ 0 1 0 0 ]
[ 2 2 ]
1
[ 0 0 − 10 − 45 17
10
1 ]
8 1 1
1 5 5 5
0 0
[ ]
[ 0 1 0 −4 9 5 ]
[ 0 0 1 8 −17 −10 ]
R3 (− 51 ) + R1 and R2 (− 85 ) + R1 gives
1 0 0 5 −11 −6
[ ]
[ 0 1 0 −4 9 5 ]
[ 0 0 1 8 −17 −10 ]
Thus,
5 −11 −6
[ ]
A−1 = [ −4 9 5 ]
[ 8 −17 −10 ]
occur. There are R reactions among S species. Let VR be the volume of reactor contents
(assumed to be constant) and Cj be the molar concentration of species Aj . Further as-
sumptions are: (i) the reactor contents are well mixed so that there are no spatial gra-
dients and Cj is uniform throughout the tank, (ii) the density of the fluid is constant,
(iii) isothermal system and (iv) the volume of fluid in the tank remains constant. Let
ri (C1 , . . . , CS ) be the rate of reaction i and νij be the stoichiometric coefficient of species
Aj in reaction i. The mole balance for species Aj is
R
d
{VR Cj } = (∑ νij ri )VR ; j = 1, 2, . . . , S (1.30)
dt i=1
Since VR is assumed to be constant, the above balance may be simplified and written
in the following vector form:
dc
= ν T r(c) (1.31)
dt
or in expanded form
Here, c is the S × 1 vector of concentrations, r(c) is the R × 1 vector of reaction rates (as
a function of various concentrations) and ν is the R × S matrix of stoichiometric coeffi-
cients. To complete the model, we also specify the initial condition corresponding to
the species concentrations at time zero, i. e.,
c(t = 0) = c0 (1.33)
1.7 Vector-matrix formulation of some chemical engineering problems | 23
[Note that the same model is obtained for the case of an ideal isothermal tubular plug
flow reactor (PFR) with time replaced by space time. In this case, the initial condition
is the vector of inlet concentrations]. For the special case of linear kinetics, we have
̂
r(c) = K.c, (1.34)
K = ν T .K
̂ (1.35)
dc
= K.c, t > 0; c(t = 0) = c0 (1.36)
dt
−1 1 −1 1 0 0
T
ν =( 1 −1 0 0 −1 1 ) (1.37)
0 0 1 −1 1 −1
k21 0 0
0 k12 0
−(k21 + k31 ) k12 k13
(
̂=( k31 0 0 )
);
K ( ) K=( k21 −(k12 + k32 ) k23 )
0 0 k13
k31 k32 −(k13 + k23 )
0 k32 0
( 0 0 k23 )
(1.38)
[Remark: The ordering of the reactions changes the matrices ν and K ̂ but K depends
only on the ordering of the species]. Since the total concentration is fixed for this spe-
cific reaction system, i. e., C1 + C2 + C3 = C10 + C20 + C30 = C0 , we can define the mole
fraction of species Aj as xj = Cj /C0 and write the evolution equation (1.36) as
dx
= K.x, t > 0; x(t = 0) = x0 . (1.39)
dt
24 | 1 Matrices and linear algebraic equations
Figure 1.1: Schematic diagram of monomolecular first-order reaction scheme between 3 species.
We now extend the above batch reactor model to include the flow terms. With the same
assumptions as those in the batch reactor case, the species balance equations for the
isothermal case may be expressed as
R
d
[VR Cj ] = qin Cj,in (t) − qout Cj + (∑ νij ri )VR (1.40)
dt i=1
For the special case of constant and equal in and out flow rates qin = qout = q and
constant VR , the above equation simplifies to
Cj,in (t) − Cj R
d
[Cj ] = + (∑ νij ri ), (1.41)
dt τ i=1
where τ is the residence or space time, defined as the volume of reactor contents over
the volumetric flow rate (τ = VR /q). In vector-matrix form, the above equations may
be expressed as
dc 1
= (cin (t) − c) + νT r(c), t>0 (1.42)
dt τ
along with the initial condition given by equation (1.33). Here, the S × 1 vector c rep-
resents the species molar concentration and ν is the R × S stoichiometric coefficient
matrix. If the inlet concentrations are independent of time, the steady-state reactor ef-
1.7 Vector-matrix formulation of some chemical engineering problems | 25
fluent concentrations are described by the following set of nonlinear algebraic equa-
tions:
1
(c − cs ) + ν T r(cs ) = 0. (1.43)
τ in
For the special case of linear kinetics and constant feed/inlet concentrations, we ob-
tain the linear system of equations
(I + K∗ τ)cs = cin ; K∗ = −ν T .K
̂ (1.44)
1.7.3 Two interacting tank system: transient model for mixing with in- and outflows
Consider the interacting two tank system shown in Figure 1.2. To develop a mathemat-
ical model for describing the transient behavior of the system, we make the follow-
ing assumptions: (i) each tank is an ideal mixer so that the concentration is uniform
within each tank so that the concentration of species A (salt, tracer or a chemical) in
the stream leaving each tank is equal to that in the tank, (ii) the flow rate entering
each tank (q) is constant (independent of time) but the inlet concentration to tank 1,
Cin (t), may change with time, (iii) the exchange or circulation flow rate (qe ) between
the tanks is constant, (iv) the density of the fluid is constant and the volume of fluid
that each tank holds is constant at VR1 and VR2 [This assumption implies that the to-
tal volumetric flow rate of the streams entering must be equal to that of the streams
leaving the tank], (v) no chemical reaction takes place in either tank. The notation for
various quantities (volumes of tanks, flow rates and concentrations of species A in
each tank) is as shown in the figure.
Figure 1.2: Schematic diagram of two interacting tanks with in and outflows.
26 | 1 Matrices and linear algebraic equations
dC1
VR1 = qCin (t) + qe C2 − (q + qe )C1 (1.45)
dt
dC2
VR2 = (q + qe )C1 − (q + qe )C2 (1.46)
dt
[These are mass balances on species if concentration is measured in kg/m3 , and mole
balances on species A if the concentration is measured in molar units, moles/m3 or
moles/liter].
To complete the model, we have to supplement it with initial conditions which
specify the concentration of the species at time zero, i. e.,
dc C10
= Ac + b(t); c (at t = 0) = c0 = ( ) (1.49)
dt C20
(q+qe ) qe q
C − VR1 VR1 C (t)
c = ( 1 ); A=( ); b(t) = ( VR1 in ) (1.50)
C2 (q+qe )
−
(q+qe ) 0
VR2 VR2
Note that the matrix can be written as the sum of diffusive (or exchange) matrix and a
convective (flow) matrix:
q qe
−Ve VR1 − Vq 0
A=( R1
)+( q
R1 )
qe q
−Ve VR2
− Vq
VR2 R2 R2
= Ad + Ac
One special case of this model is obtained when the two tanks are of equal volume. In
q t
this case, we can define a dimensionless time (t ′ = Ve ) and Peclet number (PeD = qq ;
R e
qe ≠ 0) and write it as
dc ̂ ̂ ′ ); c (at t ′ = 0) = c
= Ac + PeD b(t 0 (1.51)
dt ′
̂ = ( −1 1 ) + PeD ( −1 0 ) ; b(t ̂ ′ ) = ( Cin (t ) ) .
′
A (1.52)
1 −1 1 −1 0
1.7 Vector-matrix formulation of some chemical engineering problems | 27
2VR
τ=
q
τ dc −1 0 C (t)
=( ) c + ( in ); c (at t ′ = 0) = c0 . (1.53)
2 dt 1 −1 0
The example above of two interacting tanks (or cells) can be generalized to any num-
ber of cells which interact through exchange (diffusion), imposed flow (convection)
and with or without reaction. We consider here these models without reaction so that
the structure of the models can be seen more clearly. These models are referred to
as cell or compartment models and are discrete (or finite-dimensional) analogs of the
continuous diffusion–convection–reaction models. With the same assumptions as
above, the formulation of the transient models for these cases is similar to the two
tank (cell) system. We provide here only the final model equations for different cases
as their derivation is straightforward. [Remark: The compartment models formulated
here for discrete interacting systems also appear when partial differential equations
of diffusion–convection–reaction type are discretized using finite difference or finite
volume methods.]
dc
= Ad c; c (at t ′ = 0) = c0 (1.54)
dt ′
−1 1 0 0 . 0 0 0
1 −2 1 0 . 0 0 0
( 0 1 −2 1 . 0 0 0 )
( )
Ad = ( . . . . . . . . ) (1.55)
( )
0 0 0 0 . −2 1 0
0 0 0 0 . 1 −2 1
( 0 0 0 0 . 0 1 −1 )
Note that the matrix is symmetric and sum of each row and column is zero. The same
matrix is obtained when the one-dimensional transient diffusion equation (with zero
flux boundary conditions at the ends) is discretized using the second order finite dif-
ference or finite volume method. If the cells are arranged in a circular array (so that
cell N is connected to cell N − 1 as well as cell 1), the exchange matrix is modified to
−2 1 0 0 . 0 0 1
1 −2 1 0 . 0 0 0
( 0 1 −2 1 . 0 0 0 )
( )
Ad = ( . . . . . . . . ). (1.56)
( )
0 0 0 0 . −2 1 0
0 0 0 0 . 1 −2 1
( 1 0 0 0 . 0 1 −2 )
Again, this discrete model (with the symmetric circulant matrix) is obtained when the
one-dimensional transient diffusion equation on a circle (and periodic boundary con-
ditions) is discretized using second-order finite difference or finite volume methods.
̂ = Ad + PeD Ac
A (1.57)
where Ad is as defined by equation (1.55) and the N × N convective flow matrix Ac and
̂ ′ ) (assuming that there is only a single inlet stream entering
the N × 1 forcing vector b(t
tank 1) are given by
−1 0 0 0 . 0 0 0
1 −1 0 0 . 0 0 0
( 0 1 −1 0 . 0 0 0 )
( )
Ac = ( . . . . . . . . ) (1.58)
( )
0 0 0 0 . −1 0 0
0 0 0 0 . 1 −1 0
( 0 0 0 0 . 0 1 −1 )
̂ ′ ) = C (t ′ )e .
b(t (1.59)
in 1
1.7 Vector-matrix formulation of some chemical engineering problems | 29
Here, e1 is the N × 1 unit vector (corresponding to unity in the first element and zeros
in all other elements). Once again, the discrete model in this case is obtained from the
continuum transient diffusion-convection model by using differencing methods.
dc
= AL c; c (at t ′ = 0) = c0 (1.60)
dt ′
−1 0 1
AL = ( 1 −1 0 ) (1.61)
0 1 −1
−1 0 0 0 . 0 0 1
1 −1 0 0 . 0 0 0
( 0 1 −1 0 . 0 0 0 )
( )
AL = (
( . . . . . . . . ).
) (1.62)
( 0 0 0 0 . −1 0 0 )
0 0 0 0 . 1 −1 0
( 0 0 0 0 . 0 1 −1 )
We note that AL is not a symmetric matrix but is a special case of a circulant matrix.
A more general circulant matrix denotes the matrix
in which every row starting with the second can be obtained from the previous row
by moving each of its elements one column to the right with the last element circling
to become the first. Each diagonal of a circulant matrix consists of identical elements.
30 | 1 Matrices and linear algebraic equations
It is a special case of a Toeplitz matrix for which elements on all the diagonals are
constants. The matrices appearing in the above examples are also known as banded
Toeplitz matrices.
Figure 1.4: Schematic diagram of flow past a particle on the surface of which chemical reactions
occur.
A simplified model of such a system at steady state is obtained by using the concept
of the mass transfer coefficient between the fluid and solid. For the case of a single re-
1.8 Application of elementary matrix concepts | 31
action of the form A → B, the steady-state model is obtained by equating the external
flux
to the reaction rate (the rate of disappearance of A), which for the case of linear kinet-
ics (first-order reaction) may be expressed as
Here, kc is the external mass transfer coefficient, av is the solid-fluid exchange area
per unit volume, ks is the first-order surface reaction rate constant (kv = ks av is the
first-order rate constant) and CAb and CAs are the bulk (cup-mixing) and surface con-
centrations of the reactant species A, respectively. For the case of several reactions
among N species with linear kinetics, the external flux vector may be expressed as
je = Kc (Cb − Cs ) (1.66)
r = KRs Cs (1.67)
r = K∗ Cb (1.68)
where the apparent or mass transfer disguised rate constant matrix K∗ is defined by
KRs = ks A, Kc = kc M (1.70)
where A is the matrix of relative rate constants and M is the matrix of relative mass
transfer coefficients. This allows us to write (1.69) as
32 | 1 Matrices and linear algebraic equations
K∗ = KRs H (1.71)
H = (Dapm A + M)−1 M
= (I + Dapm M−1 A) (1.72)
−1
ks k
Dapm = = v
kc kc av
[Remarks: The matrices A, M and H are dimensionless or have elements with no units.
In simplifying the expression for H, we have assumed that M is invertible and used
the property given by equation (1.24).] To illustrate how the external mass transfer
can disguise the true reaction network (leading to the so called falsified kinetics), we
consider the following numerical values:
1 0 0 1 − 21 0
Dapm = 1, M=( 0 1 0 ), A = ( −1 1 − 41 ) ,
0 0 1 0 − 21 1
4
which gives
19 5 1
1
H= ( 10 20 4 )
33
4 8 28
14 −5 −1
ks
K∗ = ( −10 13 −4 )
33
−4 −8 5
The true and disguised rate constants (and reaction networks) are shown in Figure 1.5.
We note that two extra reactions (P → R and R → P) (that are not present in the true
reaction network) appear in the mass transfer disguised reaction network.
In the general case of several reactions among many species, the true reaction net-
work may be represented by a sparse matrix of about N nonzero rate constants while
the mass transfer disguised rate constant matrix may have N 2 nonzero rate constants.
Thus, the number of spurious reactions that appear in the mass transfer disguised rate
constant matrix is N 2 − N, which is quite large for large N. See below for further illus-
tration for the case of large N, where, symbolic manipulation and computer algebra
software (such as Mathematica® , Matlab® , Fortran/Python or others) can be used, as
solving by hand may take much longer time.
1.9 Application of computer algebra and symbolic manipulation | 33
Figure 1.5: Schematic representation of the true (left) and mass transfer disguised reaction net-
works.
k1 k2 ki−1 ki ks−1
A1 A2 A3 . . . Ai Ai+1 . . . As (1.73)
k−1 k−2 k−(i−1) k−i k−(s−1)
where ki and k−i are forward and backward rate constants between species Ai to Ai+1
with linear kinetics. Thus, the net rate of formation of these species is given by
d[A1 ]
= −k1 [A1 ] + k−1 [A2 ] (1.74)
dt
d[A2 ]
= k1 [A1 ] − (k−1 + k2 )[A2 ] + k−2 [A3 ] (1.75)
dt
..
.
d[Ai ]
= ki−1 [Ai−1 ] − (k−(i−1) + ki )[Ai ] + k−i [Ai+1 ] (1.76)
dt
..
.
d[As−1 ]
= ks−2 [As−2 ] − (k−(s−2) + ks−1 )[As−1 ] + k−(s−1) [As ] (1.77)
dt
d[As ]
= ks−1 [As−1 ] − k−(s−1) [As ] (1.78)
dt
[Ai ]
xi = S
, (1.79)
∑j=1 [Aj ]
we can write
dx
= KRs x, t>0 and x = x0 @ t = 0 (1.80)
dt
k1 −k−1 0 0 0 0 0
−k1 k−1 + k2 −k−2 0 0 0 0
( 0 .. .. .. )
( . . . 0 0 0 )
( )
KRs =( 0 0 −k(i−1) k−(i−1) + ki −k−i 0 0 )
( )
( .. .. .. )
0 0 0 . . . 0
0 0 0 0 −k(s−2) k−(s−2) + ks−1 −k−(s−1)
( 0 0 0 0 0 −k(s−1) k−(s−1) )
(1.81)
T
and the mole fraction vector x = ( x1 x2 ⋅⋅⋅ xs ) . (1.82)
Thus depending on the transfer coefficient matrix Kc , the apparent (mass-transfer dis-
guised) rate constant matrix K∗ given equation (1.69) can have all elements nonnega-
tive, i. e.,
kij∗
Ai ∗ Aj , |i − j| > 1. (1.84)
kji
The number of new reactions appearing for the true reaction network (equation (1.73))
are shown in Table 1.1. It can be shown that for large number of species, sequential
reactions is observed as the complex reaction network due to the mass transfer, and
significant number of disguised reactions appear.
Note that when number of species (S) is large, the evaluation of mass-transfer dis-
guised rate constants can be time consuming, and hence any symbolic/numerical
programming software such as Mathematica® , Matlab® and Maple can be utilized for
these purposes. Here, we use Mathematica® to demonstrate some examples.
1.9 Application of computer algebra and symbolic manipulation | 35
1.9.1 Example 1: mass transfer disguised matrix for a five species system
Assuming S = 5 and the true rate constants (see equation (1.73)) are
1
k1 = ks ; k−1 = ks ;
2
1 1
k2 = ks ; k−2 = ks ;
2 4
1 1
k3 = ks ; k−3 = ks ;
4 8
1 1
k4 = ks ; k−4 = k.
10 20 s
⇒
1 −1/2 0 0 0
−1 1 −1/4 0 0
KRs = ks A where A = ( 0 −1/2 1/2 −1/8 0 )
0 0 −1/4 9/40 −1/20
0 0 0 −1/10 1/20
In addition, assuming that the transfer coefficient matrix Kc is diagonal with all diag-
onal elements same, i. e.,
1 0 0 0 0
0 1 0 0 0
Kc = ks M, where M = I5 = ( 0 0 1 0 0 ),
0 0 0 1 0
0 0 0 0 1
k
such that Dapm = ks = 1. Then mass-transfer disguised rate constant can be given by
c
using equations (1.71) and (1.72) as
which has all elements nonzero. In other words, there are S(S − 1) = 20 reaction can
be observed due to mass transfer. The true and mass transfer disguised reactions net-
works are shown in Figure 1.6, where additional reactions with corresponding rate
constants are depicted.
36 | 1 Matrices and linear algebraic equations
Figure 1.6: Effect of mass-transfer on observed kinetics: (left) true reaction network and (right) mass-
transfer disguised reaction network for Example 1 (5 species leading to 12 new reactions).
1.9.2 Example 2: mass transfer disguised matrix for a ten species system
Assuming S = 10 and the true rate constants (see equation (1.73)) are
k1 = ks ; k−1 = 0.5ks ;
k2 = 0.5ks ; k−2 = 0.25ks ;
k3 = 0.25ks ; k−3 = 0.125ks ;
k4 = 0.1ks ; k−4 = 0.05ks ;
k5 = 0.25ks ; k−5 = 0.1ks
k6 = 0.1ks ; k−6 = 0.15ks
k7 = 0.2ks ; k−7 = 0.3ks
k8 = 0.1ks ; k−8 = 0.05ks
k9 = 0.2ks ; k−9 = 0.1ks
⇒ KRs = ks A where
1 −0.5 0 0 0 0 0 0 0 0
−1 1 −0.25 0 0 0 0 0 0 0
(
( 0 −0.5 0.5 −0.125 0 0 0 0 0 0 )
)
( 0 0 −0.25 0.225 −0.05 0 0 0 0 0 )
( )
( 0 0 0 −0.1 0.3 −0.1 0 0 0 0 )
( )
A=( )
( 0 0 0 0 −0.25 0.2 −0.15 0 0 0 )
( )
( 0 0 0 0 0 −0.1 0.35 −0.3 0 0 )
( )
( 0 0 0 0 0 0 −0.2 0.4 −0.05 0 )
0 0 0 0 0 0 0 −0.1 0.25 −0.1
( 0 0 0 0 0 0 0 0 −0.2 0.1 )
1 0 0 0 0 0 0 0 0 0
0 2 0 0 0 0 0 0 0 0
( 0 0 3 0 0 0 0 0 0 0 )
( )
( 0 0 0 1 0 0 0 0 0 0 )
( )
( 0 0 0 0 4 0 0 0 0 0 )
( )
M=( ),
( 0 0 0 0 0 2 0 0 0 0 )
( )
( 0 0 0 0 0 0 1 0 0 0 )
( )
( 0 0 0 0 0 0 0 3 0 0 )
0 0 0 0 0 0 0 0 5 0
( 0 0 0 0 0 0 0 0 0 1 )
k
such that Dapm = ks = 2. Then mass transfer disguised rate constant can be given by
c
using equations (1.71) and (1.72) as
0.478 −0.043 −0.065 −0.022 −0.087 −0.043 −0.022 −0.065 −0.109 0.022
−0.043 0.913 −0.13 −0.043 −0.174 −0.087 −0.043 −0.13 −0.217 −0.043
−0.065 −0.13 1.304 −0.065 −0.261 −0.13 −0.065 −0.196 −0.326 −0.065
(−0.022 −0.043 −0.065 0.478 −0.087 −0.043 −0.022 −0.065 −0.109 −0.022)
( )
(−0.087 1.652 −0.087)
K∗ = ks (−0.043
−0.174 −0.261 −0.087 −0.174 −0.087 −0.261 −0.434
0.913
)
( −0.087 −0.13 −0.043 −0.174 −0.043 −0.13 −0.217 −0.043)
(−0.022 −0.043 −0.065 −0.022 −0.087 −0.043 0.478 −0.065 −0.109 −0.022)
−0.065 −0.13 −0.196 −0.065 −0.261 −0.13 −0.065 1.304 −0.326 −0.065
−0.109 −0.217 −0.326 −0.109 −0.434 −0.217 −0.109 −0.326 1.956 −0.109
(−0.022 −0.043 −0.065 −0.022 −0.087 −0.043 −0.022 −0.065 −0.109 0.478 )
which has all elements nonzero. Note that all diagonal elements are positive while
off-diagonal elements are negative as expected. In this case, total number of observed
reactions are S(S − 1) = 90, i. e., 72 new reactions appear as shown in Figure 1.7, where
the true and the mass-transfer disguised reactions networks are depicted.
Problems
1. (Formulation of linear models): Consider the following simplified economic model
of a small town represented by three major industries: coal mining, transportation
and electricity. It is estimated that the production of a one dollar value of coal re-
quires the purchase of 10 cents of electricity and 20 cents of transportation. Simi-
larly, the production of one dollar output of transportation requires the purchase
of 2 cents of coal and 35 cents of electricity, while production of one dollar value
of electricity requires purchase of 10 cents of electricity, 50 cents of coal and 30
cents of transportation. The town has external contracts for $1,000,000 of coal,
$1,600,000 of transportation and $500,000 of electricity. Show that the problem
38 | 1 Matrices and linear algebraic equations
Figure 1.7: Effect of mass transfer on kinetics: (a) true reaction network and (b) mass-transfer dis-
guised reaction network for example 2 (10 species leading to 72 new reactions).
x = Ax + d or (I − A)x = d
where x is called the production vector, d is the demand vector and A is the con-
sumption matrix. Determine the production vector by solving the linear equa-
tions. [Remark: Models of this type were first developed by W. W. Leontief, who
won the 1973 economics Nobel prize.]
2. (Formulation of linear models): Paul, Jim and Mike decide to help each other build
houses. Paul will spend half of his time on his own house and a quarter of his
time on each of the houses of Jim and Mike. Jim will spend one-third of his time
on each of the three houses under construction. Mike will spend one-sixth of his
time on Paul’s house, one-third on Jim’s house and one-half on his own house.
For tax purposes, each must place a price on his labor, but they want to do so in a
way that each will break even. Formulate the relevant equations, solve them and
suggest a price on the labor of each person such that the hourly wages for each
exceed the minimum wage.
3. (Gaussian elimination and LU decomposition):
(a) Consider the linear system
x1 + 2x2 − x3 = 0
2x1 + x2 + 2x3 = 8
6x1 + 2x2 + 2x3 = 14
4 −1 0 0 0 1
−1 4 −1 0 0 0
A=( 0 −1 4 −1 0 ), b=( 0 )
0 0 −1 4 −1 0
0 0 0 −1 4 0
(n − 1)n(n + 1) 1 3
MD = ≈ n (n ≫ 1)
3 3
n(n − 1)(2n − 1) 1 3
AS = ≈ n (n ≫ 1)
6 3
u2 − 2u3 − 4u4 = 0
u3 − 3u4 = 0
2u1 + u2 + 3u3 + 7u4 = 0
6u1 + 2u2 + 10u3 + 28u4 = 0
(b) Determine the relationship between the parameters k and Ra for which the
following set of equations has a nontrivial solution:
c1 − (π 2 + k 2 )c2 = 0
−(π 2 + k 2 )c1 + Ra k 2 c2 = 0.
[Remark: This relation is called the neutral stability curve and defines the on-
set of convection in a fluid layer heated from below. Here, k is the wave num-
40 | 1 Matrices and linear algebraic equations
ber and Ra is the Rayleigh number]. Determine the minimum value of Ra for
which a nontrivial solution exists.
(c) Determine all of the nontrivial solutions for the following system of homoge-
neous equations:
u1 − iu2 = 0
u2 + u3 = 0
u1 + u2 − u4 = 0
u2 + iu3 + iu4 = 0, where i = √−1
u1 − 3u3 = −3
2u1 + λu2 − u3 = −2
u1 + 2u2 + λu3 = 1.
(b) Verify that the following linear system is inconsistent, and hence has no so-
lution:
u1 + 4u2 − u3 = −5
5u1 + 2u2 − 3u3 = −1
−2u1 + u2 + u3 = −2
u1 + 5u2 = −2
u1 − u3 + 2u4 + u5 + 6u6 = −3
u2 + u3 + 3u4 + 2u5 + 4u6 = 1
u1 − 4u2 + 3u3 + u4 + 2u6 = 0
2u1 − 4u2 + 2u3 + 3u4 + u5 + 8u6 = −3.
where α = Lh , β = GK
h
, L is the liquid (heavy-phase) flow rate, h is the holdup (which
is assumed to be constant and same for all stages), G is the gas (light-phase) flow
rate and xj is the composition of the transferable component in the liquid stream
leaving stage j. State any other assumptions involved. (b) Generalize the model
in (a) to the case of N stages and put it in vector-matrix form. Assuming that the
y
compositions x0 and xN+1 = N+1 K
, of the entering streams are known, show that
the model may be written in the form
Ax = b
Identify the vectors x, b and the matrix A (c) Compute the steady-state values for
y1 and x3 when N = 3 (three-stage process) and other parameters are given as
follows:
L = 5, G = 3, h = 1, K = 1, x0 = 0, y4 = 0.5
dc
Qc = VR [ + r(c)] − qin cin (t) + qe c (1.85)
dt
with an appropriate initial condition. Here, c (cin (t)) is the vector representing the
limiting reactant (inlet) concentrations in various cells, qin is a diagonal matrix
representing the inlet volumetric flow rates to the cells, qe is a matrix representing
the auxiliary flow rates leaving various cells (excluding the main convective flow),
Q is the N × N (loop or cell connectivity) matrix defined by
1 0 ... 0 −1
−1 1 ... 0 0
Q = QL ( 0 0 0 )
( −1 ... ), (1.86)
.. .. .. .. ..
. . . . .
( 0 0 ... −1 1 )
1.9 Application of computer algebra and symbolic manipulation | 43
and r(c) is the vector of reaction rates. Cast the model in dimensionless form and
identify the various matrices for the special case of one entering stream in cell 1
and one exit stream in cell j (1 ≤ j ≤ N).
11. Compartment models for 2D and 3D transient diffusion: Consider the arrangement
of cells shown in Figure 1.11. Assuming that all cells are of equal volume and ex-
change flow rates are identical in magnitude, determine the coupling matrix for
each case.
12. Discrete interacting convective loops: Determine the coupling matrix for the dis-
crete interacting convective loops shown in Figure 1.12. Assume all cells to be of
identical volume and magnitude of all exchange/convective flows to be equal.
is defined by
where the summation is taken over all possible products a1k1 a2k2 . . . ankn in which each
product has n elements with exactly one element arising from each row and each col-
umn of A. The value of h is the number of transpositions required to put the sequence
(k1 , k2 , . . . , kn ) in its natural order. Though h is not unique, it may be shown that it is
always even or odd for a given sequence. Note that there are n! terms in the summation
in equation (2.1) and (k1 , k2 , . . . , kn ) is a permutation of the sequence (1, 2, 3, . . . , n).
https://doi.org/10.1515/9783110739701-002
2.1 Definition of determinant | 45
a11 a12
A=( )
a21 a22
(k1 , k2 ) = (1, 2) ⇒ h = 0
(k1 , k2 ) = (2, 1) ⇒ h = 1
Therefore, we get
(k1 , k2 , k3 ) = (1, 2, 3) ⇒ h = 0
(k1 , k2 , k3 ) = (2, 3, 1) ⇒ h = 2
(k1 , k2 , k3 ) = (3, 1, 2) ⇒ h = 2
(k1 , k2 , k3 ) = (3, 2, 1) ⇒ h = 1
(k1 , k2 , k3 ) = (1, 3, 2) ⇒ h = 1
(k1 , k2 , k3 ) = (2, 1, 3) ⇒ h = 1
|A| = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 − a11 a23 a32 − a12 a21 a33
= a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a22 a31 )
Example 2.3. What is the sign of the term a13 a24 a31 a42 in the expansion of a 4 × 4
determinant?
Since the first subscripts are ordered, we need to look at the sequence formed
by the second subscripts (3, 4, 1, 2). Permuting 1 and 3 gives (1, 4, 3, 2). Permuting 2
and 4 gives (1, 2, 3, 4). Thus, h = 2 and the term a13 a24 a31 a42 appears with a positive
sign.
46 | 2 Determinants
det A = ∑(−1)h a1k1 a2k2 a3k3 . . . ankn (rows ordered, columns permuted)
T h′
det A = ∑(−1) ak1 1 ak2 2 ak3 3 . . . akn n (columns ordered, rows permuted)
α1 .α2 = α1 α2
|A|
̃ = −|A|
|A| = 0
This implies that if all the elements in a row (or column) are multiplied by α then
the determinant is multiplied by α.
6. If row j is multiplied by α(≠ 0) and added to row i, the determinant is unchanged
= det A + 0
7. If A is upper or lower triangular matrix, then det A is the product of all diagonal
elements. From the definition, we have
det In = 1
(a) Let E1n = elementary matrix obtained by performing row operation of type 1
(interchange of rows) on In . Then
|En | = −1
(b) Let E2n = elementary matrix obtained by performing row operation of type 2
on In . Then
|E2n | = k
(c) Let E3n = matrix obtained by performing elementary row operation of type 3
on In . Then
|E3n | = 1
Let B = Ein A (i = 1, 2 or 3). Then |B| = |Ein ||A|. Now suppose that
B = Em Em−1 . . . E1 A
Then
48 | 2 Determinants
It follows from this property that if a square matrix is not singular, then its echelon
form is an upper triangular matrix with nonzero elements along the diagonal.
a a12 . . . . a1n
11
a21 a22 . . . . a2n
. . . . . . .
det A =
. . . . . . .
. . . . . . .
a an2 . . . . ann
n1
Example 2.4.
1 3 5
A=( 2 0 −1 )
1 4 3
Set D0 = det A = 1. Multiply row 1 by 2 and subtract from row 2. Subtract row 1 from
row 3. D1 = 1.
1 3 5
A→( 0 −6 −11 )
0 1 −2
Multiply row 2 by 61 . D2 = 6
2.4 Minors, cofactors and Laplace’s expansion | 49
1 3 5
A→( 0 −1 − 116 )
0 1 −2
1 3 5
A→( 0 −1 − 116 )
0 0 − 23
6
Thus,
−23
det A = 6 × 1 × (−1) × ( )
6
= 23
a11 a12 . a1j . a1n
a21 a22 . a2j . a2n
. . . . . .
det A =
ai1 ai2 . aij . ain
. . . . . .
an1 an2 . anj . ann
= ∑(−1)h a1k1 a2k2 . . . ankn
and the (n − 1) × (n − 1) matrix obtained from A by deleting the i-th row and j-th column
of A. Denote this matrix by Mij . This is called a minor. The cofactor of element aij is
defined by
Note that the cofactor is a number whereas minor is a matrix of order (n − 1) × (n − 1).
Laplace’s expansion of an n-th order determinant in terms of determinants of order
(n − 1) may be stated as follows:
n
det A = ∑ aij Aij for any i = 1, 2, 3, . . . , n
j=1
50 | 2 Determinants
That is, take any row and determine the cofactors of the elements of this row. Multiply
the elements by the corresponding cofactors and sum to get the determinant. Simi-
larly, the expansion in terms of columns is given by
n
det A = ∑ aij Aij for any j = 1, 2, 3, . . . , n
i=1
To prove Laplace’s expansion (equations (2.4) and (2.5) are identical), first we note
that (2.5) has n! terms and each term contains exactly one element from each row and
column of A. This follows from the fact that each |Mij | has (n − 1)! terms, which do not
include any elements from the i-th row or j-th column of A. Thus, the sum in (2.5) has
(n)(n − 1)! = n! terms. To account for the signs of the terms, we consider the matrix
 obtained by moving the i-th row of A to the last row and the j-th column of A to
the last column. Then, when we expand the determinant of A,̂ each term in it differs
from that of A by exactly (n − i) row transpositions and (n − j) column transpositions.
Equivalently, the sign of the term differs by a factor (−1)(n−i)+(n−j) = (−1)2n−(i+j) = (−1)i+j .
Thus, the expansion given by (2.5) gives the determinant of A.
Akj is called the alien cofactor of aij and equation (2.7) is called the alien cofactor ex-
pansion. To establish this expansion, we consider the identity
2.4 Minors, cofactors and Laplace’s expansion | 51
a11 a12 a13 . . . a1n
a21 a22 a23 . . . a2n
. . . . . . .
ai1 ai2 ai3 . . . ain
= ak1 Ak1 + ak2 Ak2 + ⋅ ⋅ ⋅ + akn Akn
. . . . . . .
ak1 ak2 ak3 . . . akn
. . . . . . .
an1 an2 an3 . . . ann
and replace (ak1 , ak2 , . . . , akn ) by (ai1 , ai2 , . . . , ain ), i. e., replace the elements of the k-th
row by the elements of the i-th row. Then, on the left-hand side (LHS.) we have two
identical rows, and hence LHS = 0. This replacement of the k-th row by elements of
the i-th row does not change the cofactors Aki (i = 1, . . . , n). Now, RHS = ∑nj=1 aij Akj (we
are multiplying the cofactors of the k-th row by elements of the i-th row). Therefore,
we have ∑nj=1 aij Akj = 0, i ≠ k.
The Laplace and alien cofactor expansions may be used to prove the following theo-
rem.
Proof.
Similarly,
Theorem. If det A ≠ 0, then the inverse matrix A−1 exists and is given by
1
A−1 = (adj A)
|A|
a11 a12 0 0
[ a a22 0 0 ]
P = [ 21
[ ]
]
[ −1 0 b11 b12 ]
[ 0 −1 b21 b22 ]
a 0 0 a 0 0
22 21
|P| = a11 0 b11 b12 − a12 −1
b11 b12
−1 b21 b23 0
b21 b22
b b21
|P| = (a11 a22 − a12 a21 ) 11
b21 b22
= |A||B|
To show |P| = |AB|, we use elementary row operation of type 3 to transform P to P.̂
Since ERO of type 3 does not change the value of a determinant, |P|̂ = |P|. To get P,̂ we
multiply row 3 by a11 and add to row 1, row 4 by a12 and add to row 1, row 3 by a21 and
add to row 2, row 4 by a22 and add to row 2. This gives
2.6 Rank of a matrix defined in terms of determinants | 53
0 0 c11 c12
[ 0 0 c21 c22 ]
[ ]
P̂ = [ ]
[ −1 0 b11 b12 ]
[ 0 −1 b21 b22 ]
where
2
cij = ∑ aik bkj ; i, j = 1, 2
k=1
0 c11 c12
|P| = (−1) 0
̂ c21 c22
−1 b21 b22
1
det(A−1 ) = (2.8)
(det A)
Recall our earlier definition of the rank of a matrix as the number of nonzero rows in
the row echelon from of A. Now, let Ei be the elementary matrix obtained by perform-
ing ERO of type i on the identity matrix I. Then we have seen that
|E1 | = −1
|E2 | = k(k ≠ 0)
|E3 | = 1
Now, we let A be any m × n matrix and Ae be the row echelon form of A. Then
Ae = PA (2.9)
where P is the product of elementary matrices Ei . Thus, |P| ≠ 0. Suppose that rank of
A = r. Then, without loss of generality, we may assume that Ae is of the form
54 | 2 Determinants
1 â 12 â 13 . . â 1r . â 1n
[
[ 0 1 â 23 . . â 2r . â 2n ]
]
[ ]
[ 0 0 . . . â 3r . â 3n ]
[ ]
Ae = [
[ . . . . . . . . ]
]
[ 0 0 0 . . 1 . â rn ]
[ ]
[ ]
[ 0 0 0 . . 0 . 0 ]
[ 0 0 0 . . 0 . 0 ]
Here, Ae has m − r zero rows and the first nonzero element in row i appears in the i-th
column (this can always be arranged by renumbering the columns of A). Thus, Ae has
a r × r minor whose determinant is not zero. From equation (2.9), it follows that A also
has a r × r minor with a nonzero determinant. Thus, if A has rank r, there is at least
one r × r minor of A whose determinant is not zero. Conversely, if A has rank r then all
k × k minors (k > r) of A have a zero determinant.
Au = b (2.10)
Au = 0 (2.11)
has a unique solution. This solution may be expressed in terms of determinants using
Cramer’s rule. Let D = |A| ≠ 0. Then we have
Next, for each k ≠ j, add uk times column k to column j of the matrix in equation (2.12).
This does not change the value of the determinant. Thus,
But
n
∑ aik uk = bi ; i = 1, 2, . . . , n
k=1
Thus,
a a12 . . b1 . . a1n
11
a21 a22 . . b2 . . a2n
uj D = . . . . . . . .
. . . . . . . .
an1 an2 . . bn . . ann
Dj
uj = , j = 1, 2, . . . , n (2.13)
D
This is the explicit solution of the linear system given by equation (2.10). The result
given by equation (2.13) is referred to as Cramer’s rule.
a11 u1 + a12 u2 = b1
a21 u1 + a22 u2 = b2
we have
b1 a12 a11 b1
b2 a22
a21 b2
u1 = , u2 = (2.14)
a11 a12
a11 a12
a21 a22
a21 a22
we have
56 | 2 Determinants
b1 a12 a13
a11 b1 a13
a11 a12 b1
b2 a22 a23
a21 b2 a23
a21 a22 b2
b3 a32 a33
a31 b3 a33
a31 a32 b3
u1 = , u2 = , u3 =
a11 a12 a13
a11 a12 a13
a11 a12 a13
a21 a22 a23
a21 a22 a23
a21 a22 a23
a31 a32 a33
a31 a32 a33
a31 a32 a33
It follows from Cramer’s rule that for the special case of the homogeneous system
(b = 0), we obtain u = 0. Thus, as already seen before, when det A ≠ 0, the only
solution to the homogeneous system is the trivial one.
It should also be noted that when det A ≠ 0, the solution given by Cramer’s rule
is unique. To show this, suppose that there are two solutions and call them u and y.
We have
Au = b
Ay = b
Subtracting, we get
Az = 0 (2.15)
where
z=u−y
Since D ≠ 0, the only solution to the homogeneous system (2.15) is the trivial one.
Thus, z = 0 and u = y.
Cramer’s rule is not used in practice for higher order systems (e. g., n > 10) as it
requires more computational time than the Gaussian elimination procedure.
Then
dD da da11 da da
= a11 (t) 22 + a − a12 21 − 12 a21
dt dt dt 22 dt dt
da11 da12 a a
11 12
= dt dt + da
21 da22
a21 a22 dt dt
= D1 + D2
Thus, the derivative of a determinant is the sum of two determinants in which a single
row is differentiated. This result is easily generalized to the n-th order determinant:
where
a (t) a12 (t) . . a1n (t)
11
a21 (t) a22 (t) . . a2n (t)
. . . . .
Dj = . . . . .
daj1 (t) daj2 (t) dajn (t)
. .
dt dt dt
. . . . .
a (t) an2 (t) . . ann (t)
n1
Now, equations (2.18) to (2.21) are a set of n homogeneous linear equations for the
coefficients {c1 , c2 , . . . , cn } whose only solution is the trivial one. Thus, the determinant
of the coefficient matrix (also called the Wronskian determinant), defined by
w (x) w2 (x) . . wn (x)
1
′
w1 (x) w2′ (x) . . wn′ (x)
. . . . .
W(x) = ≠ 0
. . . . .
. . . . .
[n−1]
w w2[n−1] . . wn[n−1]
1
fi (u1 , u2 , . . . , un , α) = 0; i = 1, 2, . . . , n, (2.22)
tives, the implicit function theorem of multivariable calculus states that if the deter-
minant of the (linearized) Jacobian matrix of equations (2.22) does not vanish, then
the solution is a continuous function of the parameter α. Equivalently, the number of
solutions to equations (2.22) can change only when the determinant of the Jacobian
matrix
𝜕fi
J=[ (u, α)]; i = 1, 2, . . . , n; j = 1, 2, . . . , n
𝜕uj
vanishes, i. e.,
det J = 0 (2.23)
The elimination of the state variables u from equations (2.22) and (2.23) gives a locus
in the α parameter space. This locus is called the bifurcation set, as new solutions can
emerge (or bifurcate) only when α values cross this set. [Remark: It can be shown that if
the zero eigenvalue of J is simple, the number of solutions of equations (2.22) changes
if and only if the parameters a cross the bifurcation set. However, when the eigen-
value is not simple, the vanishing of the Jacobian determinant is necessary but not a
sufficient condition for bifurcation.] Thus, the solution of the original set of equations
(2.22) along with the vanishing of the Jacobian determinant can be used to determine
the bifurcation set for nonlinear problems
As an example, we consider the steady-state equation describing the temperature
(u) in an adiabatic CSTR with parameters B > 0 and Da > 0:
B Da eu
f (u, B, Da) = u − =0 (2.24)
1 + Da eu
df B Da eu
=1− = 0. (2.25)
du (1 + Da eu )2
Writing
t = Da eu , (2.26)
the two equations may be solved for B and Da and the bifurcation set may be expressed
in a parametric form as
(1 + t)2
B= (2.27)
t
Da = t exp{−1 − t}, t>0 (2.28)
This is plotted in Figure 2.1. The bifurcation set consists of two branches, the upper
ignition branch and the lower extinction branch. It divides the (B, Da) space into two
60 | 2 Determinants
regions, corresponding to unique and three solutions of equation (2.24). [Remark: The
cusp point in Figure 2.1 where the ignition and extinction branches meet is at B = 4
and Da = e−2 . We note that for B < 4, there is only a unique solution for any value of
the dimensionless residence time, Da]. For higher-dimensional problems, symbolic
manipulation or computer programming can be used to determine the bifurcation set.
Example 2.8 (Lorenz equations). Consider the following discrete model of convection
(where the fluid density is assumed to vary only with temperature):
1 dx
= y − x = f1 (2.29)
Pr dt
dy
= −xz + Rx − y = f2 (2.30)
dt
dz 8
= xy − z = f3 (2.31)
dt 3
where R is the (scaled) Rayleigh number and Pr is the Prandtl number (taken to be
unity here).
Writing the vector of variables as ψ = (x, y, z)T , then equations (2.29)–(2.31) can
be written as
dψ
= f(ψ) = (f1 , f2 , f3 )T . (2.32)
dt
It can easily be verified that equations (2.29)–(2.31) or (2.32) has a trivial steady-state
solution, i. e., ψs = (x, y, z)T = 0 is a steady-state solution for any R and Pr. [Remark:
The steady-state solution does not depend on Pr.] The Jacobian of the function f can
be calculated easily and is given by
𝜕fi −1 1 0
J={ }=( R−z −1 −x ) (2.33)
𝜕ψj y x −8
3
−1 1 0
⇒ Js = J|ψs =0 = ( R −1 0 ). (2.34)
0 0 −8
3
Thus, the Jacobian matrix at the trivial steady-state is of rank 3 except when R = 1
(where it has rank 2, i. e., det Js = 0 when R = 1).
Note that the linearized system of equations at R = 1,
Jψ = 0
α
has a nontrivial solution ψ = ( α ) for any arbitrary α. Thus, the linearized matrix Js has
0
a simple zero eigenvalue and R = 1 is a bifurcation point, i. e., new solutions appear
or disappear when R-value crosses unity.
In this specific case, we can determine all solutions since at steady-state (equa-
tions (2.29) and (2.31)) give
x=y (2.35)
3 3
z = xy = x 2 (2.36)
8 8
3
− x 3 − x + Rx = 0
8
or
3 2
x(R − 1 − x )=0
8
8
x=0 or x = ±√ (R − 1) (2.37)
3
In other words, for R > 1, the system has three steady-state solutions (including a
trivial steady state).
The solution diagram (referred to as pitchfork bifurcation in the literature) is
shown in Figure 2.2. In this figure, x = 0 is the trivial conduction solution while x ≠ 0
correspond to convective solutions. As stated in the Introduction to this chapter, the
condition for the existence of a nontrivial solution to a system of linearized equa-
tions (expressed in terms of a determinant) may be used to determine the possible
bifurcation points of many nonlinear systems.
For further discussion on the theory of determinants, we refer to the books by
Amundson [3] and Lipschutz and Lipson [22].
62 | 2 Determinants
Figure 2.2: Steady-state solution diagram of Lorenz equations illustrating the pitchfork bifurcation.
Problems
1. (Simplification of determinants):
(a) Show that
1 + c 1 1 . . 1
1
1 1 + c2 1 . . 1
= c c . . . c (1+ 1 + 1 +⋅ ⋅ ⋅+ 1 )
Dn = 1 1 1 + c3 . . 1 1 2 n
c1 c2 cn
. . . . . .
1 1 1 . . 1 + cn
c1 − (π 2 + k 2 )c2 = 0
Le c1 − (π 2 + k 2 )c3 = 0
−(π 2 + k 2 )c1 + Rat k 2 c2 + Rac k 2 c3 = 0.
[Remark: This relation is called the neutral stability curve and defines the onset of
convection in a fluid layer heated from below. Here, k is the wave number, Le is the
Lewis number and Rat , Rac are the thermal and concentration Rayleigh numbers,
respectively.]
4. (Equation for a plane and a circle in terms of determinants)
(a) Show that the equation of a plane passing through the points (xi , yi ), i = 1, 2, 3
may be expressed as
x y z 1
x1 y1 z1 1
= 0
x2 y2 z2 1
x3 y3 z3 1
(b) (Equation for a circle in terms of determinants) Show that the equation of a
circle passing through the points (xi , yi ), i = 1, 2, 3 may be expressed as
x2 + y2
x y 1
x12 + y12 x1 y1 1
= 0
x22 + y22 x2 y2 1
x32 y32
+ x3 y3 1
64 | 2 Determinants
5. (Common root condition for polynomial equations) Show that the necessary and
sufficient conditions for the equations
x 3 + ax2 + bx + c = 0
x 2 + αx + β = 0
1 a b c 0
0 1 a b c
1 α β 0 0 = 0
0 1 α β 0
0 0 1 α β
3 Vectors and vector expansions
In this chapter, we review some elementary concepts about vectors and vector expan-
sions. A more general discussion will be given in Part II when we deal with abstract
vector space concepts.
For the purpose of this chapter, we define a vector to be an n-tuple of real or com-
plex numbers arranged in a single row or column:
u1
u2
u=( . ) (column vector)
.
un
uT = (u1 u2 u3 . . . un ) (row vector)
For simplicity, we shall deal with only column vectors in the discussion below. How-
ever, all the concepts and properties of column vectors are also applicable to row vec-
tors. Also, when the elements of the column vector u are complex numbers, we define
the corresponding row vector by
u1 + v1
u2 + v2
u+v=( ),
.
un + vn
and the product (scalar multiplication) of a vector u by a real (or complex) number α
by
αu1
αu2
αu = ( . )
.
αun
https://doi.org/10.1515/9783110739701-003
66 | 3 Vectors and vector expansions
The set V with the above operations is called a vector space. We now deal with the
algebraic and geometric properties of this set.
c1 u1 + c2 u2 + ⋅ ⋅ ⋅ + cr ur = 0 (3.1)
implies
c1 = c2 = ⋅ ⋅ ⋅ = cr = 0
Otherwise, the set is called linearly dependent. We note that equation (3.1) defines a
system of n homogeneous equations in r unknowns.
c1 u1 + c2 u2 + c3 u3 = 0,
Using the elementary row operations, we reduce this system to the following echelon
form:
3
1 2
4
( 0 1 1 ) c = 0.
0 0 0
Thus, we have c2 = −c3 , c1 = − 52 c3 , and can get a nontrivial solution (e. g., by taking
c3 = 2, c2 = −2 and c1 = −5), and hence, the vectors are linearly dependent.
3.2 Dot or scalar product of vectors | 67
The following facts may be established from the above definitions of linear depen-
dence:
1. The zero vector is linearly dependent (since 1. 0 = 0).
2. Any single nonzero vector is linearly independent.
3. If a set of vectors is linearly dependent, then any larger set containing this set is
also linearly dependent.
4. Any subset of a linearly independent set is also linearly independent.
5. Any set of vectors containing the zero vector is linearly dependent.
6. If r > n, the set {u1 , u2 , . . . , ur } is linearly dependent, i. e., there can be at most
n linearly independent vectors in a set where all the vectors have n elements. As
already noted, equation (3.1) defines a set of n linear homogeneous equations in
r unknowns. Where r > n, there are more unknowns than equations and we can
always find a nontrivial solution.
The collection of all vectors, which are linear combinations of elements of the set S =
{u1 , u2 , . . . , ur } is called the subspace spanned by S. A set S = {u1 , u2 , . . . , ur } is called
a basis for a vector space V if it is linearly independent and spans V. The number
of elements in a basis is called the dimension of the vector space V. The following
theorem may be established easily from the above properties of n-tuples.
Theorem. The vector space V of all n-tuples of real numbers ℝn (or complex num-
bers, ℂn ) has dimension n.
1 0
e1 = ( ), e2 = ( )
0 1
in ℝ2 . This is called the standard basis. The set in example (3.1) is another basis for ℝ2 .
This function is called scalar or dot product (or more generally, an inner product) if it
satisfies the following three rules:
(i) ⟨αu + βv, w⟩ = α⟨u, w⟩ + β⟨v, w⟩; for u, v, w ∈ V and α, β are scalars
(ii) ⟨u, v⟩ = ⟨v, u⟩
(iii) ⟨u, u⟩ ≥ 0 and ⟨u, u⟩ = 0 iff u = 0
It is important to note that the scalar product maps pairs of vectors in V to the set of
real or complex numbers. The first property requires linearity in the first variable. The
second property is called Hermitian symmetry. For the case in which u and v contain
real elements, this simply requires the scalar product to be symmetric. The third prop-
erty known as the positive definiteness requires that the scalar product of a vector with
itself to be positive for all vectors in V except the zero vector. A vector space in which
a scalar product is defined has a geometric structure (and we can change this geomet-
ric structure by properly choosing the scalar product for a particular application. This
will be demonstrated in the second part.) We define the length of a vector by
2
⟨u, v⟩ ≤ ⟨u, u⟩⟨v, v⟩ (3.4)
we can also define the angle between two vectors. [A proof of Schwarz’s inequality is
given in Part II]. When V is the set of n-tuples of real numbers, we define the angle
between two vectors u, v ∈ V as
⟨u, v⟩
cos θ = (3.5)
‖u‖‖v‖
When V is the set of n-tuples of complex numbers, we define the angle between two
vectors u, v ∈ V as
|⟨u, v⟩|
cos θ = (3.6)
‖u‖‖v‖
[Remark: It can be shown that the angle defined by equation (3.5) satisfies 0 ≤ θ ≤ π
while that defined by equation (3.6) satisfies 0 ≤ θ ≤ π2 .] The vectors u, v ∈ V are
said to be orthogonal if ⟨u, v⟩ = 0. A vector u is said to be normalized (or is a unit
vector) if ‖u‖ = 1. If the set of vectors {u1 , u2 , . . . , un } is linearly independent and forms
a basis for V, then this basis is called an orthonormal basis if each vector in the set is
3.2 Dot or scalar product of vectors | 69
orthogonal to other vectors and is normalized to have unit length. In terms of scalar
product, an orthonormal basis satisfies the condition
where δij is the Kronecker delta function (δij = 1 for i = j and zero otherwise).
This is the usual inner (dot) product and it may be verified that it satisfies all the three
axioms. The length of a vector with respect to this inner product is given by
The set consisting of the unit vectors e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), . . . , en =
(0, 0, . . . , 1) is one possible orthonormal basis for this space. This vector space is of-
ten referred to as the n-dimensional Euclidean space.
where the bar denotes complex conjugate. Again, it may be verified that all the three
axioms are satisfied. The length of a vector with respect to this inner product is given
by
‖u‖ = √u1 u1 + u2 u2 + ⋅ ⋅ ⋅ + un un
This is the space of n-tuples of complex numbers and has a geometric structure similar
to that of the n-dimensional Euclidean space. It is an example of a finite-dimensional
Hilbert space.
70 | 3 Vectors and vector expansions
It may be shown that every finite-dimensional inner product space has an or-
thonormal basis. If {u1 , u2 , . . . , un } is a basis for V but is not orthogonal, the following
Gram–Schmidt procedure may be used to transform it to an orthogonal basis. Define
v1 = u1
k−1
⟨uk , vi ⟩
vk = uk − ∑ v
i=1
‖vi ‖2 i
Au = 0 (3.8)
Suppose that the rank of A is r(r ≤ m, r ≤ n). Then there is at least one r × r minor of
A whose determinant is not zero. Without loss of generality, we can assume that the
nonzero r × r minor is at the upper left corner. We can solve for the first r variables in
terms of the remaining (n − r) variables to obtain
Suppose that we choose values for the variables {ur+1 , ur+2 , . . . , un } and calculate
{u1 , u2 , . . . , ur } from equation (3.9). Suppose that we make (n − r + 1) choices for
{ur+1 , ur+2 , . . . , un } and arrange the solution in rows. Then, in this solution matrix,
the first r columns are obtained as linear combination of the last (n − r) columns.
Hence, the rank of this matrix is at most (n − r). If the choices of {ur+1 , ur+2 , . . . , un }
are such that the rank of the solution matrix is equal to (n − r), then the last row is
a linear combination of the first (n − r) rows. Thus, there can be at most (n − r) lin-
early independent solutions. The (n − r) linearly independent solutions are called a
3.3 Linear algebraic equations | 71
fundamental set of solutions. The following theorem may be stated for the solutions of
equation (3.8).
where r is the rank of A and {u1 , u2 , . . . , un−r } is a set of fundamental (linearly indepen-
dent) solutions and ci are arbitrary constants.
u = uh + up (3.11)
where uh is the general solution of the homogeneous system given by (3.10) and up is any
(particular) solution of the inhomogeneous equations.
[Remark: The theorem is also valid for the case r = 0 but this is omitted as it corre-
sponds to zero equations in n unknowns.]
for which rank A = 2. We have already seen that [see Example 1.7] every solution to
the homogeneous system is of the form
3 0
0 1
uh = c1 ( ) + c2 ( )
2 −1
3 −2
u1 − 2u2 − u4 = b1
−2u1 + 3u2 + 3u3 = b2
−u2 + 3u3 − 2u4 = b3
3u1 − 7u2 + 3u3 − 5u4 = b4
1 0
0 1
b = b1 ( ) + b2 ( )
2 1
5 1
Taking bT = (0, 1, 1, 1), the general solution of the inhomogeneous system may be writ-
ten as
3 0 −2
0 1 −1
u = c1 ( ) + c2 ( )+( )
2 −1 0
3 −2 0
3.4.1 Stoichiometry
S
∑ νj A j = 0 (3.12)
j=1
and denote A1 = CH3 OH, A2 = CO and A3 = H2 . With this notation, we can write
equation (3.13) in the form of equation (3.12) as
A1 − A2 − 2A3 = 0 (3.14)
S
∑ νij Aj = 0; (i = 1, 2, . . . , R) (3.15)
j=1
where νij is the stoichiometric coefficient of species Aj in the i-th reaction. For obvi-
ous reasons, the R × S matrix {νij } is called the stoichiometric coefficient matrix. For
example, consider the above methanol synthesis reaction with the side reactions
CO2 + H2 = H2 O + CO (3.16)
CO2 + 3H2 = H2 O + CH3 OH (3.17)
A2 + A5 − A3 − A4 = 0 (3.18)
A1 + A5 − 3A3 − A4 = 0 (3.19)
νa = 0
1 −1 −2 0 0
ν=( 0 1 −1 −1 1 )
1 0 −3 −1 1
and a is the species vector defined by aT = (A1 A2 A3 A4 A5 ). The two reactions (3.13)
and (3.16) are independent of each other while the third reaction (3.17) is the sum of
the other two reactions. It is important to know just how many independent reactions
there are in a given system. This can be answered using the vector space concepts in
two different ways:
1. When we already know the system of reactions, we can determine the number of
independent reactions and pick one such set by looking at the stoichiometric co-
efficient matrix. In the above example, the rank of ν is two and only two reactions
are independent.
2. We can also determine the number of independent reactions between S species
(A1 , A2 , . . . , AS ) by determining the rank of the atomic matrix. Suppose that each
species Aj is made up of atoms αi and let the number of atoms αi in species Aj be
denoted by λij . A table may be made up listing the species Aj along the top row
and the building blocks of the species (i. e., the atoms) αi vertically at the left so
that the element at the intersection of the i-th row and j-th column is λij . The n × S
matrix {λij } is called the atomic matrix:
74 | 3 Vectors and vector expansions
A1 A2 A3 . . AS
α1 λ11 λ12 λ13 . . λ1S
α2 λ21 λ22 λ23 . . λ2S
. . . . . . .
. . . . . . .
αn λn1 λn2 λn3 . . λnS
Suppose that the rank of the atomic matrix is r. Then the number of independent vec-
tors is r and the remaining (S − r) vectors (species) may be represented as a linear
combination of r basis vectors (species). These (S − r) relations are nothing but the
independent reactions between the species.
Example 3.7. Consider a reaction mixture consisting of CH3 OH, CO, H2 , CO2 and H2 O.
There are five species and the three distinct atoms. We form the atomic matrix and
see that it has rank 3. Thus, there are two independent reactions between these five
species.
Dimensional analysis is useful to analyze and correlate the behavior of a physical sys-
tem when it is not possible to write down the governing equations explicitly or when
they are too complicated to solve. In such cases, the Buckingham method may be used
to determine the dimensionless groups that characterize the behavior of the system.
In this method, one lists all the variables that are significant in a given problem and
determines the number of independent dimensionless groups formed by these vari-
ables by using the Buckingham pi-theorem. This theorem states that the number of
dimensionless groups used to describe a system involving n variables is equal to n − r,
where r is the rank of the dimensional matrix of the variables. Thus,
i = n − r,
The exponents of the fundamental dimensions may be used to represent each variable
as a vector in the m-dimensional space. Suppose that the rank of this matrix is r(≤ m).
Then only r of these vectors are linearly independent, and hence, (n−r) of these vectors
may be expressed as a linear combination of the r independent vectors. These (n − r)
relations are the dimensionless groups formed by the variables.
Example 3.8. Consider the motion of a solid body through a fluid. The drag force ex-
erted by the fluid (FD ) depends on the velocity V0 of this solid body, the size of the solid
body (such as diameter, D), the fluid density (ρ) and the fluid viscosity (μ). Determine
the relevant dimensionless groups.
We note that there are five vectors (variables) in a three-dimensional space and only
three of them can be linearly independent. We take the three linearly independent
vectors to be V0 , ρ and D. The two dimensionless groups may be formed by expanding
the remaining two vectors in terms of these three linearly independent ones. Equiv-
alently, we can form the product of each of the other variables with these three and
choose the exponents so that the resulting combination has no dimensions. (This is
the pi-method.) To form the first group, we write
To make π1 dimensionless, each of the above exponents must be zero. Solving these
three homogeneous equations, we get a = −c, d = 2c and b = 2c. Thus,
c
ρD2 Vo2
π1 = ( )
FD
The value for c is arbitrary and we take it to be −1 so that π1 becomes the familiar Euler
number:
FD
Eu =
ρD2 Vo2
76 | 3 Vectors and vector expansions
b
DVo ρ
π2 = ( )
μ
DVo ρ
Re =
μ
Thus, we can relate the five variables in terms of two dimensionless groups Eu and
Re. A relationship of the form Eu = f (Re) may be determined experimentally. In the
literature, the Euler number is often replaced by the drag coefficient, which is defined
by
FD
CD = 1
ρV 2
2 o Ac
where Ac is the projected area of the body in the direction of flow (For a sphere, Ac =
πD2 /4). Plots of experimentally determined drag coefficient curves (CD versus Re) for
various shapes (e. g., sphere) may be found in standard fluid mechanics textbooks.
It is often the case that many species are found in a chemical reactor and various re-
action pathways are conjectured. For example, during oxidative dehydrogenation of
methane in a catalytic reactor, following gas-phase species are found:
1
2CH4 + O2 → C2 H6 + H2 O (3.21)
2
2CH4 + O2 → C2 H4 + 2H2 O (3.22)
1
CH4 + O2 → CO + 2H2 (3.23)
2
3.5 Application of computer algebra and symbolic manipulation | 77
1
CO + O2 → CO2 (3.24)
2
2CH4 C2 H6 + H2 (3.25)
C2 H6 C2 H4 + H2 (3.26)
C2 H4 C2 H2 + H2 (3.27)
CH4 + H2 O CO + 3H2 (3.28)
CO + H2 O CO2 + H2 , (3.29)
where not all of them are independent reactions. There are many ways to determine the
independent reactions as described earlier, however, when number of species is large,
the determination may be cumbersome and computer programming can be utilized for
the task. As an example, we consider the above example of oxidative dehydrogenation
of methane, where nine species (given in equation (3.20)) are made up of 3 atoms and
leads to the following atomic matrix:
CH4 C 2 H6 H2 O H2 O2 CO CO2 C 2 H4 C2 H2
C 1 2 0 0 0 1 1 2 2
(3.30)
H ( 4 6 2 2 0 0 0 4 2 )
O 0 0 1 0 2 1 2 0 0
It may be verified that the rank of this matrix is 3 from its row echelon form:
1 0 0 2 −4 −5 −7 −2 −4
( 0 1 0 −1 2 3 4 2 3 ). (3.31)
0 0 1 0 2 1 2 0 0
−2 1 1 0 − 21 0 0 0 0
−2 0 2 0 −1 0 0 1 0
( −1 0 0 2 − 21 1 0 0 0 )
( )
( 0 0 0 0 − 21 −1 1 0 0 )
( )
( )
ν = ( −2 1 0 1 0 0 0 0 0 ) (3.32)
( )
( 0 −1 0 1 0 0 0 1 0 )
( )
( 0 0 0 1 0 0 0 −1 1 )
−1 0 −1 3 0 1 0 0 0
( 0 0 −1 1 0 −1 1 0 0 )
where species numbers are assigned as the order they appear in equation (3.20). The
row-echelon form of this matrix can be obtained as follows:
78 | 3 Vectors and vector expansions
1 0 0 0 0 0 0 −3
2
1
0 1 0 0 0 0 0 −2 1
3
( 0 0 1 0 0 0 −1
2
−5
4 2
)
( )
( 0 0 0 1 0 0 0 −1 1 )
( )
( 0 0 0 0 1 0 −1
1 )
ν̂ = ( −1 2 ) (3.33)
( 1
)
( 0 0 0 0 0 1 −1 −1 )
( 2 4 2 )
( 0 0 0 0 0 0 0 0 0 )
0 0 0 0 0 0 0 0 0
( 0 0 0 0 0 0 0 0 0 )
which shows that the rank of stoichiometric matrix is 6, i. e., number of independent
reactions is 6. These independent reactions can be obtained by multiplying ν̂ (given
in equation (3.33)) with the species vector given in equation (3.20).
Note that this set of independent reactions is not unique and can be determined in
many other ways, e. g., another method is shown below, which is based on eliminating
each atom as follows:
CH4 = C + 2H2
C2 H6 = 2C + 3H2
H2 O = H2 + O
H2 = H2
O2 = 2O
CO = C + O
CO2 = C + 2O
C2 H4 = 2C + 2H2
C2 H2 = 2C + H2
3.5 Application of computer algebra and symbolic manipulation | 79
CH4 = C + 2H2
C2 H6 = 2C + 3H2
1
H2 O = H2 + O2
2
H2 = H2
1
CO = C + O2
2
1
CO2 = C + O2
2
C2 H4 = 2C + 2H2
C2 H2 = 2C + H2
Finally, eliminating C from the above seven equations, we get the six linearly indepen-
dent reactions:
2CH4 = C2 H6 + H2 : dimerization/pyrolysis
1
H2 + O2 = H2 O : hydrogen oxidation
2
1
CH4 + O2 = CO + 2H2 : partial oxidation to syngas
2
1
CO + O2 = CO2 : CO oxidation
2
C2 H6 = C2 H4 + H2 : dehydrogenation of ethane
C2 H4 = C2 H2 + H2 : dehydrogenation of ethylene
Problems
1. (Linear dependence and independence of vectors): Which of the following sets of
vectors are linearly independent? Find the corresponding linear relations:
(i) (5, 4, 3), (3, 3, 2), (8, 1, 3)
(ii) (4, −5, 2, 6), (2, −2, 1, 3), (6, −3, 3, 9), (4, −1, 5, 6)
(iii) Suppose we have a set of vectors
such that
n
|ajj | > ∑ |aij | j = 1, 2, . . . , n
i=1;i=j̸
(a) Determine the number of independent reactions in the following set by exam-
ining the rank of the stoichiometric coefficient matrix:
(b) Determine the number of independent reactions in the above system by ex-
amining the rank of the atomic matrix.
(c) The following species are found to be present in the pyrolysis of a low molec-
ular hydrocarbon:
Determine the number of independent reactions and write down a set of in-
dependent reactions.
3. (Application of vector expansions to dimensional analysis):
(a) When a gas and liquid flow simultaneously in a horizontal pipe, several dif-
ferent flow patterns are obtained (e. g., stratified flow, bubble flow, slug flow,
annular flow, etc.). The type of flow pattern obtained in a particular system
depends on the pipe diameter (D), the liquid and gas superficial velocities
(ULS , UGs ), the density and viscosities of the phases (ρL , ρG , μL , μG ), the inter-
facial (surface) tension (σ) and the gravitational acceleration (g). Determine
the relevant dimensionless groups and give a physical interpretation.
(b) Small droplets of liquid are formed when a liquid jet breaks up in spray and
fuel injection processes. Assume that the droplet diameter (d) depends on the
liquid density, viscosity and surface tension, as well as the jet speed (V) and
diameter (D). Determine the relationship between these quantities by dimen-
sional analysis. Give a physical interpretation of the dimensionless groups.
4. (Application of vector expansions to dimensional analysis): It was shown by G. I.
Taylor that the energy (E) released in a nuclear explosion may be estimated from
the relation
1/5
E
R=( ) ct 2/5 ,
ρ0
where R is the radius of the spherical shock wave generated by the explosion, ρ0 is
the ambient density, t is the time and c is a constant. Taylor suggested to determine
the constant c (which turns out to be close to unity) by using experimentation
with lighter explosives (such as TNT) and E by using photographic data of R as a
function of time.
3.5 Application of computer algebra and symbolic manipulation | 81
(a) Assuming that R depends on E, ρ0 , t and the ambient pressure p0 , derive the
relevant dimensionless groups.
(b) Discuss the additional assumptions or approximations involved in obtaining
Taylor’s formula from the result in (a).
5. (Gas phase microkinetics): Consider a gas phase system consisting of molecules
H2 , Br2 , HBr and free radicals H and Br.
(a) Determine the number of independent reactions and write down one such set.
(b) Determine the number of reactions if the system has no free radicals.
6. (Catalytic microkinetics) In the oxidation of CO on a catalytic site (s), the following
gas phase and surface species are present:
Determine the number of independent reactions and write down one such set.
4 Solution of linear equations by eigenvector
expansions
The main goal of this chapter is to solve the linear algebraic equations
Au = b, (4.1)
du
= Au, t > 0; u(t = 0) = u0 , (4.2)
dt
Ax = λx (4.3)
where λ is a scalar.
x1 a11 a12
x=( ), A=( )
x2 a21 a22
and
https://doi.org/10.1515/9783110739701-004
4.1 The matrix eigenvalue problem | 83
y = Ax
a11 a12 x
=( )( 1 )
a21 a22 x2
a11 x1 + a12 x2 y
=( )≡( 1 )
a21 x1 + a22 x2 y2
The matrix A operating on the vector x gives the vector y. In general, the length of y
is different from that of x as the operator A stretches (or contracts) and rotates x to
obtain y. However, when y = λx (with λ real) we see that when A operates on x we get
only a stretching (or contraction) of x but there is no rotation (Figure 4.1 shows this for
λ real and positive).
We have seen that this homogeneous system has a nontrivial solution iff det(A−λI) = 0.
a − λ a12 . . a1n
11
a21 a22 − λ . . a2n
⇒ Pn (λ) ≡ . . . . . = 0
(4.5)
. . . . .
an1 an2 . . ann − λ
84 | 4 Solution of linear equations by eigenvector expansions
Equation (4.5) is called the characteristic equation of the square matrix A. The LHS of
(4.5) is a polynomial of degree n in λ and may be written as
Theorem. Every polynomial of degree n has exactly n roots (real or complex with count-
ing of repetition or multiplicity).
It follows from this theorem that a square matrix A of order n has n eigenvalues.
(A − λi I)xi = 0
Proof.
a − λ a12 . . a1n
11
a21 a22 − λ . . a2n
Pn (λ) = . . . . .
. . . . .
an1 an2 . . ann − λ
If Pn′ (λi ) ≠ 0, then at least one of the above (n − 1) × (n − 1) determinants is not zero.
4.2 Left eigenvectors and the adjoint eigenvalue problem (eigenrows) | 85
⇒ rank(A − λi I) = n − 1.
∴ There is only one linearly independent solution to the homogeneous system (A −
λi I)xi = 0.
∴ The result.
Remark. The eigenvalues are also called characteristic values, characteristic roots or
latent roots.
where tr A is the trace of A (sum of diagonal elements) and det A is the determinant.
For the 3 × 3 case, we have
a − λ a12 a13
11
P3 (λ) = a21 a22 − λ a23
a31 a32 a33 − λ
= −λ3 + (tr A)λ2 − (a11 a22 + a11 a33 + a22 a33 − a12 a21 − a13 a31 − a23 a32 )λ + det A
= (λ1 − λ)(λ2 − λ)(λ3 − λ)
y∗ = (y)T
= ( y1 y2 y3 . yn )
86 | 4 Solution of linear equations by eigenvector expansions
y1
y2
y=( . )
.
yn
y∗ A = μy∗ (4.7)
a11 a12
( y1 y2 ) ( ) = μ ( y1 y2 )
a21 a22
Multiplying A on the left by a row vector gives another row vector. Thus, we get
Definition. A real or complex number μ for which (4.7) has nontrivial solutions is
called an eigenvalue of A and the nontrivial solution y∗ is called eigenrow or more
precisely left eigenvector of A corresponding to eigenvalue μ.
Taking the complex conjugate transpose operation, equation (4.7) may also be
written as
(y∗ A) = (μy∗ )
∗ ∗
⇒ A∗ y = μy
̄ (4.8)
Thus, the left eigenvalue problem for matrix A (also called the adjoint eigenvalue prob-
lem) is an eigenvalue problem for A∗ [Considered as an operator, the matrix A∗ is
called the adjoint of A]. We shall refer to the column vector y as the adjoint eigen-
vector. (The complex conjugate of the transpose of y, namely y∗ will be referred to as
the eigenrow or left eigenvector of A).
When A has real elements, equation (4.8) reduces to
AT y = μy
̄ (4.9)
4.3 Properties of eigenvectors/eigenrows | 87
Theorem. The set of eigenvalues defined by equation (4.8) is identical to that defined
by equation (4.3).
Proof. The eigenvalues of equation (4.8) are the roots of the polynomial
Qn (μ) = A∗ − μĪ
T
= (A∗ − μI)̄
= |Ā − μI|
̄
= |A − μI|
= Pn (μ)
Pn (μ) = 0 iff Pn (μ) = 0
Thus, the adjoint problem has the same set of eigenvalues. This can be seen more
directly from equation (4.7). This equation may be written as the homogeneous system
y∗ (A − μI) = 0 (4.10)
Thus, the eigenvalues defined by (4.3) and (4.7) are the same. However, note that if (4.7)
is written in the form given by equation (4.8), then the adjoint eigenvalue problem has
eigenvalues
μ = λ.̄ (4.12)
Axj = λj xj
⇒ A(αxj ) = λj (αj xj )
⇒ αj xj (≠ 0) is also an eigenvector.
Proof. Suppose that xi and xj are linearly dependent. Then there exist constants ci
and cj such that
ci xi + cj xj = 0
cj
xi = − xj
ci
cj
⇒ Axi = − Axj
ci
cj
⇒ λi xi = − λj xj
ci
−cj
⇒ λi xi = λj ( x ) = λj xi
ci j
⇒ (λi − λj )xi = 0
or
xi = 0
But xi cannot be zero since it is an eigenvector. We have also assumed that the eigen-
values are distinct, i. e., λi ≠ λj . Therefore, we arrive at a contradiction. ⇒ xi and xj
are linearly independent.
2.(b) Suppose that A has simple and distinct eigenvalues {λ1 , λ2 , . . . , λn } with eigen-
vectors {x1 , x2 , . . . , xn }. Then the set of eigenvectors {x1 , x2 , x3 , . . . , xn } is linearly
independent.
c1 x1 + c2 x2 + ⋅ ⋅ ⋅ + cn xn = 0 (4.13)
premultiply (4.13) by (A − λ1 I) ⇒
Axi = λi xi ⇒
0 + c2 (λ2 − λ1 )x2 + ⋅ ⋅ ⋅ + cn (λn − λ1 )xn = 0 (4.14)
4.3 Properties of eigenvectors/eigenrows | 89
Premultiply (4.14) by (A − λ2 I) ⇒
n−1
cn [∏(λn − λi )]xn = 0 (4.15)
i=1
Since the eigenvalues are all distinct and xn ≠ 0, (4.15) ⇒ cn = 0. Repeating the same
argument with the remaining part of equation (4.13), we can show that
cn−1 = cn−2 = ⋅ ⋅ ⋅ = c1 = 0
Thus, all constants are zero and we have a contradiction. This implies that the eigen-
vectors are linearly independent.
3. Properties (i) and (ii) are also valid for the eigenrows.
4.(a) If xi is an eigenvector of A corresponding to eigenvalue λi and y∗j is the eigenrow
of A corresponding to eigenvalue λj (≠ λi ), then we have
y∗j xi = 0 (4.16)
Axi = λi xi (4.17)
y∗j A = λj y∗j (4.18)
Using (4.17) ⇒
y∗j λi xi = λj y∗j xi
λi y∗j xi = λj y∗j xi (since λi is a scalar)
(λi − λj )y∗j xi = 0
90 | 4 Solution of linear equations by eigenvector expansions
Since λi ≠ λj ⇒ y∗j xi = 0. In terms of the dot product, this result may be written as
⟨xi , yj ⟩ = 0; i ≠ j.
4.(b) Suppose that A has simple and distinct eigenvalues λ1 , λ2 , . . . , λn with eigenvec-
tors {x1 , x2 , . . . , xn } and eigenrows {y∗1 , y∗2 , . . . , y∗n }. Then
We have already proved equation (4.19). To prove (4.20), we use the property
that the eigenvectors and eigenrows are linearly independent. Now, if y∗j xj = 0,
xj is orthogonal to n linearly independent vectors yi , i = 1, . . . , n. However, the
only such vector is the zero vector. But xj ≠ 0 ⇒ y∗j xj ≠ 0. We can normalize the
eigenrows (or eigenvectors) such that
[Note: The symbol δij is the Kronecker delta, which takes a values of unity when
the indices are equal and zero otherwise].
−3 2
A=( )
4 −5
−3 − λ 2
P(λ) =
4 −5 − λ
= λ2 + 8λ + 7
= (λ + 1)(λ + 7)
P(λ) = 0 ⇒ λ1 = −1, λ2 = −7
eigenvectors:
(A − λ1 I)x1 = 0 ⇒
−2 2
( ) x1 = 0
4 −4
1
⇒ x1 = ( )
1
4.3 Properties of eigenvectors/eigenrows | 91
(A − λ2 I)x2 = 0 ⇒
4 2
( ) x2 = 0
4 2
1
⇒ x2 = ( )
−2
eigenrows:
yT1 (A − λ1 I) = 0
−2 2
⇒ yT1 ( )=0
4 −4
yT1 = (2, 1)
yT2 (A − λ2 I) = 0
4 2
⇒ yT2 ( )=0
4 2
yT2 = (1, −1)
Thus, we have
eigenvalues λ1 = −1, λ2 = −7
1 )
eigenvectors x1 = ( 11 ), x2 = ( −2
eigenrows yT1 = ( 2 1 ), yT2 = (1 − 1)
Biorthogonality relations:
yT1 x1 = 3 ≠ 0 yT2 x1 = 0
yT1 x2 = 0 yT2 x2 = 3 ≠ 0
Figure 4.2 shows a plot of the eigenvectors and eigenrows (dashed lines). The biorthog-
onality relationship can be seen clearly.
Normalizing the eigenrows such that
yTi xj = δij
Example 4.3.
−1 1
A=( )
−1 −1
Figure 4.2: Schematic plot of the eigenvectors and eigenrows of the matrix in Example (4.2).
−1 −1
A∗ = AT = ( )
1 −1
−1 − λ 1
|A − λI| = = λ2 + 2λ + 2
−1 −1 − λ
P(λ) = 0 ⇒ λ1 = −1 + i, λ2 = −1 − i
Eigenvectors:
−i 1 0 1
(A − λ1 I)x1 = 0 ⇒ ( ) x1 = ( ) ⇒ x1 = ( )
−1 −i 0 i
i 1 0 1
(A − λ2 I)x2 = 0 ⇒ ( ) x2 = ( ) ⇒ x2 = ( )
−1 i 0 −i
eigenrows:
−i 1
y∗1 (A − λ1 I) = 0 ⇒ ( ȳ11 ȳ12 ) ( )=( 0 0 )
−1 −i
⇒ y∗1 = ( i 1 )
−i
y1 = ( )
1
i 1
y∗2 (A − λ2 I) = 0 ⇒ ( ȳ21 ȳ22 ) ( )=( 0 0 )
−1 i
i
⇒ y∗2 = ( −i 1 ) or y2 = ( )
1
To summarize, we have
4.3 Properties of eigenvectors/eigenrows | 93
eigenvalues
λ1 = −1 + i, λ2 = −1 − i
eigenvectors
1 1
x1 = ( ), x2 = ( )
i −i
eigenrows
y∗1 = ( i 1 ), y∗2 = ( −i 1 )
or adjoint eigenvectors
−i i
y1 = ( ), y2 = ( )
1 1
Biorthogonality
y∗1 x1 = ⟨x1 , y1 ⟩ = i + i = 2i ≠ 0
y∗2 x1 = ⟨x1 , y2 ⟩ = −i + i = 0
y∗1 x2 = ⟨x2 , y1 ⟩ = i − i = 0
y∗2 x2 = ⟨x2 , y2 ⟩ = −i − i = −2i ≠ 0
Example 4.4.
−1 1
A=( )
1 −1
1 1
√2 √2
x1 = ( 1 ), x2 = ( −1 ).
√2 √2
1 1−i
A=( )
1+i 2
1 1+i
A=( )
1−i 2
1 1−i
(A)T = A∗ = ( )=A
1+i 2
1 − λ 1 − i
P(λ) = = λ2 − 3λ
1 + i 2 − λ
P(λ) = 0 ⇒ λ1 = 0, λ2 = 3
Eigenvectors
1 1−i −1 + i
(A − λ1 I)x1 = 0 ⇒ ( ) x1 = 0 ⇒ x1 = ( )
1+i 2 1
y∗1 (A − λI) = 0
1 1−i
(y11 y21 ) ( )=( 0 0 )
1+i 2
−1 + i
⇒ y∗1 = ( −1 − i 1 ) or y1 = ( ) = x1
1
(A − λ2 I)x2 = 0
−2 1−i
( ) x2 = 0
1+i −1
1−i
⇒ x2 = ( ) = y2
2
−1 + i
y∗2 x1 = ( 1 + i 2 )( ) = 0, y∗1 x2 = 0.
1
Theorem. Suppose that the square matrix A is such that A∗ = A. Then the eigenvalues
of A are real and the left and right eigenvectors of A are related by y∗i = x∗i , i. e., the
4.3 Properties of eigenvectors/eigenrows | 95
eigenrows are the conjugate transposes of the eigenvectors (equivalently, the eigenvec-
tors and adjoint eigenvectors are the same).
Ax = λx (4.22)
Premultiplying (4.22) by x∗ (or taking the dot or inner product with x) we get
x∗ Ax = λx∗ x (4.23)
Now, x∗ Ax is a scalar and equation (4.23) is a scalar identity. We take the ∗ operation
(complex conjugation and transpose) on both sides of (4.23) ⇒
⇒ x∗ A∗ x = λx∗ x (4.24)
⇒ x Ax = λx x
∗ ∗
(since A = A)
∗
⇒ x λx = λx x
∗ ∗
⇒ λ = λ̄
⇒ λ is real
(4.22)⇒
Comparing (4.25) and (4.26), we see that y∗i may be chosen to be a scalar multiple
of x∗i . Thus, we choose y∗i = x∗i . Because of this very important property (real eigen-
values and orthogonal set of eigenvectors) of real symmetric (or complex Hermitian)
matrices, many problems involving such matrices can be solved using only orthogonal
expansions.
96 | 4 Solution of linear equations by eigenvector expansions
This expansion is unique, i. e., the coefficients αi are uniquely determined for each vec-
tor z. These coefficients are called the coordinates of z w. r. t the basis {x1 , x2 , . . . , xn }.
In general, we have to solve a set of n linear equations in n unknowns to determine
{αi }.
1 1 2
( ) = α1 ( ) + α2 ( )
−1 2 5
α1 + 2α2 = 1
2α1 + 5α2 = −1.
α1 = 7
α2 = −3
x∗i xj = 0 if i ≠ j (4.28)
n
x∗j z = ∑ αi x∗j xi (4.29)
i=1
4.4 Orthogonal and biorthogonal expansions | 97
Since x∗j xj = ‖xj ‖2 ≠ 0, there is only one nonzero term on the RHS of equation (4.29).
Now, the linear equations for αj are decoupled and we can solve for αj as
x∗j z
αj = (4.30)
x∗j xj
n ⟨z, xj ⟩
z=∑ xj (4.31)
j=1
‖xj ‖2
αj = x∗j z = ⟨z, xj ⟩.
Example 4.7.
(a) Consider ℝ2 and take e1 = ( 01 ), e2 = ( 01 ) as the orthogonal set. Taking
1
z=( )
−1
we have z = α1 e1 + α2 e2 ,
1
α1 = eT1 z = ( 1 0 )( ) = 1 (first element of z)
−1
1
α2 = eT2 z = ( 0 1 )( ) = −1 (second element of z)
−1
z = α1 x1 + α2 x2
1 1 1
α1 = xT1 z = ( √2 √2 )( )=0
−1
1 1
α2 = xT2 z = ( ) = √2
−1
√2 √2 )(
−1
98 | 4 Solution of linear equations by eigenvector expansions
Let {x1 , x2 , . . . , xn } be a set of n linearly independent column vectors and {y∗1 , y∗2 , . . . , y∗n }
be another set of n linearly independent row vectors. Suppose that
y∗j xi = 0 if i ≠ j (4.32)
These vectors satisfy the biorthogonality relation stated above. Now, let z be any vector
and consider the expansion of z in terms of the set {x1 , x2 , x3 , . . . , xn } :
n
z = ∑ αi xi (4.33)
i=1
n
y∗j z = ∑ αi y∗j xi
i=1
Again, due to the biorthogonality property, there is only one nonzero term (corre-
sponding to i = j) in the sum and we get
y∗j z
αj = (4.34)
y∗j xj
n y∗j z
z = ∑( )xj (4.35)
j=1
y∗j xj
Given the two sets of vectors {xi } and {y∗j }, we can define a transformation that con-
verts the vector z into the vector α through equation (4.34). Given the vector α, we can
recover z through equation (4.33). Thus, we have a transform:
y∗i z
z → α, αi = , (4.36)
y∗i xi
This procedure may be used to decouple and solve many linear equations containing
a square matrix A. This is illustrated in the next section.
4.5 Solution of linear equations using eigenvector expansions | 99
Au = b (4.38)
where A is a square matrix of order n and u, b are n × 1 vectors. First, we consider the
case in which rank A = n. Premultiply equation (4.38) by y∗j , ⇒
y∗j Au = y∗j b
⇒
[Remark: Taking the dot product of equation (4.38) with yj or multiplying on the left
by y∗j and using the fact that y∗j is a left eigenvector of A, decouples the equations.]
Now, equation (4.39) ⇒
⇒
y∗j u y∗j b 1
=
y∗j xj y∗j xj λj
⇒
n y∗j b 1
u = ∑( ) x (4.40)
j=1
y∗j xj λj j
is the solution to the linear equations (4.38) in terms of the eigenvalues, eigenvectors
and eigenrows of the matrix A. We shall see later that this formula is also applicable
for many other types of linear equations. A special case of the above result for the case
of a symmetric matrix A with normalized eigenvectors (orthonormal set) is
n ⟨b, xj ⟩
u=∑ xj (4.41)
j=1
λj
100 | 4 Solution of linear equations by eigenvector expansions
where ⟨b, xj ⟩ = xTj b is the standard dot (inner) product. The above solution in terms of
the eigenvector expansion should be compared with that of direct solution methods
(e. g., by Gaussian elimination). When n is large and the eigenvalues are well separated
(e. g., |λ1 | ≪ |λ2 | ≪ |λ3 | ⋅ ⋅ ⋅ ≪ |λn |), only a few terms of the expansion may be sufficient
to compute the solution to the desired accuracy (the extreme case being one eigen-
value being very small in magnitude compared to all others, requiring only one term
in the expansion). In such cases, the eigenvector expansion is more useful than the di-
rect solution method, especially for large values of n, where the number of operations
required to solve the system varies as 31 n3 . A second application of the solution in terms
of the eigenfunctions is in the development of the so-called multigrid methods for the
solution of large sparse systems (obtained by discretization of Laplace–Poisson-type
equations).
4 −1 0
( −1 4 −1 )
0 −1 4
Taking bT = (1, 1, 1) and calculating the three terms in equation (4.41), we get
In this specific case, though the separation between the eigenvalues is not extreme
(λ3 /λ1 = 2.09), the first term still gives the solution to within about 10 % error.
If rank A <n, then the solution given by equation (4.40) needs to be modified. Suppose
that rank A = r (r ≥ 1). Then we have seen that the homogeneous system
Au = 0
4.5 Solution of linear equations using eigenvector expansions | 101
y∗j b = 0, j = r + 1, . . . , n, (4.42)
where yj are the linearly independent solutions of the adjoint homogeneous system
A∗ yj = 0, j = r + 1, . . . , n. (4.43)
Equation (4.42) is another way of expressing the consistency of the linear system, i. e.,
if A has rank r (< n), then Au = b is consistent iff b is orthogonal to the adjoint eigen-
vectors corresponding to the zero eigenvalue. In this case, the general solution of
Au = b
is of the form
r y∗j b 1 n
u = ∑( ) x + ∑ c j xj
j=1
y∗j xj λj j j=r+1
= up + uh (4.44)
u1 − 2u2 − u4 = b1
−2u1 + 3u2 + 3u3 = b2
−u2 + 3u3 − 2u4 = b3
3u1 − 7u2 + 3u3 − 5u4 = b4 .
We have seen that the rank of the 4 × 4 coefficient matrix is two, implying that it has
two zero eigenvalues. We note that the adjoint homogeneous system
1 −2 0 3
T −2 3 −1 −7
A y=( )y = 0
0 3 3 3
−1 0 −2 −5
102 | 4 Solution of linear equations by eigenvector expansions
−5 −2
−1 −1
y1 = ( ), y2 = ( )
0 1
1 0
Thus, the solvability conditions, equations (4.42) lead to the relations b4 = 5b1 + b2
and b3 = 2b1 + b2 . Equivalently, the system is consistent and has solutions if and only
if b is of the form
1 0
0 1
b = b1 ( ) + b2 ( ).
2 1
5 1
Taking bT = (0, 1, 1, 1), the general solution of the inhomogeneous system may be writ-
ten as
3 0 −2
0 1 −1
u = c1 ( ) + c2 ( )+( ).
2 −1 0
3 −2 0
du
= Au (4.45)
dt
u = u0 @ t = 0 (4.46)
Equation (4.45) defines a set of n coupled linear equations while (4.46) gives the initial
conditions. For the case of n = 2, equation (4.45) in component form is
du1
= a11 u1 + a12 u2
dt
du2
= a21 u1 + a22 u2
dt
u1 = u10 , u2 = u20 @ t = 0.
4.5 Solution of linear equations using eigenvector expansions | 103
To solve equations (4.45) and (4.46), we use the biorthogonal expansion. Multiply
(4.45) on the left by y∗j ⇒
du
y∗j = y∗j Au
dt
d ∗
(y u) = λj y∗j u
dt j
y∗j u = cj eλj t
At t = 0, u = u0 ⇒ cj = y∗j u0
∴
⇒
y∗j u (y∗j u0 )
= e λj t
y∗j xj y∗j xj
⇒
n y∗j u0
u = ∑( )eλj t xj (4.47)
j=1
y∗j xj
y∗j u0
ĉj =
y∗j xj
Thus, the solution is a linear combination of terms of the form xj eλj t . Equivalently, the
state of the system at any time is a linear combination of the eigenvectors. For this
reason, the eigenvectors are also called the fundamental modes or basic states of the
system. Note that if u0 = αxi then equation (4.47) simplifies to
u = u0 eλi t
104 | 4 Solution of linear equations by eigenvector expansions
Thus, if the initial state corresponds to one of the eigenvectors then the system will
be in that state at all times. For this reason, the eigenvectors are called invariants for
the flow (or trajectory) defined by equation (4.45). Note also that the reciprocal of the
eigenvalue λj determines the rate of change or the time constant for the system evolu-
tion for this special initial condition.
du
= Au + b, t > 0
dt
u = u0 @ t = 0
We first determine the steady-state solution by setting the time derivative to zero. As-
suming that A is invertible,
Aus + b = 0 ⇒ us = −A−1 b
Define
z = u − us
dz
= A(us + z) + b
dt
= Az
z = z0 @ t = 0; z0 = u0 − us
If λn > 0, the term containing eλn t will be the dominant term in the solution as it in-
creases without bound for t → ∞. If λn < 0 (and hence is smaller in magnitude than
4.5 Solution of linear equations using eigenvector expansions | 105
all other λi ), then the term containing eλn t will determine the time taken by the sys-
tem to approach the steady state. All other terms decay to zero more rapidly than this
term. Thus, the time constant of the system (time required for the system to approach
steady state with say < 5 % deviation ≈ |λ3 | ) is determined by the eigenvalue having
n
the smallest magnitude.
Now consider the case of complex eigenvalues. If λj = aj + ibj ,
If aj < 0, then the approach to steady state is oscillatory. Once again, the eigenvalue
with the smallest real part (in absolute value) determines the time constant of the sys-
tem.
The vibrations of many systems such as coupled point masses and springs, molecules
and structures are described by the equations of the form
d2 u
M = −Ku (4.50)
dt 2
u = u0 @ t = 0 (initial displacement) (4.51)
du
= v0 @ t = 0 (initial velocity) (4.52)
dt
where M is an n × n matrix (called the inertia matrix) and K is n × n matrix (called the
stiffness matrix or matrix of spring constants) and u is the displacement vector. We
assume that M be invertible and let M−1 K = A,
d2 u
⇒ = −Au (4.53)
dt 2
d2 u
y∗j = −y∗j Au
dt 2
d2 ∗
(y u) = −λj y∗j u
dt 2 j
⇒
du
= v0 @ t = 0 ⇒ c1j √λj = y∗j v0
dt
∴
sin √λj t
y∗j u = (y∗j v0 ) + (y∗j u0 ) cos √λj t
√λj
This is the formal solution to the initial value problem defined by equations (4.50) to
(4.52).
u0 = αxi , v0 = 0
d2 u
= −Au + b
dt 2
u = u0
@ t = 0, { du
dt
= v0
is given by
where
us = A−1 b and z0 = u0 − us
The problem of diffusion and reaction in a catalyst pore (with no radial and only lon-
gitudinal gradients) or a slab (or plate) of catalyst is described by the vector boundary
value problem
d2 c
D = Kc, 0<s<L
ds2
Here, K is the matrix of first-order rate constants, c is the vector of species concentra-
tions and D is a (positive definite matrix) of diffusivities. Define
s
ξ = , A = D−1 KL2 (= Φ2 ; Φ = Thiele matrix)
L
d2 c
⇒ = Ac (4.58)
dξ 2
c = c0 @ ξ = 0 (4.59)
dc
= 0@ξ = 1 (4.60)
dξ
Let {λ1 , λ2 , . . . , λn } be the eigenvalues, {x1 , x2 , . . . , xn } be the eigenvectors and {y∗1 , y∗2 , . . . ,
y∗n } be the eigenrows of A, respectively. Then, from equation (4.58) we obtain after
premultiplying by y∗j ,
d2 ∗
(y c) = y∗j Ac = λj (y∗j c)
dξ 2 j
y∗j c = α1j cosh √λj ξ + α2j cosh √λj (1 − ξ )
α1j = 0
108 | 4 Solution of linear equations by eigenvector expansions
L
1
robs = ∫ Kc(s) ds
L
0
D dc
= 2 ( − )
L dξ ξ =0
D n yj c0
∗
= ∑ (√λj tanh √λj )xj (4.62)
L2 j=1 y∗j xj
robs = K∗ c0 (4.63)
D n xj yj
∗
K∗ = ∑ (√λj tanh √λj )
L2 j=1 y∗j xj
D √ −1 2
= D KL tanh(√D−1 KL2 ) (4.64)
L2
The second equality follows from the spectral theorem to be discussed in the next
chapter. In terms of the Thiele matrix, equation (4.64) may be expressed as
D
K∗ = Φ tanh(Φ) (4.65)
L2
It follows from equation (4.64) that when the pore diffusional effects are negligible
(L → 0) the diffusion disguised rate constant matrix is equal to the true rate constant
matrix (K∗ = K) while for the case of strong pore diffusional limitations (L → ∞) or
more precisely |λj | ≫ 1 for all j, we have K∗ = L1 √DK. (Remark: The square root of a
matrix is uniquely defined only for positive definite matrices. See Chapter 7 for further
details).
4.6 Diagonalization of matrices and similarity transforms | 109
B = TAT−1 (4.66)
A = T−1 BT (4.67)
2. Similar matrices have the same eigenvalues. To prove this, let λ be an eigenvalue
of A and x be the corresponding eigenvector. Then
Ax = λx (4.68)
T−1 BTx = λx
B(Tx) = λ(Tx)
or
By = λy, y = Tx (4.69)
Consider the 3-tank interacting system shown in Figure 4.3. The model describing the
transient behavior of this system may be expressed as
dc
VT = Qc, (4.70)
dt ′
where VT is the diagonal capacitance matrix (of tank volumes) and Q is the (symmet-
ric) matrix of exchange flows. With proper normalization of time and dividing by the
respective capacitances, the dimensionless form of the model may be written as
dc
= Ac, t>0 and c = c0 @ t = 0, (4.71)
dt
−2 1 1
A = ( 1/2 −1 1/2 ) . (4.72)
1 1 −2
[Note that even though Q is symmetric, the matrix A is not symmetric.] If we label the
tanks differently, i. e., denoting the large tank by subscript 1 as shown in Figure 4.4,
the system is now described by
dc
= Bc, t>0 and c = c0 @ t = 0, (4.73)
dt
4.6 Diagonalization of matrices and similarity transforms | 111
−1 1/2 1/2
B=( 1 −2 1 ). (4.74)
1 1 −2
It can be seen that A and B are similar matrices. They have the same eigenvalues but
different eigenvectors.
The difference between the two cases is flipping of tanks 1 and 2. Thus, if we take
0 1 0
T=( 1 0 0 ) = T−1 , (4.75)
0 0 1
which is obtained by flipping the first and second row of the identity matrix I3 , then
0 1 0 −2 1 1 0 1 0
T−1 AT = ( 1 0 0 ) ( 1/2 −1 1/2 ) ( 1 0 0 )
0 0 1 1 1 −2 0 0 1
−1 1/2 1/2
=( 1 −2 1 ) = B.
1 1 −2
TAT−1 y = λy ⇒ By = λy (4.77)
In other words, eigenvalues of A and B are the same but eigenvectors are related by
flipping the first and second element of x. The eigenvalues of A are given by λ1 = 0,
1 1
λ2 = −2 and λ3 = −3 with corresponding eigenvectors x1 = ( 1 ), x2 = ( −1 ) and x3 =
1 1
1
( 0 ). Similarly, the eigenvalues of B are the same as A but corresponding eigenvectors
−1
1 0
have flipped first and second element as y1 = ( 1 ), y2 = ( 1 ) and y3 = ( 1 ).
−1
1 1 −1
Remark. Though we use the same symbol, the matrix T in the above example is not
the modal matrix. It relates the two coordinate systems.
dc
= Ac, t>0 and c = c0 @ t = 0.
dt
1 1 1
where T = (x1 , x2 , x3 ) = ( 1 −1 0 ) is the modal matrix, i. e., matrix whose columns are
1 1 −1
eigenvectors xj of A. Then
dc dĉ
= Ac ⇒ T = ATĉ
dt dt
dĉ
⇒ = T−1 ATĉ = Λĉ, (4.79)
dt
dĉ1
= 0,
dt
dĉ2
= −2ĉ2
dt
dĉ3
= −3ĉ3 (4.80)
dt
Here, ĉi are the canonical variables in which the original system becomes diagonal.
Thus, the solution can be expressed in these canonical variables as
4.6 Diagonalization of matrices and similarity transforms | 113
ĉ1 = ĉ10
ĉ2 = ĉ20 e−2t
ĉ3 = ĉ30 e−3t . (4.81)
Further, since the rows of T−1 define the left eigenvectors, each left eigenvector defines
a canonical variable:
yT1 1 2 1
1
T −1
= ( yT2 ) = ( 1 −2 1 )
4
yT3 2 0 −2
1 2 1 c1
1
ĉ = T−1 c = ( 1 −2 1 ) ( c2 ) ,
4
2 0 −2 c3
which leads to
1
ĉ1 = (c + 2c2 + c3 )
4 1
1
ĉ2 = (c1 − 2c2 + c3 )
4
1
ĉ3 = (c1 − c3 ). (4.82)
2
Similarly,
1
ĉ10 = (c + 2c20 + c30 )
4 10
1
ĉ20 = (c10 − 2c20 + c30 )
4
1
ĉ30 = (c10 − c30 )
2
Thus, we can also write
These relations also define the initial conditions in the phase space for which the tran-
sient behavior is confined to a subspace. For example, if c10 = c30 , the concentration
vector at any time is in the subspace spanned by the first two eigenvectors. Since the
system is decoupled in the variables ĉj , we refer to them as the “canonical variables.”
114 | 4 Solution of linear equations by eigenvector expansions
When A is real and symmetric, the eigenvalues are real and eigenvectors can be nor-
malized to have unit length. They form an orthonormal set, i. e.,
xTi xj = δij
and represents either a rotation of the axes, i. e., the two coordinate systems are either
related by a pure rotation of the axes (about the origin) or a combination of rotations
and reflections.
1 1 1
T = (x1 x2 ) = ( ) (4.83)
√2 1 −1
⇒
1 1 1
T−1 = ( ) = TT . (4.84)
√2 1 −1
dc
= Ac (4.85)
dt
can be converted into canonical form as discussed earlier by using the transform
c = Tĉ ⇒ ĉ = T−1 c
⇒
c1 + c2
ĉ1 =
√2
c1 − c2
ĉ2 = .
√2
Figure 4.5: Canonical transform and rotation of axes for initial value problems.
cos θ − sin θ
Rot(θ) = ( ) (4.86)
sin θ cos θ
represents a (counterclockwise) rotation about the origin by an angle θ, while the ma-
trix
cos 2ϕ sin 2ϕ
Ref(θ) = ( ) (4.87)
sin 2ϕ − cos 2ϕ
represents a reflection about a line through the origin that makes an angle ϕ with the
x-axis. The set of all 2×2 orthogonal matrices describing rotations and reflections form
an orthogonal group, denoted by O(2). As discussed in Part II, orthogonal or unitary
matrices represent rotations and reflections.
Problems
1. (a) Given the matrix
−3 0 2
A=( 1 −2 1 )
1 1 −4
−1 2 2
A = ( −5 −3 −1 )
−3 −2 −2
−3 0 2
A=( 1 −2 1 )
1 1 −4
1
du
= Au; u(t = 0) = ( 0 )
dt
−1
2 0
d2 y dy
= Ay; y(0) = ( 2 ) (0) = ( 0 )
dt 2 dt
0 0
(a) Formulate the differential equations that describe the evolution of the con-
centration vector.
(b) Determine the eigenvalues and eigenvectors of the matrix in (a).
4.6 Diagonalization of matrices and similarity transforms | 117
(c) Give a physical interpretation to the eigenvalues and eigenvectors (Hint: Con-
sider what happens when the initial concentration vector is equal to the equi-
librium vector plus a constant multiple of one of the other two eigenvectors).
4. Consider the vibration of the spring-mass system shown in Figure 4.7. Suppose
that the masses are equal and the springs are identical.
(a) Formulate the Newton’s equations of motion and cast them in dimensionless
form.
(b) Determine the eigenvalues and eigenvectors of the matrix and give a physical
interpretation. Sketch the different modes of vibration.
(c) Generalize the above results to N equal masses with identical springs.
5. The thermal conductivity tensor of an anisotropic solid is given by
6 2 −2
K=( 2 6 −2 )
−2 −2 10
(a) Determine the principal conductivities (eigenvalues) and the principal axes
of conductivity (eigenvectors).
(b) Write the expanded form of the heat conduction equation
𝜕T
ρc = .(K. T)
𝜕t
𝜕T
ρc = k 2 T
𝜕t
where k is a scalar.
6. (a) A real square matrix A is called idempotent if A2 = A. Show that the eigenval-
ues of an idempotent matrix are equal to either 0 or 1.
(b) A real square matrix A is called orthogonal if A−1 = At (transpose of A). Show
that the real eigenvalues of an orthogonal matrix are equal ±1. Show also that
the complex eigenvalues (if any) of an orthogonal matrix have an absolute
value of unity.
118 | 4 Solution of linear equations by eigenvector expansions
Remark. The above special matrices play a very important role in group theory,
tensor analysis, numerical computation of eigenvalues, etc.
A B
L1 = ( )
B A
A B B
L2 = ( B A B )
B B A
(a) Show that the eigenvalues of the 2k × 2k matrix L1 are the same as those of
k × k matrices (A + B) and (A − B).
(b) Show that the eigenvalues of the 3k × 3k matrix L2 are the same as those of
(A + 2B) and (A − B) (repeated twice).
(c) Discuss how you would find the eigenvectors of L1 and L2 from the lower-
dimensional matrices.
Remark. Matrices of the above type appear in the transient analysis of coupled
systems such as cells (catalyst particles), reactors, distillation columns, etc.
du
C = Au u(t = 0)u0
dt
0 = u1 + 2u2 + u3
du2
= u2 + 6u3
dt
du3
= −u1 + u2 + 3u3 .
dt
9. Circulant matrices play an important role in digital signal processing and in the
computation of discrete and fast Fourier transforms. A circulant is a constant di-
agonal matrix with the special form
f0 f1 f2 . . fn−1
fn−1 f0 f1 . . fn−2
A = ( fn−2 fn−1 f0 . . fn−3 )
. . . . . .
( 1f f2 f3 . . f0 )
λ = f0 + f1 ω + f2 ω2 + ⋅ ⋅ ⋅ + fn−1 ωn−1 ,
ωk = 1; k = 1, 2, . . . , n
Fc = f.
Show that the discrete transform (and its inverse) can be computed by a sim-
ple matrix multiplication. Write the explicit form of these formulas (for the
transform and its inverse) in terms of the components of the Fourier matrix.
Note: The discrete transform is extremely useful in applications. Its computa-
tion involves n2 multiplications. However, by taking advantage of the special
structure of the Fourier matrix and choosing n as a power of 2, we can reduce
the number of multiplications to n log2 n. This is the basis for the fast Fourier
transform.
120 | 4 Solution of linear equations by eigenvector expansions
du
= Au, u(t = 0) = u0
dt
where A is a n × n matrix with real elements and u is a n × 1 vector. Suppose that all
the eigenvalues of A are simple. (a) Write the general form of the solution to the
initial value problem in terms of eigenvalues, eigenvectors and eigenrows of A.
(b) Discuss the asymptotic form (t → ∞) of the solution if (i) all eigenvalues have
negative real part, (ii) one zero eigenvalue while all others have negative real part
and (iii) if A has a pair of purely imaginary eigenvalues while all others have neg-
ative real part.
11. Consider the linear system Ax = b, where A is an m × n matrix and x, b are n × 1
and m×1 vectors, respectively. Reason that a necessary and sufficient condition for
the system Ax = b to have a solution is that every solution of the system y∗ A = 0
(where y∗ is a row vector) should also satisfy y∗ b = 0.
12. A square matrix A is called normal if AA∗ = A∗ A. Show that the eigenvalues of a
normal matrix are real.
13. Suppose that A = XΛX−1 , where Λ is a diagonal matrix. Find a similarity transfor-
mation Y that diagonalizes the matrix
0 A
B=( ).
A 0
What is the diagonal form of B? Here, all matrices except B are n × n while B are
2n × 2n.
14. The same similarity transformation diagonalizes both matrices A and B. Show
that A and B must commute. (a) Two Hermitian matrices A and B have the same
eigenvalues. Show that A and B are related by unitary similarity transformation.
15. The transient response of a two-phase system (containing a solid and fluid) is de-
scribed by the equation
dc
= Ac + b(c, t) (1)
dt
c − ϵ1 1
ϵf
where c = ( cfs ) (subscript f refers to the fluid and s to the solid) A = ( 1
f
),
1−ϵf
− 1−ϵ1
f
b (c ,c ,t)
b = ( b1 (cf ,cs ,t) ), ϵf = volume fraction/capacitance of fluid phase and b is some
2 f s
nonlinear function of c as well as time. (a) Determine the eigenvalues and eigen-
vectors of A and give physical interpretation. (b) What is the transformation that
will reduce the linear part of (1) to a diagonal form? Perform this transformation
and write equation (1) in terms of the canonical variables. Give a physical inter-
pretation of the canonical variables.
4.6 Diagonalization of matrices and similarity transforms | 121
16. Consider the linear system Au = b, where A is an n × n matrix and u and b are
n × 1 vectors. Suppose that rank of A = n − 2 (a) Write the conditions the vector b
has to satisfy so that the system is consistent and has solutions. (b) Assuming that
the conditions in (a) are satisfied, write down the general form of the solution in
terms of the eigenvalues, eigenvectors and eigenrows of A. (c) What is the general
form of the solution to the initial value problem dudt
= Au − b, u(0) = 0, where A
and b are as above?
17. Consider the 3-tank interacting system with the tanks having different volumes
but the exchange flows being all identical. (a) Formulate the transient model
and cast it in dimensionless form (but without dividing by the capacitances).
(b) Rewrite the model in (a) by dividing each equation by the respective capaci-
tance term and reason that the resulting model may be interpreted as three equal
sized tanks but with different exchange flow rates. (c) Give a physical interpreta-
tion of the transient system obtained when the matrix appearing in (b) is replaced
by its transpose (or adjoint matrix).
5 Solution of linear equations containing a square
matrix
While the biorthogonal expansions were useful to express the solution of linear equa-
tions in terms of eigenvectors and eigenrows, they may not be convenient for numer-
ical calculations, especially when the order of the matrix is large. In this chapter, we
discuss other important properties of square matrices and alternate methods for deter-
mining the solutions of linear equations containing a square matrix A. The methods
discussed here also give a procedure to calculate functions defined on square matri-
ces.
Theorem. Every square matrix satisfies its own characteristic equation, i. e.,
Pn (A) = 0
where
Pn (λ) = |A − λI|
= (−λ)n + a1 (−λ)n−1 + a2 (−λ)n−2 + ⋅ ⋅ ⋅ + an−1 (−λ) + an
1 2
A=( )
3 4
1−λ 2
P2 (λ) =
3 4−λ
= λ2 − 5λ − 2
A2 − 5A − 2I = 0.
https://doi.org/10.1515/9783110739701-005
5.1 Cayley–Hamilton theorem | 123
1 2 1 2 7 10
A2 = ( )( )=( )
3 4 3 4 15 22
5 10 2 0 7 10
5A + 2I = ( )+( )=( )
15 20 0 2 15 22
7 10 7 10
A2 − 5A − 2I = ( )−( )
15 22 15 22
0 0
=( )=0
0 0
Proof. Let
where, adj(A − λI) = classical adjoint of (A − λI). As seen in Chapter 2, this is the matrix
formed from (n − 1) × (n − 1) determinants of the minors of (A − λI). Thus, each element
of the matrix adj(A − λI) is at most a polynomial of degree (n − 1) and by collecting the
coefficients of various powers of λ, we can write
where B0 , B1 , . . . , Bn−1 are n × n constant matrices. Substituting (5.1) and (5.3) in (5.2)
⇒
(A−λI)(B0 λn−1 +B1 λn−2 +⋅ ⋅ ⋅+Bn−2 λ+Bn−1 ) = [(−λ)n +a1 (−λ)n−1 +a2 (−λ)n−2 +⋅ ⋅ ⋅+an ]I (5.4)
−An B0 = (−1)n An
An B0 − An−1 B1 = (−1)n a1 An−1
An−1 B1 − An−2 B2 = (−1)n−2 a2 An−2
.
.
2
A Bn−2 − ABn−1 = (−1)an−1 A
ABn−1 = an I
Pn (A) = 0 (5.10)
An+1 = α1 An + α2 An−1 + ⋅ ⋅ ⋅ + αn A
= α1 (α1 An−1 + ⋅ ⋅ ⋅ + αn I) + α2 An−1 + ⋅ ⋅ ⋅ + αn A
= β1 An−1 + β2 An−2 + ⋅ ⋅ ⋅ + βn I (5.12)
where β1 = α12 +α2 , . . . , βn = α1 αn are some constants. It follows from equation (5.12) that
An+1 can be expressed as a polynomial of degree (n−1) in A. Continuing this procedure,
we see that if
∞
Q(λ) = ∑ ci λi (5.13)
i=0
5.2 Functions of matrices | 125
is any function that has a Maclaurin’s series expansion, then Q(A) can be expressed
as a polynomial of degree (n − 1) in A.
Now, suppose that A is invertible, i. e., A−1 exists. Then, multiplying both sides of
(5.11) by A−1 , we get
Assuming that αn ≠ 0, (this is the case if A is invertible), we can rewrite the above
equation as
for some constants {c1 , c2 , . . . , cn }. This result can be used to extend the definition of
familiar scalar functions to functions of matrices as well as to compute them. For ex-
ample, we define the exponential matrix function by the series
∞
Ai A2 A3
eA = ∑ =I+A+ + + ⋅⋅⋅ (5.16)
i=0
i! 2! 3!
∞
(−1)i A2i+1 A3 A5
sin A = ∑ =A− + + ⋅⋅⋅ (5.17)
i=0
(2i + 1)! 3! 5!
126 | 5 Solution of linear equations containing a square matrix
It follows from Cayley–Hamilton theorem that the above matrix functions can be ex-
pressed as polynomials of degree (n − 1) in A. These function may also be computed if
we can evaluate the coefficients in equation (5.15).
and in general,
Axj = λj xj
Similarly,
5.2 Functions of matrices | 127
∞
f (x) = ∑ ck x k
k=−∞
then
Now, let
But C-H theorem gives equation (5.19). From this equation, we get
Since xj ≠ 0 ⇒
Thus, the n constants (c1 , . . . , cn ) can be determined by solving the linear equations
defined by equation (5.22). If an eigenvalue λi is repeated r times (r ≥ 1), then we
consider the relation
Example 5.2. Develop a formula for f (A) when A is a 2 × 2 matrix with distinct eigen-
values.
From the Cayley–Hamilton theorem, we can write
f (A) = c1 A + c2 I,
f (λ1 ) = c1 λ1 + c2
f (λ2 ) = c1 λ2 + c2
Thus, for any 2 × 2 matrix with distinct eigenvalues we obtain the formula
Example 5.3. Develop a formula for f (A) when A is a 2 × 2 matrix with repeated eigen-
values.
Now, the constants c1 and c2 are given by
f (λ1 ) = c1 λ1 + c2
f ′ (λ1 ) = c1 ⇒ c2 = f (λ1 ) − λ1 f ′ (λ1 )
Example 5.4. Develop a formula for f (At) when A is a 2 × 2 matrix with distinct eigen-
values.
First, we note that if λ1 and λ2 are eigenvalues of A, then λ1 t and λ2 t are eigenvalues
of At. This follows from the fact that
where the last equality assumes that λi is an eigenvalue of A, and hence satisfies the
characteristic equation. Now replace λ1 by λ1 t and λ2 by λ2 t and A by At in the formula
of Example (5.2). This gives
e λ1 t − e λ2 t λ eλ2 t − λ2 eλ1 t
eAt = A+ 1 I
(λ1 − λ2 ) (λ1 − λ2 )
1 λ2 1 λ1
= e λ1 t [ A− I] + eλ2 t [− A+ I]
(λ1 − λ2 ) (λ1 − λ2 ) (λ1 − λ2 ) (λ1 − λ2 )
This property may be used to write the solution of many vector differential equations
containing a square matrix A with constant coefficients in terms of functions of A using
the solution for the scalar case. This is illustrated in this section with several examples.
du
= Au, u = u0 @ t = 0 (5.26)
dt
We claim that
u = eAt u0 (5.27)
is a solution to the above initial value problem. To prove this, first we verify that (5.27)
satisfies the initial condition:
du
= eAt Au0 = AeAt u0 (since A commutes with eAt )
dt
= Au
Thus, the expression given by equation (5.27) is the solution to the initial value prob-
lem defined by (5.26).
d2 u
= −Au, (5.28)
dt 2
du
u(0) = u0 , (0) = v0 (5.29)
dt
Using the Cayley–Hamilton theorem, the general solution of equation (5.28) may be
written as
The constant vectors c1 and c2 may be determined from the initial conditions:
u(0) = u0 ⇒ c1 = u0
du
= −√A sin(√At)c1 + √A cos(√At)c2
dt
du
(0) = v0 ⇒ √Ac2 = v0 ⇒ c2 = (√A)−1 v0
dt
Thus,
is a formal solution. Note that cos √At and [sin √At](√A)−1 contain only integral pow-
ers of A, and hence are polynomials of degree (n − 1) in A.
d2 c
= Φ2 c, 0 < ξ < 1
dξ 2
c = c0 @ ξ = 0
dc
= 0@ξ = 1
dξ
representing diffusion and reaction in a flat plate geometry (see Section 4.5.6). The
solution may be written as
dc
= 0 @ ξ = 1 ⇒ α1 = 0
dξ
c = c0 @ ξ = 0 ⇒ c0 = [cosh Φ]α2
⇒ c(ξ ) = [cosh Φ(1 − ξ )](cosh Φ)−1 c0
is the solution. The quantity of practical interest is the average (or observed) reaction
rate vector in the pore defined by
1
robs = ∫ K.c(ξ ) dξ
0
= KΦ−1 tanh(Φ)c0
= K∗ c0
where Φ2 = L2 D−1 K and K∗ is the diffusion disguised rate constant matrix given by
The two limiting cases of equation (5.32) can be seen more easily now. For the case of
negligible pore diffusional limitations (ϕ → 0), we have A∗ = A while for the case of
strong pore diffusional limitations (ϕ → ∞), we have A∗ = ϕ1 √AM.
If we require that the polynomial has to pass through the point (xj , Qn−1 (xj )), we get
Qn−1 (xj )
cj = (5.34)
∏ni=1,j (xj − xi )
132 | 5 Solution of linear equations containing a square matrix
Here, the notation ∏ni=1,j means that the product excludes the term corresponding to
i = j. Thus,
n ∏ni=1,j (x − xi )
Qn−1 (x) = ∑ Qn−1 (xj ) (5.35)
j=1 ∏ni=1,j (xj − xi )
We now convert equation (5.35) into a matrix identity by the following assumptions:
(i) Assume that the square matrix A has n distinct eigenvalues {λ1 , λ2 , . . . , λn } and re-
place xj by λj and (ii) Replace x by A on both sides of equation (5.35). This gives
n ∏ni=1,j (A − λi I)
Qn−1 (A) = ∑ Qn−1 (λj ) (5.36)
j=1 ∏ni=1,j (λj − λi )
Note that the order of the products on the RHS. of this equation is immaterial since
(A−λi I) and (A−λj I) commute. As it stands, equation (5.36) is valid for any polynomial
of degree (n−1) in A. However, we note that any arbitrary polynomial in A and A−1 may
be expressed as a polynomial of degree (n − 1) in A (Cayley–Hamilton theorem). Now,
if f (λ) is any function of the form
∞
f (λ) = ∑ ck λk (5.37)
k=−∞
n ∏ni=1,j (A − λi I)
f (A) = ∑ f (λj ) (5.38)
j=1 ∏ni=1,j (λj − λi )
This is Sylvester’s formula. We can simplify it further by using the identity (see proof
below).
n
∏ (A − λi I) = (−1)n−1 adj(A − λj I). (5.39)
i=1,j
We note that
n n
∏ (λj − λi ) = (−1)n−1 ∏ (λi − λj ) (5.40)
i=1,j i=1,j
n (−1)n−1 adj(A − λj I)
f (A) = ∑ f (λj )
j=1 (−1)n−1 ∏ni=1,j (λi − λj )
n adj(A − λj I)
= ∑ f (λj ) (5.41)
j=1 ∏ni=1,j (λi − λj )
Defining
adj(A − λj I)
Ej = (5.42)
∏ni=1,j (λi − λj )
It can be shown that the matrices Ej (j = 1, 2, 3, . . . , n) have rank one and satisfy the
relations
n
Ei Ej = 0, i ≠ j and ∑ Ei = I (5.44)
i=1
The matrix Ei is called a projection. We shall prove the above relations as well as estab-
lish other properties of projections in the next section when we deal with the spectral
theorem.
where Φ(y, x) = Φ(x, y) is of degree (n − 1). Convert this scalar polynomial identity to
a matrix identity by letting x = Iλ, y = A ⇒
But
134 | 5 Solution of linear equations containing a square matrix
1 1
A=( )
4 1
λ1 = −1, λ2 = 3.
1 1
x1 = ( ); x2 = ( )
−2 2
1
y∗1 = ( 2
− 41 ) ; y∗2 = ( 1
2
1
4
)
2 1 2 −1
(A − λ1 I) = ( ) ⇒ adj(A − λ1 I) = ( )
4 2 −4 2
−2 1 −2 −1
(A − λ2 I) = ( ) ⇒ adj(A − λ2 I) = ( )
4 2 −4 −2
5.5 Spectral theorem | 135
1 −1
adj(A − λ1 I)
E1 = =( 2 4
) = x1 y∗1
(λ2 − λ1 ) −1 1
2
1 1
adj(A − λ2 I)
E2 = =( 2 4
) = x2 y∗2
(λ1 − λ2 ) 1 1
2
1
1
2
−1
4
1
2
−1
4 4
+ 41 −1
8
− 81 1
2
−1
4
E21 = ( )( )=( )=( ) = E1
−1 1
2
−1 2
1
− 21 − 21 4
1
+ 41 −1 1
2
E22 = E2
E1 + E2 = I
A = λ1 E1 + λ2 E2
f (A) = f (λ1 )E1 + f (λ2 )E2
For example,
1 −1 1 1
2 4 2 4
A100 = (−1)100 ( 1
) + (3)100 ( 1
)
−1 2
1 2
Let A be a square matrix with simple eigenvalue λj . Let xj and y∗j be the corre-
sponding eigenvector and eigenrow. Consider the dyadic product of xj and y∗j defined
by
xj y∗j
Ej = (5.51)
y∗j xj
[Remark: Note that y∗j xj = ⟨xj , yj ⟩ is a scalar and is a normalization constant while
xj y∗j is an n × n matrix of rank one.] We now show that Ej is a projection using the fact
that matrix (or vector) multiplication is associative:
xj y∗j xj y∗j
E2j =
y∗j xj y∗j xj
xj (y∗j xj )y∗j
=
(y∗j xj )(y∗j xj )
xj y∗j
= = Ej (5.52)
y∗j xj
136 | 5 Solution of linear equations containing a square matrix
Now,
n
xi y∗i
E i u = ∑ αj ( )x
j=1
y∗i xi j
= αi xi (5.55)
It follows from equations (5.54) and (5.55) that Ei u is the component of u in the space
spanned by the eigenvector xi , i. e., Ei is the projection operator onto the eigenspace
spanned by xi . If i ≠ j, we have
xi y∗i xj yj
∗
Ei Ej = =0
y∗i xi y∗j xj
Similarly,
Ej Ei = 0.
Proof. Let x1 , x2 , x3 , . . . , xn be the eigenvectors and y∗1 , y∗2 , . . . , y∗n be the eigenrows of A.
Defining
xi y∗i
Ei =
y∗i xi
T = (x1 x2 . . . xn )
AT = A(x1 x2 . . . xn )
= (Ax1 Ax2 . . . Axn )
= (λ1 x1 λ2 x2 . . . λn xn )
λ1 0 0 . . 0
0 λ2 0 . . 0
= (x1 x2 . . . xn ) ( 0 0 λ3 . . 0 )
. . . . . .
0 0 0 . . λn
= TΛ, (5.56)
A = TΛT−1 (5.57)
We now show that the rows of the matrix T−1 are the normalized eigenrows of A. Let
y∗1 /(y∗1 x1 )
[ y∗ /(y∗ x ) ]
[ 2 2 2 ]
[ ]
S=[ . ]
[ ]
[ . ]
[ yn /(yn xn ) ]
∗ ∗
Then
y∗1 /(y∗1 x1 )
y∗2 /(y∗2 x2 )
ST = ( . ) (x1 x2 . . . xn )
.
y∗n /(y∗n xn )
1 0 . . 0
0 1 . . 0
=( )=I
. . . . .
0 0 . . 1
S = T−1 (5.58)
Thus,
y∗1 /(y∗1 x1 )
y∗2 /(y∗2 x2 )
A = (λ1 x1 λ2 x2 . . . λn xn ) ( . )
.
y∗n /(y∗n xn )
n
xi y∗i
= ∑ λi
i=1
(y∗i xi )
n
= ∑ λi Ei
i=1
A1 = A − λ1 E1 (5.59)
A1 x1 = (A − λ1 E1 )x1
=0
A1 xj = λj xj − λ1 E1 xj
= λ j xj ; j = 2, . . . , n
A2 = A − λ1 E1 − λ2 E2 (5.60)
A2 x1 = 0, A 2 x2 = 0
A2 xj = λj xj ; j = 3, . . . , n
An xj = 0; j = 1, 2, 3, . . . , n (5.62)
We now show that equation (5.62) implies that An = 0 (n × n zero matrix), and hence
5.5 Spectral theorem | 139
n
A = ∑ λi Ei
i=1
α1T
[ T ]
[ α2 ]
[ ]
[ . ] xj = 0
[ ]
[ ]
[ . ]
T
[ αn ]
where αkT = k-th row of An . Since the eigenvectors are linearly independent, the only
solution to the system of equations
αkT xj = 0, k = 1, 2, 3, . . . , n
αk = 0 for each k = 1, 2, 3, . . . , n
⇒
An = 0
we let
n
S = ∑ Ei
i=1
(S − I)u = 0 (5.63)
Equation (5.63) implies that each row of the matrix S − I is orthogonal to n linearly
independent vectors (u1 , u2 , . . . , un ).
140 | 5 Solution of linear equations containing a square matrix
n
A = ∑ λj Ej
j=1
n n
⇒ A2 = (∑ λj Ej )(∑ λi Ei )
j=1 i=1
n
= ∑ λj2 Ej
j=1
Similarly,
n
Ak = ∑ λjk Ej , for k = 0, 1, 2, . . .
j=1
∞
f (λ) = ∑ ci λi ,
i=0
we have
n
f (A) = ∑ f (λj )Ej
j=1
Other and more general forms of the spectral theorem may be found in the books on
linear algebra (Halmos [20]; Naylor and Sell [24]; Lipschutz and Lipson [22]).
5.5 Spectral theorem | 141
−3 2
A=( )
4 −5
whose eigenvalues are λ1 = −1, λ2 = −7. We have eigenvectors and normalized eigen-
rows:
1 1
x1 = ( ); x2 = ( )
1 −2
2 1 1
y∗1 = ( 3 3
); y∗2 = ( 3
− 31 )
2 1
x1 y∗1 3 3
E1 = =( )
y∗1 x1 2 1
3 3
1
x y∗ 3
− 31
E2 = ∗2 2 = ( )
y2 x2 − 32 2
3
2 1 2 1 2 1
3 3 3 3 3 3
E21 = ( 2 1
)( 2 1
)=( 2 1
) = E1
3 3 3 3 3 3
1
3
− 31 1
3
− 31 1
3
− 31
E22 =( )( )=( ) = E2
− 32 2
3
− 32 2
3
− 32 2
3
2 1 1
3 3 3
− 31 0 0
E1 E2 = ( )( )=( )
2 1
− 32 2 0 0
3 3 3
2 1 1
3 3 3
− 31 1 0
E1 + E2 = ( )+( )=( )
2 1
− 32 2 0 1
3 3 3
2 1 1
3 3 3
− 31
λ1 E1 + λ2 E2 = −1 ( 2 1
) − 7( )
3 3
− 32 2
3
− 32 − 7
3
− 31 + 7
3
=( )
− 32 + 14
3
− 31 − 14
3
−3 2
=( )=A
4 −5
f (A) = f (λ1 )E1 + f (λ2 )E2 .
To show that the same expression is obtained when we use the Cayley–Hamilton the-
orem, we note that
f (A) = α0 I + α1 A,
f (λ1 ) = α0 − α1 = f (−1)
f (λ2 ) = α0 − 7α1 = f (−7).
1 0
E1 = e1 eT1 = ( )
0 0
and
0 0
E2 = e2 eT2 = ( ).
0 1
z
If z = ( z21 ) be any vector, then we can write
z = z1 e1 + z2 e2
= E1 z + E2 z,
5.6 Projections operators and vector projections | 143
where
1 0 z z
E1 z = ( ) ( 1 ) = ( 1 ) = z1 e1
0 0 z2 0
0 0 z 0
E2 z = ( )( 1 ) = ( ) = z2 e2
0 1 z2 z2
E2j = Ej , j = 1, 2
E1 E2 = E2 E1 = 0,
2
1 0
∑ Ej = I = ( ).
j=1
0 1
yT2 x1 = yT1 x2 = 0 and yT1 x1 = yT2 x2 = 1. Here, the nonorthogonal projection matrices
are given by
1 2 1
E1 = x1 yT1 = ( )
3 2 1
1 1 −1
E2 = x2 yT2 = ( )
3 −2 2
1 2 1 2 1
E1 z = ( )( ) = ( ) = x1 ,
3 2 1 −1 1
1 1 −1 2 1
E2 z = ( )( )=( ) = x2 ,
3 −2 2 −1 −2
Hence, for the case of real eigenvalues, the eigenvectors and normalized left eigenvec-
tors define the projections onto the one-dimensional eigenspaces.
du
= Au, t > 0; u = u0 @ t = 0
dt
with A = ( −3 2
4 −5 ). Eigenvalues of A are λ1 = −1 and λ2 = −7 with corresponding eigen-
vectors
1 1
x1 = ( ) and x2 = ( ),
1 −2
2/3 1/3
E1 = x1 yT1 = ( )
2/3 1/3
and
1/3 −1/3
E2 = x2 yT2 = ( )
−2/3 2/3
where
u
Assuming u0 = ( u20
10
),
2u10 + u20
α1 = yT1 u0 =
3
and
u10 − u20
α2 = yT2 u0 = .
3
1
u10 = u20 = β or u0 = β ( ) = βx1
1
then
In other words, when initial condition is in the eigenstate x1 , it always remains in the
same state (see Figure 5.3). Similarly, when initial state is in x2 , i. e.,
u20
α1 = 0 or u10 = − = γ,
2
then
1
u0 = γ ( ) = γx2
−2
and
2 1
u(t) = βx e−t + βx2 e−7t .
3 1 3
Since, e−7t → 0 much faster than e−t (i. e., the component in eigenstate x2 vanishes
faster), the solution for t ≫ 1 simplifies to
2
u(t) = βx e−t , t ≫ 1.
3 1
In other words, the solution approaches the steady state along the x1 direction. This
is also true for any initial condition except when the initial condition is along x2 (see
Figures 5.3 and 5.4).
The trajectory of solution in (u1 , u2 ) plane can be determined for any general initial
condition corresponding to the point (u10 , u20 ) in phase plane. Note that solution in
two dimensional phase plane can be represented as
parametrically in t, which can be eliminated by solving for e−t and e−7t that leads to
the equation for the trajectory in the form of
Figure 5.4: Trajectory of solution with stable node in the phase-plane for various initial points. The
component of solution along x2 direction (with large eigenvalue) vanishes first, and solution ap-
proaches steady state in x1 direction (smaller eigenvalue).
in the phase-plane. Such trajectories for various initial points are demonstrated in
Figure 5.4. The trivial steady-state here is stable because of negative real eigenvalues
(and is referred to as node in the dynamical systems literature).
5.6.4 Geometrical interpretation with complex eigenvalues with negative real part
Consider a matrix A = ( −1 1
−1 −1 ), whose eigenvalues are λ1 = −1 + i and λ2 = −1 − i with
corresponding eigenvectors
1 1
x1 = ( ) and x2 = ( ),
i −i
du
= Au, t>0 and u = u0 = (u10 , u20 )T @ t = 0
dt
can be expressed as
u = c1 x1 eλ1 t + c2 x2 eλ2 t
= y∗1 u0 eλ1 t x1 + y∗2 u0 eλ2 t x2
5.6 Projections operators and vector projections | 149
The solution trajectory for this example is shown in Figure 5.5 for various initial con-
ditions. As can be seen from this figure, the solution is oscillatory in time as expected
because of complex eigenvalues but has stable focus since real part of the complex
eigenvalue is negative.
Figure 5.5: Solution trajectory with stable focus in phase-plane with complex eigenvalues leading to
oscillating response.
Consider a matrix A = ( −1 1
1 −1 ), whose eigenvalues are λ1 = 0 and λ2 = −2 with corre-
sponding eigenvectors
1 1
x1 = ( ) and x2 = ( ),
1 −1
and normalized eigenrows yT1 = (1/2, 1/2) and yT2 = (1/2, −1/2), respectively. The initial
value problem:
du
= Au, t>0 and u = u0 = (u10 , u20 )T @ t = 0
dt
150 | 5 Solution of linear equations containing a square matrix
represents the mass exchange between two identical tanks in absence of convection
and reaction. The solution for this example can be expressed as
u = c1 x1 eλ1 t + c2 x2 eλ2 t
= y∗1 u0 eλ1 t x1 + y∗2 u0 eλ2 t x2
u10 + u20 1 u − u20 1
=( ) ( ) + ( 10 )( ) e−2t
2 1 2 −1
u +u
The term yT1 u = u1 +u
2
2
= 10 2 20 is constant and denotes the mass conservation of
species. The solution trajectory for this example is shown in Figure 5.6 for various
initial conditions. The bullet points in this figure represents the equilibrium state (x1 )
corresponding to zero eigenvalue. Thus, for any initial condition, the component of
the solution along x2 direction decreases with time (due to negative eigenvalue λ2 )
and approaches the equilibrium composition at steady-state, which is practically for
all t ≫ 2.
Figure 5.6: Solution trajectory in phase-plane for various initial conditions: the bullet points repre-
senting the equilibrium state (corresponding to zero eigenvalue).
Consider the flow system as shown in Figure 5.7 containing three interacting tanks
with volumes V1 = 21 V2 = V3 = VR , and a fixed exchange rates qe . Assuming no net
5.6 Projections operators and vector projections | 151
1
Figure 5.7: Flow system containing three interacting tanks with volumes V1 = V
2 2
= V3 = VR , with a
fixed mass-exchange rates in absence of net convection and reaction.
convection and reaction, and tanks being well mixed, the model equations describing
the species balances in these tanks can be expressed as
dc1
VR = −2qe c1 + qe c2 + qe c3
dt ′
dc
2VR ′1 = qe c1 − 2qe c2 + qe c3
dt
dc1
VR ′ = qe c1 + qe c2 − 2qe c3 ,
dt
qe ′
t= t
VR
dc
= Ac, t>0 and c = c0 = (c10 , c20 , c30 )T @ t = 0
dt
−2 1 1
A = ( 1/2 −1 1/2 ) .
1 1 −2
Note that the matrix A is not symmetric, but has zero row sums, implying the existence
of zero eigenvalue, which also denotes the mass conservation of species.
152 | 5 Solution of linear equations containing a square matrix
Eigensystem
The eigenvalues of matrix A are λ1 = 0, λ2 = −2 and λ3 = −3 corresponding to the
eigenvectors
1 1 1
x1 = ( 1 ) , x2 = ( −1 ) and x3 = ( 0 )
1 1 −1
1 1 1
yT1 = (1, 2, 1), yT2 = (1, −2, 1) and yT3 = (1, 0, −1).
4 4 2
Interpretation of eigenvectors
1
The eigenvector x1 = ( 1 ) corresponding to the eigenvalue λ1 = 0 indicates the steady
1
state or equilibrium composition cs :
1
dc
@ t → ∞, = 0 ⇒ Acs = 0 ⇒ cs = α1 x1 = α1 ( 1 ) .
dt
1
In order to obtain α1 , initial condition can be utilized. Since yT1 is the eigenrow corre-
sponding to zero eigenvalue λ1 , we have
dc
yT1 = yT1 Ac = 0T c = 0 ⇒ yT1 c = constant
dt
⇒ yT1 cs = yT1 c0
c + 2c20 + c30
⇒ α1 = yT1 c0 = 10
4
Interpretation of eigenrows
To see the meaning of eigenrows, we express the solution as
3
yTi c0 3 3
c(t) = ∑ xi eλi t = ∑(yTi c0 )xi eλi t = ∑ αi xi eλi t
i=1 yTi xi i=1 i=1
= α1 x1 + α2 x2 e −2t
+ α3 x3 e −3t
, αi = yTi c0
We note that
yTi c = α1 = yTi c0 .
Thus, the first eigenrow corresponds to the overall mass-conservation and determines
1
the equilibrium/steady-state composition as shown earlier, i. e., cs = α1 ( 1 ), where
1
c(t) = α1 x1 + α3 x3 e−3t ,
which implies that yT2 c = 0. In other words, if the initial concentration vector c0 is
such that α2 = 0, i. e.,
c0 = α1 x1 + α3 x3 ,
then concentration c(t) at all times remains a linear combination of x1 and x3 with
c(t = 0) = c0 and c(t → ∞) = α1 x1 = cs . It can be seen by combining the condition of
α2 = 0 with mass balance constraint as follows:
α1 + α3 1 1
0
c =( α1 ) = α1 ( 1 ) + α3 ( 0 )
α1 − α3 1 −1
= α1 x1 + α3 x3 = cs + α3 x3
154 | 5 Solution of linear equations containing a square matrix
which approaches to steady state along a straight line on triangular diagram as shown
0.5
in Figure 5.8 (see the line DEF). For example, for special initial condition c0 = ( 0.25 )
0
corresponding to the point D, we have
1 1
α1 = yT1 c0 = , α2 = yT2 c0 = 0 and α3 = yT3 c0 = ,
4 4
1 + e−3t
1 1 1
c(t) = x1 + x3 e−3t = ( 1 ).
4 4 4
1−e −3t
Figure 5.8: Solution trajectory in triangular phase-plane: E is the equilibrium state, AEB is the slow
transient state (x2 ) and DEF is the fast transient state (x3 ).
α1 + α2 1 1
0
c = ( α1 − α2 ) = α1 ( 1 ) + α2 ( −1 )
α1 + α2 1 1
= α1 x1 + α2 x2 = cs + α2 x2 ,
and approaches to steady state along the straight line BEA as shown in triangular
0
phase-trajectory in Figure 5.8. For example, for the special initial condition c0 = ( 0.5 )
0
corresponding to the point B, we have
1 −1
α1 = yT1 c0 = , α2 = yT2 c0 = and α3 = yT3 c0 = 0.
4 4
1 − e−2t
1 1 1
c(t) = x1 − x2 e = ( 1 + e−2t ) ,
−2t
4 4 4
1 − e−2t
Problems
1. Given the matrix
0.15 −0.01
A=( )
−0.25 0.15
du
= Au + f(t), u(t = 0) = u0
dt
is given by
t
At
u = e u0 + ∫ eA(t−τ) f(τ)dτ
0
(b) Determine a similar formula for the solution of the inhomogeneous system
d2 u du
= −Au + f(t), u(t = 0) = u0 , (t = 0) = v0
dt 2 dt
156 | 5 Solution of linear equations containing a square matrix
3 1 1
A=( 2 4 2 )
1 1 3
(a)
Compute the eigenvalues and eigenvectors.
(b)
What matrix in a similarity transform will reduce A to diagonal form.
(c)
Determine the spectral decomposition of A.
(d)
If f (λ) is any function defined on the spectrum of A, develop a formula for
determining f (A).
4. We had three representations for the solution of
du
= Au, u(t = 0) = u0
dt
(a) n y∗j u0
u=∑ exp(λj t)xj
j=1
y∗j xj
(b)
u = eAt u0
(c) n
u = ∑ exp(λj t)Ej u0
j=1
k2 k4
B →
A→
← C
←
k1 k3
where
k k k
k1 = k, k2 = , k3 = , k4 = .
2 2 4
(a) Formulate the species balances for steady-state operation and show that the
concentration vector in the stream leaving tank N is given by
kτ
−N
cN = [I + A] c0 ,
N
where c0 is the feed concentration vector, τ is the space time and A is the
matrix of relative rate constants defined by
5.6 Projections operators and vector projections | 157
1 − 21 0
A = ( −1 1 − 41 )
0 − 21 1
4
(b) If the reactor is an ideal plug flow reactor, show that the exit concentration
vector is given by
cN = exp{−kτA}c0 .
(c) Compute the exit concentration vector for cases (a) and (b) above for the fol-
lowing parameter values:
1
N = 10, kτ = 2, c0 = ( 0.2 )
0
d2 c
D = Kc, 0 < x < L
dx2
dc
D = kc (c − c0 ) @ x = 0
dx
dc
= 0@x = L
dx
L
1
robs = ∫ Kc(x) dx,
L
0
show that
robs = K∗ c0 ,
D
K∗ = Φ tanh Φ(Bi + Φ tanh Φ)−1 Bi,
L2
K∗ = KH
where H is called the effectiveness matrix, show that when D is diagonal and
K is nonsingular, H is given by H = Φ−1 tanh Φ(I + Bi−1 Φ tanh Φ)−1
k1 k2
(d) Use the result in (b) to calculate H for the reaction networks (i) A → B → C,
k1 k2 k3
(ii) A → B → C, A → D when the diffusivities and mass-transfer coef-
ficients of all the species are identical. Give a physical interpretation of the
diagonal elements of H.
(e) Consider again the special case in which
Lkc
D = dI; Bi = Bim I; Bim =
d
and we write
kL2
K = kA; K∗ = kA∗ ; Φ2 = ϕ2 A; ϕ2 =
d
where A (A∗ ) is the relative rate constant (diffusion disguised relative rate con-
stant) matrix. Show that the result in (b) may be expressed in dimensionless
form as
k2 k4
B →
A→
← C
←
k1 k3
where
k k k
k1 = k, k2 = , k3 = , k4 = .
2 2 4
7. The concentration vector in a tubular reactor with axial dispersion satisfies the
equations
1 d2 c dc
− − Dac = 0 0 < ξ < 1
Pe dξ 2 dξ
1 dc
= c − c0 @ ξ = 0
Pe dξ
dc
= 0@ξ = 1
dξ
Pe Pe Pe
−1
f (Pe, Da) = 4 exp( )Y[(I + Y)2 exp( Y) − (I − Y)2 exp(− Y)]
2 2 2
and
1/2
4Da
Y = (I + )
Pe
(c) Use the result in (b) to calculate c(ξ = 1) for the reaction network in problem
#5 and numerical values:
1
Pe = 2.0; kτ = 2; c0 = ( 0.2 ) ;
0
1 − 21 0
Da = kτA; A = ( −1 1 − 41 ) .
0 − 21 1
4
6 Generalized eigenvectors and canonical forms
When an n × n matrix A that is not symmetric has repeated eigenvalues and fewer than
n eigenvectors, it is not possible to find a matrix P that reduces A to a diagonal form.
However, Jordon’s theorem states that any n × n matrix can be reduced to a canoni-
cal form called the Jordan canonical form. This chapter gives a brief introduction to
generalized eigenvectors and theory of Jordan forms.
Ax = λx (6.1)
where A is a matrix of constants and the solution x is the eigenvector that depends on
eigenvalue λ. Rewriting equation (6.1) as
where
Ax2 = λ1 x2 + x1 (6.4)
or
Now, we have two eigenvectors: one regular (x1 ) and one generalized eigenvector (x2 )
of rank 2. This procedure can be generalized if the eigenvalues λ = λ1 has multiplicity
greater than 2 (say r ≥ 2) and the rank of (A − λ1 I) is smaller than (n − r).
https://doi.org/10.1515/9783110739701-006
6.1 Repeated eigenvalues and generalized eigenvectors | 161
Defining
1 ′′
x3 = x (λ) (6.7)
2!
1 ′′′
x4 = x (λ) (6.9)
3!
gives
and so forth.
du
6.1.1 Linearly independent solutions of dt
= Au with repeated eigenvalues
Let λ be an eigenvalue of A with eigenvector x. Then we have seen that u(t) = xeλt is a
solution of du
dt
= Au.
(A − λ1 I)x1 = 0
(A − λ1 I)x2 = x1
then
u1 (t) = x1 eλ1 t
u2 (t) = x2 eλ1 t + x1 teλ1 t
du
are the two linearly independent solutions of the vector equation dt
= Au.
162 | 6 Generalized eigenvectors and canonical forms
c1 u1 + c2 u2 = 0
λ1 t λ1 t
⇒ c1 x1 e + c2 (x2 e + x1 teλ1 t ) = 0
Evaluating at t = 0,
⇒ c1 x1 + c2 x2 = 0
Theorem. If A is a symmetric matrix with real elements, then it cannot have any gener-
alized eigenvectors of rank 2 or higher.
(A − λ1 I)2 x = 0 (6.11)
(A − λ1 I)x ≠ 0 (6.12)
Multiplying equation (6.11) on the left by xT (or take dot product with x) ⇒
xT (A − λ1 I)2 x = 0
⇒ xT (A − λ1 I)T (A − λ1 I)x = 0
⇒ yT y = 0, where y = (A − λ1 I)x
⇒ y = 0 ⇒ (A − λ1 I)x = 0,
|A − λI| = 0
⇒
6.1 Repeated eigenvalues and generalized eigenvectors | 163
Note that
1 1
(A − λ1 I) = ( ) ⇒ rank(A − λ1 I) = 1,
−1 −1
i. e., only one eigenvector and one generalized eigenvector exist. In addition,
0 0
(A − λ1 I)2 = ( ).
0 0
Thus,
a
(A − λ1 I)2 x2 = 0 ⇒ x2 = ( ).
b
x1 = regular eigenvector
1 1 a
= (A − λ1 I)x2 = ( )( )
−1 −1 b
1
= (a + b) ( ), (a + b) ≠ 0
−1
1 ).
If we take a = 1 and b = 0, then the GEV x2 = ( 01 ) and regular eigenvector x1 = ( −1
Using these two eigenvectors (one GEV and one regular eigenvector), the modal matrix
can be constructed as
1 1 0 −1
T = (x2 , x1 ) = ( ) ⇒ T−1 = ( )
−1 0 1 1
⇒
0 −1 −2 1 1 1
T−1 AT = ( )( )( )
1 1 −1 −4 −1 0
1 4 1 1
=( )( )
−3 −3 −1 0
−3 1
=( ) = J = Jordan form of A
0 −3
du
Also, note that the solution of dt
= Au, u = u0 @ t = 0 is given by u = eAt u0 . But
1 Jt −3t 1 t
where J = ( −3
0 −3 ) is the Jordan form of A. Since e = e ( 0 1 ),
⇒
t+1 t u
u = eAt u0 = e−3t ( ) ( 10 )
−t −t + 1 u20
Definition. An upper (lower) Jordan block of order m is an m×m matrix with the eigen-
values along the diagonal and unity in the upper (lower) diagonal.
Examples of upper and lower Jordon blocks of order two and three are given be-
low:
λ 1
J(λ) = ( ) (upper Jordon block of order 2)
0 λ
λ 1 0
J(λ) = ( 0 λ 1 ) (upper Jordon block of order 3)
0 0 λ
λ 0
J(λ) = ( ) (lower Jordon block of order 2)
1 λ
λ 0 0
J(λ) = ( 1 λ 0 ) (lower Jordon block of order 3)
0 1 λ
The theory for upper and lower Jordon blocks is identical, the only difference being the
arrangement of the generalized eigenvectors. Here, we present the theory for upper
Jordon blocks.
J(λ1 ) 0 . 0
0 J(λ2 ) . 0
T−1 AT = B, B=( )
0 0 . 0
0 0 . J(λr )
and the number of Jordon blocks is equal to the number of linearly independent eigen-
vectors of A and there may be more than one block with the same eigenvalue along the
diagonal.
The proof of this theorem may be found in standard matrix algebra books (Bron-
son and Costa [9]; Gantmacher [18]).
Then the following possible Jordon forms may exist depending on the number of
eigenvectors:
1. There are 5 eigenvectors (3 corresponding to λ1 and 2 corresponding to λ2 ). In this
case, A can be diagonalized and
λ1 0 0 0 0
0 λ1 0 0 0
T−1 AT = ( 0 0 λ1 0 0 )
0 0 0 λ2 0
0 0 0 0 λ2
λ1 1 0 0 0
0 λ1 0 0 0
B = T−1 AT = ( 0 0 λ1 0 0 ) (4 Jordon blocks)
0 0 0 λ2 0
0 0 0 0 λ2
λ1 0 0 0 0
0 λ1 0 0 0
B = T−1 AT = ( 0 0 λ1 0 0 ) (4 Jordon blocks)
0 0 0 λ2 1
0 0 0 0 λ2
166 | 6 Generalized eigenvectors and canonical forms
λ1 1 0 0 0
0 λ1 0 0 0
B = T−1 AT = ( 0 0 λ1 0 0 ) (3 Jordon blocks)
0 0 0 λ2 1
0 0 0 0 λ2
λ1 1 0 0 0
0 λ1 1 0 0
B = T−1 AT = ( 0 0 λ1 0 0 ) (3 Jordon blocks)
0 0 0 λ2 0
0 0 0 0 λ2
6. There are 2 eigenvectors, each corresponding to λ1 and λ2 . In this case, the canon-
ical form of A consists of two Jordon blocks and is of the form,
λ1 1 0 0 0
0 λ1 1 0 0
B = T−1 AT = ( 0 0 λ1 0 0 ) (2 Jordon blocks)
0 0 0 λ2 1
0 0 0 0 λ2
Note that Jordon’s theorem tells us that A can be reduced to a canonical form B
but does not tell us how to find the matrix T in the similarity transformation. We
now focus on this aspect.
m1 + m2 + m3 + ⋅ ⋅ ⋅ + mr = n, (6.14)
mi is called the algebraic multiplicity of the eigenvalue λi . Let the number of linearly
independent eigenvectors corresponding to λi be Mi . We note that if rank(A − λi I) =
n − Mi , then there are Mi independent solutions of the homogenous equations (A −
λi I)x = 0.
6.3 Multiple eigenvalues and generalized eigenvectors | 167
⇒
λ1 1 0 . . 0
0 λ1 1 . . 0
A(x1 x2... xn ) = (x1 x2... xn ) ( 0 0 λ1 . . 0 )
. . . . . 1
0 0 0 . . λ1
⇒ Ax1 = λ1 x1 or (A − λ1 I)x1 = 0 (6.16)
Ax2 = x1 + λ1 x2 or (A − λ1 I)x2 = x1 (6.17)
Ax3 = x2 + λ1 x3 or (A − λ1 I)x3 = x2 (6.18)
.
.
Axn = xn−1 + λ1 xn or (A − λ1 I)xn = xn−1 (6.19)
These equations define the columns of T. The vector x1 is the regular eigenvector. We
call the vectors x2 , x3 , . . . , xn the generalized eigenvectors. To see the properties of these
vectors, premultiplying equation (6.17) by (A − λ1 I), we get
(A − λ1 I)2 x2 = 0
(A − λ1 I)3 x3 = 0
...
(6.20)
...
(A − λ1 I)n xn = 0
These are the equations for determining the generalized eigenvectors. The vector x2
is called a generalized eigenvector of rank 2, x3 is a generalized eigenvector of rank 3,
etc. Note that once xn is determined, all the others can be determined by simply using
the above equations in reverse order, i. e.,
168 | 6 Generalized eigenvectors and canonical forms
xn−1 = (A − λ1 I)xn
xn−2 = (A − λ1 I)xn−1
.
.
x1 = (A − λ1 I)x2
(A − λ1 I)m xm = 0
and
(A − λ1 I)m−1 xm ≠ 0
Chains
A chain generated by a GEV xm of rank m associated with eigenvalue λ1 is a set of
vectors {xm , xm−1 , . . . , x1 } defined recursively as
xj = (A − λ1 I)xj+1 , j = m − 1, m − 2, . . . , 1
The number of vectors in the set is called the length of the chain.
The procedure for finding T may be summarized as follows: Let Pn (λ) given by
equation (6.13) be the characteristic polynomial. Determine the chain of GEVs corre-
sponding to each eigenvalue λj . Arrange these as columns of T, i. e.,
and construct the chain generated by this vector. Each of these vectors is part of
the canonical basis.
4. Reduce each positive Nk by one. If all Nk are zero, we are finished. If not, find the
highest value of k for which Nk is not zero and determine a GEV of that rank, which
is linearly independent of all previously determined GEV associated with λ1 . De-
termine the chain generated by this vector and include this in the canonical ba-
sis.
5. Repeat step 4 until all GEVs are found.
Example 6.2.
3 2 0 1
0 3 0 0
A=( )
0 0 3 −1
0 0 0 3
0 2 0 1
0 0 0 0
(A − 3I) = ( ); rank(A − 3I) = 2
0 0 0 −1
0 0 0 0
Thus, there are two eigenvectors, and hence two generalized eigenvectors. This can
also be confirmed from the following calculation:
0 0 0 0
0 0 0 0
(A − 3I)2 = ( ); rank[(A − 3I)2 ] = 0 ⇒ p1 = 2
0 0 0 0
0 0 0 0
0 0
0 1
x2 = ( ), y2 = ( )
0 0
1 0
1
0
x1 = ( )
−1
0
and
2
0
y1 = ( )
0
0
Hence,
2 0 1 0
0 1 0 0
T=( )
0 0 −1 0
0 0 0 1
3 1 0 0
0 3 0 0
T−1 AT = ( )
0 0 3 1
0 0 0 3
Example 6.3.
4 2 1 0 0 0
0 4 −1 0 0 0
( 0 0 4 0 0 0 )
) ⇒ P6 (λ) = (4 − λ)5 (7 − λ)
A=(
( 0 0 0 4 2 0 )
0 0 0 0 4 0
( 0 0 0 0 0 7 )
−3 2 1 0 0 0 x1 0
0 −3 −1 0 0 0 x2 0
(
( 0 0 −3 0 0 0 )(
)( x3 ) (
)=( 0 )
)
( 0 0 0 −3 2 0 )( x4 ) ( 0 )
0 0 0 0 −3 0 x5 0
( 0 0 0 0 0 0 )( x6 ) ( 0 )
0
0
( 0 )
⇒ z1 = ( ) is an eigenvector corresponding to λ2 .
( 0 )
0
( 1 )
0 2 1 0 0 0
0 0 −1 0 0 0
( 0 0 0 0 0 0 )
(A − 4I) = ( ) ⇒ rank(A − 4I) = 4
( 0 0 0 0 2 0 )
0 0 0 0 0 0
( 0 0 0 0 0 3 )
0 0 −2 0 0 1
0 0 0 0 0 −1
( 0 0 0 0 0 0 )
(A − 4I)2 = ( ) ⇒ rank(A − 4I)2 = 2
( 0 0 0 0 0 0 )
0 0 0 0 0 0
( 0 0 0 0 0 9 )
0 0 0 0 0 −2
0 0 0 0 0 0
( 0 0 0 0 0 0 )
(A − 4I)3 = ( ) ⇒ rank(A − 4I)3 = 1
( 0 0 0 0 0 0 )
0 0 0 0 0 0
( 0 0 0 0 0 27 )
The chain corresponding to λ1 will consist of one GEV of rank 3, two GEV of rank 2 and
two GEVs of rank 1. We first determine the GEV of rank 3 by solving
172 | 6 Generalized eigenvectors and canonical forms
(A − 4I)3 x3 = 0; (A − 4I)2 x3 ≠ 0
0
0
( 1 )
x3 = ( )
( 0 )
0
( 0 )
−1
−1
( 0 )
x2 = (A − 4I)x3 = ( )
( 0 )
0
( 0 )
−2
0
( 0 )
x1 = (A − 4I)x2 = ( )
( 0 )
0
( 0 )
(A − 4I)2 y2 = 0; (A − 4I)y2 ≠ 0
⇒ y23 = y26 = 0 and y21 , y22 , y24 , y25 are arbitrary but must be chosen such that (A −
4I)y2 ≠ 0 and y2 is independent of x3 , x2 and x1 determined above:
y21
y22
( 0 )
(A − 4I) ( ) ≠ 0 ⇒ y22 ≠ 0 or y25 ≠ 0
( y24 )
y25
( 0 )
6.4 Determination of f (A) when A has multiple eigenvalues | 173
Hence, we take
0
0
( 0 )
y2 = ( ),
( 0 )
1
( 0 )
which is independent of x3, x2 and x1 . The remaining vector of this chain is given by
0
0
( 0 )
y1 = (A − 4I)y2 = ( )
( 2 )
0
( 0 )
T = (x1 x2 x3 y1 y2 z1 ),
then
4 1 0 0 0 0
0 4 1 0 0 0
( 0 0 4 0 0 0 )
T−1 AT = ( )
( 0 0 0 4 1 0 )
0 0 0 0 4 0
( 0 0 0 0 0 7 )
f (A) = Tf (Λ)T−1
where Λ is the diagonal matrix containing the eigenvalues. Now, let T be such that
J1 (λ) . 0
T−1 AT = B = ( . . . ) = Jordon canonical form of A.
0 0 Jr (λ)
174 | 6 Generalized eigenvectors and canonical forms
f (J1 ) . 0
f (A) = Tf (Λ)T−1 = T ( . . . ) T−1 .
0 0 f (Jr )
Thus, if we can evaluate f (J) where J is a Jordon block, then we can evaluate f (A).
λ 1 0 . . 0
0 λ 1 . . 0
( 0 0 λ . . 0 )
Jm (λ) = ( ),
( . . . . 1 . )
. . . . . .
( 0 0 0 . . λ )
f ′ (λ) f ′′ (λ) f [m−1] (λ)
f (λ) 1! 2!
. . . (m−1)!
f ′ (λ) f [m−2] (λ)
( 0 f (λ) 1!
. . . (m−2)! )
f (Jm (λ)) = ( )
. . . . . . .
. . .. . . . .
( 0 0 0 . . . f (λ) )
λt t 0 0 . 0
0 λt t 0 . 0
( 0 0 λt t . 0 )
U = Jm (λ).t = (
(
)
)
. . . . . .
. . . . . .
( 0 0 0 0 . λt )
Note that U is not a Jordon block because it does not have ones on the super diagonal.
It is easily verified that U may be written as
U = VJm (λt)V−1
where
t m−1 0 0 . . 0
0 t m−2 0 . . 0
( . . . . . . )
V=(
(
)
)
. . . . . .
. . . . . .
( 0 0 0 . . 1 )
6.5 Application of Jordon canonical form to differential equations | 175
λt 1 0 . . 0
0 λt 1 . . 0
( 0 0 λt 1 . 0 )
Jm (λt) = ( )
. . . . . .
. . . . λt 1
( 0 0 0 . . λt )
t 1−m 0 0 . . 0
0 t 2−m 0 . . 0
( . . . . . . )
V−1 = ( )
. . . . . .
. . . . . .
( 0 0 0 . . 1 )
Hence,
Thus,
t t2 t m−1
1 1! 2!
. . (m−1)!
t t m−2
0 1 1!
. . (m−2)!
( )
( t m−3 )
exp(Jm (λ)t) = exp(λt) ( 0 0 1 . . (m−3)! )
( )
. . . . . .
. . . . . .
( 0 0 0 . . 1 )
dx
= Ax, x(t = 0) = x0 (6.21)
dt
dz
= T−1 ATz, z(t = 0) = T−1 x0 ≡ z0 (6.22)
dt
dzi
= λi zi ⇒ zi = zi0 exp(λi t)
dt
If T−1 AT consists of Jordon blocks, then the equations (6.22) are uncoupled only par-
tially (or equivalently in blocks). In this case, each component of the solution to (6.21)
consists of terms like exp(λi t) as well as terms of the form t exp(λi t), t 2 exp(λi t), . . ., etc.
To illustrate this, we consider a simple case in which the Jordon canonical form is
4 1 0 0 0 0
0 4 1 0 0 0
( 0 0 4 0 0 0 )
T−1 AT = ( )=B
( 0 0 0 4 1 0 )
0 0 0 0 4 0
( 0 0 0 0 0 7 )
dz1
= 4z1 + z2
dt
dz2
= 4z2 + z3
dt
dz3
= 4z3
dt
dz4
= 4z4 + z5
dt
dz5
= 4z5
dt
dz6
= 7z6
dt
or,
dz
= Bz; z = z0 @ t = 0 (6.23)
dt
z = exp(Bt)z0 ,
where
t 4t t 2 4t
e4t 1!
e 2!
e 0 0 0
4t t 4t
0 e 1!
e 0 0 0
( 4t )
(
exp(Bt) = ( 0 0 e 0 0 0 )
t 4t
).
( 0 0 0 e4t 1!
e 0 )
0 0 0 0 e4t 0
( 0 0 0 0 0 e7t )
6.5 Application of Jordon canonical form to differential equations | 177
We close this topic by stating a general theorem on the solution of the linear initial
value problem,
du
= Au, u(@ t = 0) = u0 (6.24)
dt
r
∑ mi = n (6.25)
i=1
Then the solution of the initial value problem, defined by equation (6.24) is given by
r mj −1
tk
u(t) = ∑ [ ∑ (A − λj I)k exp(λj t)]u0,j
j=0 k=0
k!
where
r
u = ∑ u0,j and u0,j ∈ Wj (A).
i=1
Problems
1. Given the matrix A = ( −2 1
−1 4 )
(a) Determine the eigenvalues, generalized eigenvectors and generalized adjoint
eigenvectors.
(b) What is the Jordan canonical form of A?
(c) Determine the matrix exp[At].
2. Given the matrix
8 −2 −2 0
0 6 2 −4
A=( )
−2 0 8 −2
2 −4 0 6
du
= Au, u(t = 0) = u0 ,
dt
−7 −1 −3 1
−1 7 1 −3
A=( ).
−3 1 7 −1
1 −3 −1 7
7 Quadratic forms, positive definite matrices and
other applications
Quadratic forms appear in many applications such as the determination of maxima or
minima of functions of several variables, optimization theory, solution of linear and
nonlinear equations by iterative methods, tensor analysis and coordinate transforma-
tions, stability and control theory, definitions of metric or inner products, classifica-
tion of partial differential equations, etc. The first four sections of this chapter give a
brief introduction to this topic, and the rest of the chapter deals with various other
applications.
where
c = f (x = 0)
𝜕f
bi = (x = 0)
𝜕xi
1 𝜕2 f
{aij } = {aji } = (x = 0).
2! 𝜕xi 𝜕xj
The real symmetric matrix 2A (of second partial derivatives of f (x1 , x2 , . . . , xn )) is also
called the Hessian matrix. We can remove the constant and linear terms in equation
(7.1) by a translation of the origin. Defining
y=x−α (7.2)
we get
1
Aα = − b (7.3)
2
https://doi.org/10.1515/9783110739701-007
180 | 7 Quadratic forms, positive definite matrices and other applications
Then the linear terms vanish. Defining Q(x) = f (x) − (c + bT α + αT Aα), the quadratic
form simplifies to
Q(y) = yT Ay (7.4)
Note that equation (7.3) can be solved for any b only if A is not singular (or if the vector
−b/2 is in the column space of A). For the case of n = 2 with
a b
A=( ), (7.5)
b c
the linear terms can be removed only if b2 − ac ≠ 0, i. e., the quadratic form is not of
parabolic type.
The quadratic form given by equation (7.4) can be put in canonical form by not-
ing that for a real symmetric matrix A, there exists an orthogonal matrix U such that
UT AU = Λ (diagonal). Making the coordinate transformation (which is a rotation)
y = Uz (7.6)
Q = (Uz)T AUz
= zT UT AUz
= zT Λz
= λ1 z12 + λ2 z22 + ⋅ ⋅ ⋅ + λn zn2 .
Example 7.1. Examine the nature of the curve defined by 5x12 − 8x1 x2 + 5x22 = 10.
The quadratic form Q = 5x12 − 8x1 x2 + 5x22 may be written as
5 −4 x
Q = ( x1 x2 ) ( ) ( 1 ) = 10
−4 5 x2
5 −4 ) are λ = 9, λ = 1, x = ( 1/√2
The eigenvalues and eigenvectors of A = ( −4 5 1 2 1 );
−1/√2
x2 = ( 1/√2 ). Thus, making the substitution (rotation) x = Uz, where
√
1/ 2
1 1
√2 √2
U=( ) (7.7)
− √12 1
√2
9z12 + z22 = 10
7.1 Quadratic forms | 181
or equivalently,
z12 z2
+ 2 =1
(10/9) 10
Now, since
1
z1 − √12 x1 (x − x2 )/√2
z=( ) = UT x = ( √2
1 1 )( )=( 1 ),
z2 √2 √2
x2 (x1 + x2 )/√2
2 2
x1 − x2 x + x2
9( ) +( 1 ) = 10.
√2 √2
This represents an ellipse with semiaxis lengths of √10/9 and √10, respectively. Fig-
ure 7.1 shows the two coordinate systems as well as the curve (ellipse) represented by
the quadratic form. In this case, equation (7.7) represents a 7π/4 rotation of the axes
in the positive (or counterclockwise) direction (or π/4 in the clockwise direction).
Figure 7.1: Ellipse represented by 5x12 − 8x1 x2 + 5x22 = 10 with standard and canonical coordinate
systems.
Example 7.2. The quadratic form defined by 5x12 + 8x1 x2 + 5x22 = 10 is also an ellipse
identical to the one in the above example but the major (longer) axis makes an angle
3π/4 with the positive x1 -axis (see Figure 7.2).
Example 7.3. The quadratic form defined by 3x12 − 8x1 x2 − 3x22 = 10 is a hyperbola, as
seen from the following analysis. The eigenvalues and eigenvectors of A = ( −43 −4 ) are
−3
1/ 5 2/ 5
λ1 = −5, λ2 = 5, x1 = ( 2/ ); x2 = ( −1/ ). Making the substitution (rotation) x = Uz,
√ √
√5 √5
where
182 | 7 Quadratic forms, positive definite matrices and other applications
Figure 7.2: Ellipse represented by 5x12 + 8x1 x2 + 5x22 = 10 with standard and canonical coordinate
systems.
1 2
√5 √5
U=( 2
) (7.8)
√5
− √15
or equivalently,
z22 z12
− =1
2 2
Now, since
1 2
z1 √5 √5 x1 (x + 2x2 )/√5
z=( ) = UT x = ( )( )=( 1 ),
z2 2
− √15 x2 (2x1 − x2 )/√5
√5
2 2
2x1 − x2 x + 2x2
( ) −( 1 ) = 2.
√5 √5
Figure 7.3 shows the curve represented by the quadratic form along with the axes and
asymptotes.
Another application of quadratic forms is in obtaining the nature of the extrema
of multivariable functions (i. e., maxima, minima, saddle points, etc.), which we will
discuss in Section 7.4.
7.2 Positive definite matrices | 183
Figure 7.3: Hyperbola represented by 3x12 − 8x1 x2 − 3x22 = 10 with standard and canonical coordinate
systems.
Q(x) = xT Ax (7.9)
Definition. Q(x) is called positive definite if it takes only positive values for any choice
of x ≠ 0 and is zero only for x = 0, i. e., xT Ax > 0 for all x ≠ 0. Similarly, Q(x) is called
negative definite if xT Ax < 0 for all x ≠ 0.
For example, Q(x) = 5x12 − 8x1 x2 + 5x22 = 9( x1√−x2 2 )2 + ( x1√+x2 2 )2 is positive definite.
The following tests may be used to check for the positive definiteness of a quadratic
form or symmetric (or Hermitian) matrix:
(i) A is positive definite if and only if it can be reduced to an upper triangular form
using only elementary row (or column) operations of type 3 and the diagonal ele-
ments of the resulting matrix (the pivots) are all positive.
(ii) A is positive definite if and only if all its principal minors are positive. A principal
minor of A is the determinant of any submatrix obtained from A by deleting its
last k rows and k columns (k = 0, 1, . . . , n − 1).
(iii) A is positive definite if and only if all its eigenvalues are positive.
184 | 7 Quadratic forms, positive definite matrices and other applications
The proof of the first two statements may be found in the book by Bronson [8]. State-
ment (iii) is proved in the next section.
6 2 −2
A=( 2 6 −2 )
−2 −2 10
6 2 −2
16
A1 = ( 0 3
− 43 ) .
−0 − 43 28
3
6 2 −2
16
A2 = ( 0 3
− 43 ) .
0 0 9
Since all the diagonal elements (pivots) of A2 are positive, A is positive definite.
(ii) The principal minors of A are
d1 = det[6] = 6,
6 2
d2 = det ( ) = 34,
2 6
and
6 2 −2
d3 = det ( 2 6 −2 ) = 288.
−2 −2 10
Since all three principal minors are positive, the matrix is positive definite.
λ1 ≤ λ2 ≤ ⋅ ⋅ ⋅ ≤ λn .
We define the Rayleigh quotient (which defines a mapping from ℝn /ℂn to the field of
real numbers) as
⟨Ax, x⟩
R(x) = .
⟨x, x⟩
λ1 ≤ R(x) ≤ λn
i. e., the Rayleigh quotient attains its minimum value (equal to the smallest eigen-
value of A) when x is the eigenvector corresponding to λ1 . Similarly, R(x) attains its
maximum value when x is the eigenvector corresponding to λn . This may be shown as
follows:
Suppose that U is the orthogonal (unitary) matrix that diagonalizes A. Then
U∗ AU = Λ = diag .(λ1 , . . . , λn )
The upper and lower bounds follow from this expression and the assumption that
λ1 ≤ λ2 ≤ ⋅ ⋅ ⋅ ≤ λn .
df
=0 (7.10)
dx x0
d2 f
A sufficient condition is |
dx 2 x0
≠ 0. Further, the extremum is a local minimum if
d2 f
> 0, (7.11)
dx2 x0
d2 f
< 0. (7.12)
dx2 x0
2
df
The case of dx 2 |x0 = 0 corresponds to neither maximum nor minimum, but a saddle
Figure 7.4: Schematic of extremum points: minimum, maximum and saddle points.
where
𝜕f
𝜕x1
𝜕f
∇f (x0 ) = ( = gradient vector of f (x) at x = x0 (7.14)
𝜕x2 )
..
.
𝜕f
( 𝜕xn )x=x0
and
𝜕2 f
A = {aij }n×n = { } = Hessian matrix
𝜕xi 𝜕xj x=x0 n×n
= Symmetric matrix of second partial derivatives of f (x) evaluated at x = x0 (7.15)
From equations (7.13)–(7.15), it can be seen that a necessary condition for f (x) to have
an extremum at x = x0 is
7.4 Maxima/minima for a function of several variables | 187
∇f (x0 ) = 0, (7.16)
Q(x) = xT Ax
where AT = A is a real symmetric matrix. If Q(x) > 0 for all x near the origin, then f (x)
has a local minimum, while if Q(x) < 0 for all x near the origin (0), then f (x) has a
local maximum. Otherwise, the extremum is a saddle point.
Let λ1 , λ2 , . . . , λn be the eigenvalues (real) and T be the orthogonal matrix that di-
agonalizes A. Let
x = Tz ⇒ z = T−1 x = TT x (7.19)
⇒
Q = zT TT ATz = zT Λz
n
= ∑ λi zi2 (7.20)
i=1
Thus, if all λi > 0 (i = 1, 2, . . . , n), Q only takes positive values, and hence the extremum
is a minimum. If λi < 0 for all i, Q is negative and the extremum is a maximum. If λi = 0
for some i or if eigenvalues of A are both positive and negative, the extremum is neither
a maximum nor a minimum.
It may be shown that a symmetric matrix A has positive eigenvalues (or is positive
definite) if all the principal minors have positive determinants. A principle minor is
obtained by striking out the last k rows and k columns (k = 1, 2, . . . , n − 1) of A. For
example, for 2 × 2 case,
a11 a12
A=( ), a12 = a21
a21 a22
is positive definite if
188 | 7 Quadratic forms, positive definite matrices and other applications
a11 > 0
a11 a22 − a12 a21 = a11 a22 − a212 > 0.
is positive definite if
a11 > 0,
a11 a12
> 0,
a21 a22
a a12 a13
11
a21 a22 a23 > 0.
a31 a32 a33
Remark. A is negative definite if (−A) is positive definite. Then, for the 2 × 2 case, A is
negative definite if
a11 < 0,
a11 a12
> 0.
a21 a22
Example 7.5. State the conditions for a function of two variables f (x1 , x2 ) to have a
local maximum.
For f (x1 , x2 ) to have a local maximum at (x10 , x20 ), a necessary condition is that the
first partial derivatives of f (x1 , x2 ) w. r. t. x1 and x2 vanish, i. e.,
𝜕f 𝜕f
(x , x ) = 0; (x , x ) = 0
𝜕x1 10 20 𝜕x2 10 20
1 𝜕2 f 𝜕2 f
Q= [ 2 (x10 , x20 )(x1 − x10 )2 + 2 (x , x )(x − x10 )(x2 − x20 )
2! 𝜕x1 𝜕x1 𝜕x2 10 20 1
𝜕2 f
+ 2 (x10 , x20 )(x2 − x20 )2 ]
𝜕x2
is negative definite. This is the case if the following two conditions are satisfied:
𝜕2 f
(x10 , x20 ) < 0
𝜕x12
2
𝜕2 f 𝜕2 f 𝜕2 f
(x 10 , x 20 ) (x 10 , x 20 ) − ( (x , x )) > 0
𝜕x12 𝜕x22 𝜕x1 𝜕x2 10 20
7.4 Maxima/minima for a function of several variables | 189
Example 7.6. Consider the function f (x, y) = 2x 2 − 2xy + 5y2 − 18y + 23, and examine it
for extremum values.
The extremum points can be obtained by setting ∇f = 0 ⇒
𝜕f
= 4x − 2y = 0
𝜕x
𝜕f
= −2x + 10y − 18 = 0,
𝜕y
which after solving leads to (x0 , y0 ) = (1, 2) as a possible extremum point. We evaluate
the Hessian matrix (matrix of second derivatives),
𝜕2 f 𝜕2 f
𝜕x 2 4 −2
A=(
𝜕x𝜕y
) =( ).
𝜕2 f 𝜕2 f −2 10
𝜕y𝜕x 𝜕y2 (x0 ,y0 )=(1,2)
d1 = |4| = 4 > 0,
4 −2
d2 = = 36 > 0.
−2 10
Example 7.7. Consider the function g(x, y) = 2x 2 − 8xy − y2 + 18y − 9. The extremum
points can be obtained by setting ∇g = 0 ⇒
𝜕g
= 4x − 8y = 0
𝜕x
𝜕g
= −8x − 2y + 18 = 0.
𝜕y
Solving these linear equations gives a possible extremum point (x0 , y0 ) = (2, 1). We
examine the Hessian matrix
𝜕2 g 𝜕2 g
𝜕x 2 4 −8
A=(
𝜕x𝜕y
) =( ).
𝜕2 g 𝜕2 g −8 −2
𝜕y𝜕x 𝜕y2 (x0 ,y0 )=(2,1)
The eigenvalues of A are λ1 = 1 + √73 > 0 and λ2 = 1 − √73 < 0. Thus, the point x0 = 2,
y0 = 1 is neither a maximum nor a minimum. It is a saddle point.
190 | 7 Quadratic forms, positive definite matrices and other applications
Here, uk is the system state at time k, which is assumed to be discrete, i. e., taking
values k = 0, 1, . . . ∞ while A is the connectivity, coupling or transition probability
matrix.
un−1 un
xn = ( ) and xn+1 = ( ), (7.23)
un un+1
0 1
xn+1 = ( ) xn = Axn with x1 given. (7.24)
b a
un−1 0 1 0
xn+1 = Axn where xn+1 = ( un ) and A = ( 0 0 1 ) (7.26)
un+1 c b a
u0
with given x1 = ( u1 ). Thus, higher-order scalar difference equations can be converted
u2
to vector equations.
7.5 Linear difference equations | 191
u1 = Au0
u2 = Au1 = A2 u0
..
.
uk = Ak u0 , k = 0, 1, 2, . . . (7.27)
Thus, the computation of uk requires the evaluation of Ak , which is given by the spec-
tral theorem as follows:
n
Ak = ∑ λjk Ej (7.28)
j=1
xj y∗j
Ej = = xj y∗j (if y∗j are normalized). (7.29)
y∗j xj
n
uk = ∑ λjk Ej u0 , k = 0, 1, 2, . . . (7.30)
j=1
We consider some special cases of this solution based on the magnitude of the eigen-
values.
Case 1: |λj | < 1 for all j = 1, 2, . . . , n
In this case, limk→∞ |λj |k → 0, and thus, uk → 0 for k → ∞ and the system
approaches the trivial state u = 0 for k → ∞.
Case 2: λ1 = 1, |λj | < 1 for all j = 2, 3, . . . , n
In this case, limk→∞ uk → E1 u0 = (y∗1 u0 )x1 , and the system approaches a scalar
multiple of the state given by the eigenvector x1 corresponding to λ1 = 1 (eigenvalue
of unity).
Case 3: |λj | > 1 for some j
In this case, limk→∞ uk → ∞ and the solution is not bounded.
Case 4: A pair of complex eigenvalues of unit modulus and all other eigenvalues
inside the unit circle, i. e., λ1,2 = α ± iβ with |λ1 | = |λ2 | = α2 + β2 = 1 while |λj | < 1 for
j = 3, 4, . . . , n. The solution is again bounded and lies on an invariant circle.
It should be pointed out that all these special cases occur in the solution of non-
linear algebraic equations by local linearization, e. g., Newton–Raphson or other iter-
ative methods.
192 | 7 Quadratic forms, positive definite matrices and other applications
Example 7.8 (Two-stage Markov process). Consider the case in which the state vector
u is of the form
ak
uk = ( )
bk
where ak is the fraction of the population in state ak and bk is the fraction of the pop-
ulation in state B at time k. For example, ak can be the fraction of students in a class
having a mobile phone of type-I while bk as the fraction having phone of type-II. Let
A = {pij } = P
where P being the transition/switching probability matrix, i. e., pij is the probability
of switching from state j to i then
n
∑ pij = 1 for j = 1, 2, . . . , n
i=1
2 1
or the columns of P sum to unity for a Markov matrix. For example, for n = 2, P = ( 31 2
1 )
3 2
2
is a Markov matrix where p11 = is the probability of a student in state A staying in
3
state A, p21 = 31 is the probability of student switching from state A to state B, p12 = 21
is the probability of switching from state B to state A, and p22 = 21 is the probability of
students staying in state B (here p11 + p21 = 1, p12 + p22 = 1).
We note that if P is a Markov matrix then λ1 = 1 is an eigenvalue of P with left
eigenvector yT1 = (1 1 . . . 1). This follows from the fact that ∑ni=1 pij = 1 for each j.
Thus, for a Markov process with initial state u0 , we have
n
uk = ∑ λjk Ej u0
j=1
2 1
Figure 7.5: Convergence in Markov chains with P = ( 3
1
2
1 ) and u0 = ( 0.05
0.95 ).
3 2
Remark 7.1. For the case of n = 2 or 2 × 2 Markov matrix, the limk→∞ uk can also be
β α
found by solving a simple algebraic equation. Let u0 = ( 1−β ) and u∞ = ( 1−α ). Now
since
2 1
α 3 2 α
u∞ = Pu∞ ⇒( )=( )( )
1−α 1 1 1−α
3 2
2 1 3
⇒α= + (1 − α) ⇒ α =
3 2 5
3
5
⇒ u∞ = ( 2
),
5
which is independent of β, i. e., the initial point (provided the sum of the two compo-
nents of initial condition is unity).
194 | 7 Quadratic forms, positive definite matrices and other applications
u∞ = Pu∞
along with the constraints that the sum of all u∞ components is unity. Note that
this is proportional to the eigenvector x1 corresponding to the eigenvalue λ1 where
the proportionality constant can be obtained with the constraints stated earlier, i. e.,
yT1 u∞ = 1.
xn+1 = xn + xn−1 , n = 1, 2, . . .
x0 = x1 = 1
which is a second-order linear difference equation. One way to solve this equation is
by assuming the solution of the form
xn = r n ,
1 ± √5
r 2 = r + 1 ⇒ r1,2 = (r1 = 1.618, r2 = −0.618).
2
Thus, the general solution can be expressed as
xn = c1 r1n + c2 r2n
where c1 and c2 can be solved from initial points by setting for n = 0 and 1, respectively,
i. e.,
n = 0 ⇒ c1 + c2 = 1 and n = 1 ⇒ c1 r1 + c2 r2 = 1
√5 ± 1
⇒ c1,2 =
2√5
1 (1 + √5)n+1 − (1 − √5)n+1
xn = ( ).
√5 2n+1
xn−1
xn = ( )
xn
xn 0 1 x
⇒ xn+1 = ( )=( ) ( n−1 )
xn+1 1 1 xn
1
⇒ xn+1 = Axn , x1 = ( ).
1
Thus,
λ2 − λ − 1 = 0,
1±√5
⇒ λ1,2 = 2
, and corresponding to eigenvectors:
1
x1,2 = ( ),
λ1,2
1
yT1,2 = ( 1 λ1,2 ) .
2
√1 + λ1,2
1 1 λ1,2
E1,2 = x1,2 yT1,2 = 2
( 2 )
1 + λ1,2 λ1,2 λ1,2
xn−1 1
( ) = xn = An−1 x1 = An−1 ( )
xn 1
(1+λ1 )λ1n−1 (1+λ2 )λ2n−1
1+λ12
+ 1+λ22
=( )
(1+λ1 )λ1n (1+λ2 )λ2n
1+λ12
+ 1+λ22
196 | 7 Quadratic forms, positive definite matrices and other applications
⇒
(1 + λ1 ) n (1 + λ2 ) n
xn = λ + λ
1 + λ12 1 1 + λ22 2
1±√5 2
But λ1,2 = 2
and λ1,2 = λ1,2 + 1 ⇒
2 n+2
(1 + λ1,2 ) n λ1,2 n λ1,2
2
λ1,2 = 2
λ1,2 = 2
1 + λ1,2 1 + λ1,2 1 + λ1,2
( 1±2 5 )n+2 1 (1 ± √5)n+2
√
= =
1+ 3±√5 2n+1 5 ± √5
2
1 (1 + √5)n+2 1 (1 − √5)n+2
xn = + n+1
2n+1 5 + √5 2 5 − √5
1 (1 + √5)n+1 − (1 − √5)n+1
= ( ),
√5 2n+1
which is the same result obtained earlier. Since |λ2 | = 0.618 < 1 and λ2n → 0 for n → ∞,
xn may be approximated for large n as
1 + λ1 n
xn ≈ λ = 0.7236(1.618)n .
1 + λ12 1
Au = b (7.31)
If we denote the j-th component of (Au − b) as ej , which represents the error in j-th
equation, then
m
f (u) = eT e = ∑ ej2 = sum of squares of residuals (7.33)
j=1
To minimize f , we set
𝜕f
= 0, k = 1, 2, . . . , n
𝜕uk
⇒ ∇f = gradient of f = 0 (7.34)
∇f = 2AT Au − 2AT b = 0
⇒ AT Au = AT b
⇒ u = (AT A) AT b. (7.35)
−1
Definition. A† = (AT A)−1 AT is called the generalized inverse (or Moore–Penrose) in-
verse of A.
Properties of A†
(i) If m = n and A is an invertible square matrix, then A† = A−1 .
(ii) A† A = In = Identity matrix.
(iii) AA† A = A.
The equations
AT Au = AT b (7.36)
are often referred to as the “least squares equations” and can be solved for u if AT A is
invertible. If we let
B = AT A, (7.37)
then
BT = A T A = B (7.38)
⇒ B is a symmetric matrix, and hence the eigenvalues of B are real and nonnegative.
The positive square roots of the eigenvalues of B are called “singular values” of A.
198 | 7 Quadratic forms, positive definite matrices and other applications
D 0
A = U( ) VT (7.39)
0 0
where U and V are orthogonal matrices and D is a diagonal matrix having all the pos-
itive singular values of A as its diagonal elements. Equation (7.39) is referred to as
“singular-value decomposition” of A.
x1 + 3x2 = 5
x1 − x2 = 1
x1 + x2 = 0.
Here, we have
1 3 5
A=( 1 −1 ) ; b = ( 1 ),
1 1 0
⇒
3 3 6
AT A = ( ), AT b = ( ).
3 11 14
x̂1 = 1 = x̂2 .
Note that the least square solution is equivalent to fitting the line y = α + βx through
the data points (3, 5), (−1, 1) and (1, 0). The least squares solution α = 1 and β = 1 gives
the line closest to the data points as shown in Figure 7.6.
Problems
1. Given the quadratic form F = 6x12 + 5x22 + 7x32 − 4x1 x2 + 4x1 x3 :
(a) Reduce it to its canonical form.
(b) If F = 1, what are the lengths of the semiaxes?
(c) What are the directions of the principal axes with respect to the original axes?
2. Find the values of the parameter λ for which the following quadratic forms are
positive definite:
(a) 2x12 + x22 + 3x32 + 2λx1 x2 + 2x1 x3
(b) x12 + 4x22 + x32 + 2λx1 x2 + 10x1 x3 + 6x2 x3
7.6 Generalized inverse and least square solutions | 199
π n/2
∞ ∞ ∞ ∞
∫ ∫ ∫ . . . ∫ exp(−xt Ax) dx = .
(det A)1/2
−∞ −∞ −∞ −∞
𝜕2 u 𝜕2 u 𝜕2 u 𝜕2 u 𝜕2 u
a + b + c + 2f + 2g
𝜕x 2 𝜕y2 𝜕z 2 𝜕x𝜕y 𝜕y𝜕z
𝜕2 u 𝜕u 𝜕u 𝜕u
+ 2h +l +m +n + qu = 0
𝜕z𝜕x 𝜕x 𝜕y 𝜕z
𝜕2 u 𝜕2 u 𝜕2 u 𝜕u 𝜕u
a 2
+ 2b +c 2 +g +h + du = 0
𝜕x 𝜕x𝜕y 𝜕y 𝜕x 𝜕y
state explicitly (in terms of a, b and c) the conditions under which the PDE is
parabolic, elliptic or hyperbolic.
(b) The dispersion of a tracer in a capillary is described by the equation
𝜕C 𝜕C 𝜕2 C 1 𝜕2 C
+ +P − =0
𝜕t 𝜕z 𝜕t𝜕z Pe 𝜕z 2
where P and Pe are constants known as the transverse and axial Peclet numbers,
respectively. Determine whether this equation is parabolic, elliptic or hyperbolic.
7. (Scalar differential equation with constant coefficients)
(a) Consider the scalar second-order equation
d2 u du
a +b + cu = f (t), t > 0
dt 2 dt
u(0) = α, α ≠ 0
u′ (0) = β,
du
= Au + b(t), t > 0; and u(t = 0) = u0
dt
u
where u = ( u21 ), and identify the coefficient matrix A and vector b.
(b) Show that the nth order scalar initial value problem can also be expressed in
the same vector form. Identify the coefficient matrix A and vector b.
8. (Solution of inhomogeneous vector IVP)
Consider a general IVP in vector form
du
= Au + b(t), t > 0; and u(t = 0) = u0
dt
(a) Obtain a formal solution of above equation using the integrating factor e−At
(b) Show that when b(t) = b0 is a constant vector for t > 0 and A is invertible, the
solution simplifies to
du
= Au, t > 0; and u = u0 @t = 0
dt
where A is a 3 × 3 matrix with one real eigenvalue λ1 and a pair of complex eigen-
values λ2 = α + ιβ, λ3 = α − ιβ.
(a) Show that the eigenvector x1 has real elements, while x2 and x3 may be ex-
pressed as complex conjugates:
where x2R and x2I are real vectors obtained by real and imaginary part of x2 .
(b) If we define
T = ( x1 x2R x2I )
verify that
i.
λ1 0 0
T AT = Λ = ( 0
−1 ̂ α β )
0 −β α
or
A = TΛT
̂ −1 ,
ii.
f (A) = Tf (Λ)T
̂ −1 ,
e λ1 t 0 0
At Λt αt αt
e = Te T = T( 0 e cos(βt) e sin(βt) ) T .
̂ −1 −1
[Remark: Here, Λ
̂ is referred to as the real canonical form of A].
202 | 7 Quadratic forms, positive definite matrices and other applications
−1 2 4
A=( 2 −3 ) , b = ( 1 ).
−1 3 2
11. Consider a set of data points [(x1 , y1 ), (x2 , y2 ), . . . , (xN , yN )] and you want to fit a lin-
ear model y = α0 + α1 x.
(a) Show that this is equivalent to solving the system of equations Aα = y, where
1 x1 y1
. . α .
A=( ), α = ( 0 ), y=( ).
. . α1 .
1 xN yN
(b) Show that the least squares solution is given by solving the normal equations:
(AT A)α = AT y.
du
= Au, (1)
dt
u(t = 0) = u0 , (2)
is given by
n y∗j u0
u=∑ x j e λj t , (3)
j=1
y∗j xj
where A is a constant coefficient n × n matrix, λj , xj and y∗j are the eigenvalues, eigen-
vectors and eigenrows of A, respectively. If A is a real symmetric (self-adjoint) matrix,
we have shown that yj = xj and the eigenvectors may be normalized to form an or-
thonormal basis. In this case, the solution given by Eq. (3) simplifies to
n
u = ∑⟨u0 , xj ⟩xj eλj t , (4)
j=1
where ⟨u0 , xj ⟩ = xTj u0 is the scalar(dot) product. Now, consider the linear partial dif-
ferential equation
𝜕u 𝜕2 u
= 2; 0 < ξ < 1, t > 0 (5)
𝜕t 𝜕ξ
u(ξ , 0) = u0 (ξ ). (7)
https://doi.org/10.1515/9783110739701-008
206 | Abstract vector space concepts
where
1
0
⟨u (ξ ), xj (ξ )⟩ = ∫ u0 (ξ )xj (ξ )dξ , xj (ξ ) = √2 sin jπξ . (9)
0
Here, λj = −j2 π 2 (j = 1, 2, . . .) are the eigenvalues and xj (ξ ) = √2 sin jπξ are the normal-
ized eigenfunctions of the linear differential operator
𝜕2 v
Lv = ; v(0) = v(1) = 0. (10)
𝜕ξ 2
The striking similarity between the two solutions is due to the fact that they are both
linear equations and contain linear operators A and L which are symmetric or self-
adjoint. Thus, it is useful to study the general properties of such linear operators.
In what follows, we consider some abstract vector space concepts that are useful
in solving linear equations. The advantage of the abstract formalism is the unified
treatment of various cases that arise in applications.
8 Vector space over a field
8.1 Definition of a field
A field F is a collection of objects called scalars (or numbers) such that two binary op-
erations called addition and multiplication are defined with the following properties:
If α ∈ F, β ∈ F, then
(i) α + β ∈ F (addition)
(ii) α.β ∈ F (multiplication)
Further, the following axiomatic laws of addition and multiplication must hold:
1. α + β = β + α and α.β = β.α (Commutativity)
2. (α + β) + γ = α + (β + γ) and (α.β).γ = α.(β.γ) (Associativity)
3. α.(β + γ) = α.β + α.γ (Distributivity)
4. There are two distinct elements, denoted 0 and 1, respectively, with the properties
α + 0 = α, α.1 = α ∀α ∈ F
The element 0 is called the identity element for addition while 1 is called the iden-
tity element for multiplication.
5. For every α ∈ F, ∃ an element x in F ∋ α + x = 0.
x is called the additive inverse of α and is denoted by (−α).
6. For every α ∈ F(α ≠ 0), ∃ an element x in F ∋
α.x = 1.
The operations of subtraction and division are merely extensions of addition and mul-
tiplication, respectively. For example, b + (−a) is denoted as b − a and b.a−1 is denoted
as b/a(a ≠ 0).
Examples.
1. The set of rational numbers ( pq , q ≠ 0), (p and q integers) forms a field.
2. The set of real numbers (positive, negative and zero) forms a field (denoted by ℝ).
3. The set of complex numbers forms a field (denoted by ℂ).
4. The set of integers does not form a field since no multiplicative inverse exists.
5. The set of quaternions forms a field. Quaternions are hypercomplex numbers of
the form a + ib + jc + kd, where a, b, c and d are real and ij = k, jk = i, ki = j, i2 = j2 =
k 2 = −1.
https://doi.org/10.1515/9783110739701-009
208 | 8 Vector space over a field
u + 0 = u ∀u ∈ V
Note that, as the definition states, a vector space is a composite object consisting of a
field, a set of vectors and two operations with the above special properties.
u + v = (α1 + β1 , . . . , αn + βn )
Then it can be shown that V has all the above properties. Hence, V is a vector
space and is denoted by ℝn /ℂn .
8.2 Definition of an abstract vector or linear space: | 209
γ∈F
p(x) = α0 + α1 x + α2 x 2 + ⋅ ⋅ ⋅ + αN x N , αj ∈ F
Ly = 0
n n−1
d y d y
where L is a linear n-th order differential operator a0 (x) dx n + a1 (x)
dx n−1
+ ⋅⋅⋅ +
an (x)y = 0 is a vector space.
8.2.1 Subspaces
1. The subspace containing the single element or the zero vector {0} is a subspace.
This is called the zero subspace of V
2. Let V be the set of n-tuples defined over F and
x = (α1 α2 . . . αn ) ∈ V.
Ax = 0,
Definition. Let V be a vector space over F and S be a set of vectors in V. Then the
subspace spanned by S is defined to be the intersection W of all subspaces of V, which
contain S.
α1 x1 + α2 x2 + ⋅ ⋅ ⋅ + αn xn = 0.
If no such set of scalars exists, i. e., if αi = 0 for all i, then the set S is said to be linearly
independent.
Definition. Let V be a vector space over F. A basis for V is a linearly independent set
of vectors in V, which spans V.
Definition. The dimension of a vector space is the largest number of linearly indepen-
dent vectors in that space. If there is no largest number, then we say that the vector
space is of infinite-dimensional.
Example. The vector space of continuous functions over the unit interval C[0, 1] is of
infinite dimension.
8.2 Definition of an abstract vector or linear space: | 211
8.2.3 Coordinates
Let V be a finite-dimensional vector space over the field F. Let (x1 , x2 , . . . , xn ) be a basis
for V. Let z be any vector in V. Then we have
z = α1 x1 + α2 x2 + ⋅ ⋅ ⋅ + αn xn (8.1)
z = β1 y1 + β2 y2 + ⋅ ⋅ ⋅ + βn yn (8.2)
p11
[ p ]
[ 21 ]
y1 = p11 x1 + p21 x2 + ⋅ ⋅ ⋅ + pn1 xn = [ x1 x2 ⋅⋅⋅ xn ] [
[ ..
]
]
[ . ]
[ pn1 ]
yn = p1n x1 + p2n x2 + ⋅ ⋅ ⋅ + pnn xn
z = β1 (p11 x1 + p21 x2 + ⋅ ⋅ ⋅ + pn1 xn ) + β2 (p12 x1 + p22 x2 + ⋅ ⋅ ⋅ + pn2 xn1 )
+ ⋅ ⋅ ⋅ + βn (p1n x1 + p2n x2 + ⋅ ⋅ ⋅ + pnn xn )
= x1 (p11 β1 + p12 β2 + ⋅ ⋅ ⋅ + p1n βn ) + x2 (p21 β1 + p22 β2 + ⋅ ⋅ ⋅ + p2n βn )
+ ⋅ ⋅ ⋅ + xn (pn1 β1 + pn2 β2 + ⋅ ⋅ ⋅ + pnn βn ) (8.3)
α = Pβ ⇒ β = P−1 α (∵ P is nonsingular)
Theorem. Let V be a n-dimensional vector space over the field F, let B and B′ be two
ordered bases of V. Then there is a unique, necessarily invertible, n × n matrix P with
entries in F such that
212 | 8 Vector space over a field
1.
[z]B = P[z]B′
An important point to note is that once a set of basis vectors is selected, coordi-
nates can be defined and all algebraic operations in the abstract vector space (of finite
dimension) can be reduced to operations on matrices and n-tuples.
Problems
1. Consider the vector space of polynomials of degree ≤ 3. Are the following four a
linearly independent set?
1 − 2t + t 2 − 3t 3 , −2 + t − 4t 2 + 5t 3 , −1 − 4t − 5t 2 + t 3 , 3 + t − 4t 2 + 3t 3
2. Consider the vector space of 3 × 3 symmetric matrices. What is its dimension? Find
a basis.
3. Consider the vector space of complex numbers over the real field. What is its di-
mension? Consider the vector space of complex numbers over the complex num-
ber field. What is its dimension?
4. Given the vector space ℝ4 , suppose xT = (a, b, c, d) where a, b, c, d ∈ ℝ. Consider
the subset with a + c = 0, b = 3d. Is this subset a vector space?
5. Consider a linear homogeneous n-th order differential equation with constant co-
efficients. Consider the space of polynomials of degree ≤ N where N is arbitrary
but fixed. By operating in the coordinate space show that no finite polynomial can
ever be a solution.
6. Let V be the vector space generated by the polynomials
p1 = x 3 − 2x 2 + 4x + 1
p2 = x 3 + 6x − 5
p3 = 2x 3 − 3x 2 + 9x − 1
p4 = 2x 3 − 5x 2 + 7x + 5
w = α1 y1 + α2 y2 + ⋅ ⋅ ⋅ + αn yn
where αi are constants (scalars in the field). Show that the representation of w is
unique.
8. Let U be the vector space of all 2 × 3 matrices over field ℝ.
(a) Determine a basis for U.
(b) Determine if the following matrices in U are dependent or independent:
1 2 −3
A=( )
4 0 1
1 3 −4
B=( )
6 5 4
3 8 −11
C=( ).
16 10 9
9 Linear transformations
9.1 Definition of a linear transformation
Recall from calculus the definition of a function. A function consists of the following:
1. a set X called the domain of the function
2. a set Y called the codomain of the function
3. a rule (or correspondence) f , which associates with each element x of X a single
element f (x) of Y.
We write f : X → Y to indicate the function. Figure 9.1 shows the key features of a
function schematically.
Figure 9.1: Schematic diagram illustrating the domain and codomain of a function or a linear trans-
formation.
https://doi.org/10.1515/9783110739701-010
9.1 Definition of a linear transformation | 215
Examples.
1. Let V be any vector space. Then the identity transformation I defined by
Iu = u,
T(u) = Au
f (x) = c0 + c1 x + ⋅ ⋅ ⋅ + ck xk
Let (Df )(x) = c1 + 2c2 x + ⋅ ⋅ ⋅ + kck x k−1 . Then D is a linear transformation from V into
V—the differentiation transformation.
4. Let V be the vector space of polynomials in t defined over the field ℝ and T be
integration operation from 0 to 1, i. e.,
T:V→ℝ
Tf = ∫ f (t) dt
0
T(A) = PAQ.
T(0v ) = 0w
Examples.
1. Let V = ℝ2 and define T(x1 , x2 ) = (x2 , x1 ), then T is a linear operator on V
2. Let V = space of 2 × 2 matrices defined on ℝ. For A ∈ V, define T(A) = AB − BA
where B is a fixed 2 × 2 matrix, e. g.,
1 2
B=( )
3 4
aij ∈ F. These equations contain the essential information about T, i. e., they tell us
how T transforms each of the basis vectors (e1 , e2 , . . . , en ). Now, let u ∈ V be an arbi-
trary vector. Then
u = α1 e1 + ⋅ ⋅ ⋅ + αn en
α1
.
α=( )
.
αn
Tu = w ∈ W
w = β 1 f1 + ⋅ ⋅ ⋅ + β m fm
β1
.
β=( ), βj ∈ F
.
βm
Tu = α1 Te1 + ⋅ ⋅ ⋅ + αn Ten
= α1 [a11 f1 + a21 f2 + ⋅ ⋅ ⋅ + am1 fm ] + ⋅ ⋅ ⋅
+ αn [a1n f1 + a2n f2 + ⋅ ⋅ ⋅ + amn fm ]
= f1 [a11 α1 + a12 α2 + ⋅ ⋅ ⋅ + a1n αn ] + ⋅ ⋅ ⋅
+ fm [am1 α1 + am2 α2 + ⋅ ⋅ ⋅ + amn αn ]
β1 = a11 α1 + ⋅ ⋅ ⋅ + a1n αn
.
.
βm = am1 α1 + ⋅ ⋅ ⋅ + amn αn
or β = Aα, where A = {aij } is called the matrix representation of the linear transforma-
tion T with respect to the bases e (in V) and f (in W). Thus, if u ∈ V, Tu ∈ W and once
we choose a basis for V and W then the linear transformation T : V → W (in abstract
spaces) is equivalent to the transformation
β1 a11 . . . a1n α1
[ . ] [ . . . . ][ . ]
n [ m
] [ ][ ]
T:ℝ →ℝ ⇒[ ]=[ ][ ]
[ . ] [ . . . . ][ . ]
[ βm ] [ am1 . . amn ] [ αn ]
218 | 9 Linear transformations
Examples.
1. Let V = ℝ2 and consider the linear operator T(u1 , u2 ) = (u2 , u1 ). Choose e1 = (1, 0),
e2 = (0, 1) as a basis for V.
0 1
A=( )
1 0
1 −1
A=( )
2 4
1 1
e1 = ( ), e2 = ( )
−1 −2
gives
2 0
A=( ).
0 3
d
(1) = 0 = 0.1 + 0.t + 0.t 2 + 0.t 3
dt
d
(t) = 1 = 1.1 + 0.t + 0.t 2 + 0.t 3
dt
d 2
(t ) = 2t = 0.1 + 2.t + 0.t 2 + 0.t 3
dt
d 3
(t ) = 3t 2 = 0.1 + 0.t + 3.t 2 + 0.t 3
dt
⇒
9.2 Matrix representation of a linear transformation | 219
0 1 0 0
0 0 2 0
A=( ).
0 0 0 3
0 0 0 0
Let
f = α0 + α1 t + α2 t 2 + α3 t 3
Then
0 1 0 0 α0
df 0 0 2 0 α1
=( )( )
dt 0 0 0 3 α2
0 0 0 0 α3
α1
2α2
=( ) = α1 .1 + 2α2 .t + 3α3 .t 2 + 0.t 3
3α3
0
1 1 1
t 2 1
∫ 1. dt = 1, ∫ t. dt = =
2 0 2
0 0
1 1
1 1
∫ t 2 . dt = , ∫ t 3 . dt =
3 4
0 0
1 1 1
A=[ 1 2 3 4
]
If
f = α0 + α1 t + α1 t 2 + α1 t 3 ∈ V
then
α0
1 1 1
[ α ] α α α3
] = α0 + 1 + 2 +
[ 1 ]
Tf = [ 1 ][
2 3 4 [ α2 ] 2 3 4
[ α3 ]
220 | 9 Linear transformations
5. Let V be the vector space of all 2 × 2 real matrices and T be a linear operator on V
defined by
0 1 0 1
T(A) = A ( )−( )A
1 0 1 0
1 0 0 1 0 1 1 0
Te1 = ( )( )−( )( )
0 0 1 0 1 0 0 0
0 1 0 0
=( )−( )
0 0 1 0
0 1
=( ) = 0.e1 + 1.e2 − 1.e3 + 0.e4
−1 0
0 1 0 1 0 1 0 1
Te2 = ( )( )−( )( )
0 0 1 0 1 0 0 0
1 0 0 0
=( )−( )
0 0 0 1
1 0
=( ) = 1.e1 + 0.e2 + 0.e3 − 1.e4
0 −1
0 0 0 1 0 1 0 0
Te3 = ( )( )−( )( )
1 0 1 0 1 0 1 0
0 0 1 0
=( )−( )
0 1 0 0
−1 0
=( ) = −1.e1 + 0.e2 + 0.e3 + 1.e4
0 1
0 0 0 1 0 1 0 0
Te4 = ( )( )−( )( )
0 1 1 0 1 0 0 1
0 0 0 1
=( )−( )
1 0 0 0
0 −1
=( ) = 0.e1 − 1.e2 + 1.e3 + 0.e4
1 0
0 1 −1 0
1 0 0 −1
[T]{ei } =( )
−1 0 0 1
0 −1 1 0
9.2 Matrix representation of a linear transformation | 221
Theorem. Let V be a finite-dimensional vector space over the field F, and let B1 =
(e1 , e2 , . . . , en ), B2 = (u1 , u2 , . . . , un ) be two sets of ordered bases for V. Suppose that T is
a linear operator on V. If P is the n×n transition matrix, which expresses the coordinates
of each vector in V relative to B1 in terms of its coordinates relative to B2 , then
A2 = P−1 A1 P
Sketch of the proof. A1 is the matrix representation of T w. r. t. B1 . Express the new ba-
sis in terms of the old one
If x ∈ V, then
x = α1 e1 + ⋅ ⋅ ⋅ + αn en
= α1′ u1 + ⋅ ⋅ ⋅ + αn′ un
α = Pα′
[Tx] = A1 α = y = β1 e1 + ⋅ ⋅ ⋅ + βn en
⇒ β = A1 α
Now,
Definition. Let V and W be vector spaces over the field F and T : V → W be a linear
transformation. Then,
Proof.
1. Let ℝT denote the range of T. Let w1 ∈ ℝT and w2 ∈ ℝT . Then ∃u1 , u2 ∈ V such
that
Tu1 = w1 , Tu2 = w2
Now consider
∴ NT is a subspace.
Rank of T = dim(range of T)
Nullity of T = dim(kernel of T) = dim(null space of T)
Proof. Let u1 , . . . , uk be a basis for NT (null space of T). Then there are vectors
(uk+1 , . . . , un ) in V such that (u1 , . . . , un ) is a basis for V. We shall prove that {Tuk+1 , . . . ,
Tun } is a basis for the range of T. The vectors {Tu1 , . . . , Tun } certainly span the range
of T and since Tuj = 0, j = 1, . . . , k, we see that {Tuk+1 , . . . , Tun } span the range of T.
To see that these vectors are linearly independent, suppose we have scalars αi such
that
n n
∑ αi (Tui ) = 0 ⇒ T( ∑ αi ui ) = 0
i=k+1 i=k+1
k
y = ∑ βi ui
i=1
Thus,
k n
∑ βi ui − ∑ αi ui = 0
i=1 i=k+1
αk+1 = ⋅ ⋅ ⋅ αn = 0, β1 = ⋅ ⋅ ⋅ = βk = 0
Thus, if k is the nullity of T, the fact that {Tuk+1 , . . . , Tun } form a basis for range of T
implies that rank of
T = n − k = dim(Range of T)
∴ The result.
Ax = b, x ∈ ℝn , b ∈ ℝm
The matrix A may be viewed as the linear transformation A : ℝn → ℝm and the solu-
tion to Ax = b is the preimage of b ∈ℝm . Furthermore, the solution of the associated
homogeneous equation Ax = 0 may be viewed as the kernel of the linear mapping
A :ℝn → ℝm . Figure 9.2 illustrates the kernel and range of a linear transformation
schematically.
224 | 9 Linear transformations
Figure 9.2: Schematic diagram illustrating the kernel and range of a linear transformation. The dot
represents the zero vector.
Ax = 0
9.2.4 Isomorphism
If V and W are vector spaces over the field F, any one-to-one linear transformation
T of V onto W (i. e., any bijective linear transformation of V into W) is called an iso-
morphism of V into W. If there exists an isomorphism of V into W, we say that V is
isomorphic to W.
Proof. Let V be the n-dimensional space over F and let B = {e1 , e2 , . . . , en } be a basis
for V. Define a function T : V → Fn as follows:
If x ∈ V, let Tx be the n-tuple of coordinates (α1 , . . . , αn ) of x relative to the basis B.
Now it is easily verified that T is linear, one-to-one and maps V onto Fn .
∴ The result.
For many purposes one often regards isomorphic vector spaces as being “the
same,” although the vectors and operations in the spaces may be quite different.
9.2 Matrix representation of a linear transformation | 225
Proof. Note that T is nonsingular implies that T is one-to-one and onto. To show this,
let
Tx = Ty
⇒
⇒
x − y = 0v (since T is nonsingular)
⇒ x = y ∴ Tx = Ty ⇒ x = y ∴ T is one-to-one.
To show that T is onto, it is sufficient if we show that if {e1 , . . . , en } is a basis for V
then {Te1 , . . . , Ten } is a basis for W. We claim that Te1 , . . . , Ten are an independent set
of vectors; for suppose
α1 Te1 + ⋅ ⋅ ⋅ + αn Ten = 0w
⇒
α1 e1 + ⋅ ⋅ ⋅ + αn en = 0v since T is nonsingular
TT−1 = Iw = identity on W
T−1 T = Iv = identity on V
226 | 9 Linear transformations
Theorem. Let V and W be finite-dimensional vector spaces over the field F such that dim
V = dim W. If T is a linear transformation from V into W, the following are equivalent:
1. T is invertible
2. T is nonsingular
3. The range of T is W
4. If {e1 , . . . , en } is a basis for V, then {Te1 , . . . , Ten } is a basis for W.
Tx1 = y1 or x1 = T−1 y1
Tx2 = y2 or x2 = T−1 y2
Now,
Thus,
Application
Consider the solution of Ax = b where A is n × n, x, b ∈ℝn . The above theorems can be
used to prove the following theorem.
Theorem (Linear algebraic equations).
1. If the homogeneous system Ax = 0 (A is n × n and x is n × 1) has only the trivial
solution, then the inhomogeneous system has a unique solution for any b ∈ℝn .
2. If Ax = 0 has nonzero solution, then ∃b ∈ℝn for which Ax = b has no solution.
Furthermore, if a solution exists for some b, it is not unique.
u1 − 2u2 = b1
2u1 − 4u2 = b2
b
or Au = b with A = ( 21 −4
−2 ) and b = ( 1 ).
b 2
9.2 Matrix representation of a linear transformation | 227
2
uh = cx1 ; x1 = ( ),
1
1
u = c 1 x1 + x2 ; x2 = ( ),
2
where c1 is an arbitrary constant. Figure 9.3 shows the solution space of the homoge-
neous and (consistent) inhomogeneous system.
Figure 9.3: The eigenvectors of A and the solution spaces of the homogeneous equation Au = 0 and
inhomogeneous equation Au = b (when they are consistent).
Remark. The vectors x1 and x2 are the eigenvectors of the matrix A, with x1 corre-
sponding to the zero eigenvalue. The solution space of Au = b is a translation of that
of Au = 0 by x2 . Such a space is called an affine space.
Problems
1. Determine which of the transformations are linear
(a) T : ℝ2 → ℝ2 defined by T(x, y) = (x + y, x)
(b) T : ℝ2 → ℝ3 defined by T(x, y) = (x + 1, 2y, x + y)
228 | 9 Linear transformations
where g(t) and f (t) are elements in C[0, 1] and K(t, s) is continuous in [0, 1] ×
[0, 1].
2. Let T be a linear operator on ℝ3 defined by
T(x, y, z, w) = (x − y + z + w, x + 2z − w, x + y + 3z − 3w).
Find a basis and the dimension of (a) the range of T, (b) the kernel of T.
4. Consider the linear operator T on ℝ3 defined by
In a normed linear space, in addition to length, we can also measure distance between
vectors by defining a distance function
d(x, y) = ‖x − y‖.
Examples.
1. Let V = ℝn and for x ∈ V, define
1/p
n
p
‖x‖p = (∑ |xi | ) , p≥1
i=1
It can be shown that this definition satisfies all three rules of a norm:
(a) For p = 1, ‖x‖1 = ∑ni=1 |xi |.
(b) For p = 2, ‖x‖2 = (∑ni=1 |xi |2 )1/2 , which is the standard Euclidean norm.
(c) For p = ∞, ‖x‖∞ = max1≤i≤n |xi |, which is also referred to as the supremum
norm.
2. Let V = C[a, b], the space of continuous functions defined on [a, b]. For f (x) ∈ V,
define
b
https://doi.org/10.1515/9783110739701-011
230 | 10 Normed and inner product vector spaces
Again, it may be shown that these three definitions satisfy the three rules of a
norm. [Note that the vector space C[a, b] is infinite-dimensional.]
If f (x), g(x) ∈ V, the distance function corresponding to the above norms are given
by
n
1 1 1 1 1
Sn = 1 + + + + ⋅⋅⋅ + =∑ ;
1! 2! 3! n! k=1 k!
are continuous, i. e., fn ∈ C[0, 1] and gn ∈ C[−1, 1] for any finite n, but for n → ∞,
1, x=0
f∞ (x) = { ∉ C[0, 1]
0, x ≠ 0
{ −1, x<0
{
g∞ (x) = { 0, x = 0 ∉ C[−1, 1]
{
{ 1, x>0
10.2 Inner product vector spaces | 231
Thus, the limits of sequences of continuous functions may not be continuous and the
space C[a, b] may not be complete depending on how d(f , g) is defined. Similarly, in
infinite-dimensional vector spaces, the functions (or vectors) may have uncountable
number of discontinuities as illustrated by the example below.
0, x is rational
fD (x) = {
1, x is irrational
1
The Riemann integral ∫0 fD (x) dx does not exist. However, since the set of rational
numbers have zero measure in [0, 1], the Lebesque integral exists. Further, there is
no difference between fD (x) and function ̂f (x) = 1∀x ∈ [0, 1] in the Lebesque theory of
integration. The Lebesque integral of fD (x) is
The Fourier coefficients of fD and ̂f are identical and d2 (̂f , fD ) = 0. We return to this
example in Chapter 21 when we discuss the theory of convergence in function spaces.
2. Hermitian symmetry
Remarks.
1. If F = ℝ, then the bar denoting complex conjugation is superflous.
2. Inner product is a generalization to an abstract vector space of the dot (scalar)
product in two and three dimensions.
Examples.
1. Let F = ℝ and V = ℝn . For u, v ∈ V, define
The distance between two points (vectors) is defined as d(u, v) = ‖u − v‖. These
are the standard dot product and distance function in the Euclidean space ℝn .
2. Let F = ℂ and V = ℂn . For u, v ∈ V define
⟨u, v⟩ = u1 v1 + u2 v2 + ⋅ ⋅ ⋅ + un vn
Then
(a)
(b)
⟨v, u⟩ = v1 u1 + ⋅ ⋅ ⋅ + vn un
⟨v, u⟩ = v1 u1 + ⋅ ⋅ ⋅ + vn un
= v1 u1 + ⋅ ⋅ ⋅ + vn un
= ⟨u, v⟩
(c)
⟨u, u⟩ = u1 u1 + ⋅ ⋅ ⋅ + un un
= |u1 |2 + ⋅ ⋅ ⋅ + |un |2 > 0.
10.2 Inner product vector spaces | 233
⟨u, v⟩ = vT Gu
Then
(a)
(b) T
⟨u, v⟩ = vT Gu = (vT Gu) since it is a scalar
T T
=u G v
= uT Gv (since GT = G)
= ⟨v, u⟩
(c)
⟨u, u⟩ = uT Gu > 0
since G is positive definite. G is called the matrix of the inner product (or met-
ric of the inner product) space. For the standard (Euclidean) inner product in
Example 1,
1 0 . 0
G=( . . . . )=I
0 0 . 1
4. Let F = ℝ and V = space of all continuous real valued functions in the interval
a ≤ t ≤ b. For f (t), g(t) ∈ V, define
⟨f , g⟩ = ∫ f (t)g(t) dt
a
This satisfies the axioms of an inner product. Note that the space V in this example
is infinite-dimensional. This space is denoted by C[a, b]:
‖f ‖ = 0 ⇒ ∫ f (t)2 dt = 0
a
234 | 10 Normed and inner product vector spaces
If f (t) is continuous, the only way the integral can be zero is f (t) ≡ 0, a ≤ t ≤ b.
This inner product is very useful in applications. Very often, we are interested in
solving nonlinear equations of the form:
N(y) = 0
Since an exact solution is not possible, y(t) is approximated by f (t). The closeness
of this approximation to the exact solution can be found only if an inner product
is defined on the space.
There exist a variety of inner products on the space C[a, b]. For example,
⟨f , g⟩ = ∫ f (t)g(t) dt
a
It can be shown that this satisfies all the axioms of an inner product.
Definition. Let V be an inner product space. The length of a vector u ∈ V (also called
the norm of u denoted by ‖u‖) is defined by ‖u‖ = √⟨u, u⟩.
2
⟨u, v⟩ ≤ ⟨u, u⟩.⟨v, v⟩
Proof. If v = 0, both sides of the inequality are zero and it is satisfied. Assume v ≠ 0.
Then, by property (3) of inner product,
⟨w, w⟩ ≥ 0, w∈V
Let w = u − αv ⇒
⟨u − αv, u − αv⟩ ≥ 0
⇒ ⟨u, u − αv⟩ − α⟨v, u − αv⟩ ≥ 0
⇒ ⟨u − αv, u⟩ − α⟨u − αv, v⟩ ≥ 0
⇒ ⟨u, u⟩ − α⟨v, u⟩ − α[⟨u, v⟩ − α⟨v, v⟩] ≥ 0
10.2 Inner product vector spaces | 235
Let α = ⟨u,v⟩
⟨v,v⟩
, ⇒
2
⟨u, u⟩.⟨v, v⟩ ≥ ⟨u, v⟩
∴ The result.
Definition. The angle between two vectors u, v in an inner product vector space V is
defined by
|⟨u, v⟩|
cos θ =
‖u‖.‖v‖
Definition. Let V be an inner product space over a field F. Two vectors u, v ∈ V are
said to be orthogonal w. r. t. its inner product if ⟨u, v⟩ = 0.
Remark. If V is an inner product space, we can define (1) distances between vectors
(2) lengths of vectors (3) angles between vectors, i. e., an inner product space has a
geometrical structure.
A vector space in which only distances are defined is called a metric space.
A vector space in which lengths are defined is called a normed linear space. The
schematic diagram of Figure 10.1 shows the relationship between these spaces.
Theorem. Let V be a finite dimensional inner product vector space and {u1 , u2 , . . . , un }
be a set of orthogonal vectors. Then, this set is linearly independent provided it does not
include the zero vector.
n
v = ∑ αi u i
i=1
⇒
236 | 10 Normed and inner product vector spaces
n
⟨v, uj ⟩ = ⟨∑ αi ui , uj ⟩ = αj ⟨uj , uj ⟩
i=1
If ⟨uj , uj ⟩ ≠ 0 ⇒
⟨v, uj ⟩
αj =
‖uj ‖2
∴
n
⟨v, ui ⟩
v=∑ u
i=1
‖ui ‖2 i
Proof. Let {u1 , u2 , . . . , un } be a basis for V. From this basis, we shall show how to obtain
an orthogonal basis {v1 , v2 , . . . , vn }. When each of the vectors in this orthogonal basis
is normalized to have unit length, then we obtain an orthonormal basis.
vi
ei = , i = 1, 2, . . . , n
‖vi ‖
Let
v1 = u1
⟨u2 , v1 ⟩
v2 = u2 − v
‖v1 ‖2 1
Then
⟨u2 , v1 ⟩
⟨v2 , v1 ⟩ = ⟨u2 , v1 ⟩ − ⟨v1 , v1 ⟩ = 0
‖v1 ‖2
∴ v2 is orthogonal to v1 . Let
2
⟨u3 , vi ⟩
v3 = u3 − ∑ v
i=1
‖vi ‖2 i
Then
k−1
⟨uk , vi ⟩
vk = uk − ∑ v
i=1
‖vi ‖2 i
k−1
⟨uk , vi ⟩
uk = ∑ v
i=1
‖vi ‖2 i
n
x = ∑ αi ei
i=1
n
⟨x, ej ⟩ = ⟨∑ αi ei , ej ⟩ = αj
i=1
238 | 10 Normed and inner product vector spaces
∴
n
x = ∑⟨x, ei ⟩ei
i=1
Remark. The above procedure also indicates to us how to define the inner product
so that {u1 , u2 , . . . , un } is an orthonormal basis. Thus, we may formulate a theorem as
follows.
Theorem. Let V be a finite dimensional vector space over a field F. Let {u1 , u2 , . . . , un } be
a basis for V. Then, ∃ an inner product on V such that {u1 , u2 , . . . , un } is an orthonormal
basis.
Proof. Let x, y ∈ V be any two linearly independent vectors. Expand x and y in terms
of the basis {ui }
n n
x = ∑ αi ui , y = ∑ βi ui
i=1 i=1
Define
n
⟨x, y⟩ = ∑ αi βi
i=1
α = Pα′ , β = Pβ′
This is the inner product in the e-basis that makes {ui } an orthonormal set. Note that
PT P is a positive definite matrix.
Examples.
1. Let V = C[a, b] = space of continuous real valued functions over the field ℝ. For
g(t) ∈ V, define
f (g(t)) = ∫ g(t) dt
a
This linear functional maps the space V into the real line.
2. Let F = ℝ and V = ℝn and let {e1 , e2 , . . . , en } be a basis for V. Let f : ℝn → ℝ be a lin-
ear functional on V and f (ej ) = aj . Then the matrix of f in the basis {e1 , e2 , . . . , en }
is a row vector [a1 , a2 , . . . , an ]. If x ∈ V is any vector and
n
x = ∑ βj e j
j=1
Then
n
f (x) = f (∑ βj ej )
j=1
n
= ∑ βj f (ej ) since f is linear
j=1
n n
= ∑ βj aj = ∑ βj aj = ⟨β, a⟩
j=1 j=1
This appears like the standard inner product of x with a fixed vector in V, i. e.,
f (x) = ⟨β, a⟩.
Through the use of an orthonormal basis, this adjoint operation on linear operators
is identified with the operation of forming the conjugate transpose of a matrix. These
ideas are illustrated below.
Let
n
y = ∑ f (ej )ej
j=1
We shall show that this y is the y of the theorem. Let f ̂ be the linear functional on V
defined by
⇒
n
f ̂(ek ) = ⟨ek , ∑ f (ej )ej ⟩
j=1
n
= ⟨∑ f (ej )ej , ek ⟩
j=1
= f (ek )
= f (ek )
Thus,
Since f ̂ and f agree on each basis vector, we have f = f ̂, and hence the theorem is
proved. Now suppose that there are two such vectors (say y and z). Then
Take x = y − z ⇒
⟨y − z, y − z⟩ = 0 ⇒ y − z = 0 or y=z
Proof.
1. T∗ exists: Let y ∈ V be a vector. We shall define T∗ y to prove its existence. Now,
ŷ = T ∗ y
[Note: T∗ is called the adjoint operator, so that ⟨Tx, y⟩ = ⟨x, T∗ y⟩ for x, y ∈ V].
2. T∗ is a linear operator: Consider
∴ T∗ is a linear operator.
αij = ⟨Tej , ei ⟩
(b) the matrix of T∗ with respect to the same basis is A∗ (conjugate transpose of A).
242 | 10 Normed and inner product vector spaces
Proof. Let
n
Tej = ∑ αij ei (10.1)
i=1
If we expand Tej in terms of the basis vectors, we get the j-th column of A,
⇒
⟨Tej , ei ⟩ = αij
If we expand T∗ ej in terms of the basis, we get the j-th column of the matrix of T∗ ,
⇒
n
⟨T∗ ej , ei ⟩ = ⟨∑ βij ei , ei ⟩ = βij
i=1
∴ The result.
Remark. If the basis of V is not orthonormal, the relationship between the matrix of
T and T∗ is more complicated than given in the theorem above.
⟨Tx, y⟩ = ⟨x, T∗ y⟩
if T = T∗ (self-adjoint) ⇒
Thus, self-adjointness or the symmetry property of a linear operator or its matrix rep-
resentation very much depends on the definition of inner product (or equivalently,
adjointness depends on the definition of inner product).
Definition. A linear operator T is called normal if it commutes with its adjoint, i. e.,
TT∗ = T∗ T.
10.3 Linear functionals and adjoints | 243
Characteristic values
Let T : V → V is a linear operator over a field F. A scalar λ ∈ F is called a characteristic
value or eigenvalue of T if Tx = λx, x ∈ V, λ ∈ F. This definition is crucial for in some
cases because of the limitation on the field there would be no characteristic values.
If we choose an orthonormal basis for V, the eigenvalues of T and T∗ (adjoint) as well
as the corresponding eigenvectors can be found from the matrix representations. Sup-
pose that {e1 , e2 , . . . , en } is an orthonormal basis for V, and A = [T]e = n×n matrix with
aij ∈ F. Then the eigenvalues of T are given by the algebraic equations
(A − λI)x = 0 (10.2)
and of T∗ by
Now,
Tx = λx or Ax = λx
T y = ηy or
∗
A∗ y = ηy = λy.
Proof.
1. (a) Let x ∈ V
λ⟨x, x⟩ = ⟨λx, x⟩
= ⟨Tx, x⟩
= ⟨x, T∗ x⟩
= ⟨x, Tx⟩
= ⟨x, λx⟩
= ⟨λx, x⟩
= λ⟨x, x⟩
= λ⟨x, x⟩
Txi = λi xi
⟨Txi , xj ⟩ = λi ⟨xi , xj ⟩
⟨Txi , xj ⟩ = ⟨xi , T∗ xj ⟩
= ⟨xi , Txj ⟩
= ⟨xi , λj xj ⟩
= λj ⟨xi , xj ⟩ since λj is real
as λi ≠ λj ⇒ ⟨xi , xj ⟩ = 0
2.
λ⟨x, x⟩ = ⟨λx, x⟩
= ⟨Tx, x⟩
10.3 Linear functionals and adjoints | 245
= ⟨x, T∗ x⟩
= ⟨x, −Tx⟩
= ⟨x, −λx⟩
= −λ⟨x, x⟩
= −λ⟨x, x⟩
λ⟨x, x⟩ = ⟨λx, x⟩
= ⟨Tx, x⟩
= ⟨S∗ Sx, x⟩
= ⟨Sx, Sx⟩
Theorem. Let V be a finite dimensional inner product space defined over F and let T be
a self-adjoint linear operator. Then T has n eigenvectors.
Proof. It is sufficient if we prove the theorem for a Hermitian matrix A. We need only
to show that a Hermitian matrix does not have any generalized eigenvectors of rank 2.
This implies there are no generalized eigenvectors of rank > 2. If there was a GEV of
rank k(k ≥ 3), then it generates a chain of GEV of rank k, k − 1, . . . , 2, 1. Thus, it is suffi-
cient to prove the theorem for k = 2. Assume x is a GEV of rank 2 with eigenvalue λ, ⇒
(A − λI)2 x = 0, (A − λI)x ≠ 0
0 = ⟨x, 0⟩
= ⟨x, (A − λI)2 x⟩
= ⟨(A − λI)2 x, x⟩
= ⟨(A − λI)x, (A − λI)∗ x⟩
= ⟨(A − λI)x, (A − λI)x⟩
= ⟨(A − λI)x, (A − λI)x⟩
Proof. It follows from previous two theorems. Let λ1 , λ2 , . . . , λr be the distinct eigenval-
ues. If r = n, then there are n orthogonal eigenvectors. If r < n, there are repeated
eigenvalues. Suppose λi is an eigenvalue of multiplicity mi . We showed that there can-
not be any GEV of rank ≥ 2.
⇒ There are mi eigenvectors corresponding to λi . Apply the Gram–Schmidt procedure
to make these orthogonal. Then these are not only orthogonal to each other but also
to other eigenvectors.
∴ The result.
If x is an arbitrary vector, then
n
x = ∑ αi xi
i=1
∴
n
x = ∑ Ei x ⇒ E1 + E2 + ⋅ ⋅ ⋅ + En = I (10.4)
i=1
∴
n
Tx = ∑ αi Txi
i=1
= ∑ αi λ i xi
n
= ∑ λi Ei x
i=1
λ1 E1 + λ2 E2 + ⋅ ⋅ ⋅ + λn En = T (10.5)
10.3 Linear functionals and adjoints | 247
V = W1 ⊕ W2 ⊕ ⋅ ⋅ ⋅ ⊕ Wn .
Invariant subspaces
Let V be a finite-dimensional vector space and T be a linear operator on V. Let W be
a subspace of V. W is called an invariant subspace (w. r. t. T) if T maps W into itself,
i. e., x ∈ W ⇒ Tx ∈ W. A schematic diagram of such invariant subspaces is shown in
Figure 10.2 for V = W1 ⊕ W2 .
V = W1 ⊕ W2 ⊕ ⋅ ⋅ ⋅ ⊕ Wk ,
k
x = ∑ αi w i where wi ∈ Wi ,
i=1
and {αi } are uniquely determined. Then we say that V is a direct sum of W1 , . . . , Wk and
write as
V = W1 ⊕ W2 ⊕ ⋅ ⋅ ⋅ ⊕ Wk
Example.
1. Let V = ℝ2 , W1 = space spanned by e1 = (1, 0), W2 = space spanned by e2 = (0, 1)
If x ∈ V, then
x = α1 e1 + α2 e2
V = W1 ⊕ W2
2. Let V = ℝ3 , W1 = {(1, 0, 0), (0, 1, 0)}, W2 = {(0, 0, 1)}. Then we can write
V = W1 ⊕ W2
A schematic diagram of the spaces W1 and W2 is shown in Figure 10.3. Here, W1 is the
(x, y) plane of dimension 2 and W2 is the z-axis of dimension 1.
Projections
Let E : V → V be a linear operator on a finite-dimensional vector space V. Then E is
called a projection if E2 x = Ex for all x in V.
10.3 Linear functionals and adjoints | 249
Theorem. Let E be a projection. Let R = range of E = {y/y = Ex} and N = null space of
E = {x/Ex = 0}. Then V = R ⊕ N.
Proof. If x ∈ V, then
x = x + Ex − Ex = (x − Ex) + Ex
x = y + z, y ∈ R, z ∈ N
x = Ey for some y ∈ V
Ex = E(Ey) = Ey = x
But x is also in N ⇒ Ex = 0
⇒x=0
∴ V = R ⊕ N.
Definition. E1 and E2 are called orthogonal projections on the vector space V if E1 E2 x =
E2 E1 x = 0.
In general, let {E1 , E2 , . . . , Ek } be orthogonal projections on a finite-dimensional
inner product space V. Let
Ej x = xj , j = 1, 2, . . . , k
Then
E2j x = xj , . . . , Enj x = xj , n = 3, 4, . . .
This form of the spectral theorem is a generalization of that stated in Part I. A proof
of this may be found in the book by Halmos [20].
Finally, the following diagonalization theorems can be stated.
1
[ ]
[ 1 ]
[ ]
[
[ 1 ]
]
[ . ]
[ ]
[ ]
[ . ]
[ ]
[
[ 1 ]
]
[ −1 ]
[ ]
[ ]
[ −1 ]
[ ]
[ . ]
[ ]
[ . ]
[ ]
[ ]
[ −1 ]
[ ]
[
[ cos θ1 − sin θ1 ]
]
sin θ1 cos θ1
[ ]
[ ]
[ ]
[ . ]
[ ]
[ . ]
[ ]
[ ]
[ cos θn − sin θn ]
[ sin θn cos θn ]
The proof of these theorems may be found in the books by Halmos [20], Naylor
and Sell [24] and Lipschutz and Lipson [22].
Problems
1. Given the vector space of polynomials of degree at most N defined over the interval
(a, b). Equip V with inner product
10.3 Linear functionals and adjoints | 251
⟨f , g⟩ = ∫ ρ(t)f (t)g(t) dt
a
where ρ(t) > 0 for a < t < b. Indicate how one may determine an orthogonal basis
set by applying the Gram–Schmidt process to the basis {1, t, t 2 , . . . , t N }. Find the
first three members for the following cases:
(a) ρ(t) = 1, a = 0, b = 1 (Legendre polynomials on the unit interval)
(b) ρ(t) = 1, a = −1, b = 1 (classical Legendre polynomials)
(c) ρ(t) = exp(−t), a = 0, b = ∞ (Laguerre polynomials)
(d) ρ(t) = exp(−t 2 ), a = −∞, b = ∞ (Hermite polynomials)
1
(e) ρ(t) = [t(1 − t)]− 2 , a = 0, b = 1 (Chebyshev polynomials on the unit interval)
2. Consider the space ℂn of n-tuples of complex numbers. Let W be a nonsingular
n × n matrix. For u, v ∈ℂn , define
(u, v)W = v∗ W∗ Wu
(Tu, u)
λ1 ≤ ≤ λn
(u, u)
1 −1
M=( ).
−2 2
T(A) = MA − AM for A ∈ V.
252 | 10 Normed and inner product vector spaces
(a) Find a basis and the dimension of the kernel and image of T.
(b) Show that ⟨A, B⟩ = tr(B∗ A), where tr stands for the trace (sum of diagonal
elements) satisfies the requirements of an inner product.
(c) Find the adjoint operator.
(d) Determine the eigenvalues and eigenvectors of T.
8. In the numerical solution of transport and reaction problems in a tube in which
the flow is laminar, we need a set of polynomial trial functions (to approximate
the unknown solution) on the unit interval 0 < r < 1 such that each function
vanishes at r = 1 while its derivative vanishes at r = 0. The functions should also
be orthogonal w. r. t. the weight function
ρ(r) = 4r(1 − r 2 )
w′ = w − (c1 u1 + c2 u2 + ⋅ ⋅ ⋅ + cr ur )
where
⟨w, uj ⟩
cj = ; j = 1, 2, . . . , r
⟨uj , uj ⟩
Now suppose that T is not self-adjoint with respect to the standard inner product. Then
the question is if we can make T self-adjoint w. r. t. the inner product
n n
⟨x, y⟩ = ∑ ∑ gij xj yi = yT Gx (11.2)
i=1 j=1
or equivalently,
GA = AT G
= AT GT (since G = GT )
GA = (GA)T (11.4)
Thus, the matrix A is symmetric (or T is self-adjoint) with respect to the inner product
(11.2) if the matrix (GA) is symmetric with respect to the standard inner product (11.1).
Now consider the special case in which G is a diagonal matrix
https://doi.org/10.1515/9783110739701-012
254 | 11 Applications of finite-dimensional linear algebra
g1 0 . . 0
0 g2 . . 0
G=( . . ), gi > 0 i = 1, 2, . . . , n
. .
0 0 . . gn
Then
G is positive definite ⇒ gi > 0 for all i. Thus, if we can choose gi such that gi > 0 and
aij gi = aji gj
n
⟨x, y⟩ = ∑ gi xi yi (11.5)
i=1
Note that we can do so only if aij and aji are of the same sign. The above generaliza-
tion of the inner product to make a nonsymmetric matrix into a symmetric matrix has
many applications in chemical engineering. We illustrate here the use of weighted dot
product with examples.
1 2
A=( )
3 2
g1 0
⟨x, y⟩ = yT Gx = yT ( ) x; g1 , g2 > 0
0 g2
g1 0 1 2 g 2g1
GA = ( )( )=( 1 )
0 g2 3 2 3g2 2g2
GA is symmetric if
2g1 = 3g2
2
Take g1 = 1 ⇒ g2 = 3
11.1 Weighted dot/inner product in ℝn | 255
2
⟨x, y⟩ = x1 y1 + x2 y2
3
with respect to this inner product, A is symmetric. Note that this definition satisfies all
the rules of inner product.
Eigenvalues and vectors of A:
(1 − λ)(2 − λ) − 6 = 0
⇒ λ2 − 3λ − 4 = 0 ⇒ (λ − 4)(λ + 1) = 0 ⇒ λ = 4, −1
2 2 x 1
λ1 = −1 ⇒ ( ) ( 11 ) = 0 ⇒ x1 = ( )
3 3 x12 −1
−3 2 x 2
λ2 = 4 ⇒ ( ) ( 1 ) = 0 ⇒ x2 = ( )
3 −2 x2 3
It can be seen that ⟨x1 , x2 ⟩ = 1.2 + 32 (−1)(3) = 0. Thus, x1 and x2 are orthogonal w. r. t.
the new inner product.
In fact, it may be shown that any real or complex n × n matrix A that has real
eigenvalues and a complete set of eigenvectors is self-adjoint (symmetric) with respect
to some inner product.
Suppose that A has real eigenvalues and a complete set of eigenvectors. Then, ∃ a
nonsingular matrix T such that
A = TΛT−1 (11.6)
where
λ1 0
.
Λ = spectral matrix = ( )
.
0 λn
(11.6) ⇒
T T
AT = (TΛT−1 ) = (T−1 ) ΛT TT (11.7)
⟨x, y⟩ = yT Gx (11.8)
we should have
GA = (GA)T = AT G (11.9)
256 | 11 Applications of finite-dimensional linear algebra
GAG−1 = AT (11.10)
(11.11) ⇒
GTΛ(GT)−1 = AT
or
G = (TTT ) .
−1
For the special case in which W is a diagonal matrix, equation (11.13) reduces to equa-
tion (11.5).
Example 11.2.
1 2
A=( )
3 2
c1 2c2 c1 −c1
T=( ), TT = ( )
−c1 3c2 2c2 3c2
⇒
11.1 Weighted dot/inner product in ℝn | 257
Take c1 = c2 ⇒
1 10c2 −5c2 2 1 − 21
G= ( )= 2( 1 )
25c4 −5c2 5c2 5c −2 1
2
2
Take c2 = 5
⇒
1 − 21
G=( )
− 21 1
2
T
1 − 21
⟨x, y⟩ = y ( )x
− 21 1
2
1 − 21 x1
= (y1 y2 ) ( )( )
− 21 1 x2
2
1 −y1 + y2 x
= (y1 − y2 )( 1 )
2 2 x2
1 1 1
= x1 y1 − x1 y2 − x2 y1 + x2 y2
2 2 2
Take
1
x=( )
0
⇒
1
⟨x, x⟩ = x12 − x1 x2 + x22 = 1
2
1
⟨x, y⟩ = 0 ⇒ y1 − y2 = 0 ⇒ y2 = 2y1
2
Take y2 = 2 ⇒ y1 = 1.
∴ e1 = (1, 0), e2 = (1, 2) is an orthonormal basis for ℝ2 . In this inner product space e1 , e2
are orthonormal, i. e., they have unit length and orthogonal to each other. Clearly, the
geometry of this space is quite different from what it is with standard inner product
(see Figure 11.1).
258 | 11 Applications of finite-dimensional linear algebra
Figure 11.1: Schematic diagram of orthonormal basis vectors w. r. t. weighted inner product defined
by equation (11.12).
du
C = Qu, u = u0 @ t = 0, (11.14)
dt
α1 0 ⋅⋅⋅ 0
0 α2 ⋅⋅⋅ 0
C=( . .. ) , αi > 0. (11.15)
.. .
0 0 ⋅⋅⋅ αn
du
= C−1 Qu = Au, u = u0 @ t = 0. (11.16)
dt
Here, the matrix A = C−1 Q is not symmetric w. r. t. the usual inner product. However,
if we define the weighted inner product
n
⟨u, v⟩ = vT Cu = ∑ αi ui vi , for all real vectors u and v (11.17)
i=1
and
⇒
Thus, the solution of equations (11.14) or (11.16) can be obtained by taking the inner
product (defined in equation (11.17)) with the eigenvectors xj of A:
d
⟨u, xj ⟩ = ⟨Au, xj ⟩ = ⟨u, Axj ⟩
dt
= λj ⟨u, xj ⟩ (∵ A is symmetric, λj are real)
⇒
⇒
n ⟨u, xj ⟩ n ⟨u0 , xj ⟩
u(t) = ∑ xj = ∑ e λj t x j . (11.20)
i=1
⟨xj , xj ⟩ i=1
⟨xj , xj ⟩
Example 11.3 (Interacting two tank system). As an example, consider the two inter-
acting tank system shown in Figure 11.2.
The model describing this system is given by
dc1
VR1 = −qe c1 + qe c2
dt
dc
VR2 2 = qe c1 − qe c2
dt
where VR1 and VR2 are the volumes of the tanks and qe is the exchange flow rate. The
model can be written in matrix-vector form:
dc c10
= Ac, c(t = 0) = c0 = ( ), (11.21)
dt c20
−α α qe qe
A=( ); α= ; β= . (11.22)
β −β VR1 VR2
The matrix A is not symmetric w. r. t. the usual inner product unless α = β or VR1 = VR2 .
Standard solution
The eigenvalues of matrix A are
1 −α
x1 = ( ); x2 = ( ) (11.24)
1 β
n yTj c0
c(t) = ∑ e λj t x j (11.26)
j=1 yTj xj
βc10 + αc20 1 c − c10 −(α+β)t −α
= ( ) + 20 e ( ). (11.27)
(β + α) 1 (β + α) β
11.2 Application of weighted inner product to interacting tank systems | 261
g1 0
⟨u, v⟩ = vT Gu, G=( ) with gi > 0 (11.28)
0 g2
= g1 u1 v1 + g2 u2 v2 . (11.29)
Then A is symmetric if
GA = (GA)T .
But
g1 0 −α α
GA = ( )( )
0 g2 β −β
−αg1 αg1
=( )
βg2 −βg2
αg1 = βg2 .
Thus, if we take
g1 = β and g2 = α (11.30)
β 0
G=( ) and ⟨u, v⟩ = vT Gu = βu1 v1 + αu2 v2 . (11.31)
0 α
Note that
1 −α
⟨x1 , x2 ⟩ = ⟨( ),( )⟩ = β.1.(−α) + α.1.β = 0.
1 β
Similarly,
2 ⟨c0 , xj ⟩
c(t) = ∑ e λj t x j
j=1
⟨xj , xj ⟩
⟨c0 , x1 ⟩ ⟨c , x ⟩
= x1 + 0 2 e−(α+β)t x2 .
⟨x1 , x1 ⟩ ⟨x2 , x2 ⟩
βc10 + αc20 1 c − c10 −(α+β)t −α
= ( ) + 20 e ( ) (11.32)
(β + α) 1 (β + α) β
Equation (11.32) is the same solution as in equation (11.27) but we do not need to com-
pute the eigenrows. Note that the vectors x1 = ( 11 ) and x2 = ( −α
β ) are orthogonal w. r. t.
the inner product defined in equation (11.31). The solution for α = 1 and β = 3 is shown
in Figure 11.3 for the initial condition of c0 = ( 01 ).
Figure 11.3: Solution diagram of interacting two tank system for α = 1, β = 3, c10 = 1 and c20 = 0.
It can be seen form the plots that at steady-state (or after long time, t → ∞), the so-
lution is in the space spanned by the eigenvector x1 corresponding to zero eigenvalue
(as expected).
d[A1 ]
dt
= −[k21 + k31 ][A1 ] + k12 [A2 ] + k13 [A3 ] }
}
}
} @ t = 0, [Ai ] = [Aio ]
d[A2 ]
= k21 [A1 ] − (k12 + k32 )[A2 ] + k23 [A3 ] }
dt }
} (initial concentration of Ai )
d[A3 ] }
dt
= k31 [A1 ] + k32 [A2 ] − (k13 + k23 )[A3 ] }
Let
[Ai ]
xi = 3
= mole fraction of Ai
∑i=1 [Ai ]
then we have
dx
= Kx
dt
where
dx
= Kx, x(t = 0) = x0 (11.33)
dt
where
and
264 | 11 Applications of finite-dimensional linear algebra
n
kii = − ∑ kji .
j=1,i
Note that each column of K sums to zero. Thus, rank of K < n, and there is a nonzero
solution of Kx = 0. This is the equilibrium solution, denoted by x∗ . We assume that
rank K = n − 1 so that there is no other equilibrium point. Clearly, the matrix K is
nonsymmetric with respect to the usual inner product. A feature crucial to the follow-
ing analysis is the principle of microscopic reversibility or principle of detailed balanc-
ing, which states that at equilibrium the rate of every reaction and its reverse must be
equal, i. e.,
where [A∗j ] is the equilibrium concentration of species Aj . Thus, we have (in terms of
mole fraction)
kij kji
kij xj∗ = kji xi∗ ⇒ = .
xi∗ xj∗
and show that the matrix K is symmetric w. r. t. this inner product. The i-th element of
the vector Ku is given by
n
(Ku)i = ∑ kij uj
j=1
Thus,
n n
vi
⟨Ku, v⟩ = ∑(∑ kij uj )
i=1 j=1
xi∗
n n kij uj vi
= ∑∑
i=1 j=1
xi∗
n n kij uj vi
= ∑∑
j=1 i=1
xi∗
but
kij kji
=
xi∗ xj∗
11.3 Application of weighted inner product to monomolecular kinetics | 265
∴
n n kji uj vi
⟨Ku, v⟩ = ∑ ∑
j=1 i=1
xj∗
Since the indices i and j are dummy we change i → j and j → i without changing the
sum. Thus,
n n
ui
⟨Ku, v⟩ = ∑ ∑ kij vj
i=1 j=1
xi∗
n (∑nj=1 kij vj )ui
=∑
i=1
xi∗
= ⟨u, Kv⟩
Therefore, K is self-adjoint w. r. t this inner product, which implies that all its eigenval-
ues are real and it has a set of n-eigenvectors. Let z1 (= x∗ ), z2 , z3 , . . . , zn be the eigenvec-
tors and λ1 (= 0), λ2 , . . . , λn be the eigenvalues. Now, let us show that K is nonpositive,
i. e., all the nonzero eigenvalues are strictly negative. We consider the quadratic form
n n
1
⟨Ku, u⟩ = ∑( ui ) ∑ kij uj
i=1
xi∗ j=1
n n kij ui uj
= ∑∑
i=1 j=1
xi∗
n n kij ui uj kii u2i
= ∑( ∑ + )
i=1 j=1,i
xi∗ xi∗
n n kij ui uj n kji u2i
= ∑( ∑ −∑ )
i=1 j=1,i
xi∗ j=1,i
xi∗
2
kji kij kij
⟨u, Ku⟩ = [∑ ∑ √ √ ∗ uj ui − ∑ ∑(√ ∗ uj ) ]
i=j̸
xj ∗ xi i=j̸
xj
⇒
266 | 11 Applications of finite-dimensional linear algebra
2
kij kji
2⟨Ku, u⟩ = − ∑ ∑(√ u −√ u) ≤0 (11.36)
i=j̸
xj∗ j xi∗ i
where zj are orthonormal set of eigenvectors and the inner product is defined by equa-
tion (11.35). To obtain this form of the solution, take inner product with zj ,
⇒
d
⟨x, zj ⟩ = ⟨Kx, zj ⟩
dt
= ⟨x, Kzj ⟩
= ⟨x, λj zj ⟩
= λj ⟨x, zj ⟩
⇒
n
x = ∑⟨x0 , zj ⟩eλj t zj
j=1
n
= ⟨x0 , x∗ ⟩x∗ + ∑⟨x0 , zj ⟩eλj t zj . (11.37)
j=2
The first term is the equilibrium solution, while the nonzero eigenvalues (λj , j =
2, . . . , n) determine the time scales associated with the transient process.
The Wie–Prater scheme is an experimental method for determining the eigenval-
ues and eigenvectors. From these experimental values, we can determine the rate con-
stant matrix K.
11.3 Application of weighted inner product to monomolecular kinetics | 267
k1j
.
Let kj = ( ) = j-th column of K.
.
knj
n
kj = ∑⟨kj , zr ⟩zr .
r=1
Now consider
n kij zir
⟨kj , zr ⟩ = ∑
i=1
xi∗
n kji
=∑ zir
i=1
xj∗
1 n
= ∑k z
xj∗ i=1 ji ir
n
Kzr = λr zr ⇒ ∑ kji zir = λr zjr
i=1
∵
1
⟨kj , zr ⟩ = λz
xj∗ r jr
⇒
n λr zjr 1 n
kj = ∑ zr = ∑λ z z (since λ1 = 0)
r=1 xj
∗ xj∗ r=2 r jr r
1 n
kij = ∑λ z z (11.39)
xj∗ r=2 r jr ir
All quantities on RHS are experimentally determinable. Thus, we can determine all
the rate constants from the eigenvectors and eigenvalues. An example in which this
procedure was used by Wei and Prater [31] is the catalytic isomerization of butenes.
The reaction network along with the relative values of the rate constants is shown in
Figure 11.5.
268 | 11 Applications of finite-dimensional linear algebra
Figure 11.5: Reaction network for monomolecular kinetics for catalytic isomerization of butenes.
In this case, the eigenvalues and orthonormal set of eigenvectors are given by
0.1436
λ1 = 0, z1 = x = ( 0.3213 ) ,
∗
0.5351
0.1903
λ2 = −9.2602, z2 = ( 0.3050 ) ,
0.4953
0.2946
λ3 = −19.418, z2 = ( −0.3536 ) .
0.0590
The experimentally observed reaction paths (which include straight line reaction
paths) are shown in Figure 11.6.
Figure 11.6: Experimentally observed reaction paths for catalytic isomerization of butenes, obtained
from Wei and Prater [31].
11.3 Application of weighted inner product to monomolecular kinetics | 269
Other applications of weighted inner product to stage operations, kinetics and weight-
ed least squares are given in the exercises. Additional applications may also be found
in the book by Ramkrishna and Amundson [25].
Problems
1. Consider the vector space of 2-tuples of real numbers over the real field, i. e., ℝ2 .
(a) Find an inner product on ℝ2 w. r. t., which the matrix A = ( −1 1
4 −4 ) is self-adjoint
(symmetric).
(b) Determine the eigenvalues and eigenvectors of A and verify that the eigenvec-
tors are orthogonal w. r. t. the inner product defined in (a).
(c) Determine the normalized eigenvectors and show a schematic plot of these
eigenvectors.
(d) Use the above results to obtain the solution of initial value problem
du 1
= Au; u(t = 0) = ( ).
dt 0
−γ α 0
A=( β −γ α ),
0 β −γ
dxj
= αxj−1 − (α + β)xj + βxj+1 j = 1, 2, 3, . . . , N
dt
where α = Lh , β = GK
h
, L is the liquid (heavy-phase) flow rate, h is the holdup (of
the heavy phase), G, is the gas (light-phase) flow rate and xj is the composition
of the transferable component in the liquid stream leaving stage j. State any
other assumptions involved.
270 | 11 Applications of finite-dimensional linear algebra
y
(b) Assuming that the compositions x0 (t) and xN+1 (t) = N+1
K
, of the entering
streams are known, show that the model may be written in the form
dx
= Ax + b(t)
dt
N i−1
β
⟨x, y⟩ = ∑( ) xi yi
i=1
α
n i−1 2
β β
⟨Ax, x⟩ = − ∑ α( ) [xi − x ] <0
i=0
α α i+1
Thus, A is negative definite and all its eigenvalues are strictly negative.
(e) Compute the steady-state values for y1 and x3 when N = 3 (three stage process)
and other parameters are given as follows:
L = 5, G = 3, h = 1, K = 1, x0 = 0, y4 = 0.5
Compute and plot the transient response when there is a step change in y4
from 0.5 to 0.7.
4. Consider a well-stirred batch reactor in which the following consecutive reversible
first-order reactions occur:
A1 A2 ⋅ ⋅ ⋅ An
(a) Denote the concentration vector by c = (c1, c2 , . . . , cn ) and identify the matrix
K, which yields the batch reactor equation
dc
= Kc.
dt
(b) Find the inner product on ℝn with respect to which K becomes self-adjoint.
Is K negative or nonpositive? Obtain the solution to the differential equation
above subject to the initial condition c(0) = c0 .
(c) Solve the transient continuous stirred tank reactor equation
11.3 Application of weighted inner product to monomolecular kinetics | 271
dc 1
= (cf − c) + Kc, c(0) = c0
dt τ
ck+1 = Ack + bk ; c0 = α,
can be written in the matrix form given in (a). Obtain an explicit solution to
this equation.
6. Consider an extraction process in which G kg/s of a light phase containing Y0 kg
solute A per kg of carrier is fed to a cascade containing N ideal stages and con-
tacted with L kg/s of solute-free heavy phase containing XN+1 kg solute A per kg of
carrier.
(a) Formulate the relevant equations.
(b) If the concentration of A in the exit light and heavy streams are YN and X1 ,
respectively, show that the concentration of A in the light phase leaving stage
i is given by
Y0 − KX1 KG
i
KXN+1 − KG Y
L N
Yi = ( ) +
1− LKG L 1− LKG
A B → C
and all the reactions are first order. There is no B or C in the feed but it may be
assumed that the feed contains a catalyst that initiates the reaction as soon as the
feed enters the first reactor.
272 | 11 Applications of finite-dimensional linear algebra
8. Consider the three interacting tanks arranged in series as shown in Figure 11.7
below. The volume of tanks VR1 , VR2 and VR3 may not be identical. Similarly, the
exchange flow rate q1 between the tanks 1 and 2 may not be same as q2 between
tanks 2 and 3.
(a) Considering the transient process, formulate the model for concentrations ci
in each tank in the form:
dc
VR = Qc
dt
and show that (i) VR is the diagonal positive definite matrix, and (ii) Q is a
symmetric matrix with zero row and column sum (i. e., has a zero eigenvalue).
(b) Define a matrix A = V−1
R Q, show that (i) A is not symmetric w. r. t. usual inner
product
3
⟨x, y⟩ = yT x = ∑ xi yi
i=1
⟨x, y⟩ = yT VR x
y = α0 + α1 x.
Suppose that the data point j, is given a weight wj (> 0). Formulate the weighted
least squares problem and determine the normal equations to be solved for α0
and α1 .
|
Part III: Linear ordinary differential equations-initial
value problems, complex variables and
laplace transform
12 The linear initial value problem
In earlier chapters, we have discussed the solution of linear initial value problems in
which a square matrix of constant coefficients appeared. In this chapter, we consider
the case of more general form of the linear initial value problem and discuss the rele-
vant theory along with applications.
du
= A(t)u + b(t), 0<t<a (12.1)
dt
u(@ t = 0) = u0 (12.2)
where
u1 (t)
u2 (t)
u=( . )
.
un (t)
is an n-tuple of real (or complex) valued functions ui ∈ C 1 [0, a], a is a positive constant
and u0 is a constant vector determining the initial state of the system. The forcing
vector b(t) is also an n-tuple of real (or complex) valued function of t. The following
fundamental existence and uniqueness theorem may be stated for the initial value
problem defined by equations (12.1) and (12.2).
Theorem. Consider the IVP defined by equations (12.1) and (12.2) and suppose that
and let u0 be any vector in ℝn and t be a point in [0, a]. Then there exists one and only one
solution u(t), 0 < t < a, satisfying equations (12.1) and (12.2). This solution is a continu-
ous function of the initial conditions u0 and if aij (t) depend continuously on a parameter,
so does the solution. [For proof of this theorem, see Coddington and Levinson [13].]
We now outline a method for obtaining the solution of the initial value problem
defined by equations (12.1) and (12.2). As discussed previously, since equation (12.1)
is linear, the principle of superposition may be used and the general solution may be
written as
https://doi.org/10.1515/9783110739701-013
278 | 12 The linear initial value problem
duh
= A(t)uh (12.4)
dt
uh (@ t = 0) = u0 (12.5)
dup
= A(t)up + b(t) (12.6)
dt
up (@ t = 0) = 0. (12.7)
We now consider the homogeneous IVP defined by equation (12.4). The following
properties may be easily established:
1. The set of all solutions to equation (12.4) form a vector space.
2. There are n linearly independent solutions and every solution of equation (12.4)
is expressible as a linear combination of these n solutions.
3. Let {α1 , α2 , . . . , αn } be any set of n linearly independent constant vectors in ℝn /ℂn
and uj (t) be a solution of (12.4) satisfying the initial condition uj (0) = αj . Then the
set {u1 (t), u2 (t), . . . , un (t)} is linearly independent and forms a basis for the solu-
tion space.
If U(t) is a fundamental matrix for equation (12.4), then every solution is of the
form
dU
= A(t)U(t) (12.9)
dt
(b) t
and U(t) is nonsingular at every point in [0, a]. [Here, tr A stands for the trace of the
matrix A].
12.1 The vector initial value problem | 279
(c) If B is any constant nonsingular matrix, then V(t) = U(t)B is also a fundamental
matrix and every fundamental matrix may be written in this form.
Note that the specific solution of equations (12.4) and (12.5) is given by
where c is a constant vector. The unique solution of equations (12.1) and (12.2) is given
by
280 | 12 The linear initial value problem
t
−1 0
u(t) = U(t)U(0) u + U(t) ∫ U(s)−1 b(s) ds. (12.15)
0
dn u dn−1 u du
Lu = p0 (t) n
+ p1 (t) n−1 + ⋅ ⋅ ⋅ + pn−1 (t) + pn (t)u, 0 < t < a, (12.16)
dt dt dt
where pi (t), i = 0, 1, . . . , n are real (or complex) valued functions of a real variable t.
We assume that L is a regular differential operator, i. e., po (t) ≠ 0 for 0 ≤ t ≤ a and
pi (t) ∈ C[0, a]. Suppose that u(t) ∈ C n [0, a], the class of n-times differentiable func-
tions defined on the interval [0, a]. Then the most general form of a linear n-th order
initial value problem is given by
Defining
u1 (t) = u(t)
du1 du
u2 (t) = =
dt dt
. (12.19)
.
dun−1 dn−1 u
un (t) = = n−1
dt dt
Equations (12.17) and (12.18) may be written in the vector form of equations (12.1)
and (12.2) with
u0 = α
u(t)
0 1 0 0 . . . 0
u′ (t)
0 0 1 0 . . . 0
( . )
u=(
(
),
) A(t) = ( 0 0 0 1 . . . 0 )
.
. . . . . . . .
p (t) pn−1 (t)
− pp1 (t)
. n
0
0
f (t) ( . ) = f (t) en
)
b(t) = ( (12.20)
p0 (t) ( . ) p (t)
0
0
( 1 )
Thus, the n-th order IVP is a special case of the vector initial value problem. In fact,
every linear IVP (e. g., coupled higher-order scalar equations) may be written in the
form given by equation (12.1).
The solutions of Lu = 0 are precisely those vectors (functions) whose image under
L is zero, i. e., the kernel of L. It is easily shown that ker L is of dimension n and a basis
for ker L consists of n-linearly independent solutions.
Definition. Any vector whose components form a basis for ker L is called a fundamen-
tal vector for Lu = 0. If ψT = [ψ1 (t) ψ2 (t) . . . ψn (t)] is a fundamental vector, the
general solution of Lu = 0 is of the form u = ψT c, where c is a constant vector.
The fundamental matrix of the companion vector equation is called the Wronskian
matrix of equation (12.17). The Wronskian matrix is denoted by
ψ1 (t) . . . . ψn (t)
ψ′1 (t) . . . . ψ′n (t)
( . . . . . . )
K(ψ(t)) = (
(
)
)
. . . . . .
. . . . . .
( ψ1 (t)
[n−1]
. . . . ψn (t)
[n−1]
)
ψ(t)
ψ′ (t)
( . )
k(ψ(t)) = (
(
).
)
.
.
( ψ [n−1]
(t) )
let
282 | 12 The linear initial value problem
ψ1 ψn
dψ1 dψn
( dt ) ( dt )
u1 = k(ψ1 (t)) = ( . ), ..., un = k(ψn (t)) = ( . )
. .
dn−1 ψ1 dn−1 ψn
( dt n−1 ) ( dt n−1 )
Proof. Suppose that ψ1 , ψ2 , . . . , ψn are linearly independent, i. e., the only solution of
Since the only solution to the above system of n equations is the trivial one, we have
Theorem. The Wronskian W(t) is either identically zero or is never zero, i. e., it never
changes sign.
12.2 The n-th order initial value problem | 283
Proof. By definition,
ψ (t) ψ2 (t) . . .
ψn (t)
1
ψ′1 (t) ψ′2 (t) . . . ψn (t)
′
W(t) = ψ1 (t) ψ2 (t) ψn (t)
′′ ′′ ′′
.
.
[n−1]
ψ1 (t) ψ[n−1]
2 (t) . . . ψn (t)
[n−1]
ψ′ ψ′2 . . . ψ′n
1
ψ1 ′
ψ2′
. . . ψ′n
dW(t) ψ′′ ψ2
′′
ψn
′′
= 1
dt
. .
. ..
[n−1]
ψ1 ψ[n−1]
2 . . . ψ[n−1]
n
ψ ψ2 . . . ψn ψ ψ2 . . . ψn
1 1
ψ1 ′′
ψ2′′
. . . ψ′′n ψ′1 ψ′2 . . . ψ′n
+
. . + ⋅ ⋅ ⋅ + . .
. . . .
. . . .
[n−1] [n]
ψ ψ2[n−1] . . ψn
[n−1] ψ ψ[n] ψ[n]
1 . 1 2 . . . n
ψ1 (t) ψ2 (t) . . . ψn (t)
ψ′ (t) ψ′2 (t) . . . ψ′n (t)
1
′′
= 0 + 0 + ⋅ ⋅ ⋅ + 0 + ψ1 (t) ψ2 (t) ψn (t)
′′ ′′
[n−2]
ψ1 . ψ[n−2]
2 ψ[n−2]
n
ψ[n] (t) ψ[n] ψ[n]
1 2 (t) . . . n (t)
Now,
Lψj = 0
⇒
p1 [n−1] p2 [n−2] p p
ψ[n]
j =− ψ − ψj − ⋅ ⋅ ⋅ − n−1 ψ′j − n ψj
p0 j p0 p0 p0
Substitute this in the above determinant and perform the following (n − 1) row opera-
tions:
284 | 12 The linear initial value problem
p
(i) Multiply row 1 by pn and add to the last row
0
(ii) Multiply row 2 by pn−1 /p0 and add to the last row
..
.
⇒
t
p1 (s)
W(t) = W(t0 ) exp{− ∫ ds}
p0 (s)
t0
Lu = f (t), (12.22)
du
= A(t)u + b(t). (12.23)
dt
Theorem. Let L be regular and f (t) ∈ C[t0 , b], t0 < b < ∞. Then the solution of the IVP
Lu = f (t)
k(u(t0 )) = α
is given by
t
T T f (s)
u(t) = [ψ(t)] c + [ψ(t)] ∫ K(ψ(s)) en ds (12.24)
−1
p0 (s)
t0
12.2 The n-th order initial value problem | 285
where
0
ψ1 (t)
0
ψ2 (t)
( . )
en = (
(
),
) ψ(t) = ( . )
.
.
0
ψn (t)
( 1 )
is a fundamental vector of the homogeneous system and K(ψ(t)) is the Wronskian matrix.
The constant vector c is determined from the algebraic equations
T
α0 = [ψ(t0 )] c
T
α1 = [ψ′ (t0 )] c
.
.
T
αn−1 = [ψ[n−1] (t0 )] c
or
Proof. The proof of equation (12.24) follows directly from that for the vector equation
du
dt
= A(t)u + b(t), whose solution is given by
Now, let
0
0
b(s) = ( . ) = en f (s)
p0 (s)
.
f (s)
( p0 (s) )
and use the fact that the fundamental matrix for the scalar problem is the Wronskian
matrix, i. e.,
U(s) = K(ψ(s)),
t
ψ1 (t)ψ2 (s) − ψ1 (s)ψ2 (t) f (s)
u(t) = c1 ψ1 (t) + c2 ψ2 (t) + ∫ ds (12.25)
W(s) p0 (s)
t0
u′′ + u = 2 sin t.
The two linearly independent solutions of the homogeneous equation are given by
ψ1 = sin t, ψ2 = cos t
sin t cos t
K(ψ) = ( ), W(t) = −1 ≠ 0
cos t − sin t
= sin t − t cos t
du
= Au + b (12.26)
dt
u = u0 @ t = 0 (12.27)
Simplifying, we get
u = u0 @ t = 0 ⇒ c = u0 + 0
Another method: The solution to equation (12.26) may also be obtained by de-
termining the steady-state solution and subtracting it to obtain a homogeneous
equation.
Let z = u − us , us = −A−1 b ⇒
dz
= Az
dt
z = z0 = u0 − us @ t = 0
z = eAt z0 ⇒ u = −A−1 b + eAt (u0 + A−1 b)
u = eAt u0 + eAt A−1 b − A−1 b
Thus, for the case of constant coefficients and constant forcing vector, the solution
is given by
[Remark: Since A−1 and eAt commute, the solution given by equations (12.29) and
(12.30) are identical].
288 | 12 The linear initial value problem
(b) The second special case is that of scalar n-th order IVP with constant coefficients:
Let
u = eλt
p0 λn + p1 λn−1 + ⋅ ⋅ ⋅ + pn−1 λ + pn = 0
λ1 , λ2 , . . . , λn ,
and for simplicity assume that they are all distinct. Then
ψj (t) = eλj t , j = 1, 2, 3, . . . , n
n
uh = ∑ cj eλj t
j=1
t
T f (s)
up (t) = ψ(t) ∫[K(ψ(s))] en ds
−1
p0 (s)
0
where
e λ1 t e λ2 t . . . e λn t
[ λ 1 e λ1 t λ2 eλ2 t λ n e λn t
[ ]
. . . ]
[ ]
[ . . . ]
=[
[
]
]
[ . ]
[ ]
[ . ]
n−1 λ1 t
[ 1 e
λ λ2n−1 eλ2 t . . . λnn−1 eλn t ]
Evaluation of K(ψ(t)) at t = 0 leads to the Van der Monde matrix, whose inverse
can be expressed analytically for the case of distinct eigenvalues.
Many examples illustrating the application of the above theory to first- and second-
order scalar as well as vector equations are given as problems below and in Chapter 14.
Problems
1. The vibration of the spring-mass system (with identical springs and masses) may
be described by the equations
m d2 u1
= −2u1 + u2
k dt 2
2
m d u2
= u1 − 2u2
k dt 2
where u1 and u2 are the displacements of the masses from their equilibrium po-
sitions. (a) Cast the above equations in dimensionless form, (b) Determine the
natural frequencies of vibration and the modes (eigenvectors) of vibration and (c)
What will be the form of the equations if damping is included? Write the equations
in vector/matrix form.
2. Show that the solution of the inhomogeneous system
du
= Au + f(t), u(t = 0) = u0
dt
is given by
3. Determine a formula similar to that given in problem (2) for the solution of the
inhomogeneous system
d2 u du
= −Au + f(t), u(t = 0) = u0 , (t = 0) = v0 .
dt 2 dt
290 | 12 The linear initial value problem
4. Consider the flow system shown in Figure 12.1. Assume that each tank is well
mixed and species A enters tank 1 at a concentration of cin (t) and leaves at c1 (t).
Assume further that VR1 = 1 m3 , VR2 = 32 m3 and q1 = q2 = 2 m3 / min. (a) Formulate
the differential equations describing the transient behavior of the system and put
them in vector/matrix form. (b) Determine the response of the system (i. e., how
the exit concentration c1 (t) varies with time) for a unit step input (cin (t) = 1 for
t > 0 and 0 for t < 0). Assume that no A is present initially in either tank.
5. Solve problem (4) by converting the model to a scalar initial value problem for
c2 (t).
6. Consider the linear system
du 0 1 1
=( )u + ( ).
dt −2 3 1
Determine (a) a fundamental matrix for the homogeneous system (b) a particular
solution to the inhomogeneous system (c) the general form of the solution (d) the
solution to the initial value problem u(0) = ( 21 ).
7. (a) Determine a fundamental matrix for the linear system
du 1 1
=( ) u.
dt 0 1
(b) Determine the vector differential equations for which the following are funda-
mental matrices
1 t t2
et tet
(i) U(t) = ( t ) (ii) U(t) = ( 0 1 t )
e (1 + t)et
0 0 1
Lu = u′′ − 6t −2 u = 0.
Lu = f (t); k(u(0)) = b,
Here, c, α and a are positive constants. Plot the response for the overdamped
(c = 6, α2 = 5, ω = 1), critically damped (c = 2, α2 = 1, ω = 1) and underdamped
cases (c = α2 = 2, ω = 1). Also, plot the amplitude of the asymptotic response as a
function of the forcing frequency for a fixed α2 (say α2 = 4) and varying values of
c(≥ 0).
13 Linear systems with periodic coefficients
The theory of linear differential equations with periodic coefficients is encountered
in applications such as the analysis of transport phenomena with spatially periodic
transport properties, in the development of time averaged models of periodically
forced systems, in the determination of the stability of periodic solutions of nonlinear
systems, and so forth. In this chapter, we outline the theory briefly.
du
= a(t)u, u = u0 @ t = 0 (13.1)
dt
with
a(t + T) = a(t).
ln u = ∫ a(s) ds + c
0
t = 0, u = u0 ⇒ c = ln u0
t
u
ln( 0 ) = ∫ a(s) ds. (13.2)
u
0
To investigate the nature of this solution when a(s) is periodic, we consider the differ-
ent time intervals:
(i) 0 < t ≤ T
t
u(t)
= exp{∫ a(s) ds} (13.3)
u0
0
https://doi.org/10.1515/9783110739701-014
13.1 Scalar equation with a periodic coefficient | 293
T t
u(t)
= exp{∫ a(s) ds + ∫ a(s) ds}
u0
0 T
T t
Let
T kT
Then we get
t
u(t)
= m exp{∫ a(s) ds}
u0
T
t−T
u(t)
= m exp{ ∫ a(s)d
̂ s};
̂ T < t < 2T (13.4)
u0
0
(iii) If 2T < t < 3T, following the above procedure, it is easily seen that
t−2T
u(t)
= m2 exp{ ∫ a(s) ds}; 2T < t < 3T, etc.
u0
0
Thus, we can compute u(t) for all t if we know u(t) for 0 < t < T.
Let
T T
1
m = eρT = exp{∫ a(s) ds}; ρ= ∫ a(s) ds
T
0 0
where
p(t + T) = p(t)
t
Thus, when a(t) is periodic, the solution of equation (13.1) is of the form
⇒
T
∫ a(s) ds = 0.
0
du
= (cos t)u; u = u0 .
dt
Here,
a(s) = cos s
2π
Figure 13.1: Plot of the periodic solution of the scalar linear equation with ρ = 0.
du
= A(t)u (13.6)
dt
A(t + T) = A(t) (13.7)
Theorem. If U(t) is a fundamental matrix of equations (13.6) and (13.7) so is V(t) where
V(t) = U(t + T)
Corresponding to every such U(t), ∃ a nonsingular matrix P(t), which is periodic with
period T and a constant n × n matrix C such that
U(t) = P(t)etC
296 | 13 Linear systems with periodic coefficients
Proof. Since
dU
= A(t)U ⇒ V′ (t) = U′ (t + T).I
dt
= A(t + T)U(t + T)
= A(t)V(t)
det V(t) = det{U(t + T)} ≠ 0
U(t + T) = U(t)M.
1
M = eTC (i. e., ln M = TC or C = ln M)
T
U(t + T) = U(t)eTC
= U(t)e−tC e(t+T)C
= P(t)e(t+T)C ,
P(t) = U(t)e−tC
P(t + T) = U(t + T)e−(t+T)C
= P(t)e(t+T)C .e−(t+T)C
= P(t)
∴ The result.
The significance of the above theorem is that the determination of the fundamen-
tal matrix U(t) over a finite interval of length T (e. g., 0 ≤ t ≤ T) leads at once to the
determination of U(t) over (−∞, ∞). This follows from the periodicity property.
U(t + T) = P(t)e(t+T)C
= P(t)etC .eTC
= U(t)eTC = U(t)M
⇒
13.2 Vector equation with periodic coefficient matrix | 297
U(0) = U0
Then
U(T) = U0 eTC = U0 M
⇒
1
C= ln U−1
0 U(T)
T
1
= ln[U−1
0 U(T)]
T
U(0) = I
⇒
1
C= ln U(T)
T
Thus, C can be determined from U(T). Each column of U(T) may be determined by
integrating the IVP
du
= A(t)u; u(0) = ej .
dt
Definition. M is called the monodromy matrix. The eigenvalues of M are called the
characteristic multipliers of the periodic initial value problem. The matrix C is called
the Floquet matrix and the eigenvalues of C are called the characteristic exponents or
Floquet exponents.
Tρj = ln μj
If we take the principal value of the logarithm, then ρj is also uniquely defined.
Lemma 13.1. M is not unique but any two are related by a similarity transform.
298 | 13 Linear systems with periodic coefficients
U1 (t + T) = U1 (t)M1
U2 (t + T) = U2 (t)M2
M2 = U−1
2 (t)U2 (t + T)
= D−1 U−1
1 (t)U1 (t + T)D
= D−1 M1 D
In order to see the explicit form of the solution of the periodic system (equations
(13.6) and (13.7)), we consider the simple case in which matrix C is diagonalizable, i. e.,
∃ a nonsingular constant matrix S such that
C = SρS−1
where
ρ1 0
ρ=( . )
0 ρn
e ρ1 t 0
tC
e = S( . ) S−1
ρn t
0 e
e ρ1 t 0
U(t) = P(t)S ( . ) S−1
ρn t
0 e
From this, it is clear that U(t) consists of columns u1 (t), . . . , un (t), which are of the form
ui (t) = pi (t)eρi t
u = p1 (t)eρ1 t + ⋅ ⋅ ⋅ + pn (t)eρn t
13.2 Vector equation with periodic coefficient matrix | 299
Proof. This follows from above discussion. Note that from this lemma it follows that
u(t) is a periodic solution iff ∃ a characteristic exponent of C, which is zero, i. e., ρi = 0
for some i or equivalently μi = +1 for some i.
n T
Let τ = 0, t = T
⇒
U(0) = I
U(T) = M
but
n
det M = μ1 .μ2 . . . μn = ∏ μi
i=1
∴ The result.
300 | 13 Linear systems with periodic coefficients
Problems
1. Consider the linear system with periodic coefficients:
n T
du1
= (− sin 2t)u1 + [(cos 2t) − 1]u2
dt
du2
= (1 + cos 2t)u1 + (sin 2t)u2
dt
13.2 Vector equation with periodic coefficient matrix | 301
is a fundamental matrix.
(b) Obtain the corresponding monodromy matrix and determine the characteris-
tic multipliers and characteristic exponents.
4. Consider the nonlinear system
du1
= u1 − u2 − u31 − u1 u22
dt
du2
= u1 + u2 − u21 u2 − u32 (13.9)
dt
dz1
= (−1 − cos 2t)z1 − [1 + sin 2t]z2
dt
dz2
= (1 − sin 2t)z1 + (cos 2t − 1)z2 (13.10)
dt
is fundamental matrix.
(d) Determine the monodromy matrix and the Floquent multipliers of the peri-
odic system defined by equation (13.10).
5. Consider an ideal mixing tank with constant fluid density and tank volume but
periodically varying flow rate. With appropriate notation,
(a) Show that the concentration of a solute satisfies the scalar equation,
dc 1 + ε sin(ωt)
= [cin (t) − c], t > 0; c = c0 @ t = 0
dt τ
(b) Obtain the solution of the above equation for c0 = 0, and cin = H(t) = Heavi-
side’s unit step function.
(c) Obtain the solution for cin (t) = 0 and c0 = 1, and show a plot of the solution
for ε = 0 and ε > 0.
14 Analytic solutions, adjoints and integrating
factors
We have already shown that the solution to linear scalar and vector differential equa-
tions with constant coefficients can be expressed analytically in terms of the eigenval-
ues. This chapter deals with other cases in which it is possible to express the solutions
in explicit form.
dn u dn−1 u du
p0 (t) + p 1 (t) + ⋅ ⋅ ⋅ + pn−1 + pn (t)u = f (t)
dt n dt n−1 dt
or
Lu = f (t) (14.1)
Lu = 0 (14.2)
Under certain special conditions, we can obtain the solution of equation (14.2) analyt-
ically. We discuss some of these special cases here.
(i) Scale Invariance in t
d2 u du
p0 (t) + p1 (t) + p2 (t)u = 0 (14.3)
dt 2 dt
gives
α0 t 2 u′′ + α1 tu′ + α2 u = 0.
d d
Let t ′ = at ⇒ dt
= dt ′
.a
https://doi.org/10.1515/9783110739701-015
14.1 Analytic solutions | 303
⇒
2
t′ d2 u t′ du
α0 ( ) a2 ′2 + α1 ( ).a ′ + α2 u = 0
a dt a dt
d2 u du
α0 t ′2 + α1 t ′ ′ + α2 u = 0.
dt ′2 dt
Thus, the equation is invariant to the scaling t → at. Scale invariant equations can be
converted to constant coefficient equations by the transformation
t = ex or x = ln t (14.4)
d d 1
= .
dt dx t
⇒
d d
t =
dt dx
d2 d2 1 1 d 2 d
2
d2 d
= . − ⇒ t = −
dt 2 dx 2 t 2 t 2 dx dt 2 dx dx
Thus, using the transformation given by equation (14.4), equation (14.3) reduces to
d2 u du
α0 + (α1 − α0 ) + α2 u = 0
dx 2 dx
u(t) = c1 t λ1 + c2 t λ2
u = tλ
du
u′ = = λt λ−1
dt
u′′ = λ(λ − 1)t λ−2
α0 λ(λ − 1) + α1 λ + α2 = 0
α0 λ2 + (α1 − α0 )λ + α2 = 0
u(t) = c1 √t + c2 √t ln t
Thus, for scale invariant equations in t, the linearly independent solutions are of the
form t λ or t λ ln t is a solution for some λ determined by the characteristic equation.
(ii) Scale invariant equations in u
d2 u du
p0 (t) + p1 (t) + p2 (t)u = 0, (14.5)
dt 2 dt
⇒
14.1 Analytic solutions | 305
u′ = ew .w′ = uw′
2
u′′ = ew .(w′ ) + uw′′
⇒
2
p0 (t).u[(w′ ) + w′′ ] + p1 (t)uw′ + p2 (t)u = 0
⇒
2
p0 [(w′ ) + w′′ ] + p1 w′ + p2 = 0 (14.7)
Let w′ = y
⇒
dy
p0 (t) + p0 (t)y2 + p1 (t)y + p2 (t) = 0 (14.8)
dt
Examples.
(1) (u′ )2 + uu′′ + t = 0 is scale invariant t → at and u → a3/2 u
(2) du
dt
= F( ut ) is scale invariant t → at and u → au
First-order equations
(a) The linear equation
du
+ p(t)u = q(t) (14.9)
dt
du
+ p(t)u = q(t)um (14.10)
dt
1−m
y = [u(t)]
du
= a(t)u2 + b(t)u + c(t) (14.11)
dt
a(t) = 0 ⇒ linear equation
c(t) = 0 ⇒ Bernoulli’s equation
When a ≠ 0, let
−w′ (t)
u=
a(t)w(t)
⇒
a′ (t)
w′′ (t) − [ + b(t)]w′ (t) + a(t)c(t)w = 0
a(t)
du
a(u, t) + b(u, t) = 0
dt
or
If a(u, t) = a1 (u)a2 (t) and b(u, t) = b1 (u)b2 (t), then equation (14.14) is separable
and can be solved by quadratures. If
𝜕a 𝜕b
=
𝜕t 𝜕u
then we have an exact equation that can be solved analytically.
14.2 Adjoints and integrating factors | 307
A listing of all ODEs (linear and nonlinear) that have analytic solutions can be found
in the book by E. Kamke [21].
du
Lu ≡ p0 (t) + p1 (t)u = 0 (14.15)
dt
du
vLu ≡ v(t)p0 (t) + v(t)p1 (t)u = 0 (14.16)
dt
d du d
(vp0 u) = vp0 + u (vp0 )
dt dt dt
we can write
d d
vLu ≡ [p (t)u.v] − u (p0 (t)v) + v.p1 (t)u = 0
dt 0 dt
⇒ vLu ≡ (p0 uv)′ + u[−(p0 v)′ + p1 v] = 0 (14.17)
Then the LHS or RHS of equation (14.17) is an exact derivative. The function v(t) is
called the integrating factor of equation (14.15). To find the integrating factor of equa-
tion (14.15), we end up with equation (14.18) and to find the integrating factor of equa-
tion (14.18) we end up with equation (14.15). Hence, Lagrange called equation (14.18),
the adjoint equation.
Thus, the adjoint operator of equation (14.15) is defined by
308 | 14 Analytic solutions, adjoints and integrating factors
L∗ v = −(p0 v)′ + p1 v
dv
= −p0 (t) + [p1 (t) − p′0 (t)]v (14.19)
dt
Also, we have
d
vLu − uL∗ v = [p (t)u.v] (14.20)
dt 0
dv
L∗ v = −p0 (t) + [p1 (t) − p′0 (t)]v = 0. (14.21)
dt
d
vLu = [v(t)p0 (t)u(t)]. (14.22)
dt
Thus, the solution of the adjoint equation gives an integrating factor to equation
(14.15). Now suppose that u(t) satisfies Lu = 0. Then equation (14.20) gives
d
uL∗ v = − [p (t)u.v] (14.23)
dt 0
Thus, u(t) is an integrating factor for the adjoint equation. It is also seen that (L∗ )∗ = L,
i. e., the adjoint of the adjoint equation is the original equation.
If Lu = 0 and L∗ v = 0, equation (14.20) gives
d
(vp0 u) = 0. (14.24)
dt
or,
d2 u du
Lu = p0 (t) + p1 (t) + p2 (t)u.
dt 2 dt
14.2 Adjoints and integrating factors | 309
= (vp0 u′ ) − [u(p0 v)′ ] + u(p0 v)′′ + (vp1 u)′ − u(p1 v)′ + vp2 u
′ ′
= u[(p0 v)′′ − (p1 v)′ + vp2 ] + (vp0 u′ ) − [u(p0 v)′ ] + (vp1 u)′
′ ′
d
= uL∗ v + [vp0 u′ − u(p0 v)′ + vp1 u]
dt
where
⇒
d
vLu − uL∗ v = [π(u, v)]
dt
where the concomitant π(u, v) is defined by
where
u
k(u) = ( ) = Wronskian vector,
u′
π(u, v) = constant
⇒
d
vLu = [π(u, v)] = 0
dt
⇒
ψ2 (t)c01 − ψ1 (t)c02
u(t) = (14.30)
p0 (t)W(t)
where
ψ ψ2
W(t) = 1′ = ψ1 ψ′2 − ψ′1 ψ2 ≠ 0
ψ1 ψ′2
Equation (14.30) gives the general solution to Lu = 0. In the general case, we have n
relations
π(u, ψi ) = c0i , i = 1, 2, . . . n
Since π is linear in u, u′ , u′′ , . . . , u[n−1] (t), we can eliminate (or solve) for these variables.
The same reasoning also applies to the vector equation.
du
= A(t)u (14.31)
dt
du
Lu = − A(t)u (14.32)
dt
then
du
vT Lu = v(t)T − v(t)T A(t)u
dt
d dvT
= (vT u) − u − v(t)T A(t)u (14.33)
dt dt
is an exact derivative if
dvT
u + vT A(t)u = 0
dt
or
dvT
= −vT A(t)
dt
or
dv
= −A(t)T v (14.34)
dt
dv
L∗ v = − − A(t)T v (14.35)
dt
d T T
vT Lu = (v u) + (L∗ v) u
dt
⇒
T d T
vT Lu − (L∗ v) u = (v u) (14.36)
dt
Thus, if u(0) = α, the Lagrange condition: ⟨Lu, v⟩ = ⟨u, L∗ v⟩ can be satisfied when we
have vT (a)u(a) = vT (0)α. If we choose v(a) = α, then αT u(a) = vT (0)α = αT v(0) ⇒
v(0) = u(a), i. e., the final condition of original IVP is the initial condition of the adjoint
problem.
Thus, if we integrate the IVP,
du
= A(t)u, 0<t≤a with u(t = 0) = α, (14.38)
dt
312 | 14 Analytic solutions, adjoints and integrating factors
dv
= −AT (t)v, 0≤t<a with v(t = a) = α (14.39)
dt
in backward direction to get v(0). Thus, forward integration of equation (14.38) and the
backward integration of equation (14.39) are coupled. Further, if the final condition of
original IVP is u(a) = β, the initial condition of the adjoint problem is v(0) = u(a) = β.
In other words, the adjoint IVP can be integrated in forward direction with the initial
condition as v(0) = β = u(a) to get v(a) = α = u(0). Thus, the adjoint problem may be
used to determine what initial condition on equation (14.31) may lead to a given final
state at t = a. This observation is useful in many applications.
Let u1 , u2 , . . . , un be n linearly independent solutions of
du
Lu ≡ − A(t)u = 0
dt
dv
L∗ v = + AT (t)v = 0,
dt
then we have
d T
(v u) = 0 ⇒ vT u = constant
dt
⇒
vT u i = c i , i = 1, 2, . . . , n
⇒
vT U = c T (14.40)
UT v = c ⇒ v = [UT ] c (14.41)
−1
Thus, we can find solutions of the adjoint equation if we know the solution of Lu = 0.
It can also be shown that
We return to the concept of adjoint again when we deal with boundary value problems
in Chapter 18.
14.4 Vector initial value problem | 313
Problems
1. (Linear first-order equation): Find the general solution (or a particular solution if
the initial condition is given) of the following first-order differential equations:
dy dy dy
(a) dx + y cos x = 21 sin 2x, (b) (1 − x 2 ) dx + 2xy = √ x 2 ; y(0) = 0, (c) dx − y tan x =
1−x
dy
ex sec x and (d) (1 + x 3 ) dx + 2xy = 4x 2
2. (Bernoulli’s equation): Find the general solution of the following first-order dif-
y
dy dy dy
ferential equations: (a) x dx + y = y2 ln x, (b) dx + x1 = ex2 , (c) y12 dx + xy1 = 1 and
dy
(d) dx = x 2 y3 − xy
3. (Transient behavior of a mixing tank): A tank initially holds 80 gal of a brine solu-
tion containing 0.125 lb of salt per gallon. At t = 0, another brine solution contain-
ing 1 lb of salt per gallon is poured into the tank at a rate of 4 gal/min, while the
well-stirred mixture leaves the tank at a rate of 8 gal/min. (a) Formulate a model
for describing the transient behavior of the tank and (b) Find the amount of salt
in the tank when the tank contains exactly 40 gal of the solution.
4. (Compound interest modeling): A depositor currently has $6,000 and plans to in-
vest it in an account that accrues interest continuously. What is the required in-
terest rate if the depositor needs to have $10,000 in 4 years? Formulate the model,
solve it and use the solution to calculate the required rate of interest.
5. (Application of Newton’s second law for a free falling body): A body weighing m kg
is dropped from a height H with no initial velocity. As it falls, the body encoun-
ters a force due to air resistance, which may be assumed to be proportional to its
velocity. If the limiting/terminal velocity of this body is v0 m/s, determine (a) an
expression for the velocity of the body at any time t and (b) an expression for the
position of the body at any time t.
6. (Cooling of a pie): A hot pie that was cooked at 325 ∘ F is taken directly from an oven
and placed outdoors in the shade to cool on a day when the air temperature in the
shade is 85 ∘ F. After 5 minutes in the shade, the temperature of the pie had been
reduced to 250 ∘ F. Determine (a) the temperature of the pie after 20 minutes and
(b) the time at which the pie cools to a temperature of 90 ∘ F.
7. (Population growth): The population of a certain state is known to grow at a rate
proportional to the number of people presently living in the state. If after 10 years
the population has trebled and if after 20 years the population is 200,000, find the
number of people initially living in the state.
8. (First-order process model): The process model for a first-order system is given by
dy
τ + y = f (t); y(0) = 0
dt
Here, τ > 0 is the first-order time constant. (a) Determine and plot the response of
the system for a unit step input and a unit impulse input and (b) Determine and
plot the response of the system (amplitude and phase lag) when f (t) = A sin ωt.
314 | 14 Analytic solutions, adjoints and integrating factors
π 3 dv π 3 π
d ρ = dp ρs g − dp3 ρf g − 3πμdp v
6 p s dt 6 6
Explain the meaning of each term in the above equation and (b) Solve the above
model with the initial condition specified and show that the velocity of the particle
at any time may be expressed as
18μt
v(t) = v∞ [1 − exp(− )]
ρs dp2
where v∞ is the terminal velocity (for t → ∞). (c) Determine an expression for v∞ .
11. (Bernoulli’s equation): (a) Find the general solution of the following first-order dif-
ferential equations:
y
dy dy
(i) x dx + y = y2 ln x (ii) dx + x1 = ex2
Consider a population balance model in which the birth rate is proportional to the
square of the population size while the death rate varies linearly: (i) With appro-
priate assumptions, show that the evolution equation is of the form
dN
= bN 2 − aN
dt
where a and b are positive constants and N(t) is the population size at time t,
(ii) Solve the above equation and plot the solution for the two cases of N(t = 0) < ba
and N(t = 0) > ba .
12. (Consecutive first-order reactions in a batch reactor): Consider a well-stirred batch
reactor in which the consecutive reactions A → B → C occur. Assume that the
density of the reaction mixture (and the volume of the reactor) remains constant
and at time zero the reactor is charged with a solution containing only reactant
A at a concentration of C0 . Further assume that the rate of the first reaction is
given by r1 = −rA = k1 CA and that of the second reaction by r2 = rC = k2 CB :
(a) Formulate the differential equations (and the initial conditions) describing the
concentrations of all the species as a function of time, (b) Solve the equations
14.4 Vector initial value problem | 315
in (a) and determine the concentrations and (c) Determine the time at which the
concentration of species B is maximum.
13. The process model for a second order system is given by
d2 y dy
τ2 + 2γτ + y = f (t); y(0) = 0; y′ (0) = 0
dt 2 dt
Here, τ > 0 is the system time constant, while γ > 0 is called the damping con-
stant. (a) Determine and plot the response of the system for a unit step input and
a unit impulse input and (b) Determine and plot the response of the system (am-
plitude and phase lag) when f (t) = A sin ωt for three cases: γ = 0.1, 1, 2.
14. The transient response of a U-tube manometer to changes in pressure is described
by the initial value problem
d2 h 24μ dh 3g 3 Δp
+ + h= ; t > 0, h(0) = 0; h′ (0) = 0
dt 2 ρd2 dt 2L 4 ρL
where h is the deviation of the manometer liquid level from the equilibrium po-
sition, t is time, g is the gravitational acceleration, d is the tube diameter, ρ and
μ are the density and viscosity of the manometer fluid, L is the total length of the
fluid column and Δp is the change in pressure (a) Cast the above model in dimen-
sionless form (b) Determine the critical tube radius above which the response is
oscillatory (c) Determine and plot the transient response of the manometer for
a step change in the pressure when the tube diameter is twice the critical value
determined in (b).
15. The motion of a periodically driven pendulum (for small amplitudes), the clas-
sic mechanical spring-mass-dashpot oscillator and the RLC (resistor-inductor-
capacitor) electric circuit may be described by the second- order IVP:
d2 u du
+ 2γ + ω20 u = f (t); u(0) = 0; u′ (0) = 0
dt 2 dt
18. Consider the second-order process model described by the following IVP:
d2 y dy
τ2 + 2γτ + y = f (t); y(0) = 0; y′ (0) = 0
dt 2 dt
Determine and plot the response of the system for a unit step input when τ = 1
and three values of the damping constant: γ = 0.1, 1, 2
19. The motion of a periodically driven pendulum (for small amplitudes) may be de-
scribed by the second-order IVP:
d2 u du
+ 2γ + ω20 u = f (t); u(0) = 0; u′ (0) = 0
dt 2 dt
d2 h 24μ dh 3g 3 Δp
+ + h= ; t > 0, h(0) = 0; h′ (0) = 0
dt 2 ρd2 dt 2L 4 ρL
If the viscous damping term can be neglected, determine the period of oscillation
(or the natural frequency) for the case where the total length of the liquid col-
umn is 2 meters (b) Examine the eigenvalues of the homogeneous equation with
the damping term and derive a formula for the critical diameter of the tube be-
low which the system does not oscillate. Calculate this value when the manome-
ter fluid is water at 20∘ C with density ≈ 1.0 g/cm3 and viscosity ≈ 1 centipoise
(= 0.01 g.cm−1 s−1 ).
21. Find a general solution of the inhomogeneous equation y′′ + 3y′ + 2y = f (x) for
the following cases: (i) f (x) = 1, (ii) f (x) = x 2 , (iii) f (x) = e−4x , (iv) f (x) = e−x and
(v) cos x
22. Find a general solution of the inhomogeneous equation y′′ + 2y′ + y = f (x) for the
following cases: (i) f (x) = 1, (ii) f (x) = x 2 , (iii) f (x) = e−4x , (iv) f (x) = e−x and
(v) sin x
23. Find a general solution of the inhomogeneous equation y′′ + 4y = f (x) for the
following cases: (i) f (x) = 1, (ii) f (x) = x 2 , (iii) f (x) = e−4x , (iv) f (x) = e−x and
(v) sin 2x
24. Find a general solution of the inhomogeneous equation y′′ + 2y′ + 5y = f (x) for the
following cases: (i) f (x) = 1, (ii) f (x) = x 2 , (iii) f (x) = e−x sin 2x, (iv) f (x) = e−x and
(v) sin 2x
25. Use the method of variation of parameter to obtain a general formula for the so-
lution of the equation in problem (21) and then solve the specific cases.
14.4 Vector initial value problem | 317
26. Find a general solution of the following ODEs: (i) yiv + 2y′′ + y = 0, (ii) yiv + 4y′′ = 0
and (iii) y′′′ + 4y′′ + 13y′ = 0
27. Determine the solution of the initial value problem:
d3 y dy
t3 +t − y = t2; y(1) = 1; y′ (1) = 3; y′′ (1) = 14
dt 3 dt
u′ = A(t)u (14.42)
(a) Show that the expression vT Lu is a exact derivative if v satisfies the adjoint
equation
v′ = −A(t)T v (14.43)
This is known as the Lagrange’s identity and by integrating it we get the so-
called Green’s formula
(c) If U is a fundamental matrix for equation (14.42) and V for equation (14.43)
show that UT V = C, where C is a nonsingular constant matrix.
(d) Discuss the relationship between scalar and vector adjoints for the case n = 2.
15 Introduction to the theory of functions of a
complex variable
The theory of functions of a complex variable is helpful in determining and under-
standing the solutions (and their properties) of linear equations. Specifically, it is use-
ful for (i) inverting the Laplace transformation, (ii) evaluation and inversion of Fourier
transforms, (iii) series solutions of ordinary differential equations, (iv) solution of lin-
ear partial differential equations (conformal mapping and solution of Laplace’s equa-
tion in the plane) and (v) evaluation of certain definite integrals. This chapter pro-
vides an introduction to the theory of functions of a complex variable with examples
selected from applications.
We have already seen that the set of all complex numbers forms a field, which we shall
denote by ℂ. The symbol z = x + iy, which can stand for any complex number in the
set ℂ is called a complex variable. We shall use the notation Re{z} = real part of z = x
and Im{z} = imaginary part of z = y. The complex conjugate of z is a complex number
x − iy and will be denoted by z or z ∗ .
If z1 = x1 + iy1 ∈ ℂ and z2 = x2 + iy2 ∈ ℂ, then the usual algebraic operations
(addition/subtraction and multiplication/division) are defined by
z1 ± z2 = (x1 ± x2 ) + i(y1 ± y2 )
z1 z2 = (x1 + iy1 )(x2 + iy2 )
= (x1 x2 − y1 y2 ) + i(x1 y2 + x2 y1 )
z1 x1 + iy1 (x1 + iy1 )(x2 − iy2 )
= =
z2 x2 + iy2 x22 + y22
x1 x2 + y1 y2 (x y − x1 y2 )
= + i 2 21 (for z2 ≠ 0)
x22 + y22 x2 + y22
Since a complex number z = x + iy can be identified with the ordered pair (x, y), it
can be represented as a point in the x − y plane, called the complex plane or Argand
diagram. This is shown in Figure 15.1.
https://doi.org/10.1515/9783110739701-016
15.1 Complex valued functions | 319
x = r cos θ; y = r sin θ; r = √x 2 + y 2 ,
where r is the modulus and the angle θ with the positive x-axis (in counterclockwise
direction) is called the argument. In polar form, the complex number is written as
The polar form of the complex number is convenient for some operations such as mul-
tiplication or division. For example, if we denote zk = rk eiθk (k = 1, 2, . . . , n), then
Applying this to the n-th power of a complex number, we get De Moivre’s theorem
n
z n = (reiθ ) = r n (cos θ + i sin θ)n = r n (cos nθ + i sin nθ) (15.2)
or
By expanding the RHS of equation (15.3) and equating the real and imaginary parts,
we can express cos nθ or sin nθ (for n = 1, 2, 3, . . .) in terms of cos θ and sin θ.
320 | 15 Introduction to the theory of functions of a complex variable
If n is a positive integer, we have from De Moivre’s theorem and using the relation
e2kπi = 1 for k = 0, 1, 2, . . ., we can write
1 1 1 1 1
z n = (reiθ ) n = r n (eiθ+2kπi ) n = r n ei
(θ+2kπ)
n , k = 0, 1, . . . n − 1 (15.4)
1
It follows from equation (15.4) that there are n distinct values of z n that are located on
1
a circle of radius r n and with argument differing by 2π n
. For example, for n = 4 and
r = 1, we get the fourth roots of unity: 1, −1, i and −i, located on a circle of unit radius
as shown in Figure 15.2.
If to each value of the complex number z in a set, we assign another complex number
w such that w = f (z), where f is a function, then w is called a function of a complex
variable. The function can be single-valued or multivalued. In general, we write
w = f (z) = f (x + iy)
= u(x, y) + iv(x, y) (15.5)
where u(x, y) and v(x, y) are the real and imaginary part of f (z). Unless otherwise spec-
ified, we assume f (z) is single-valued.
15.2 Limits, continuity and differentiation | 321
Example 15.1. We consider some elementary functions and determine their real and
imaginary parts.
1. Some elementary functions of a complex variables:
(a) w = z 2 = (x + iy)2 = (x 2 − y2 ) + i(2xy) ≜ u(x, y) + iv(r, θ).
(b) w = ez = ex+iy = ex (cos y + i sin y) = ex cos y + iex sin y ≜ u(x, y) + iv(x, y). The
function ez is periodic with complex period 2πi, since ez = ez+2πi
(c) w = sin z = sin(x + iy) = sin x cos iy + cos x sin iy = sin x cosh y + i cos x sinh y ≜
u(x, y) + iv(x, y). The function sin z is periodic with real period 2π.
2. Some functions in polar coordinates:
(a) w = z 2 = (reiθ )2 = r 2 e2iθ = r 2 cos 2θ + ir 2 sin 2θ ≜ u(r, θ) + iv(r, θ).
(b) w = ln z. Writing z = reiθ = reiθ+2kπi , k = 0, ±1, ±2, . . . we get w = ln(reiθ+2kπi ) =
ln r+i(θ+2kπ) ≜ u(x, y)+iv(x, y), k = 0, ±1, ±2, . . . . The function ln z is infinitely
many valued. The principal or main branch is given by ln r + iθ and is denoted
by Ln z.
Let f (z) be a function defined in some neighborhood of a point z0 , with the possible
exception of z0 itself. We say that the limit of f (z) as z approaches z0 is w0 and write
15.2.2 Continuity
Let f (z) be a function defined in a neighborhood of z0 . Then we say that f (z) is contin-
uous at z0 if (i) limz→z0 f (z) = w0 exists, (ii) f (z0 ) is defined and (iii) w0 = f (z0 ).
A function f (z) is said to be continuous in a region if it is continuous at all points
of the region.
15.2.3 Derivative
Remarks.
(i) analyticity is a property defined over open sets while differentiability could con-
ceivably hold at one point only. For example, for the function
the derivative exists at z0 = 0 but does not exist at any other point.
(ii) As in the case of function of a real variable, differentiability implies continuity
but the converse is not true. For example, the function f (z) = |z 2 | is continuous
everywhere but is differentiable only at z0 = 0. Similarly, the function f (z) = z is
continuous everywhere but is not differentiable at any point.
𝜕u 𝜕v
f ′ (z0 ) = (x , y ) + i (x0 , y0 ) (15.10)
𝜕x 0 0 𝜕x
1 𝜕u 𝜕v
f ′ (z0 ) = (x0 , y0 ) + (x0 , y0 )
i 𝜕y 𝜕y
𝜕v 𝜕u
= (x , y ) − i (x0 , y0 ) (15.11)
𝜕y 0 0 𝜕y
Thus, if f ′ (z0 ) exists, a necessary condition from equation (15.10) and equation (15.11)
is
15.2 Limits, continuity and differentiation | 323
𝜕u 𝜕v 𝜕u 𝜕v
= , =− @ (x0 , y0 ) (15.12)
𝜕x 𝜕y 𝜕y 𝜕x
𝜕u 1 𝜕v 𝜕v 1 𝜕u
= , =− (15.13)
𝜕r r 𝜕θ 𝜕r r 𝜕θ
Definition. A real-valued function ϕ(x, y) is said to be harmonic in a region ℛ if all its
second partial derivatives are continuous and
𝜕2 ϕ 𝜕2 ϕ
+ 2 =0 (Laplace’s eq.) (15.14)
𝜕x 2 𝜕y
at each point of ℛ.
The proof of this theorem follows from the C–R equations. u and v are called the
conjugate harmonic functions.
w = Pn (z)
= a0 + a1 z + a2 z 2 + ⋅ ⋅ ⋅ + an z n ,
For example,
1
w1 (z) = z 2
1 1
w2 (z) = z 2 + z 3
are algebraic functions. [w1 (z) is two-valued function while w3 (z) is six-valued.]
324 | 15 Introduction to the theory of functions of a complex variable
az + b
w(z) = ; a, b, c, d ∈ ℂ.
cz + d
4. Exponential functions
5. Logarithmic functions
w = ln z = ln(reiθ ) = ln(reiθ+2kπi ), k = 0, 1, 2, . . .
= ln r + i(θ + 2kπ), k = 0, 1, 2, . . .
As stated earlier, this function (which is the inverse of the exponential function)
is infinite-valued. The primary branch corresponding to k = 0 is denoted by
Ln z = ln r + iθ.
z α = eα ln z
are trigonometric functions and periodic with period 2π. The inverse function
such as
15.2 Limits, continuity and differentiation | 325
1
sin−1 z = ln(iz + √1 − z 2 )
i
is infinite-valued.
7. Hyperbolic functions
sinh−1 z = ln(z + √z 2 − 1)
is infinite-valued.
(a) Poles
If z0 is a singular point of f (z) such that
lim (z − z0 )n f (z) = A ≠ 0,
z→z0
Example 15.2.
ez
(i) f1 (z) = z−1 has a simple pole at z = 1.
1
(ii) f2 (z) = sin z has simple poles at z = ±kπ, k = 0, 1, 2, . . .
(iii) f3 (z) = z13 has a pole of order 3 at z = 0.
Example 15.3.
(i) f1 (z) = √z + 1 has a branch point at z = −1.
(ii) f2 (z) = ln z has a branch point at z = 0.
(iii) f3 (z) = cos √z has no branch points and is analytic for all z.
326 | 15 Introduction to the theory of functions of a complex variable
sin √z
For example, f (z) = √z
has a removable singularity at z = 0.
1
g(z) = f ( ).
z
Definition. If f (z) is analytic for all z(with |z| < ∞), it is called an entire function.
For example, sin z, cos √z, ez and J0 (z) are entire functions.
∫ f (z) dz = ∫ f (z) dz
a C
= lim Sn (15.16)
n→∞
max Δzk →0
From the above definition, it is seen that the complex line integral of f (z) = u(x, y) +
iv(x, y) can be expressed in terms of two real line integrals:
Further, many properties of the real definite/line integrals also apply to complex inte-
grals.
A region ℛ is called simply connected if any simple curve, which lies in ℛ can be
shrunk to a point without leaving ℛ (equivalently, the region ℛ has no holes). In Fig-
ure 15.4, ℛ1 and ℛ2 are simply-connected while ℛ3 and ℛ4 are not. The regions ℛ3
and ℛ4 are multiply-connected (or they have one or more holes).
Let ℛ be a region in the complex plane and C be its boundary. The boundary is said to
be traversed in the positive direction if an observer moving on C has the region to the
left. We use the notation
∮ f (z) dz (15.18)
C
Figure 15.4: Schematic diagram illustrating simply and multiply connected domains.
∮ z dz = ∫ ei3θ ieiθ dθ
3
C 0
2π 2π
∮ f (z) dz = 0. (15.19)
C
This fundamental theorem may be shown to be valid for both simply and multiply
connected domains. It was proved by Cauchy with the further assumption of f ′ (z) to
be continuous. However, Goursat removed the restriction and for this reason, it is also
referred to as the Cauchy–Goursat theorem. Cauchy’s proof utilizes Green’s theorem in
the plane.
Green’s theorem. Let P(x, y) and Q(x, y) be continuous and have continuous partial
derivatives in a region ℛ and on its boundary C. Then
15.3 Complex integration, Cauchy’s theorem and integral formulas | 329
𝜕Q 𝜕P
∮ P dx + Q dy = ∬( − ) dx dy (15.20)
𝜕x 𝜕y
C ℛ
where C1 and C2 are traversed in the positive sense (see Figure 15.5).
Figure 15.5: Schematic diagram illustrating positive traversal and Cauchy’s theorem.
1
Example 15.5. Evaluate ∮C (z−a) dz, where C is any simple closed curve and z = a is (i)
outside of C and (ii) inside C.
1
(i) If z = a is outside of C, by Cauchy’s theorem ∮C (z−a) dz = 0.
(ii) If z = a is inside C, let C2 be a circle of radius ϵ centered at z = a. Then
330 | 15 Introduction to the theory of functions of a complex variable
1 1
I1 = ∮ dz = ∮ dz
(z − a) (z − a)
C C2
But on C2 ,
z − a = ϵeiθ ; dz = iϵeiθ dθ
Thus,
2π
1 iϵeiθ dθ
I1 = ∮ dz = ∫ = 2πi
(z − a) ϵeiθ
C2 0
1
Example 15.6. Evaluate In = ∮C (z−a) n dz, n = 2, 3, 4, . . . where C is any simple closed
curve.
Using the same procedure as above, it is easily seen that In = 0 for both cases, i. e.,
when z = a is inside or outside of C.
Let f (z) be analytic inside and on a simple closed curve C and let a be any point in-
side C. Then
1 f (z)
f (a) = ∮ dz (15.22)
2πi (z − a)
C
n! f (z)
f [n] (a) = ∮ dz, n = 1, 2, 3, . . . (15.23)
2πi (z − a)n+1
C
These formulas follow from Cauchy’s theorem and are remarkable as they imply that
if f (z) is known on a simple closed curve C, then the values of the function and all its
derivatives can be found at all points inside C. An extended form of equation (15.22)
is also useful for developing an inversion formula for the Laplace transform. It is also
useful for solving Laplace’s equation in two spatial dimensions with Dirichlet bound-
ary conditions.
The following are some consequences of Cauchy’s integral formulas:
1. Every polynomial of degree n,
Pn (z) ≡ a0 + a1 z + ⋅ ⋅ ⋅ + an z n = 0
2. If f (z) is analytic inside and on a circle C with center at a and radius r, then f (a)
is the mean value of f (z) on C, i. e.,
2π
1
f (a) = ∫ f (a + reiθ ) dθ
2π
0
1 f ′ (z)
∮ dz = N − P
2πi f (z)
C
where N and P are, respectively, the number of zeros and poles of f (z) inside C.
4. Suppose that f (z) and g(z) are analytic inside and on a simple closed curve C and
suppose that |g(z)| < |f (z)| on C, then f (z) + g(z) and f (z) have the same number
of zeros inside C. [This is also known as Rouche’s theorem.]
For a proof of these and many other related theorems, we refer to the book by Spiegel
[29].
Example 15.7 (Poisson’s integral formula for a circle). Let f (z) be analytic inside and
on a circle C defined by |z| = R and let z = reiθ be any point inside C. We have by
Cauchy’s integral formula
1 f (w)
f (z) = f (reiθ ) = ∮ dw. (15.24)
2πi (w − z)
C
The inverse of the point z with respect to the circle C lies outside of C and is given
2
by Rz . By Cauchy’s theorem,
1 f (w)
0= ∮ dw. (15.25)
2πi (w − R2 )
C z
Writing
and separating the real and imaginary parts of equation (15.27) gives
2π
1 (R2 − r 2 )u(R, ϕ)
u(r, θ) = ∫ 2 dϕ (15.28)
2π R − 2Rr cos(θ − ϕ) + r 2
0
and
2π
1 (R2 − r 2 )v(R, ϕ)
v(r, θ) = ∫ 2 dϕ (15.29)
2π R − 2Rr cos(θ − ϕ) + r 2
0
We note that the Poisson’s integral formula given by equation (15.28) is the solu-
tion of the Laplace’s equation
1 𝜕 𝜕u 1 𝜕2 u
∇2 u = (r ) + 2 2 = 0
r 𝜕r 𝜕r r 𝜕θ
in a circle with boundary value u(R, ϕ) specified [Dirichlet problem]. A similar formula
may be obtained for the solution of ∇2 u = 0 in the upper half-plane (y > 0) with u(x, 0)
specified.
if given ϵ > 0, we can find a number N (depending in general on both ϵ and z) such
that
In such case, we say that the sequence converges to U(z). If a sequence {un (z)} con-
verges for all z in a region ℛ, we call ℛ the region of convergence of the sequence.
A sequence which is not convergent is called divergent.
15.4 Infinite series: Taylor’s and Laurent’s series | 333
2n−1
Example 15.8. Consider the sequence {un (z) = n
+ i n+2
n
} = {1 + 3i, 32 + 2i, 35 + i 35 , . . .}.
We claim that
2n − 1 n+2
un (z) − 2 − i = +i − 2 − i
n n
1 2
= − + i
n n
√5
=
n
√5
< ε ⇒ n > ε5 . Thus, if ε = 0.01, n > 223 = N and all terms after the 223rd are
√
n
inside a circle of radius 0.01 centered at 2 + i.
Taylor’s theorem. Let f (z) be analytic in a region ℛ and let z = a be any point in ℛ.
Then there exist precisely one power series with center at z = a representing f (z) and it
is given by
∞
f (n) (a)
f (z) = ∑ (z − a)n
n=0
n!
f ′ (a) f ′′ (a)
= f (a) + (z − a) + (z − a)2 + ⋅ ⋅ ⋅
1! 2!
The above representation is referred to as Taylor’s series expansion of f (z) and is valid
in the largest open disk with center at z = a in ℛ, i. e., the radius of convergence of the
above Taylor’s series is equal to the distance of the point z = a to the nearest singularity
of f (z), or to the boundary as shown below in Figure 15.6 schematically.
Figure 15.6: Schematic diagram illustrating the region of convergence of Taylor series.
334 | 15 Introduction to the theory of functions of a complex variable
Let f (z) be analytic in a region ℛ and let z = a be any point in ℛ. Then we have seen
that f (z) has a power series representation
∞
f (n) (a)
f (z) = ∑ (z − a)n
n=0
n!
that converges uniformly in the disk |z − a| < b. The power series may be obtained
either by evaluating the derivatives of f (x) or by other methods.
Example 15.9.
(a)
1
f (z) =
1+z
= 1 − z + z 2 − z 3 + ⋅ ⋅ ⋅ + (−1)n z n + ⋅ ⋅ ⋅ (|z| < 1)
(b)
f (z) = ln(1 + z)
z2 z3 (−1)n−1 z n
=z− + + ⋅⋅⋅ + + ⋅⋅⋅ (|z| < 1)
2 3 n
(c)
f (z) = tan−1 z
z3 z5 (−1)n z 2n+1
=z− + + ⋅⋅⋅ + + ⋅⋅⋅ (|z| < 1)
3 5 2n + 1
In cases (b) and (c), the series represent the principal value of the function.
1
= 1 + z + z 2 + z 3 + ⋅ ⋅ ⋅ ; |z| < 1,
1−z
we obtain
1 1
f (z) = + + 1 + z + z2 + z3 + ⋅ ⋅ ⋅ ; 0 < |z| ≤ γ < 1
z2 z
15.5 The residue theorem and integration by the method of residues | 335
which is valid in the punctured disk 0 < |z| ≤ γ (i. e., all points of the disk |z| ≤
γ excluding the center). This is the Laurent series of the function f (z). The first part
containing the reciprocal powers of z is called the principal part while the rest of the
series containing the constant and positive power of z is called the analytic part of f (z).
Theorem (Laurent). If f (z) is analytic and single-valued on two concentric circles 𝒞1 and
𝒞2 with center at z = a and in the annulus between them, then f (z) can be represented
by the (Laurent) series
∞
f (z) = ∑ an (z − a)n
n=−∞
where
1 f (w)
an = ∮ dw
2πi (w − a)n+1
𝒞
and 𝒞 is a closed curve in the annulus that encircles 𝒞2 (Figure 15.7). The series converges
and represents f (z) in the open annulus obtained from the given annulus by continuously
increasing 𝒞1 and decreasing 𝒞2 until each of the two circles reaches a point where f (z)
is singular.
∫ f (z) dz = 0 (15.30)
𝒞
for any contour 𝒞 in that neighborhood. If, however, f (z) has a pole or an isolated
essential singularity at z = a, and z = a lies in the interior of 𝒞 , then the integral
(equation (15.30)) will, in general, be different from zero. In this case, we may represent
f (z) by a Laurent series
∞ ∞
bn
f (z) = ∑ an (z − a)n + ∑ n
, (15.31)
n=0 n=1 (z − a)
which converges in the annulus 0 < |z − a| < R. Here, R is the distance from z = a to
the nearest singularity of f (z) and
1 f (w)
an = ∮ dw (15.32)
2πi (w − a)n+1
𝒞2
336 | 15 Introduction to the theory of functions of a complex variable
1
bn = ∮(w − a)n−1 f (w) dw (15.33)
2πi
𝒞1
where 𝒞1 and 𝒞2 are the contours enclosing the point z = a as shown in Figure 15.7.
1
b1 = ∮ f (w) dw
2πi
𝒞1
The coefficient b1 in the expansion (equation (15.31)) is called the residue of f (z) at
z = a.
Formally, we may obtain equation (15.34) from equation (15.31) by integrating term
by term as follows:
∞ ∞
bn
∮ f (z) dz = ∮[ ∑ an (z − a)n + ∑ ] dz
n=0 n=1 (z − a)n
𝒞1 𝒞1
∞ ∞
dz
= ∑ an ∮(z − a)n dz + ∑ bn ∮
n=0 n=1 (z − a)n
𝒞1 𝒞1
∞
dz
= 0 + ∑ bn ∮
n=1 (z − a)n
𝒞1
15.5 The residue theorem and integration by the method of residues | 337
⇒
2π 2π
dz iεeiθ dθ
∮ = ∫ = iε1−n ∫ ei(1−n)θ dθ
(z − a)n εn einθ
𝒞1 0 0
0, n ≠ 1
={
2πi, n=1
Thus,
∞
dz
∮ f (z) dz = ∑ bn ∮ = 2πib1
n=1 (z − a)n
𝒞1 𝒞1
Since b1 is the only term that contributes to the integral, it is called the residue.
Note that the Laurent expansion may be obtained by various methods without
using the formula (equations (15.32)–(15.33)). Hence, we may determine the residue
by one of these methods and then use the formula (equation (15.35)) for evaluating
the contour integral.
Example 15.11.
1. Evaluate
∮ ez dz = 0
𝒞
2. Evaluate
1
∮ z 2 e z dz, where 𝒞 : |z| = 5
𝒞
1 1
Note that the point z = 0 is an essential singularity of e z . The expansion of e z can
be expressed as
1 1 1 1 1 1 1 1
ez = 1 + + + + ...
z 2! z 2 3! z 3 4! z 4
338 | 15 Introduction to the theory of functions of a complex variable
⇒
1 1 1 1 1 1
z2e z = z2 + z + + + ...
2! 3! z 4! z 2
⇒
1 1
b1 = Res(z 2 e z )z=0 =
3!
Thus,
1 πi
∮ z 2 e z dz = 2πib1 =
3
𝒞
3. Evaluate
sin z
∮ dz, where 𝒞 : |z| = 1
z3
𝒞
sin z
The expansion of z3
can be expressed as
sin z 1 z3 z5
= (z − + − ⋅ ⋅ ⋅)
z3 z3 3! 5!
1 1 z2
= 2 − + − ⋅⋅⋅
z 3! 5!
sin z
Thus, z3
has a second-order pole at z = 0 with residue as
sin z
b1 = Res( ) =0
z 3 z=0
Thus,
sin z
∮ dz = 2πib1 = 0.
z3
𝒞
p(z)
f (z) =
q(z)
(z − a) ′′ (z − a)2 ′′′
q(z) = (z − a)[q′ (a) + q (a) + q (a) + ⋅ ⋅ ⋅],
2! 3!
and
q′ (a) ≠ 0, p(a) ≠ 0,
i. e., q(z) has a simple zero at z = a. Thus, the residue of f (z) is given by
(z − a)p(z)
Res f (z)|z=a = lim
z→a q(z)
p(a)
= ′ (15.36)
q (a)
Example 15.12.
4−3z
(i) Consider f (z) = z(z−1)
, which has simple poles at z = 0 and z = 1. Thus,
4 − 3z
Res f (z)|z=0 = lim = −4
z→0 (z − 1)
4 − 3z
Res f (z)|z=1 = lim =1
z→0 z
sin z
(ii) Consider f (z) = tan z = cos z
, which has simple poles at
π
ak = (2k − 1) , k = 0, ±1, ±2, . . .
2
(z − ak ) sin z (z − ak )
Res f (z)|z=ak = lim = sin ak lim
z→ak cos z z→ak cos z
1 sin ak
= sin ak lim = =1
z→ak sin z sin ak
⇒
∞
(z − a)m f (z) = bm + bm−1 (z − a) + ⋅ ⋅ ⋅ + b1 (z − a)m−1 + ∑ an (z − a)n+m
n=0
⇒
340 | 15 Introduction to the theory of functions of a complex variable
dm−1
[(z − a)m f (z)]z=a = (m − 1)!b1
dz m−1
Thus,
1 dm−1
Res f (z)|z=a = b1 = lim { m−1 [(z − a)m f (z)]} (15.37)
(m − 1)! z→a dz
ez
Example 15.13. Consider f (z) = (z−1)3
, which has pole of order 3 at z = 1. Thus,
1 d2 e
Res f (z)|z=1 = lim lim{ 2 [ez ]} =
z→1 2! z→1 dz 2
e1+u e u2 u3 u4
f = = 3 (1 + u + + + + ⋅ ⋅ ⋅)
u 3 u 2! 3! 4!
e e e 1 u
= 3 + 2 + + + + ⋅⋅⋅
u u 2!u 3! 4!
⇒
e e
b1 = Res f (z)|z=1 = Res f (u)|u=0 = =
2! 2
Let f (z) be a single-valued function, which is analytic inside a simple closed path 𝒞
and on 𝒞 , except for finitely many singular points at α1 , α2 , . . . αm inside 𝒞 . Then
m
∮ f (z) dz = 2πi ∑ Res f (z)|z=αj
𝒞 j=1
Proof. Consider the schematic of a simply connected contour 𝒞 and positively oriented
circles 𝒞k (interior to 𝒞 and centered at z = αk ) as shown in Figure 15.8.
It follows from Cauchy’s theorem that
Figure 15.8: Simply connected contour 𝒞 containing positively oriented circles 𝒞k centered at z = αk .
This important theorem has various applications in connection with complex and real
integrals that appears in the inversion of Laplace and Fourier transforms. This will be
illustrated in later chapters.
Problems
1. (Complex numbers and functions):
(a) Evaluate the following:
(i) sin−1 2, (ii) cos(1 + i), (iii) ii , (iv) ln(−4) and (v) sinh−1 z
(b) Use the definition of elementary functions to prove the following:
sin 2x sinh 2y
(i) sin iz = i sinh z, (ii) cos iz = cosh z and (iii) tan z = cos 2x+cosh 2y
+i cos 2x+cosh 2y
2. (Real and complex roots of nonlinear equations): Determine all the roots (real and
complex) of the following equations:
(a) z sin z = γ cos z (γ = 2)
kp e−zD
(b) 1 + 1+τz
= 0 (kp = 1, τ = 1, D = 1)
Remark. Equation (a) appears in the solution of unsteady state heat/mass trans-
form problems while (b) appears in the stability analysis of a closed loop control
system with delay.
7. Show that
1 ezt
∮ 2 dz = sin t, t>0
2πi z + 1
C
(c) Expand each of the following in a Laurent series about z = 0, naming the type
of singularity in each case:
z2 4
(i) 1−cos z
, (ii) ez 3 , (iii) z 2 e−z , (iv) coshz z and (v) z sinh √z
−1
z
(d) State whether each of the following functions are entire, meromorphic or nei-
ther:
(i) z 2 e−z , (ii) cot 2z, (iii) 1−cos z
z
, (iv) z sin z1 and (v) sin√z√z (vi) z + z1
10. Show that
∞
sin ax 1 a 1
∫ dx = coth −
e2πx − 1 4 2 2a
0
15.5 The residue theorem and integration by the method of residues | 343
aiz
e
Hint: Use e2πz −1
around a rectangle with vertices at 0, R, R + i, i and let R → ∞.
11. Nyquist stability criterion: Let f (z) = Pn (z) + Qm (z)e−zD , where D > 0; Pn (z) and
Qm (z) are relative prime polynomials, n > m and f (z) has no zeros on the imagi-
nary axis while Pn (z) has N zeros in the right half-plane. Prove that for the function
f (z) to have no zeros in the right half-plane, it is necessary and sufficient that the
point
Qm (z) −zD
w=− e
Pn (z)
wind around the point w = 1, N times in the positive direction while the point z
traverses the entire imaginary axis upwards.
(a) Apply the theorem to the following function to determine the domain of the
real numbers kp and τ for which all the zeros of the function lie in the left
half-plane:
f (z) = kp e−zD + τz + 1
x b−1
∞
π
∫ = (0 < b < 1).
1 + x sin bπ
0
16 Series solutions and special functions
In this chapter, we illustrate ordinary, regular and irregular singular points of a first-
and second-order differential equations and method of obtaining series solutions. We
also introduce various special functions that arise in applications.
dw
+ p(z)w = 0 (16.1)
dz
and four special cases of the coefficient function p(z) as discussed below.
Case 1: p(z) = 1
The exact solution is w(z) = c exp(−z), where c is a constant.
z = 0 is an ordinary point and the solution is an entire function.
1
Case 2: p(z) = 2z
c
The exact solution is w(z) = √z .
z = 0 is a regular singular point and the solution has a branch point at z = 0.
Case 3: p(z) = 3z
The exact solution is w(z) = zc3 .
z = 0 is a regular singular point and the solution has a pole of order 3 at z = 0.
1
Case 4: p(z) = z2
The exact solution is w(z) = c exp( z1 ).
z = 0 is an irregular singular point and the solution has an essential singularity at
z = 0.
More generally, when p(z) = kz n−1 with k ≠ 0, the exact solution is given by
c exp(− nk z n ), n ≠ 0
w(z) = { (16.2)
cz −k , n=0
https://doi.org/10.1515/9783110739701-017
16.2 Ordinary and regular singular points | 345
solution has a branch point at z = 0; (ii) for any positive integer k, z = 0 is a regular
singular point and the solution has a pole of order k and (iii) for any negative integer
k, z = 0 is an ordinary point and the solution is an entire function.
where
di w
w[i] =
dz i
Examples. (a)
w′′ (z) + z 2 w′ (z) + (z − 3)w = 0 (16.4)
1
w′′ (z) + w′ (z) + ez w = 0 (16.5)
(1 + z 2 )
Theorem. Suppose that z0 is an ordinary point of (16.3). Then (16.3) has n linearly in-
dependent solutions that are analytic at z0 . Each solution may be expanded in a Taylor
series
∞
w(z) = ∑ an (z − z0 )n (16.6)
n=0
and the radius of convergence of this series is at least as large as the distance to the
nearest singularity of the coefficient functions pi (z).
w′ (z)
w′′ (z) + + ez w = 0, (16.7)
1 + z2
∞
w(z) = ∑ an z n (16.8)
n=0
∞ ∞ ∞
zn ∞
(1 + z 2 ) ∑ n(n − 1)an z n−2 + ∑ nan z n−1 + (1 + z 2 )( ∑ )( ∑ an z n ) = 0
n=0 n=0 n=0
n! n=0
etc.
⇒
z2 z2 z4
w(z) = a0 (1 − + ⋅ ⋅ ⋅) + a1 (z − + + ⋅ ⋅ ⋅)
2 2 24
= a0 w0 (z) + a1 w1 (z)
Here, w0 (z) and w1 (z) are the two linearly independent solutions.
Definition (Regular singular point, r. s. p.). The point z0 is a regular singular point of
equation (16.3) if not all pi (z), i = 0, 1, 2, . . . , (n−1) are analytic at z0 but if (z −z0 )n p0 (z),
(z − z0 )n−1 p1 (z), . . . , (z − z0 )pn−1 (z), are analytic at z0.
z 2 w′′ + zw′ − w = 0
w′ w
w′′ + − 2 =0
z z
1 1
p0 (z) = − , p1 (z) =
z2 z
Theorem. Suppose that z0 is regular singular point of (16.3). Then the n linearly inde-
pendent solutions of (16.3) have one of the following forms:
k
i
w4 (z) = (z − z0 )γ ∑ [ln(z − z0 )] fi (z), for some k = 1, 2, . . . , n − 1
i=0
16.2 Ordinary and regular singular points | 347
where fi (z), gi (z) are analytic at z0 and have a Taylor series, which converges in a disk
extending to the nearest singularity of pi (z). αi , β and γ are called indicial exponents.
A solution of equation (16.3) is either analytic at z0 or if it is not analytic, the singularity
must be either a pole or an algebraic or logarithmic branch point. The indicial exponents
can be obtained by solving a polynomial of degree n whose coefficients depend on pi (z).
Frobenius method
To determine the solution(s) of the form w1 (z), we write
∞
w1 (z) = (z − z0 )α1 ∑ ai (z − z0 )i
i=0
∞
pk (z) = ∑ pki (z − z0 )i , k = 0, 1, . . . , n − 1
i=0
To find other forms of solutions, expand fi (z) and gi (z) in a Taylor series around z0 . We
now consider the special case n = 2 (second-order differential equation). Let α1 and α2
be the two indicial exponents. Then the different forms of the solutions are
or
Case 1: α1 ≠ α2 ; α1 − α2 ≠ integer
Two Frobenius solutions of the form given in equation (16.9) and equation (16.10)
Case 2: α1 − α2 = 0, 1, 2, . . .
Either two Frobenius solutions or one Frobenius solution and one with logarithm
Example 16.3.
zw′′ + w′ + zw = 0
(This equation is a special case of z 2 w′′ +zw′ +(z 2 −n2 )w = 0, which is Bessel’s equation
of order n)
348 | 16 Series solutions and special functions
w′
w′′ + +w =0
z
1
p0 (z) = 1, p1 (z) =
z
Indicial equation
p00 = lim z 2 .1 = 0
z→0
α2 = 0
α = 0, 0
Substituting
∞
w = ∑ an z n
n=0
gives
∞
zw′′ + w′ + zw = ∑ {an n2 + an−2 }z n−1
n=0
⇒
an−2
an = −
n2
a2n+1 = 0
a0 a2 a4
a2 = − , a4 = − , a6 = − , etc.
22 42 62
⇒
∞
(−1)k z 2k
w0 (z) = a0 ∑ 2k 2
= a0 J0 (z)
k=0 2 (k!)
where
16.2 Ordinary and regular singular points | 349
∞ (−1)k ( z2 )2k
J0 (z) = ∑ = Bessel function of order zero
k=0
(k!)2
I0 (z) = J0 (iz) = Bessel function of second kind of order zero (also known as modified
2 π
Bessel function). For |z| ≫ 1, J0 (z) = √ πz cos(z − 4
). A graph of J0 (z) is shown in
Figure 16.1.
∞
g1 (z) = ∑ an z n
n=0
∞
g2 (z) = ∑ bn z n
n=0
g2
w1′ = g1′ + (ln z)g2′ +
z
2g2′ g2
w1 = g1 + g2 ln z +
′′ ′′ ′′
− 2
z z
Substitute these in the differential equation and compare coefficients. After some al-
gebra, we get the second solution
where
2 z
Y0 (z) = {ln( ) + γ}J0 (z)
π 2
2 z2 z4 1 z6 1 1
+ { 2 − 2 2 (1 + ) + 2 2 2 (1 + + ) − ⋅ ⋅ ⋅}
π 2 2 .4 2 2 .4 .6 2 3
and
1 1 1
γ = lim {(1 + + + ⋅ ⋅ ⋅ + ) − ln n} = 0.5772 (Euler’s constant)
n→∞ 2 3 n
2
For |z| ≫ 1, Y0 (z) ≈ √ πz sin(z − π4 ). A graph of Y0 (z) for real z is shown below (see
Figure 16.2).
d2 u du
p0 (z) + p1 (z) + p2 (z)u = 0 (16.12)
dz 2 dz
Let
p1 (z) p2 (z)
p(z) = , q(z) =
p0 (z) p0 (z)
d2 u du
+ p(z) + q(z)u = 0 (16.13)
dz 2 dz
Let ψ1 (z) and ψ2 (z) be the two linearly independent solutions of equation (16.13), and
ψ (z) ψ2 (z)
W(z) = 1′ = ψ1 ψ′2 − ψ′1 ψ2 = Wronskian
ψ1 (z) ψ′2 (z)
Now,
dW ψ1 (z) ψ2 (z) ψ1 ψ2
= ′′ =
dz ψ1 (z) ψ2 (z) −pψ′1 − qψ1
′′
−pψ′3 − qψ2
ψ1 ψ2 ψ1 ψ2
=
= −p(z) ′
ψ ψ′
−pψ′1 −pψ′3 1 3
⇒
dW
= −p(z)W(z)
dz
or
z
⇒
z
ψ1 ψ′2 − ψ′1 ψ2 W(z) W(z0 )
= = exp[− ∫ p(t ′ ) dt ′ ]
ψ21 ψ21 ψ21
z0
⇒
z
d ψ2 W(z) W(z0 )
( )= = 2 exp[− ∫ p(t ′ ) dt ′ ]
dz ψ1 ψ21 ψ1 (z)
z0
⇒
z t′
1
ψ2 (z) = ψ1 (z) ∫ 2 ′ exp[− ∫ p(y) dy] dt ′ (16.14)
ψ1 (t )
z0 z0
u
u′′ + =0
4z 2
1 1
Here, p(z) = 0 and q(z) = 4z 2
. It can be seen that ψ1 (z) = √z is a solution, as ψ′1 = 2√z
,
1 2 ′′
ψ′′
1 = − 4z√z ⇒ 4z ψ1 + ψ1 = 0. Thus, from equation (16.7), we get
z
1 ′
ψ2 = √z ∫ dt = √z(ln z − ln z0 ) = √z ln z + c1 ψ1
t′
z0
2z
Here, p(z) = − 1−z 2 ⇒
z
1
exp[− ∫ p(t) dt ′ ] =
1 − z2
1 1 1+z
Q0 (z) = 1. ∫ dz = ln( )
(1 − z 2 ).12 2 1−z
1
P2 (z) = (3z 2 − 1)
2
3z 2 − 1 1+z 3z
Q2 (z) = ln( )−
4 1−z 2
w′′ = zw
n+1
1 ∞ Γ( 3 ) 1/3 n 2(n + 1)π
Ai(z) = ∑ (3 z) sin[ ]
π32/3 n=0 n! 3
n+1
1 ∞ Γ( 3 ) 1/3 n 2(n + 1)π
Bi(z) = ∑ (3 z) sin[ ]
π3 n=0 n!
1/6 3
where Γ is the Gamma function. For a plot of the Airy Ai(z) and Bairy Bi(z) functions,
see Figure 16.3. Note that Ai(0) = 2/3 1 2 = 0.3550 and Bi(0) = 1/6 1 2 = 0.6149.
3 Γ( 3 ) 3 Γ( 3 )
z 2 w′′ + zw′ + (z 2 − ν2 )w = 0
where ν may not have to be an integer. Note that z = 0 is a regular singular point. The
two linearly independent solutions are denoted by Jν (z) and Yν (z) and referred to as
Bessel functions of first and second kind, respectively. These are defined by
2m+ν
∞
(−1)m z
Jν (z) = ∑ ( ) , ∀ν (including nonintegers)
m=0
m!Γ(m + ν + 1) 2
Jν (z) cos(νπ) − J−ν (z)
Yν (z) = , ν is not an integer
sin(νπ)
zw′′ + w′ + zw = 0
2k−n
2 z 1 n−1 (n − k − 1)! z
Yn (z) = [ln( ) + γ]Jn (z) − ∑ ( )
π 2 π k=0 k! 2
2k+n
1 n−1 [ϕ(k) + ϕ(n + k)] z
− ∑ (−1)κ ( )
π k=0 (n + k)!k! 2
1 1 1
ϕ(p) = 1 + + + ⋅⋅⋅ + ; ϕ(0) = 0.
2 3 p
z 2 w′′ + zw′ − (z 2 + ν2 )w = 0
where ν is a real constant. When ν is not an integer, the two linearly independent
solutions are denoted by Iν (z) and Kν (z), where
16.4 Special functions defined by second-order ODEs | 355
2m+ν
∞
1 z
Iν (z) = i−ν Jν (iz) = ∑ ( ) ∀ν (including nonintegers)
m=0
m!Γ(m + ν + 1) 2
π I−ν (z) − Iν (z)
Kν (z) = , ν is not an integer
2 sin(νπ)
zw′′ + w′ − zw = 0
with the two linearly independent solutions I0 (z) and K0 (z), where
z2 z4 z2
I0 (z) = 1 + + + + ⋅⋅⋅
22 22 .42 22 .42 .62
z z2 z4 1 z6 1 1
K0 (z) = −[ln( ) + γ]I0 (z) + 2 + 2 2 (1 + ) + 2 2 2 (1 + + ) + ⋅ ⋅ ⋅
2 2 2 .4 2 2 .4 .6 2 3
For ν = ± 21 , I 1 (z) = √ πz
2 2
sinh z and I− 1 (z) = √ πz cosh z. A plot of modified Bessel
2 2
functions for real z is shown in Figure 16.5. Note that z = 0 is a regular singular point
of modified Bessel equation.
356 | 16 Series solutions and special functions
zw′′ + w′ + zw = 0
with the two linearly independent solutions j0 (z) and y0 (z), where
sin z cos z
j0 (z) = ; and y0 (z) = −
z z
Note that the negative sign in y0 (z) is used as convention so that these functions are
similar to J0 (z) and Y0 (z).
For n > 0, the two linearly independent solutions of the spherical Bessel equation
are
16.4 Special functions defined by second-order ODEs | 357
π Jn+ 21 (z)
jn (z) = (−1)n+1 √
2 √z
Y
π n+ 2
1 (z)
π J−(n+ 21 ) (z)
yn (z) = √ = (−1)n+1 √
2 √z 2 √z
Note that z = ±1 is a regular singular point. The two linearly independent solutions
are denoted by Pn (z) and Qn (z) and are called Legendre functions of first kind and of
second kind, respectively, and are related as
where
1 1+z
P0 (z) = 1; Q0 (z) = ln( )
2 1−z
z 1+z
P1 (z) = z; Q1 (z) = ln( )−1
2 1−z
Figure 16.7: Legendre polynomial Pn (z) and Legendre’s function of second kind Qn (z) for n =
0, 1, 2, 3, 4.
m2
(1 − z 2 )w′′ − 2zw′ + [n(n + 1) − ]w = 0,
1 − z2
where m and n are nonnegative integers. Note that z = ±1 is a regular singular point.
The two linearly independent solutions are denoted by Pnm (z) and Qm
n (z) and are called
16.4 Special functions defined by second-order ODEs | 359
the associated Legendre functions of first kind and of second kind, respectively. For
m = 0, Pn0 reduces to Legendre polynomials. Figure 16.8 shows the plot of Pn1 and Q1n
for n = 1, 2, 3, 4, 5.
Figure 16.8: Associated Legendre polynomial Pn1 (z) and Legendre’s function of second kind Q1n (z) for
n = 1, 2, 3, 4, 5.
which has an irregular singular point at infinity. The bounded solutions are the Her-
mite polynomials Hn (z) that are given as
w′′ + (1 − z)w′ + nw = 0; n = 0, 1, 2, . . .
where the bounded solutions are the Laguerre polynomials Ln (z) that are given as
(1 − z 2 )w′′ − zw′ + n2 w = 0; n = 0, 1, 2, . . .
where the two linearly independent solutions are Tn (z) and √1 − z 2 Un with Tn (z) and
Un (z) being Chebyshev’s polynomials of degree n of first and second kind, respectively.
These polynomials are given by following recurring relations:
For additional discussion on series solutions and special functions, we refer to the
book by Bender and Orszag [7].
Problems
1. Determine the Taylor series expansion about the point z = 0 of the solution to the
following initial value problems:
(a) w′′ = (z − 1)w; w(0) = 1, w′ (0) = 0
(b) w′′ = z 3 w; w(0) = 1, w′ (0) = 0, w′′ (0) = 0
2. Determine the series expansions of all solutions of the following differential equa-
tions (about the point z = 0) and identify the functions that appear:
(a) zw′′ + w = 0
(b) w′′ − z 2 w = 0
3. Determine the linearly independent (series) solutions of the following differential
equations and identify the functions that appear:
(a) zw′′ + w′ − zw = 0
(b) w′′ + λz 2 w = 0 λ is a positive constant.
(c) (1 − z 2 )w′′ − 2zw′′ + n(n + 1)w = 0; n = 0, 1, 2, . . . .
17 Laplace transforms
The Laplace Transform is a special case of general linear integral transformation of a
function of variable t and parameter s. The general transformation with kernel K(t, s)
is of the form
where F(s) is called the image or transform of f (t). Integral transforms of the above
form have been studied by Laplace (1749–1827) and Cauchy (1789–1857), and hence
the name. When a = 0, b → ∞ and K(t, s) = e−st , the general integral transform
becomes the Laplace transform.
The Laplace transform technique is useful for solving (i) linear differential equa-
tions (initial value problems and boundary value problems), (ii) linear difference
equations, (iii) linear integral equations, (iv) linear ordinary differential equations
with time delay, (v) linear integrodifferential equations and (vi) linear partial differ-
ential equations that arise in many applications. We review first the theory of Laplace
transform and then illustrate its usefulness with some chemical engineering applica-
tions. Further applications are given in the last section.
γt
f (t) ≤ Me (17.2)
(This last condition implies that the function f (t) does not grow faster than an
2
exponential function. Thus, for functions such as et , the Laplace transform does
not exist. There are also other classes of functions such as t1n ; n ≥ 1, which are
unbounded at t = 0 and for which the Laplace transform does not exist.)
The Laplace transform associates with f (t) a function F(s) of the complex variable
s = x + iy defined by the integral
https://doi.org/10.1515/9783110739701-018
362 | 17 Laplace transforms
Theorem 17.1. If f (t) satisfies conditions (i)–(iii) above, the Laplace transform exists for
Re s > γ.
∞
−st
F(s) = ∫ e f (t) dt
0
∞
≤ M ∫ e−st eγt dt
0
Now,
−st −(x+iy)t
e = e = e
−xt
∴
∞
M
F(s) ≤ M ∫ e dt = x>γ
(γ−x)t
,
(x − γ)
0
M
F(s) ≤ (17.4)
(x0 − γ)
Theorem 17.2. The Laplace transform F(s) of f (t) is an analytic function of the complex
variable s in the domain Re s > γ
= ∫e −xt
(cos yt)f (t) dt − i ∫ e−xt (sin yt)f (t) dt
0 0
Now,
17.2 Properties of Laplace transform | 363
Figure 17.1: Schematic diagram illustrating the region x > γ in which F (s) is analytic.
∞
𝜕u
= − ∫ te−xt cos ytf (t) dt (17.6)
𝜕x
0
and
𝜕u ∞ M
γt
≤ ∫ e tMe dt = (17.7)
−xt
.
𝜕x (x − γ)2
0
Thus, 𝜕u
𝜕x
exists and it is continuous in the domain x > γ. Also, from the definition,
∞
𝜕v 𝜕u
= − ∫ te−xt cos ytf (t) dt = . (17.8)
𝜕y 𝜕x
0
𝜕u 𝜕v
=− .
𝜕y 𝜕x
Thus, the Cauchy–Riemann equations are satisfied and the first partial derivatives are
continuous ⇒ F(s) is analytic for Re s > γ. This theorem also implies that any singu-
larities of F(s) must lie to the left of the line Re s = γ. Figure 17.1 shows a schematic
diagram illustrating the region in Laplace domain where F(s) is analytic.
1. Linearity
If ℓ{f1 (t)} = F1 (s) and ℓ{f2 (t)} = F2 (s), then
f (t − a), t≥a
g(t) = {
0, 0 < t < a,
3. Scaling property
If ℓ{f (t)} = F(s)
1 s
ℓ{f (at)} = F( ), where a ≠ 0 is any complex number. (17.12)
a a
4. Transforms of derivatives
If ℓ{f (t)} = F(s) and f (t) has continuous derivatives,
This formula can be established from the definition using integration by parts.
Repeated application of the above formula gives
5. Transforms of integrals
If ℓ{f (t)} = F(s) = ∫0 e−st f (t) dt, then
∞
17.2 Properties of Laplace transform | 365
t
F(s)
ℓ{∫ f (t ′ ) dt ′ } = , (17.16)
s
0
dn F(s)
ℓ{(−1)n t n f (t)} = (17.17)
dsn
∞
f (t)
ℓ{ } = ∫ F(s′ ) ds′ (17.18)
t
s
= ∫ f1 (t )f2 (t − t ) dt = ∫ f1 (t − t ′ )f2 (t ′ ) dt ′
′ ′ ′
(17.19)
0 0
8. Periodic functions
If f (t + T) = f (t), T > 0
s2 t 2 s3 t 3
∞
= ∫ [1 − st + − + ⋅ ⋅ ⋅]f (t) dt
2! 3!
0
366 | 17 Laplace transforms
it follows that
dj F(s)
Mj = (−1)j . (17.25)
dsj s=0
Equation (17.25) shows that the jth moment of f (t) can be obtained from F(s) by
expanding it in power of s.
Example 17.1. Consider the exponential function f (t) = eat , t > 0 (a is real or com-
plex). We have
∞
1
ℓ{e } = ∫ eat e−st dt =
at
(17.26)
s−a
0
From equation (17.26), we can obtain the Laplace transform of many elementary func-
tions:
(set a = 0) (differentiation w. r. t. a)
1 1
ℓ{1} = ℓ{teat } =
s (s − a)2
1
ℓ{t} = 2 .
s
. .
. .
. .
n n! n!
ℓ{t } = n+1 ℓ{t n eat } =
s (s − a)n+1
(set a = α + iβ) (set a = iw)
1
⇒ ℓ{eiwt } =
s − iw
1
ℓ{eαt (cos βt + i sin βt)} ℓ{e−iwt } =
s + iw
.
1 s
= ⇒ ℓ{cos wt} = 2 .
s − α − iβ s + w2
s−α w
⇒ ℓ{eαt cos βt} = ℓ{sin wt} = 2
(s − α)2 + β2 s + w2
17.2 Properties of Laplace transform | 367
Example 17.2 (Unit step function (Heaviside’s function)). The Heaviside’s function,
also known as the unit-step function is shown in Figure 17.2.
1, t>0
H(t) = U(t) = {
0, t<0
1
ℓ{U(t)} = (17.27)
s
Similarly, if the unit step is located at t = α (α > 0), then we denote the function by
U(t − α) and
1
ℓ{U(t − α)} = e−αs (17.28)
s
Example 17.3. Consider the function
1
, 0<t<ε
f (t) = { ε
0, t>ε
Taking the limit ε → 0 (but keeping the area under the curve constant), we get the
so-called unit impulse function (also known as the Dirac delta function):
Figure 17.3 shows an approximation to unit impulse for small values of ε. Similarly, for
the unit impulse function at time t = t0 , denoted δ(t − t0 ), we have
Remark. The Dirac delta function may be approached by using many “test functions.”
The above illustration used one-sided test function. A test function using the two-
sided or symmetric around the origin is the Gaussian function
1 x2
f (x, ε) = exp(− ).
2πε 2ε
t
2 2
erf t = ∫ e−u du = Error Function
√π
0
7
2 1 t 3/2 t 5/2 t2
erf √t = {t 2 − + − + ⋅ ⋅ ⋅}
√π 3(1!) 5(2!) 7(3!)
2 Γ(3/2) Γ(5/2) 1
ℓ{erf √t} = { − + ⋅ ⋅ ⋅} =
√π s3/2 3s5/2 s √s + 1
∞
e−u uα
∞
Γ(α + 1)
=∫ du =
sα+1 sα+1
0
17.3 Inversion of Laplace transform | 369
Γ(α + 1) = ∫ uα e−u du = α!
0
Γ(α + 1) = αT(α)
∞ ∞
1 1
2 1 2
Γ( ) = ∫ u− 2 e−u du, u = v ⇒ Γ( ) = 2 ∫ e−v dv = √π.
2 2
0 0
Example 17.6 (Laplace transform directly from the D. E.). Consider the Bessel equa-
tion of order zero:
Let
d 2 dF
− {s F(s) − s} + sF − 1 − =0
ds ds
dF sF c
⇒ =− ⇒F= ,
ds 1 + s2 √1 + s 2
and lim sF = c = 1
s→∞
1
ℓ{J0 (t)} = .
√1 + s2
γ t
f (t) < Me 0 (17.31)
is analytical in the region Re s > γ0 . Assume that F(s) is real for s real. Suppose that s
is any point on the real axis and γ > γ0 as shown in Figure 17.4.
Figure 17.4: Schematic diagram of the contours for Cauchy’s integral formula.
If C is the vertical and Γ is the curved contour shown, Cauchy’s integral formula gives
1 F(z) F(z)
F(s) = [∫ dz + ∫ dz]
2πi (z − s) (z − s)
C Γ
γ−iR sin θ1
1 F(z) 1 F(z)
= ∫ dz + ∫ dz (17.32)
2πi (z − s) 2πι (z − s)
γ+iR sin θ1 Γ
where Γ = part of semicircular arc with radius R as shown in Figure 17.4. Consider the
integral
1 F(z)
J= ∫ dz (17.33)
2πi (z − s)
Γ
M
F(z) ≤ k (17.35)
R
⇒
1 M Mθ1
|J| ≤ . 2θ = . (17.36)
2π Rk 1 πRk
Thus, limR→∞ |J| = 0 if k > 0.
⇒
γ+i∞
1 F(z)
F(s) = ∫ dz (17.37)
2πi (s − z)
γ−i∞
where s is real. The above formula is an extended form of Cauchy’s integral formula.
Now, if ℓ{f (t)} = F(s) ⇒ ℓ−1 {F(s)} = f (t), where ℓ−1 is the inverse operator. Applying
ℓ−1 on both sides of equation (17.37) ⇒
γ+i∞
1 −1 F(z)
f (t) = ℓ ∫ dz (17.38)
2πi (s − z)
γ−i∞
ℓ−1 is w. r. t. s while integration is w. r. t. z. Thus, we can take ℓ−1 inside the integral
∴
γ+i∞
1 1
f (t) = ∫ F(z)ℓ−1 ( ) dz (17.39)
2πi s−z
γ−i∞
⇒
γ+i∞
1
f (t) = ∫ ezt F(z) dz. (17.40)
2πi
γ−i∞
M
Theorem. Suppose that F(s) is analytic in some right half-plane Re s > γ, |F(s)| < sk
for
some k > 0 and F(s) is real for s real. Then the integral
γ+i∞
1
f (t) = ∫ est F(s) ds (17.41)
2πi
γ−i∞
372 | 17 Laplace transforms
is independent of γ and converges to a real valued function f (t) whose Laplace transform
is F(s).
∞ γ+i∞
1
F(s) = ∫ e −st
f (t) dt, f (t) = ∫ est F(s) ds (17.42)
2πi
0 γ−i∞
∞ ∞
1
F(α) = ∫ e −iαt
f (t) dt, f (t) = ∫ eiαt F(α) dα. (17.43)
2π
−∞ −∞
Figure 17.5: Schematic diagram illustrating the contour enclosing all the poles for Bromwich’s inte-
gral formula.
1
∮ est F(s) ds = ∑ Residues of est F(s) at poles of F(s)
2πi
17.3 Inversion of Laplace transform | 373
γ+i∞
1 1
LHS = ∫ est F(s) ds + ∫ est F(s) ds
2πi 2πi
γ−i∞ Γ
If F(s) is such that on Γ, |F(s)| < RMk , k > 0, then it can be shown that for R → ∞, the
second integral goes to zero and
γ+iR
1
LHS = ∫ est F(s) ds
2πi
γ−iR
e−a√s
F(s) = (a > 0),
s
which has a branch point at s = 0. The appropriate contour to consider is shown in
Figure 17.6.
Figure 17.6: Schematic diagram illustrating the contour enclosing the branch point at s = 0 for
Bromwich’s integral formula.
Thus,
γ+i∞
e−a√s 1 e−st−a√s
ℓ−1 { }= ∫ { } ds.
s 2πi s
γ−i∞
374 | 17 Laplace transforms
The function f (t) can be obtained by evaluating the integral along the contour shown
in Figure 17.6 and taking the limits ε → 0, R → ∞.
d2 u du
a0 + a1 + a2 u = 0 (17.44)
dt 2 dt
u(0) = d0 (17.45)
u (0) = d1
′
(17.46)
a0 (sd0 + d1 ) + a1 d0
U(s) = (17.50)
a0 s2 + a1 s + a2
dn u dn−1 u du
a0 + a 1 + ⋅ ⋅ ⋅ + an−1 + an u = 0 (17.51)
dt n dt n−1 dt
Qm (s)
U(s) = , m≤n−1 (17.53)
Pn (s)
⇒
n
Qm (si )
u(t) = ∑ esi t (17.55)
i=1
Pn′ (si )
dn u dn−1 u du
a0 n
+ a1 n−1 + ⋅ ⋅ ⋅ + an−1 + an u = g(t) (17.56)
dt dt dt
Qm (s) g(s)
̂
U(s) = + (17.58)
Pn (s) Pn (s)
Qm (s) g(s)
̂
u(t) = ℓ−1 { } + ℓ−1 { }
Pn (s) Pn (s)
= u1 (t) + u2 (t), (17.59)
where
g(s)
̂
u2 (t) = ℓ−1 { } (17.62)
Pn (s)
Let
n
1 e si t
ℓ−1 { }=∑ ′ ≡ G(t) (17.63)
Pn (s) P (s )
i=1 n i
The function G(t − t ′ ) is also called the Green’s function for the initial value problem.
We note that when g(t) = δ(t − t0 ), equation (17.65) reduces to u2 (t) = G(t − t0 ). Thus,
G(t − t0 ) is the response of the system with homogeneous (zero) initial condition for a
unit impulse at t = t0 .
17.4 Solution of linear differential equations by Laplace transform | 377
where L has constant coefficients. Taking Laplace transform and solving for the trans-
form, we get
Qm (s) F(s)
U(s) = + (17.68)
Pn (s) Pn (s)
where Qm (s) is a polynomial of degree m(< n) depending only on the initial condi-
tions while Pn (s) is a polynomial depending on the coefficients of L. Equation (17.68)
is written as
or
Homogeneous part
We now derive the Heaviside’s formula for uh (t).
Let
Qm (s)
U(s) = (17.71)
Pn (s)
Qm (s) A1 A2 An
= + + ⋅⋅⋅ + (17.72)
Pn (s) (s − λ1 ) (s − λ2 ) (s − λn )
(s − λi )
Ai = lim Qm (s)
s→λi Pn (s)
Q (λi )
= m (17.73)
Pn′ (λi )
378 | 17 Laplace transforms
Qm (s)
U(s) = (17.75)
Pn (s)
Pole of order 2
Suppose that
⇒
n αj
(s − a)2 U(s) = α1 (s − a) + α2 + (s − a)2 ∑
j=3
(s − sj )
⇒
and
n
u(t) = α1 eat + α2 teat + ∑ αj esj t (17.81)
j=3
17.4 Solution of linear differential equations by Laplace transform | 379
Pole of order 3
let s = a is a pole of order 3, we can write
n αj
α1 α2 α3
U(s) = + + +∑ (17.82)
s − a (s − a) 2 (s − a) 3
j=4
(s − sj )
where
1 d2
α1 = lim 2 [(s − a)2 U(s)] (17.85)
2! s→a ds
Qm (sj )
αj = ′ , j≥4 (17.86)
Pn (sj )
and
t 2 at n
u(t) = α1 eat + α2 teat + α3 e + ∑ α j e sj t (17.87)
2! j=4
Qm (s)
U(s) = , where Rn−2 (a) ≠ 0 (17.88)
(s − a)2 Rn−2 (s)
⇒
n Q (s )
d Qm (s) st m j sj t
u(t) = lim [ e ]+∑ ′ e
s→a ds R (s) P (s )
n−2 j=3 n j
where
d Qm (s) d
α1 = (lim [ ]) = (lim [(s − a)2 U(s)]) (17.90)
s→a ds Rn−2 (s) s→a ds
Qm (a)
α2 = = lim[(s − a)2 U(s)] (17.91)
Rn−2 (a) s→a
Qm (sj )
αj = , j≥3 (17.92)
Pn′ (sj )
As can be expected, the results from residue theorem (equations (17.89)–(17.92)) are
identical to those obtained from Heaviside’s formula (equations (17.78)–(17.81)).
Inhomogeneous part
We now derive the Heaviside’s formula for up (t).
Let
F(s) F(s)
Up (s) = = (17.93)
Pn (s) Pn (s)
Writing
we get
n
1 e λj t
ℓ−1 [ ]=∑ ′ ≡ G(t) (17.95)
Pn (s) P (λ )
j=1 n j
Thus, by convolution,
1
up (t) = ℓ−1 [ .F(s)]
Pn (s)
t
= ∫ G(t − t ′ )f (t ′ ) dt ′ (17.96)
0
= ∫ G(t ′ )f (t − t ′ ) dt ′ . (17.97)
0
Here, G(t) is the Green’s function of the IVP and is defined by equation (17.63).
17.4 Solution of linear differential equations by Laplace transform | 381
Consider solving
d2 u du
+ a1 + a0 u = f (x), a < x < b (17.98)
dx 2 dx
u(a) = α1 , u(b) = α2 (17.99)
sα1 + β1 + a1 α1 f ̂(s)
U(s) = 2
+ 2 (17.102)
s + a1 s + a0 s + a1 s + a0
(s1 α1 + β1 + α1 a1 ) (s α + β1 + α1 a1 )
u(x) = es1 x + e s2 x 2 1
(s1 − s2 ) (s2 − s1 )
x
e s1 x − e s2 x
′ ′
+∫ f (x − x ′ ) dx′ (17.103)
(s1 − s2 )
0
b
e s1 b − e s2 b
β1 ( ) + ∫ G(x, x ′ )f (x ′ ) dx ′ = 0.
s1 − s2
0
Also, in this case, the solution of the BVP with homogeneous BCs may be expressed
as
x
e s1 x − e s2 x
u(x) = β1 ( ) + ∫ G(x, x ′ )f (x′ ) dx ′ (17.104)
s1 − s2
0
where
G(x, x ′ ) = . (17.105)
(s1 − s2 )
382 | 17 Laplace transforms
d2 u du
t − − t.u = 0
dt 2 dt
u(0) = 0
U(s) = ℓ{u(t)}
ℓ{u (t)} = s2 U(s) − 0 − u′ (0),
′′
ℓ(u′ (t)) = sU(s)
d 2
ℓ{tu′′ (t)} = − {s U(s) − u′ (0)}
ds
⇒
d 2 dU
− {s U(s)} − sU(s) + =0
ds ds
dU
(s2 − 1) + 3sU = 0
ds
⇒
c
U(s) =
(s2 − 1)3/2
u(t) = c ℓ−1 {(s2 − 1) } = c t I1 (t),
−3/2
1
−3/2
(s2 − 1) = s−3 (1 −
−3/2
)
s2
3 5
3 1 (− )(− 2 ) 1
= s−3 [1 + . 2 + 2 + ⋅ ⋅ ⋅]
2 s 2! s4
1 3 3.5 3.5.7
= 3 + 5 + + + ⋅⋅⋅
s 2.s 2
2!.2 s 7 3!23 s9
Taking the inverse transform term-by-term gives
where
∞
(t/2)2m+ν
Iν (t) = ∑
m=0
m!Γ(m + v + 1)
17.4 Solution of linear differential equations by Laplace transform | 383
d2 u du
t + + tu = 0, u(0) = 1
dt 2 dt
⇒
c
U(s) = , c = u(0) = lim s.U(s) = 1
√1 + s 2 s→∞
u(t) = J0 (t)
(ii) d2 u du
t + − tu = 0
dt 2 dt
1
u(0) = 1 ⇒ U(s) = , u(t) = I0 (t)
√s2 − 1
ẍ − x + ẏ − y = 0 (17.106)
−2ẋ − 2x + ÿ − y = e −t
(17.107)
x(0) = 0, y(0) = 1 (17.108)
x(0)
̇ = −1, y(0)
̇ =1 (17.109)
(s2 + 2s + 2)
X(s) = − (17.112)
(s + 1)2 (s2 + 1)
s2 + 2s + 2
Y(s) = (17.113)
(s2 + 1)(s + 1)
1 1
X(t) = cos t − sin t − (1 + t)e−t (17.114)
2 2
384 | 17 Laplace transforms
1 3 1
Y(t) = cos t + sin t + e−t . (17.115)
2 4 2
Example 17.11 (Autonomous linear systems). Consider the vector initial value prob-
lem
du
= Au, u(@ t = 0) = u0 , (u is a n-vector, A is a n × n matrix) (17.116)
dt
ℓ{u} = û ⇒
̂ − u0 = Au(s)
su(s) ̂ = (sI − A)−1 u0
̂ ⇒ u(s) (17.117)
γ+i∞
1
u(t) = ∫ est (sI − A)−1 u0 ds (17.118)
2πi
γ−i∞
adj(sI − A) adj(sI − A)
(sI − A)−1 = = (17.119)
det(sI − A) Pn (s)
n
u(t) = ∑ Residue est (sI − A)−1 u0 |s=si
i=1
n
adj(si I − A) 0
= ∑ e si t u
i=1
Pn′ (si )
n
= ∑ esi t Ei u0 = eAt u0 . (17.120)
i=1
This example shows the connection between the spectral theorem and residue theo-
rem.
In this section, we illustrate the application of the Laplace transform method to solve
linear differential equations in two independent variables.
17.5 Solution of linear differential/partial differential equations by Laplace transform | 385
𝜕θ 𝜕2 θ
= 2; 0 < ξ < 1, t > 0 (17.121)
𝜕t 𝜕ξ
θ(ξ , 0) = 0 IC (17.122)
θ(0, t) = 0; θ(1, t) = 1 BCs (17.123)
Let θ(ξ
̄ , s) = ℓ{θ(ξ , t)} ⇒
d2 θ̄
− sθ̄ = 0
dξ 2
θ(0,
̄ s) = 0 BC1
̄ s) = 1 BC2
θ(1,
s
⇒
1
BC1⇒ c1 = 0, BC2 ⇒ c2 = s. sinh √s
⇒
sinh √sξ
θ̄ = (17.124)
s sinh √s
γ+i∞
1
θ(ξ , t) = ∫ est .θ(ξ
̄ , s) ds (17.125)
2πi
γ−i∞
Thus,
Note that for t → ∞, θ(ξ , t) → ξ , i. e. the steady-state profile is linear. The results are
shown below in Figure 17.7 using 1000 terms in the summation in Mathematica® .
386 | 17 Laplace transforms
Figure 17.7: Temperature profile at various times in a slab from one-dimensional heat conduction
model given in equations (17.121)–(17.123).
Figure 17.8: Temporal profile of flux for TAP reactor (exact solution in blue and short time solution in
red curves).
𝜕c 𝜕2 c
= ; 0 < z < 1, t>0 (17.127)
𝜕t 𝜕z 2
17.5 Solution of linear differential/partial differential equations by Laplace transform | 387
with BC:
𝜕c
− (0, t) = δ(t); c(1, t) = 0 (17.128)
𝜕z
and IC:
c(z, 0) = 0 (17.129)
𝜕c
J(t) = − (1, t) (17.130)
𝜕z
The Laplace transform method can be used to solve for dimensionless concentration
c and flux J.
Let
d2 ĉ dĉ
sĉ = ; − (z = 0, s) = 1; ĉ(z = 0, s) = 0 (17.132)
dz 2 dz
⇒
sinh[√s(1 − z)]
ĉ(z, s) = (17.133)
√s cosh[√s]
⇒ LT of exit flux is
̂J(s) = − dĉ (z = 1, s) = 1
(17.134)
dz cosh[√s]
s2 t 2
∞
= ∫ (1 − st + . . .)J(t) dt
2!
0
s2
= M0 − sM1 + M − ⋅⋅⋅ (17.135)
2! 2
388 | 17 Laplace transforms
where M0 , M1 and M2 are the zeroth, first and second moments of J(t). Thus,
M0 = ̂J|s=0 = 1 (17.136)
d̂J sinh √s
1
M1 = − = = (17.137)
dz s=0 2√s(cosh √s) s=0 2
2
d2̂J 5
M2 = − 2 = (17.138)
dz s=0 12
Thus, the dimensionless variance (second central moment) of the response is given by
M2 2
σ2 = −1= . (17.139)
2
M1 3
1 1 s s2 5
= =1− + + ⋅⋅⋅
cosh √s 1 + s
+ s2
+ ⋅⋅⋅ + sn
+ ⋅⋅⋅ 2 2! 12
2! 4! (2n)!
cosh √s = 0 ⇒ e2√s + 1 = 0
or
π2
sk = −(2k − 1)2 ; k = 1, 2, 3, . . . (17.140)
4
All these poles are simple zeros. Thus, using Heaviside’s formula or residue theorem,
we get
∞
2√sk
J(t) = ∑ esk t
k=1
sinh √sk
∞
π2
= ∑ (−1)k−1 (2k − 1)π exp[−(2k − 1)2 t] (17.141)
k=1
4
π2 9π 2 25π 2
= π[e− 4 t − 3e− 4
t
+ 5e− 4
t
− ⋅ ⋅ ⋅] (17.142)
The general solution from equation (17.141) with 100 terms in the summation and the
short time solution from equation (17.143) are shown in Figure 17.8 in blue and red
curves, respectively.
It can be seen from this figure that the flux attains a maximum when
dJ 1
= 0 ⇒ t ∗ = and Jmax = 1.85 (17.144)
dt t=t ∗ 6
We can also show from initial and final value theorem that
Tracer tests are used to determine the flow maldistributions, leaks and performance
of several types of process equipment such as reactors, separation/adsorption and
distillation columns. In a typical tracer test, a pulse of tracer is injected at the inlet
to the device and the exit concentration of the tracer is recorded. The response of the
equipment for a unit impulse input is known as the “residence time distribution” or
RTD curve and is denoted by E(t). The exit response to the unit step input is referred
to as the “breakthrough curve” or cumulative RTD curve, and is denoted by F(t).
Figure 17.9 shows some schematic of RTD curves corresponding to unit-step and
pulse inputs to a tubular reactor.
E(t) = RTD curve = response to a unit impulse function and F(t) = response to a
unit step function
t
dF
E(t) = or F(t) = ∫ E(t ′ ) dt ′ .
dt
0
𝜕c 𝜕c 𝜕2 c
+ ⟨u⟩ = D , 0<x<L (17.147)
𝜕t ′ 𝜕x 𝜕x 2
390 | 17 Laplace transforms
Figure 17.9: Schematic of RTD curves corresponding to unit-pulse and unit-step input in a tubular
reactor.
with BCs:
⟨u⟩C0 (t ′ ) = ⟨u⟩c − D 𝜕x
𝜕c
@x = 0
} Danckwert’s BCs, (17.148)
𝜕c
𝜕x
=0 @;x = L
and IC.:
t ′ ⟨u⟩ x ⟨u⟩L Lt
t= ; z= ; Pe = ; g(z) = f (Lz); c0 (t) = C0 ( ) (17.150)
L L D ⟨u⟩
Then the differential equation and the boundary and initial conditions may be written
as
17.5 Solution of linear differential/partial differential equations by Laplace transform | 391
𝜕c 𝜕c 1 𝜕2 c
+ = , 0 < z < 1, t>0
𝜕t 𝜕z Pe 𝜕z 2
1 𝜕c
BCs: c0 (t) = c − @z = 0
Pe 𝜕z (17.151)
𝜕c
= 0@z = 1
𝜕z
IC: c(z, 0) = g(z)
The solution of this more general model will be considered in part V using the Fourier
transform method. Here, we consider the special case in which
Let
c(z,
̂ s) = ℓ{c(z, t)} (17.153)
1 d2 ĉ dĉ
− − sĉ = 0, 0 < z < 1, t>0
Pe dz 2 dz
1 dĉ (17.154)
BCs: 1 = ĉ − @z = 0
Pe dz
dĉ
= 0@z = 1
dz
⇒
where
λ2
−λ−s=0
Pe
4s (17.156)
1 ± √1 + Pe
λ1,2 = 2
Pe
BCs ⇒
1
1 = α1 + α2 − [α λ + α2 λ2 ]
Pe 1 1
0 = α1 λ1 eλ1 + α2 λ2 eλ2
solving ⇒
Pe λ2 eλ2 Pe λ1 eλ1
α1 = , α2 = − (17.157)
λ22 eλ2 − λ12 eλ1 λ22 eλ2 − λ12 eλ1
392 | 17 Laplace transforms
∴
Pe
c(z,
̂ s) = [λ2 eλ2 +λ1 z − λ1 eλ1 +λ2 z ] (17.158)
(λ22 eλ2 − λ12 eλ1 )
where
Pe 4s
λ1 = [1 + √1 + ] (17.159)
2 Pe
Pe 4s
λ2 = [1 − √1 + ] (17.160)
2 Pe
Since
E(s)
̂ = c(1,
̂ s) = Laplace transform of the RTD curve (17.162)
∴
Pe
̂ = Pe e [λ2 − λ1 ]
E(s) (17.163)
λ22 eλ2 − λ12 eλ1
Write
Pe 4s
λ1 = (1 + q); q = √1 + (17.164)
2 Pe
Pe
λ2 = (1 − q) ⇒ λ2 − λ1 = − Pe q (17.165)
2
⇒
Pe Pe
4qe 2 4qe 2
E(s)
̂ = = (17.166)
(1 + q)2 e
Pe q
2 − (1 − q)2 e−
Pe q
2 H(q)
where
Pe q Pe q
H(q) = (1 + q)2 e 2 − (1 − q)2 e− 2 (17.167)
and H(−q) = −H(q). Thus, H(q) is an odd function and may be written as
− Pe
⇒ E(s)
̂ contains only even power of q, and hence q = 0, or equivalently, s =
4
is not
a branch point. The poles of E(s)
̂ are given by
17.5 Solution of linear differential/partial differential equations by Laplace transform | 393
Pe
̂ = 4e for q → 0)
2
H(q) = 0 (q = 0 is not a branch point since E(s) 2
h(q )
(1 − q)2
= ePe q
(1 + q)2
1 + q2 − 2q
= ePe q
1 + q2 + 2q
2q 1 − ePe q Pe q
= = tanh(− ) (17.169)
1+q 2 1 + ePe q 2
Let
q = iQ (17.170)
2Q Pe Q
+ tan( )=0 (17.171)
1 − Q2 2
Λ Pe
tan Λ = (17.172)
Pe2
Λ2 − 4
⇒
Pe Q Pe Pe 4s
Λ= = iq = i √1 + (17.173)
2 2 2 Pe
Pe2 4s Pe2
Λ2 = − (1 + )=− − sPe (17.174)
4 Pe 4
1 Pe2
s = sj = − (Λ2j + ), j = 1, 2 . . . (Λi ≠ 0) (17.175)
Pe 4
Figure 17.10: Schematic diagram showing the location of the roots of characteristic equation in
Laplace domain.
RTD curve
The RTD curve can be obtained from the residue theorem as
γ+i∞
1
E(t) = ∫ est E(s)
̂ ds = ∑ Residue est E(s)|
̂ s=s (17.176)
2πi j
j
γ−i∞
Let
R j = e sj t A j (17.178)
where
1 Pe2
sj = − (Λ2j + ) (17.180)
Pe 4
and
4sj Pe
2 Pe(1 + Pe
)e 2
Aj = (17.181)
H ′ (qj )
⇒
17.5 Solution of linear differential/partial differential equations by Laplace transform | 395
Pe
∞ (−1)j+1 8Λ2j
E(t) = e 2 ∑ e sj t (17.182)
j=1 4Λ2j + 4 Pe + Pe2
A plot of the RTD curve is shown in Figure 17.11 for three different values of the Peclet
number.
Similarly, the results from numerical inversion is shown in Figure 17.12 for Pe = 5.0,
along with the solution from the Residue theorem (equation (17.182)) with 10 terms in
the summation.
Figure 17.12: RTD curves for Pe = 5.0 from residue theorem (solid line) and numerical inversion of
Laplace transform (marker points).
dn F(s)
Mn = (−1)n (17.183)
dsn s=0
Central moments
∞
2
m2 = σ = ∫ (t − t)2 f (t) dt = M2 − M12 (17.184)
0
∞
4s
q = √1 +
Pe
⇒
4s 2s 2s2
q = √1 + =1+ − + ⋅⋅⋅
Pe Pe Pe2
and
Pe Pe s2
exp( q) = e 2 es− Pe +⋅⋅⋅
2
Thus,
2s 2s2 Pe
4(1 + Pe
− Pe2
+ ⋅ ⋅ ⋅)e 2
E(s)
̂ =
Pe s 2 s 2
8s 4s2 4s2 − Pe
(4 + Pe
− Pe2
)e 2 es− Pe +⋅⋅⋅ + Pe2
e 2 e−s+ Pe +⋅⋅⋅
1 1 1 e− Pe 2
=1−s+( + − + )s + O(s3 ) (17.187)
2 Pe Pe2 Pe 2
But
∞
E(s)
̂ = ℓ[E(t)] = ∫ e−st E(t) dt
0
17.5 Solution of linear differential/partial differential equations by Laplace transform | 397
n ̂
̂ = ∑ d E(s) sn = ∑ (−1)n Mn sn
E(s) n
n=0
ds s=0 n=0
M2 2
= 1 − M1 s + s ... (17.188)
2
σ 2 = M2 − M12
2 2
= − (1 − e− Pe ) (17.192)
Pe Pe2
2
Note that for Pe ≫ 1, σ 2 = Pe . The higher-order moments can be obtained similarly by
Taylor series expansion (using Mathematica® ) as
6 6 18e− Pe 24 24e− Pe
M3 = 1 + − 2 + − + (17.193)
Pe Pe Pe2 Pe3 Pe3
− Pe − Pe
12 48 108e 360e 336e− Pe 312e− Pe 24e−2 Pe
M4 = 1 + − 2 + + − + + . (17.194)
Pe Pe Pe 2
Pe 3
Pe4 Pe4 Pe4
m3 = M3 + 2M13 − 3M1 M2
12 12e− Pe 24 24e− Pe
= + − + (17.195)
Pe2 Pe2 Pe3 Pe3
For Pe ≫ 1,
12
m3 ≈ (17.196)
Pe2
which means that the RTD curve is positively skewed (i. e., longer tail on right side).
This can be observed in the plots shown in Figures 17.11 and 17.12.
398 | 17 Laplace transforms
superficial velocity
u0 = interstitial fluid velocity (= )
ε
Ts = solid temperature
Tf = fluid temperature
Tr = reference temperature
L = Length of bed
Assuming constant (and average) physical properties, and taking the limit △x → 0,
we get
𝜕Tf 𝜕Tf
ερf Cpf = −hav (Tf − Ts ) − εu0 ρf Cpf (17.198)
𝜕t 𝜕x
𝜕
hav Ac Δx(Tf − Ts ) = [A Δx(1 − ε)ρs Cps (TS − Tr )]
𝜕t c
⇒
𝜕Ts
(1 − ε)ρs Cps = hav (Tf − Ts ) (17.199)
𝜕t
Assume that at time t = 0, the fluid and solid in the bed are at T = T0 and for
t > 0 the fluid enters the bed at a temperature T = Tin , i. e., the initial and boundary
conditions can be expressed as
x u0 Tf − T0 Ts − T0
z= ; τ=t ; θf = ; θs = , (17.201)
L L T0 T0
𝜕θs hav L
= (θ − θs ) (17.202)
𝜕τ u0 (1 − ε)ρs Cps f
𝜕θf 𝜕θf hav L
=− − (θ − θs ) (17.203)
𝜕τ 𝜕z u0 ερf Cpf f
The dimensionless equations (17.202) and (17.203) contain two dimensionless groups,
namely
u0 ερf Cpf t
ph = = h = local (or transverse) heat Peclet number; (17.206)
hav L tc
(1 − ε)ρs Cps
αh = = heat capacitance ratio of solid to fluid in the bed. (17.207)
ερf Cpf
ha
Here, tc (= uLε ) is the convection time while th (= ρ Cv ) is the heat exchange time
0 s ps
between solid and fluid. The model in dimensionless form can be expressed as
𝜕θs
= (θf − θs )
αh p h (17.208)
𝜕τ
𝜕θf 𝜕θf
ph ( + ) = −(θf − θs ) (17.209)
𝜕τ 𝜕z
with ICs
and BC
This model ignores heat conduction in the solid and fluid phases. This is the simplest
nontrivial model for unsteady-state heat transfer in a packed-bed. As shown, later, the
same model appears in many other packed-bed operations such as chromatography
17.5 Solution of linear differential/partial differential equations by Laplace transform | 401
and mass transfer operations. We consider here only the special case of a unit-step
input, i. e., θin (τ) = H(τ) = Heaviside’s function.
Let
∞
Θs (z, s) = ℓ{θs (z, τ)} = ∫ e−st θs (z, t) dt and Θf (z, s) = ℓ{θf (z, τ)}
0
1
Θs = Θ (17.215)
1 + αh ph s f
dΘf αh
+ Θf [1 + ]s = 0 (17.216)
dz 1 + αh ph s
⇒
1 αh
Θf = exp[−s(1 + )z]
s 1 + αh ph s
1 αh s
= exp[−sz] exp[− z]
s 1 + αh ph s
1 z 1
= exp[−sz] exp[− (1 − )]
s ph 1 + αh ph s
⇒
z exp[−sz] 1 z
Θf = exp[− ] exp[ ] (17.217)
ph s (s + 1
) αh p2h
αh ph
Equation (17.215) ⇒
1 z exp[−sz] 1 1 z
Θs = exp[− ] ( exp[ ]) (17.218)
αh ph ph s s+ 1
(s + 1
) αh p2h
αh ph αh ph
Equations (17.217) and (17.218) are the Laplace transformation of the fluid and solid
temperatures. By inverting them, we get the temperature in time domain.
Using the formulas,
402 | 17 Laplace transforms
1
ℓ−1 { } = 1,
s
⇒
e−sz 1, τ>z
ℓ−1 { } = H(τ − z) = { (17.219)
s 0, τ < z,
and
ℓ{f (t)} = F(s), then ℓ{e−at f (t)} = F(s + a) (Shift theorem), (17.220)
we have
1 λ 1 λ
ℓ−1 [ exp( )] = e−βτ ℓ−1 [ e s ] (17.221)
(s + β) s+β s
Since
1 λ 1 ∞ λn ∞
λn
exp( ) = ∑ = ∑
s s s n=0 n!sn n=0 n!sn+1
and
1 τn
ℓ−1 { } =
sn+1 n!
⇒
1 λ ∞
(λτ)n ∞
(2√λτ)2n
ℓ−1 { exp( )} = ∑ = ∑ = I0 (2√λτ) (17.222)
s s n=0 (n!) 2
n=0 22n (n!)2
ξ
∞ ( 2 )2k ∞
ξ 2k
I0 (ξ ) = ∑ = ∑ .
n=0 (k!)2 2k
k=0 2 (k!)
2
1 z
Thus, by replacing β = αh ph
and λ = αh p2h
, equations (17.221) and (17.222) ⇒
1 1 z
ℓ−1 [ 1
exp( 1
)]
s+ αh ph
s+ αh ph
αh p2h
−τ 1 z 1
= exp( )ℓ−1 [ exp( )]
αh p h s αh p2h s
−τ zτ
= exp( )I (2√ ) (17.223)
αh p h 0 αh p2h
17.5 Solution of linear differential/partial differential equations by Laplace transform | 403
Thus, using the convolution theorem (17.19)–(17.20) and equations (17.219) and (17.223),
we can express inverse Laplace transform of equation (17.218) as
Since
1, 0<t <τ−z
H(τ − t − z) = {
0, t >τ−z
τ−z
{ 1 exp[− pz ] ∫0 exp( α−tp )I0 (2√ α ztp2 ) dt, τ>z
θs (z, τ) = { αh ph h h h h h (17.225)
{ 0, τ<z
Now we can obtain θf (z, τ) by taking inverse Laplace transform of equation (17.217).
Alternatively, we can use equations (17.208) and (17.225) to obtain θf (z, τ), which is
given as follows:
𝜕θs
θf = θs + αh ph
𝜕τ
1 τ−z
{
{
{ α p
exp[− pz ] ∫0 exp( α−tp )I0 (2√ α ztp2 ) dt
{ h h h h h h h
{
{
={ (17.226)
{ + exp[− pz ] exp( αz−τ )I0 (2√ z(τ−z) ), τ > z
h ph αh p2h
{
{ h
{
{
{ 0, τ < z
θs (0, τ) = 1 − e−βτ
Figure 17.14 below shows the breakthrough curve, fluid temperature at the exit, i. e.,
θf (1, τ), when αh = 10 and ph = 0.01, 0.05 and 0.1. The 3D and density plots of solid
phase temperatures (θs ) are also shown in this figure (on right) with αh = 10 and ph =
0.1.
1
Note that for small values of ph , the temperature front moves with a speed of ( 1+α )
h
or the breakthrough time is τ ≈ (1 + αh ), and the spread (dispersion) is symmetric. In
the limit of ph → 0, the breakthrough curve is a step function with a jump at τ =
1 + αh .
404 | 17 Laplace transforms
Figure 17.14: Breakthrough curves θf (1, τ) for αh = 10 and different Peclet numbers ph between 0.01
to 0.1 (left); and 3D and density plots of dimensionless temperatures of solid θs (z, τ) for αh = 10 and
ph = 0.1 (right).
Consider a single input-single output (SISO) first-order system with PI control. The
dynamics of such control system can be described by the linear equation:
dx
τ + x = u + f (t) (17.227)
dt
t
where x and u are the state and the control variables, τ is the process time constant,
τD is the delay time, kP and kI are proportional and integral gains, and f (t) is the dis-
turbance function. Let X(s) and F(s) be the Laplace transform of x(t) and f (t), i. e.,
17.6 Control system with delayed feedback | 405
⇒
t
X(s)
ℒ[∫ x(t ) dt ] = (17.231)
′ ′
;
s
0
dx
ℒ[ ] = sX(s) − x(0) = sX(s); (17.232)
dt
and
∞
0
∞
⇒
kI
ℒ[u] = U(s) = −kP [exp(−sτD ) + ]X(s) (17.234)
s
Thus, taking the Laplace transform of equations (17.227)–(17.229) on both sides, we get
1, t≥0 1
f (t) = H(t) = { or F(s) = ,
0, t < 0, s
406 | 17 Laplace transforms
the response to the unit step disturbance can be expressed in Laplace domain as
1
X(s) = . (17.236)
τs2 + [1 + kP exp(−sτD )]s + kP kI
1
X(s) = (17.237)
τs2 + (1 + kP )s + kP kI
1
X(s) = (17.238)
τ(s − s1 )(s − s2 )
where s1 and s2 are the roots of the denominator function in equation (17.237), which
can be expressed as
The response function X(s) can be simplified further from equation (17.238) as
1 1 1
X(s) = [ − ] (17.240)
τ(s1 − s2 ) s − s1 s − s2
⇒
1 exp(s1 t) − exp(s2 t)
x(t) = [ ] (17.241)
τ (s1 − s2 )
Note from equation (17.239) since kP > 0 and τ > 0 for a physical system and PI
control, real part of s1 and s2 are always negative, i. e.,
lim x(t) = 0.
t→∞
17.6 Control system with delayed feedback | 407
In other words, such control system leads to no offset. In addition, when s1 and s2 are
s1
ln(
real (i. e., (1 + kP )2 ≥ 4τkP kI ), then x(t) goes through a maximum at t = s −ss2 . But when
)
1 2
s1 and s2 are complex, x(t) goes to zero in an oscillatory manner. For example, when
s1 and s2 are complex, i. e., s1,2 = −a ± ib, the response to unit step can be given from
equation (17.241) as
Further, when kI = 0 (i. e., proportional control only), the roots s1 = 0 and s2 = −(1 +
kP ), and in this case the response to a unit disturbance can be expressed as
1 1 − exp[−t(1 + kP )]
x(t) = [ ]. (17.243)
τ 1 + kP
Similarly, when (1 + kP )2 = 4τkP kI , then such system has repeated roots, i. e., s1 = s2 =
−(1+kP )
2τ
. In this case, the response function simplifies from equation (17.241) to
1 exp(s1 t) − exp(s2 t)
x(t) = lim [ ]
s1 →s2 τ (s1 − s2 )
t
= exp(s2 t)
τ
t −(1 + kP )t
= exp[ ] (17.244)
τ 2τ
Figure 17.15 shows the response of PI control systems for few cases discussed be-
low:
1. τ = 1, kI = 0, kP = 1. In this case, the roots are given from equation (17.238) as
s1,2 = 0, −2
1 − exp(−2t)
x(t) = [ ]
2
s1,2 = −1, −1
x(t) = t exp(−t)
408 | 17 Laplace transforms
3. τ = 1, kI = 1, kP = 10. In this case, the roots are given from equation (17.239) as
exp(−t) − exp(−10t)
x(t) =
9
4. τ = 1, kI = 10, kP = 1. In this case, the roots are given from equation (17.239) as
−2 ± √4 − 40 −2 ± 6i
s1,2 = =
2 2
− 1 + 3i, −1 − 3i
exp(−t)
x(t) = sin(3t)
3
1
X(s) = ; (17.245)
s[τs + 1 + kP exp(−sτD )]
1
X(s) = (17.246)
sG(s)
17.6 Control system with delayed feedback | 409
Thus, it can be seen that G′′ (s) is always positive, and hence cannot have the roots re-
peated more than twice. The roots that are repeated twice, can be obtained by solving
G(s) = G′ (s) = 0, simultaneously, which leads to
τ
= kP exp(−sj τD ) = −(1 + τsj ) (17.250)
τD
−(1 + α) τ 1
sj = ; α = D; = α exp(1 + α) and G′′ (sj ) = τD τ (17.251)
τD τ kP
Thus, if kp , τ and τD satisfy the constraint given in equation (17.251), then sj = − 1+α
τD
is
a repeated root of G(s) or a second-order pole of X(s), otherwise sj given by
is a simple pole. Also note that since α exp(1 + α) is an increasing function of α, when
relation in equation (17.251) given by k1 = α exp(1 + α) leads to unique solution in α
P
for a given kp . Thus, for a given set of kp , τ and τD , that satisfy the relation given in
equation (17.251), only one repeated root exits.
Note that s = 0 is a simple pole of X(s) and the residue at s0 = 0 is given by
1 1
Res exp(st)X(s) = lim = (17.252)
s=si s→0 G(s) 1 + kP
If s∗ is the twice repeated root and sj are the other nonzero simple roots of G(s), the
response to unit disturbance can be obtained from residue theorem as
x(t) = ∑ Res[exp(st)X(s)]
s
∞
= Res[exp(st)X(s)] + ∑ Res[exp(st)X(s)] + Res∗ [exp(st)X(s)]
s=0 s=si s=s
i=1
1 exp(si t) 1
∞
exp(s t) 2 ∗
= +∑ +
1 + kP i=1 si G (si )
′ s∗ G′′ (s∗ )
⇒
1 ∞
exp(si t) 2 exp(s∗ t)
x(t) = +∑ + (17.253)
1 + kP i=1 si [τ − kP τD exp(−si τD )] s ∗ τD τ
410 | 17 Laplace transforms
Stability analysis
Consider again case 2 (i. e., proportional control with delay). The control system is sta-
ble when all the roots of denominator in X(s) lie in the left half-plane (i. e., Re(sj ) < 0).
We can determine the criteria at which these roots cross the y-axis (pure imaginary
line) and go from left to right half-plane. For this, we can substitute s = jω, j = √−1
(i. e., the roots lying on imaginary axis) in the expression of G(s) given in equation
(17.247), which leads to
⇒
1
cos(ωτD ) = − (17.254)
kP
ωτ
and sin(ωτD ) = (17.255)
kP
The relation (17.254)–(17.255) cannot be satisfied if kp < 1. In other words, when kp <
1 the roots never cross the imaginary axis and lie always in left half-plane, i. e., the
control system is stable. When kP > 1, equation (17.254) lead to
−1 √kP2 − 1
ωτD = cos−1 ( ) = sin−1 ( ). (17.256)
kP kP
ωτ = √kP2 − 1 (17.257)
Thus, ω can be eliminated from equations (17.256) and (17.257), which lead to
√k 2 −1
τD cos ( kP ) sin ( kP )
−1 −1 −1 P
= = (17.258)
τ √kP2 − 1 √kP2 − 1
Figure 17.16: Stability region of linear control system with delayed proportional control.
cos−1 ( k−1 ) 2π
three examples with kp = 2(⇒ P
= 3√ 3
= 1.2092) and τ = 1: (i) τD = 1.21, (ii)
√kP2 −1
τD = 0.605 < 1.21 and (iii) τD = 2.418 > 1.21. The responses to unit-step disturbance
corresponding to these cases are shown in Figure 17.17.
Figure 17.17: Response to unit step function for proportional control with delay for stable, unstable
and oscillatory region.
s = a + ib (17.259)
412 | 17 Laplace transforms
⇒
1 + τa + kP exp(−τD a) cos(τD b) = 0
and [τb − kP exp(−τD a) sin(τD b)] = 0
⇒
1 k sin(τD b)
a= ln[ P ] (17.260)
τD τb
and
τ k sin(τD b)
1+ ln[ P ] + τb cot(τD b) = 0 (17.261)
τD τb
Thus, for a given set of τ, τD and kp , equations (17.260)–(17.261) can be solved for real
values of b and a numerically, and hence the roots. Using these roots, the response
curve can be obtained using equation (17.253) with sj = aj ± ibj .
Example 1: τ = τD = kP = 1;
In this case, the response function can be given from equations (17.246)–(17.249)
as
1
X(s) = ; G(s) = 1 + s + exp(−s)
sG(s)
sin(b) sin(b)
1 + ln[ ] + b cot(b) = 0 and a = ln[ ] (17.262)
b b
Table 17.1 lists first few roots G(s) in this example. These roots are also shown in Fig-
ure 17.18.
Using these roots, the response function can be expressed from equation (17.253)
as
1 ∞ exp[(ai + bi )t]
x(t) = + ∑(
2 i=1 (ai + bi )[τ − kP τD exp[−(ai + bi )τD ]]
exp[(ai − bi )t]
+ ). (17.263)
(ai − bi )[τ − kP τD exp[−(ai − bi )τD ]]
Using first 16 conjugate roots (listed in Table 17.1), the expression given in equation
(17.263) is plotted in Figure 17.19 in red solid lines, along with the method of steps based
numerical solution (green dashed line).
17.6 Control system with delayed feedback | 413
Table 17.1: First few roots for proportional control with delayed feedback (τ = τD = kp = 1).
j aj bj sj = aj + ibj
1 −0.605021 ±1.78819 −0.605021 ± 1.78819i
2 −2.05283 ±7.71841 −2.05283 ± 7.71841i
3 −2.64736 ±14.0202 −2.64736 ± 14.0202i
4 −3.01658 ±20.3214 −3.01658 ± 20.3214i
5 −3.28526 ±26.6179 −3.28526 ± 26.6179i
6 −3.49668 ±32.911 −3.49668 ± 32.911i
7 −3.67104 ±39.2019 −3.67104 ± 39.2019i
8 −3.81944 ±45.4912 −3.81944 ± 45.4912i
9 −3.94861 ±51.7794 −3.94861 ± 51.7794i
10 −4.06298 ±58.0668 −4.06298 ± 58.0668i
11 −4.1656 ±64.3535 −4.1656 ± 64.3535i
12 −4.25866 ±70.6397 −4.25866 ± 70.6397i
13 −4.34378 ±76.9256 −4.34378 ± 76.9256i
14 −4.42223 ±83.2111 −4.42223 ± 83.2111i
15 −4.49496 ±89.4964 −4.49496 ± 89.4964i
16 −4.56276 ±95.7814 −4.56276 ± 95.7814i
Figure 17.18: First few roots of G(s) = 0 for proprotional control with delayed feedback (τ = τD =
kp = 1).
Both match very good with each other as expected. Note that since Re(sj ) < 0 ∀j,
1
limt→∞ x(t) = 1+k = 21 .
p
Figure 17.19: Response to unit step disturbance equipped with proportional control with delayed
feedback for τ = τD = kp = 1.
Figure 17.20: Response to unit-step disturbance equipped with proportional control with delayed
feedback for τ = kp = 1 with τD = 1 (top plot) and τD = 10 (bottom plot) and from residue theorem
(solid lines) and numerical inversion of Laplace transform (marker points).
where the plot is sharper or the slope is high, especially for larger delay [see the bottom
plot (τD = 10) in Figure 17.20 near time t = 10 or 20].
Problems
1. Use the complex inversion formula to evaluate the inverse Laplace transform of
the following functions:
1 1 s
(i) (s+1)(s 2 +1) (ii) (s4 +4) (iii) (s2 +1)4
17.6 Control system with delayed feedback | 415
2. Use the complex inversion formula to evaluate the inverse Laplace transform of
the following functions:
(i) e−√s (ii) s√1s+1 (iii) s2 cosh
1
√s
(iv) scosh x√s
cosh a√s
, (0 < x < a)
3. Solve the following linear initial value problems using the Laplace transformation
method:
(i)
d4 u d2 u
− 2 +u=0
dt 4 dt 2
u(0) = 1, u′ (0) = 0, u′′ (0) = 1, u′′′ (0) = 0
(ii)
d2 u du
t2 +t + (t 2 − 1)u = 0, u(1) = 2, u(t) bounded for all t
dt 2 dt
4. Solve the following linear integral, integrodifferential and delay (difference) equa-
tions using Laplace transformation
(i)
t
t3
∫ u(t ′ )u(t − t ′ ) dt ′ = 2u(t) + − 2t
6
0
(ii)
u(0) = c0 , u′ (0) = c1
(iii)
(iv)
416 | 17 Laplace transforms
where
e−t t>0
u(t) = 0 for t < 0 and f (t) = {
0 t<0
5. The dynamic model for a cascade of N perfectly stirred tank reactors (CSTRs) is
given by
τ dc1
= c0 (t) − c1 (t)
N dt
τ dci
= ci−1 (t) − ci (t), i = 2, 3, . . . , N
N dt
where τ is the mean residence (space) time in the cascade, N is the number of
tanks and ci−1 (t) and ci (t) are the concentrations of a tracer in the stream enter-
ing and leaving tank i, respectively. The response of the system to a unit impulse
defined the residence time distribution (RTD) function, denoted as E(t), i. e.,
E(t) = cN (t)
Mi = ∫ t i E(t) dt,
0
show that
τ2
m2 = M2 − M12 =
N
2τ3
m3 = M3 − 3M1 M2 + 2M13 =
N2
6. The dynamic model for a recycle reactor (tubular plug flow reactor with recycle)
is given by
17.6 Control system with delayed feedback | 417
𝜕c (1 + R) 𝜕c
+ =0 0 < z < 1, t>0
𝜕t τ 𝜕z
R c (t)
c(0, t) = c(1, t) + in
1+R 1+R
c(z, 0) = 0, 0 < z < 1,
show that
(1 + 2R) 2
M1 = τ, M2 = τ
(1 + R)
R 2
m1 = 0, m2 = τ
1+R
7. The dynamics of a single input-single output (SISO) first-order system with PI con-
trol is described by the linear equations
dx
τ + x = u + f (t)
dt
t
where x and u are state and control variables, τ is the process time constant, τD
is the delay time, kP and kI are proportional and integral gains and f (t) is the dis-
turbance function. Use the Laplace transform method to determine the response
of the system for a unit step disturbance for the following cases:
τ τD kI kP
(a) 1 0 0 0, 1, 10
1
(b) 1 0 2
, 1, 2 1
(c) 1 1 0 1
(d) 1 0.5 1 1
418 | 17 Laplace transforms
8. Solve the following initial-boundary value problem using the Laplace transforma-
tion and make a schematic plot of the solution at any fixed position (x ≠ 0) as a
function of time
𝜕2 u 𝜕u
= ; 0 < x < 1, t > 0
𝜕x 2 𝜕t
I.C u(x, 0) = 0; BCs u(0, t) = δ(t), u(1, t) = 0
9. Consider the flow system shown in Figure 17.21. Assume that each tank is well
mixed and species A enters tank 1 at a concentration of cin (t) and leaves at c1 (t).
Assume further that VR1 = 1 m3 , VR2 = 32 m3 and q1 = q2 = 2 m3 / min.
du
= u(t) − βu(t − τ) + f (t), t > 0;
dt
u(t) = 0, −τ ≤ t ≤ 0.
Here, f (t) is the external input disturbance, τ > 0 is the delay time and β > 1 is the
strength of the delayed feedback.
(a) Determine the steady-state response for a unit-step input
17.6 Control system with delayed feedback | 419
(b) Write the general form of the transient response for a unit-step input (no need
to compute it)
(c) Can the system go unstable, i. e., any poles/eigenvalues cross the imaginary
axis? If so, determine the smallest value of τ for which the system becomes
unstable.
(d) Show a schematic diagram of the transient response for β = 2 and τ = 0.60
π
[Hint: 3√ 3
= 0.6046].
11. [Frequency response]
Consider the inhomogeneous n-th order scalar differential equation
dn u dn−1 u du
Lu = α0 n
+ α1 n−1 + ⋅ ⋅ ⋅ + αn−1 + αn u = f (t), t>0
dt dt dt
I.Cs: u[k] (t = 0) = 0, k = 0, 1, . . . , (n − 1).
F(s)
U(s) = ℓ{u(t)} =
Pn (s)
Aω
f (t) = A sin ωt ⇒ F(s) = ,
s2 + ω2
and assume that λ1 , λ2 , . . . , λn are n distinct root of Pn (λ) = 0, then the general
form of solution can be expressed as
j=n
ωeλj t 1
u(t) = A[∑ + AR sin(ωt − ϕ)]
j=1 ω + λj Pn (λj )
2 2 ′
where
1
AR = amplitude ratio =
|Pn (iω)|
1
ϕ = phase lag = arg Pn (iω) = − arg
Pn (iω)
which can be obtained by taking the system transfer function P 1(s) and replac-
n
1 1 1
ing s by iω to get the complex number Pn (iω)
. If we write Pn (iω)
= |Pn (iω)|eiϕ
, we
obtain AR and ϕ.]
|
Part IV: Linear ordinary differential
equations-boundary value problems
18 Two-point boundary value problems
In this and the next chapter, we discuss the solution of linear differential equations
with prescribed end or boundary conditions. Specifically, we discuss the problem of
determining a function u(x) satisfying an n-th order differential equation in the inde-
pendent variable x (a < x < b) and n boundary conditions involving the function and
its first (n − 1) derivatives at the end points x = a and x = b.
du
Lu = p0 (x) + p1 (x)u (18.1)
dx
In general, the RHS of equation (18.1) is not an exact derivative. We would like to find
a function v such that vLu is an exact derivative. Multiplying equation (18.1) by v(x),
we have
du
vLu = vp0 (x) + vp1 (x)u
dx
d
= [vp0 u] − u(vp0 )′ + uvp1
dx
d
= [vp0 u] + [−(p0 v)′ + p1 v]u (18.2)
dx
Let us define
L∗ v = −(p0 v)′ + p1 v
= −p0 v′ + (p1 − p′0 )v
where L∗ is also a linear differential operator. Equation (18.2) may now be written as
d
vLu − uL∗ v = [vp0 u] (18.3)
dx
Now suppose that v satisfies the homogeneous equation
L∗ v = 0 (18.4)
https://doi.org/10.1515/9783110739701-019
424 | 18 Two-point boundary value problems
d
vLu = [vp0 u], (18.5)
dx
L∗ v = 0
may be written as
p1 − p′0
v′ = v
p0
−p′0 p1
ln v = ∫( + ) dx
p0 p0
p (x)
= − ln p0 (x) + ∫ 1 dx
p0 (x)
1 p (x)
v= exp{∫ 1 dx}. (18.6)
p0 (x) p0 (x)
d
[vp0 u] = 0.
dx
Integrating, we obtain
Thus, if we know either u(x) or v(x), we can determine the other, or the solutions to
the equations Lu = 0 and L∗ v = 0 are intimately related.
Now consider the second-order operator
d2 u du
Lu = p0 (x) + p1 (x) + p2 (x)u (18.8)
dx2 dx
d2 u du
vLu = vp0 + vp1 (x) + vp2 (x)u
dx2 dx
= vp0 u′′ + vp1 u′ + vp2 u
18.1 The adjoint differential operator | 425
′
= (vp0 u′ ) − u′ (vp0 )′ + (vp1 u)′ − u(vp1 )′ + vp2 u
′ ′
= (vp0 u′ ) − [u(vp0 )′ ] + u(p0 v)′′ + (vp1 u)′ − u(vp1 )′ + vp2 u
d
= u[(p0 v)′′ − (p1 v)′ + p2 v] + [vp0 u′ − u(vp0 )′ + vp1 u]
dx
⇒
d
vLu − uL∗ v = [vp0 u′ − u(vp0 )′ + vp1 u]
dx
d
= [π(u, v)] (18.9)
dx
Thus, if v satisfies the adjoint equation, i. e., L∗ v = 0, where
where k is the Wronskian vector. The function π(u, v) is called the bilinear concomitant
and P(x) is called the concomitant matrix. Thus,
d T
vLu − uL∗ v = [k (v)P(x)k(u)] (18.12)
dx
Equation (18.12) is again the Lagrange identity in terms of the Wronskian vectors and
concomitant matrix. Note that if two solutions of the adjoint equation L∗ v = 0 are
known, then we have
′
π(u, v1 ) = v1 (x)p0 (x)u′ (x) − u(x)(v1 (x)p0 (x)) + v1 (x)p1 (x)u(x) = c1 (18.13)
′ ′
π(u, v2 ) = v2 (x)p0 (x)u (x) − u(x)(v2 (x)p0 (x)) + v2 (x)p1 (x)u(x) = c2 (18.14)
These two linear equations can be solved for u(x) and u′ (x) to determine the two lin-
early independent solutions of the equation Lu = 0. Similarly, if two solutions u1 (x)
and u2 (x) of Lu = 0 are known, we can determine the solutions of L∗ v = 0. Thus,
the solutions of the two homogeneous equations Lu = 0 and L∗ v = 0 are closely re-
lated.
426 | 18 Two-point boundary value problems
18.1.1 The Lagrange identity for an n-th order linear differential operator
dn u dn−1 u du
Lu ≡ p0 (x) n
+ p1 (x) n−1 + ⋅ ⋅ ⋅ + pn−1 (x) + pn (x)u
dx dx dx
n
= ∑ pn−j (x)u[j] (18.15)
j=0
where
dj u
u[j] = (18.16)
dxj
It may be shown that the adjoint equation is given by
n
L∗ v = ∑ (−1)j [pn−j (x)v]
[j]
(18.17)
j=0
d T
vLu − uL∗ v = [k (v)Pk(u)] (18.18)
dx
where
u(x)
u′ (x)
k(u) = ( u′′ (x) ) (18.19)
.
u[n−1] (x)
is the Wronskian vector of u(x) and the elements of the concomitant matrix P are de-
fined by
h−1
{ ∑n−j+1 (−1)h−1 (
{ ) pn−h−j+1
[h−i]
(x), i≤n−j+1
p∗ij (x) ={ h=i i−1 (18.20)
{
{ =0 i>n−j+1
The bilinear concomitant for the n-th order case may be expressed as
n n−i+1
π(u, v) = ∑ ∑ p∗li v[l−1] u[i−1] . (18.21)
i=1 l=1
It is clear that π(u, v) defined by equation (18.21) is a bilinear form, and hence the name
bilinear concomitant. For example, for n=3, we get
18.1 The adjoint differential operator | 427
p′′
0 − p1 + p2
′
p1 − p′0 p0
P = ( 2p0 − p1
′
−p0 0 ) (18.22)
p0 0 0
Theorem 18.1. The operators L and L∗ are adjoint to each other, i. e., L∗∗ y = Ly (the
adjoint relationship is a reciprocal one).
Theorem 18.2. The concomitant matrix P is nonsingular and its determinant is given by
det P(x) = {p0 (x)}n .
the indices on the antidiagonal sum to n+1, i. e., if i+j = n+1, then aij is an antidiagonal
element. [Remark: antidiagonal elements are those on the line connecting a1n and an1 .]
Since p∗ij = 0, for i > n − j + 1 ⇒ P is an upper triangular matrix with
P = −PT
KT (v)PK(u) = C
where C is a nonsingular constant matrix. Further, we can choose v(x) and u(x) such
that C is the identity matrix (here K is the Wronskian matrix).
428 | 18 Two-point boundary value problems
is formally self-adjoint.
For algebraic details and proofs of the theorems stated above, we refer to the book
by R. H. Cole [14].
Let
dn dn−1 d
L = p0 (x) + p 1 (x) + ⋅ ⋅ ⋅ + pn−1 (x) + pn (x) (18.23)
dx n dxn−1 dx
be an n-th order differential operator and u(x) ∈ C n [a, b], p0 (x) ≠ 0 in [a, b]. Consider
the problem of solving
α11 u(a) + ⋅ ⋅ ⋅ + α1n u[n−1] (a) + β11 u(b) + ⋅ ⋅ ⋅ + β1n u[n−1] (b) = d1
α21 u(a) + ⋅ ⋅ ⋅ + α2n u[n−1] (a) + β21 u(b) + ⋅ ⋅ ⋅ + β2n u[n−1] (b) = d2
.
.
[n−1] [n−1]
αm1 u(a) + ⋅ ⋅ ⋅ + αmn u (a) + βm1 u(b) + ⋅ ⋅ ⋅ + βmn u (b) = dm (18.25)
Let
18.2 Two-point boundary value problems | 429
u(x)
u′ (x)
k(u(x)) = ( . ) be the Wronskian vector of u(x)
.
u[n−1] (x)
Let
In practice and most of our applications, m = n, but, for the present we do not impose
this restriction. We assume that rank W = m, i. e., the boundary conditions are inde-
pendent. The n-th order two-point BVP is defined by equations (18.24) and (18.27). Its
solution requires determining a function u(x) satisfying the differential equation as
well as the end conditions.
We can use the principle of superposition to write the solution of equations (18.24)
and (18.27) as
where u1 (x) satisfies the inhomogeneous equation with homogeneous boundary con-
ditions:
k(u2 (a))
W( ) = d. (18.32)
k(u2 (b))
Lu = 0, (18.33)
k(u(a))
W( ) = 0. (18.34)
k(u(b))
Theorem 18.5. The solutions of the two-point homogeneous boundary value problem
defined by equations (18.33)–(18.34) form a vector space. Let C r [a, b] be the vector space
of all functions that are r-times differentiable. We can think of L as a linear operator,
i. e.,
L : C r [a, b] → C[a, b]
Then the solutions of Lu = 0 form the kernel of L, i. e., these are elements whose image
under L is the zero image.
Figure 18.1 shows schematically the domain, codomain and transformation of kernel
of the operator L. We have already shown that the kernel of L is a vector space V (which
is a subspace of C r [a, b]) of dimension n. Now let ψ1 (x) be any element in C r [a, b].
Define a linear transformation (mapping) from C r [a, b] to ℝ2n by the relation
18.2 Two-point boundary value problems | 431
ψ1 (a)
ψ′1 (a)
( . )
( )
( )
ψ1 (a) ) = boundary vector (18.35)
[n−1]
ℬ(ψ1 (x)) = (
( )
( ψ1 (b) )
.
ψ [n−1]
( 1 (b) )
where ℬ is a linear mapping (see Figure 18.2). This mapping is not obviously one-to-
one. But we can define inverse images of sets in R2n in a familiar fashion. If S is any set
of elements, then we use ℬ−1 (S) to represent the set of all elements in V whose images
are in S.
Proof. Let ψ1 , ψ2 ∈ B−1 (S) and α1 , α2 be any constants. Then the boundary vector of
α1 ψ1 + α2 ψ2 is given by
⇒ α1 ψ1 + α2 ψ2 is in ℬ−1 (S).
∴ The result.
Proof of Theorem 18.5. Equation (18.34) requires the boundary vector to be orthogonal
to the rows of W. Thus, it defines a subspace S of R2n . Since ℬ is linear, from the above
lemma, the inverse image of this subspace is a subspace of C r [a, b].
ψ1
ψ2
( . )
( )
K(ψ(x)) = Wronskian matrix of ψ, where ψ = (
( . )
) (18.37)
( . )
.
( ψn )
Define
The matrix D is called the characteristic matrix of the BVP. It plays an important role
in determining the properties of the BVP.
u = ψT c (18.39)
Dc = 0 (18.41)
u′′′ = 0, 0<x<1
u(0) = 0, u(1) = u′ (0), u′ (1) = u′′ (0)
Solution:
ψ1 (x) = 1
ψ2 (x) = x
ψ3 (x) = x 2
18.2 Two-point boundary value problems | 433
1 x x2
K(ψ(x)) = ( 0 1 2x )
0 0 2
1 0 0 1 1 1
K(ψ(0)) = ( 0 1 0 ), K(ψ(1)) = ( 0 1 2 )
0 0 2 0 0 2
1 0 0 0 0 0
W0 = ( 0 −1 0 ), W1 = ( 1 0 0 )
0 0 −1 0 1 0
1 0 0 1 0 0 0 0 0 1 1 1
D=( 0 −1 0 )( 0 1 0 )+( 1 0 0 )( 0 1 2 )
0 0 −1 0 0 2 0 1 0 0 0 2
1 0 0 0 0 0
=( 0 −1 0 )+( 1 1 1 )
0 0 −2 0 1 2
1 0 0
=( 1 0 1 )
0 1 0
rank D = 3
We note that the two linearly independent solutions of the homogeneous equation are
⇒
434 | 18 Two-point boundary value problems
ex e2x
K(ψ(x)) = ( )
ex 2e2x
1 1 e e2
K(ψ(0)) = ( ), K(ψ(1)) = ( )
1 2 e 2e2
1 −1 1 1 0 0 e e2
D=( )( )+( )( )
0 0 1 2 1 −1 e 2e2
0 −1 0 0
=( )+( )
0 0 0 −e2
0 −1
=( )
0 −e2
0 −1 c
Dc = 0 ⇒ ( ) ( 1 ) = 0 ⇒ c2 = 0, c1 = 1
0 −e2 c2
Rank D = 1
∴ index of compatibility is one, i. e., the solution space has dimension one.
n
L∗ v = ∑ (−1)j [pn−j (x)v]
[j]
=0 (18.45)
j=0
d d T
vLu − uL∗ v = [π(u, v)] = {k (v(x))P(x)k(u(x))} (18.46)
dx dx
Integrating equation (18.46) both sides from x = a to x = b, we get
= kT (v(b))P(b)k(u(b)) − kT (v(a))P(a)k(u(a))
−P(a) 0 k(u(a))
= [ kT (v(a)) kT (v(b)) ] [ ][ ] (18.47)
0 P(b) k(u(b))
Equation (18.47) is called the Green’s formula. Now, consider the homogeneous two-
point BVP,
Lu = 0 (18.48)
Wa k(u(a)) + Wb k(u(b)) = 0 (18.49)
We want to impose boundary conditions on the function v of the adjoint problem such
that when these conditions and the adjoint equation
L∗ v = 0 (18.50)
are satisfied, the right-hand side of Green’s formula is zero, i. e., we want to find a set
of boundary conditions (called the adjoint BCs) such that
−P(a) 0 k(u(a))
[ kT (v(a)) kT (v(b)) ] [ ] [ ]
0 P(b) k(u(b)) =0 (18.51)
1 × 2n
2n × 2n 2n × 1
k(u(a)) k(u(a))
[ Wa Wb ] [ ] = 0 ⇒ W[ ]=0 (18.52)
k(u(b)) k(u(b))
i. e., the boundary vector is orthogonal to the rows of W. Thus, we can satisfy (18.51)
if we require the vector
436 | 18 Two-point boundary value problems
−P(a) 0
[ kT (v(a)) kT (v(b)) ] [ ]
0 P(b)
−P(a) 0
[ kT (v(a)) kT (v(b)) ] [ ] = aT W (18.53)
0 P(b)
wT1
wT2
= ( a1 a2 ... a3 ) ( ) (18.54)
.
wTn
where a is any vector in Rn . These are called the adjoint boundary conditions. Taking
the transpose of equation (18.54), we get
Equation (18.55) defines a set of 2n relations in k(v(a)) and k(v(b)). However, these
relations contain n unknown constants a. By eliminating these constants, we obtain
a set of n relations in terms of k(v(a)) and k(v(b)). These give the adjoint boundary
conditions. Equation (18.55) may be written as
Lu ≡ u′′ − 3u′ + 2u = 0
u(0) − u′ (0) = 0, u(1) − u′ (1) = 0
L∗ v = v′′ + 3v′ + 2v = 0.
u(0)
1 −1 0 0 [ u′
[ (0)
]
] 0
[ ][ ]=( )
0 0 1 −1 [ u(1) ] 0
[ u (1)
′
]
∴
18.3 The adjoint boundary value problem | 437
1 −1 0 0
W=[ ]
0 0 1 −1
and
−3 1
P=( )
−1 0
3 1 0 0 v(0) 1 0
[ −1 0 0 0 ] [ v′ (0) ] [ −1 0 ] a
][ 1 ]
[ ][ ] [ ]
[ ][ ]=[
[ 0 0 −3 −1 ] [ v(1) ] [ 0 1 ] a2
[ 0 0 1 0 ] [ v (1) ] [ 0
′
−1 ]
a1
[ −a ]
[ 1 ]
=[ ]
[ a2 ]
[ −a2 ]
⇒
3v(0) + v′ (0) a1
[ −v(0) ] [ −a ]
[ ] [ 1 ]
[ ]=[ ]
[ −3v(1) − v′ (1) ] [ a2 ]
[ v(1) ] [ −a2 ]
Add rows (1) and (2), and rows (3) and (4),
⇒
2v(0) + v′ (0) 0
[ −v(0) ] [ −a ]
[ ] [ 1 ]
[ ]=[ ]
[ −2v(1) − v′ (1) ] [ 0 ]
[ +v(1) ] [ −a2 ]
2v(0) + v′ (0) = 0
−2v(1) − v′ (1) = 0
are adjoint to each other. It may be verified that ϕ1 (x) = e−2x is a basis for the solution
space of the adjoint problem.
Remark. While the above general formalism is useful for higher order BVPs, for the
case of second- and fourth- order BVPs where the concomitant is a simple expression,
we can determine the adjoint BCs by simply using the relation
The adjoint BCs are obtained by setting the quantities in the brackets to zero.
d4 u
Lu = =0
dx4
u(0) = u′′ (0) = 0, u(1) = u′′ (1) = 0
Following the above procedure, it may be verified that this BVP is self-adjoint, i. e., the
adjoint operator and BCs are same.
k(v(a))
Q[ ] = 0, (18.57)
k(v(b))
Lu = 0 (18.58)
k(u(a))
W[ ]=0 (18.59)
k(u(b))
18.3 The adjoint boundary value problem | 439
and
L∗ v = 0 (18.60)
k(v(a))
Q[ ]=0 (18.61)
k(v(b))
are adjoint to each other. We have seen that the set of solutions to equations (18.58)–
(18.59) forms the vector space, which is a subspace of C n [a, b]. The set of solutions to
equations (18.60)–(18.61) also form a subspace of C n [a, b]. If these two subspaces are
identical, then we say that the BVP is self-adjoint. This is so if L = L∗ and Q = W or
Q = CW, where C is a nonsingular matrix. In terms of the coefficient and concomitant
matrices, the condition for self-adjointness of the BVP may be expressed as
T
P−1 (a) 0
Q[ ] WT = 0. (18.62)
0 −P−1 (b)
Further, it may be shown that the index of compatibility of the adjoint system is
the same as that of the original system. If we start with the Lagrange identity
d T
vLu − uL∗ v = [k (v(x))P(x)k(u(x))]
dx
and assume that u satisfies
Lu = 0
and v satisfies
L∗ v = 0
⇒
d T
[k (v(x))P(x)k(u(x))] = 0 ⇒ kT (v(x))P(x)k(u(x)) = constant (18.63)
dx
Let
u(x) = uT c
v(x) = vT d
⇒
or
dT KT (v(x))P(x)K(u(x))c = constant
KT (v(x))P(x)K(u(x)) = I (18.64)
Now, to determine the index of compatibility of the adjoint system, we use the adjoint
BCs,
−kT (v(a))P(a) = aT Wa
kT (v(b))P(b) = aT Wb
v = cT v for some c ∈ Rn
⇒
Multiply equation (18.65) by K(u(a)) and multiply equation (18.66) by K(u(b)) and use
equation (18.64),
⇒
−cT = aT Wa K(u(a))
cT = aT Wb K(u(b))
0 = aT D
or
DT a = 0 (18.67)
18.3 The adjoint boundary value problem | 441
If rank D = r, then there are (n − r) solutions, aj , to the equations. From each of these
solutions, we get a solution to the adjoint problem
Problems
1. (a) Show that the differential operator
1 ′
Lu = − [(p(x)u′ ) + q(x)u], a<x<b
w(x)
⟨u, v⟩ = ∫ w(x)u(x)v(x) dx
a
(b) Show that any formally self-adjoint operator of order 2m with respect to
dm
the usual inner product may be written in the form Lu = dx m {q0 (x)u
[m]
}+
dm−1
{q (x)u[m−1] }
dx m−1 1
+ ⋅ ⋅ ⋅ + qm (x)u
2. (a) Find the index of compatibility and the solution space of each of the following
boundary value problems:
(i) u′′′ = 0, 0 < x < 1; u(0) = 0, u′ (1) = 0, u′ (0) − 2u(1) = 0 and (ii) u′′ − 3u′ +
2u = 0, 0 < x < 1; u(0) − u(1) = 0, u′ (0) − u′ (1) = 0
(b) Determine the values of the parameter λ for which the following boundary
value problems are compatible:
(i)
(ii)
u[4] − λ4 u = 0, −1 < x < 1; u(−1) = u(1) = u′′ (−1) = u′′ (1) = 0
4
3. Given the fourth-order operator Lu = ddxu4 , and boundary conditions as follows:
(a) u(0) = 0, u′′ (0) = 0, u(1) = 0, u′′ (1) = 0
(b) u(0) = 0, u′ (0) = 0, u(1) = 0, u′ (1) = 0
(c) u(0) = 0, u′ (0) = 0, u′′ (1) = 0, u′′′ (1) = 0
(d) u(0) = 0, u′′′ (0) = 0, u(1) = 0, u′ (1) = 0
(e) u(0) = 0, u′′′ (0) = 0, u(1) = 0, u′′ (1) = 0
Determine the adjoint boundary conditions.
4. The following boundary value problem is known as the Orr–Sommerfeld equation
and arises in the stability analysis of parallel shear flows:
442 | 18 Two-point boundary value problems
Here, k is the wave number (k > 0), Re is the Reynolds number (Re > 0) and c is a
complex number (which is the dimensionless wave speed). Show that the adjoint
system is given by
5. Let V = C[a, b], the vector space of complex valued continuous functions defined
over the real interval [a, b] and T be the linear operator on V defined by
where u ∈ C[a, b] and K(s, x) is continuous in [a, b] × [a, b]. Determine the adjoint
operator with respect to the usual inner product.
6. (a) Given the linear operator
𝜕2 u 𝜕2 u
Lu = + + λu, 0 < x < a, 0<y<b
𝜕x 2 𝜕y2
𝜕u
u(0, y) = 0, (a, y) = 0,
𝜕x
𝜕u
u(x, 0) = 0, (x, b) + αu(x, b) = 0
𝜕y
𝜕2 u 𝜕u
Lu = − , 0 < x < 1, t>0
𝜕x 2 𝜕t
𝜕u
− αu = 0, @ x = 0; u = 0, @x = 1
𝜕x
u = 0, @t = 0
where the coefficients aij , bi and c are real analytic functions of real variables
x1 , . . . , xn and where the matrix of elements aij is symmetric. (a) Determine the
adjoint operator and the form of the Lagrange identity (b) Use the divergence the-
orem to obtain a formula analogous to Green’s formula.
9. The steady-state conversion (u) in a radial flow reactor (see Figure 18.3) is given
by the boundary value problem
1 1 d du r du
(r ) − 0 − Da u = − Da, r0 < r < 1
Pe r dr dr r dr
du 1 du
(1) = 0; (r ) − u(r0 ) = 0
dr Pe dr o
where Pe is the Peclet number, Da is the Damköhler number and r0 is the dimen-
sionless inner radius.
⟨u, v⟩ = ∫ ruv dr
r0
10. Let V be the vector space of complex valued functions u(x) defined on the interval
(0, a) satisfying the periodicity condition u(0) = u(a). Let L be a linear operator on
V defined by
du
Lu(x) = −i ; i = √−1
dx
Show that this operator is self-adjoint with respect to the usual inner product on V,
i. e.,
a
Suppose that we can express the solution of equation (19.1) and (19.2) as
where G(x, ξ ) is called the Green’s function of the linear operator L with homogeneous
BCs
Remark. Some authors define the Green’s function with a positive sign on the RHS of
equation (19.1). However, we shall use the definition above to be consistent with the en-
gineering literature. With this notation, positive (negative) values of f (x) correspond
to source (sink).
d2 u
= −f (x), 0 < x < 1
dx2
u′ (0) = 0, u(1) = 0.
Integrating once from 0 to x and using the first boundary condition, we get
x
du
= − ∫ f (η) dη
dx
0
Integrating again,
https://doi.org/10.1515/9783110739701-020
446 | 19 The nonhomogeneous BVP and Green’s function
x ξ
⇒
1 ξ x ξ
u(1) = 0 ⇒
1 ξ 0 ξ
which is simplified to
1 ξ
= ∫ ∫ f (η) dη dξ + ∫ ∫ f (η) dη dξ
x 0 x x
To evaluate this double integral, we change the order of integration. Figure 19.1 shows
the schematic of the domain and change in order of variables in the second double
integration.
Thus,
x 1 1 ξ =1
where
(1 − x), 0≤η≤x
G(x, η) = {
(1 − η), x<η≤1
d du
Lu ≡ (p(x) ) − q(x)u = −f (x), a < x < b, (19.5)
dx dx
α1 u(a) + α2 u′ (a) = 0
} Robin/mixed/radiation boundary conditions (19.6)
β1 u(b) + β2 u′ (b) = 0
u(a) = 0
} Dirichlet boundary conditions (19.7)
u(b) = 0
u(a)p(a) = u(b)p(b)
} Periodic boundary conditions (19.8)
u′ (a) = u′ (b)
It may be easily verified that the TPBVP defined by equation (19.5) with one of the
above set of BCs (19.6)–(19.8) is self-adjoint. We want to write the solution of equation
(19.5) with Robin/Dirichlet/Periodic BCs in the form
where G(x, s) is the Green’s function. The method for developing Green’s functions is
based on the method of variation of parameters. Suppose that u1 (x) and u2 (x) are two
linearly independent solutions of the homogeneous equation
′
Lu = (pu′ ) − qu = 0
Assume that u1 (x) satisfies the boundary conditions at x = a and u2 (x) satisfies the
boundary condition at x = b, i. e.,
⇒
⇒
u′ = c1 u′1 + c2 u′2
⇒
⇒
′ ′
(pu′ ) = c1 (pu′1 ) + c2 (pu′2 ) + c1′ pu′1 + c2 pu′2
and
qu = qc1 u1 + qc2 u2
⇒
19.2 Green’s function for second-order self-adjoint TPBVP | 449
f
c1′ u′1 + c2′ u′2 = − (19.13)
p
u1 u2 c1′ 0 f
[ ][ ]=[ ] = −e2
u′1 u′2 c2′ − pf p
⇒
f
K(u(x))c′ = −e2
p
f
c′ = −K−1 (u(x))e2
p
⇒
x
f (s)
c = c0 − ∫ K−1 (u(s))e2 ds (19.14)
p(s)
a
f (x)
− [ u1 (x) u2 (x) ] K−1 (u(x))e2 (19.15)
p(x)
The last term in equations (19.15) is identically zero as can be seen from the following
analysis:
u1 u2
K(u(x)) = [ ]
u′1 u′2
u′2 −u2 1
K−1 (u(x)) = [ ]
−u′1 u1 W
u′2 −u2 1
uT K−1 (u(x)) = [ u1 u2 ] [ ]
−u′1 u1 W
450 | 19 The nonhomogeneous BVP and Green’s function
1
= [ u1 u′2 − u2 u′1 −u1 u2 + u1 u2 ]
W
=[ 1 0 ]
f (x) 0 f (x)
uT K−1 (u)e2 =[ 1 0 ][ ]
p(x) 1 p(x)
=0
⇒
x
c10 f (s)
u′ (x) = [ u′1 u′2 ] [ ] − [ u′1 u′2 ] ∫ K−1 (u(s))e2 ds (19.16)
c20 p(s)
a
Thus,
c10
u(a) = [ u1 (a) u2 (a) ] [ ]
c20
c10
u′ (a) = [ u′1 (a) u′2 (a) ] [ ] and
c20
α1 u(a) + α2 u′ (a) = α1 [c10 u1 (a) + c20 u2 (a)] + α2 [c10 u′1 (a) + c20 u′2 (a)]
= c10 [α1 u1 (a) + α2 u′1 (a)] + c20 [α1 u2 (a) + α2 u′2 (a)]
= c20 [α1 u2 (a) + α2 u′2 (a)] = 0
⇒
c20 = 0
b
f (s)
u(b) = uT (b)c0 − uT (b) ∫ K−1 (u(s))e2 ds
p(s)
a
b
f (s)
u (b) = u (b)c0 − u (b) ∫ K−1 (u(s))e2
′ ′T ′T
ds
p(s)
a
β1 u(b) + β2 u′ (b) = 0 ⇒
b
f (s)
β1 c10 u1 (b) + β2 c10 u′1 (b) = [ β1 u1 (b) + β2 u′1 (b) β1 u2 (b) + β2 u′2 (b) ] ∫ K−1 e2 ds
p(s)
a
⇒
19.2 Green’s function for second-order self-adjoint TPBVP | 451
b
f
c10 = [ 1 0 ] ∫ K−1 (u(s))e2 ds
p
a
∴
b x
f (s) f
u(x) = [u1 (x) 0] ∫ K−1 (u(s))e2 ds − [ u1 u2 ] ∫ K−1 (u(s))e2 ds
p(s) p
a a
x b x x
Now,
1 −u2 (s)
= [ ]
W(s) u1 (s)
∴
b x
−u (x)u2 (s)f (s) −u (x)u1 (s)f (s)
u(x) = ∫ 1 ds + ∫ 2 ds
p(s)W(s) p(s)W(s)
x a
b
where
−u2 (x)u1 (s)
{ p(s)W(s)
, a≤s≤x
G(x, s) = { (19.17)
−u1 (x)u2 (s)
{ p(s)W(s)
, x≤s≤b
is the Green’s function. To simplify the expression for the Green’s function further, we
make the following observation:
d
{p(x)W(x)} = 0
dx
To prove,
452 | 19 The nonhomogeneous BVP and Green’s function
d
LHS = {p(x)(u1 u′2 − u2 u′1 )}
dx
′ ′
= u1 (pu′2 ) + u′1 pu′2 − u2 (pu′1 ) − pu′1 u′2
= qu1 u2 − u2 [qu1 ] = 0
Therefore, p(x)W(x) = constant. We may choose u1 and u2 such that the constant is
equal to minus one (−1). Then
d2 u
= −f (x), u′ (0) = 0, u(1) = 0
dx2
⇒
1 − x, a<s<x
G(x, s) = {
1 − s, x<s<1
d2 u
= −f (x)
dx2
u(0) = 0, u(1) = 0
19.2 Green’s function for second-order self-adjoint TPBVP | 453
Therefore,
d du n2
Lu ≡ (x ) − u = −f (x), n is a positive integer.
dx dx x
u′ (0) = 0, u(1) = 0
u1 (x) = x n , u2 (x) = x −n − x n
are linearly independent solutions satisfying the boundary conditions at the ends:
xn x −n − x n
−2n
W(x) = =
nx n−1 −nx −n−1 − nx n−1
x
−2n
p(x)W(x) = x( ) = −2n
x
sn (x −n −x n )
{ 2n
, 0<s<x
G(x, s) = { n −n n
x (s −s )
{ 2n
, x<s<1
s n
[( x ) − (sx)n ]/2n, 0 < s < x
={
[( xs )n − (sx)n ]/2n, x < s < 1
454 | 19 The nonhomogeneous BVP and Green’s function
d du
Lu ≡ (p(x) ) − q(x)u = −f (x), a<x<b (19.19)
dx dx
u1 (s)u2 (x)
{ − p(s)W(s) , a<s<x
G(x, s) = { u (x)u (s) (19.20)
1 2
{ − p(s)W(s) , x<s<b
Lu = 0
d dG
(p(x) ) − q(x)G = 0 (19.22)
dx dx
− u1 (x)u
C
2 (s)
, a<x<s
G(x, s) = {
− u2 (x)u
C
1 (s)
, s<x<b
𝜕G −u1 (s)u′2 (x)
= (x approaching s from the RHS)
𝜕x x=s+ C x=s
−u1 (s)u′2 (s)
=
C
𝜕G −u′1 (x)u2 (s)
= (x approaching s from the LHS)
𝜕x x=s− C x=s
−u′1 (s)u2 (s)
=
C
∴
19.3 Properties of the Green’s function for the second-order self-adjoint BVP | 455
𝜕G 𝜕G
− = Jump at x = s
𝜕x x=s + 𝜕x x=s−
u′ u − u1 u′2 −W(s)
= 1 2 =
C p(s)W(s)
𝜕G + 𝜕G − −1
(s , s) − (s , s) = (19.23)
𝜕x 𝜕x p(s)
and
BC2
BC2
456 | 19 The nonhomogeneous BVP and Green’s function
α1 u(a) + α2 u′ (a) = 0
β1 u(b) + β2 u′ (b) = 0
Proof.
⇒
x b
𝜕G(x, s) 𝜕G(x, s)
u′ (x) = ∫ f (s) ds + ∫ f (s) ds
𝜕x 𝜕x
a x
+ G(x, x )f (x ) − G(x, x + )f (x + )
− −
(19.30)
Since G and f are continuous, the last two terms cancel and we get
x b
𝜕G(x, s) 𝜕G(x, s)
u′ (x) = ∫ f (s) ds + ∫ f (s) ds (19.31)
𝜕x 𝜕x
a x
⇒
b
′ 𝜕G
α1 u(a) + α2 u (a) = ∫[α1 G(a, s) + α2 (a, s)]f (s) ds = 0
𝜕x
a
19.3 Properties of the Green’s function for the second-order self-adjoint BVP | 457
since G satisfies the BC at x = b. Thus, the solution given by equation (19.29) satisfies
the BCs.
Now, equation (19.31) ⇒
x b
′ ′ 𝜕 𝜕G 𝜕G
(pu ) = [p(x) ∫ (x, s)f (s) ds + p(x) ∫ (x, s)f (s) ds]
𝜕x 𝜕x 𝜕x
a x
x b
′ ′ 𝜕G 𝜕G
= ∫(pG′ ) f (s) ds + ∫(pG′ ) f (s) ds + p(x)f (x)( (x, x − ) − (x, x + ))
𝜕x 𝜕x
a x
⇒
x b
′ ′ ′ ′ ′
(pu ) + qu = ∫[(pG ) + qG]f (s) ds + ∫[(pG′ ) + qG]f (s) ds
a x
𝜕G 𝜕G
+ p(x)f (x)[ (x, x− ) − (x, x+ )]
𝜕x 𝜕x
−1
= 0 + 0 + p(x)f (x)[ ]
p(x)
= − f (x)
Proof. Suppose that there are two, say G1 (x, s) and G2 (x, s) and form
Ḡ = G1 − G2
is unique. The proof is similar to (7). Suppose that there are two solutions û 1 (x) and
û 2 (x). Then,
b b
û 1 (x) = û 2 (x).
Let
Now,
∴
x
−f (s)
u(x) = ψ(x)T c + eT1 K(ψ(x)) ∫ K(ψ(s))
−1
⋅ en [ ] ds (19.36)
p0 (s)
a
= uh + up
uh = ψ(x)T c = c1 ψ1 + c2 ψ2 + ⋅ ⋅ ⋅ + cn ψn
u′h = c1 ψ′1 + c1 ψ′2 + ⋅ ⋅ ⋅ + cn ψ′n
.
.
u[n−1]
h
= c1 ψ[n−1]
1 + c2 ψ2[n−1] + ⋅ ⋅ ⋅ + cn ψn[n−1]
Similarly,
x
T −1 −f (s)
u′′
p = e3 K(ψ(x)) ∫ K(ψ(s)) ⋅ en [ ] ds
p0 (s)
a
460 | 19 The nonhomogeneous BVP and Green’s function
.
.
.
x
T −1 −f (s)
up[n−1] = en K(ψ(x)) ∫ K(ψ(s)) ⋅ en [ ] ds
p0 (s)
a
⇒
x
−1 −f (s)
k(up (x)) = K(ψ(x)) ∫ K(ψ(s)) ⋅ en [ ] ds (19.38)
p0 (s)
a
b
−1 −f (s)
Wa K(ψ(a))c + Wb [K(ψ(b))c + K(ψ(b)) ∫ K(ψ(s)) ⋅ en [ ] ds] = 0
p0 (s)
a
⇒
b
−1 −f (s)
Dc = −Wb K(ψ(b)) ∫ K(ψ(s)) ⋅ en ds
p0 (s)
a
b
−1 −1 −f (s)
c = −D Wb K(ψ(b)) ∫ K(ψ(s)) ⋅ en [ ] ds
p0 (s)
a
⇒
b
−f (s)
u(x) = − ψ(x)T D−1 Wb K(ψ(b)) ∫ K(ψ(s))
−1
⋅ en [ ] ds
p0 (s)
a
x
−f (s)
+ ψ(x)T ∫ K(ψ(s))
−1
⋅ en [ ] ds (19.39)
p0 (s)
a
in the second term of equation (19.39) before the integral sign. Also, split the first term
of equation (19.39) into two terms.
19.4 Green’s function for the n-th order TPBVP | 461
⇒
x b
T −1 −1 −f (s)
u(x) = − ψ(x) D Wb K(ψ(b))(∫ + ∫)K(ψ(s)) ⋅ en [ ] ds
p0 (s)
a x
x
−f (s)
+ ψ(x)T ∫ K(ψ(s))
−1
⋅ en [ ] ds
p0 (s)
a
⇒
x
−f (s)
u(x) = − ψ(x)T D−1 Wb K(ψ(b)) ∫ K(ψ(s))
−1
⋅ en [ ] ds
p0 (s)
a
b
T −1 −1 −f (s)
− ψ(x) D Wb K(ψ(b)) ∫ K(ψ(s)) ⋅ en [ ] ds
p0 (s)
x
x
−f (s)
+ ψ(x)T D−1 Wa K(ψ(a)) ∫ K(ψ(s))
−1
⋅ en [ ] ds
p0 (s)
a
x
−f (s)
+ ψ(x)T D−1 Wb K(ψ(b)) ∫ K(ψ(s))
−1
⋅ en [ ] ds
p0 (s)
a
where
T e
{ −e1 K(ψ(x))D Wa K(ψ(a))K (ψ(s)) p0 (s) , a<s<x
−1 −1 n
G(x, s) = { T en
(19.41)
e K(ψ(x))D−1 Wb K(ψ(b))K−1 (ψ(s)) p (s) , x<s<b
{ 1 0
d2
; u(0) = 0, u(1) = 0
dx 2
⇒
1 0 0 0
W=( )
0 0 1 0
u1 (x) = x and u2 (x) = 1 − x
x 1−x
K(u(x)) = ( )
1 −1
462 | 19 The nonhomogeneous BVP and Green’s function
⇒
1 1−x
K−1 (u(x)) = ( )
1 −x
0 1 1 0
K(u(0)) = ( ) and K(u(1)) = ( )
1 −1 1 −1
1 0 0 1 0 0 1 0
D=( )( )+( )( )
0 0 1 −1 1 0 1 −1
0 1 0 0
=( )+( )
0 0 1 0
0 1
=( ),
1 0
⇒
0 1
D−1 = ( )
1 0
0 1 1 0 0 1 1 1−s 0
{
{ −[x (1 − x)] ( )( )( )( )( )
{
{
{ 1 0 0 0 1 −1 1 −s 1
{
{
{
{ 0<s<x
G={
{ 0 1 0 0 1 0 1 1−s 0
{
{
{ [x (1 − x)] ( )( )( )( )( ),
{
{
{ 1 0 1 0 1 −1 1 −s 1
{
{ x<s<1
⇒
Example 19.6. Green’s function for the second-order operator with mixed BCs
u′′ = −f (x)
u(0) + u(1) = 0, u′ (0) + u′ (1) = 0
⇒
1 0 1 0
Wa = ( ), Wb = ( )
0 1 0 1
uT (x) = [1 x]
1 x
K(u(x)) = ( )
0 1
⇒
19.4 Green’s function for the n-th order TPBVP | 463
1 −s
K−1 (u(s)) = ( )
0 1
D = Wa K(u(0)) + Wb K(u(1))
1 0 1 0 1 0 1 1
=( )( )+( )( )
0 1 0 1 0 1 0 1
2 1
=( )
0 2
1 −1
2 4
D−1 = ( 1
)
0 2
1 −1
1 0 1 0 1 −s 0
{
{
{ −[1 x] ( 2 4
)( )( )( )( ), 0<s<x
{
{
{ 0 1 0 1 0 1 0 1 1
2
G(x, s) = { 1 −1
{
{ 1 0 1 1 1 −s 0
{ [1 x] ( 2 4
{
{ )( )( )( )( ), x<s<1
{ 0 1 0 1 0 1 0 1 1
2
1
4
− 21 (x − s), 0<s<x
={ 1 1
4
− 2
(s − x), x<s<1
The Green’s function is symmetric. It is easily verified that the given problem is self-
adjoint even though the BCs are mixed.
u′′′ = −f (x)
u(0) = 0, u(1) = 0, u′ (0) − u′ (1) = 0
1 0 0 0 0 0
Wa = ( 0 0 0 ), Wb = ( 1 0 0 )
0 1 0 0 −1 0
uT (x) = (1 x x 2 )
1 x x2
K(u(x)) = ( 0 1 2x )
0 0 2
⇒
s2
1 −s 2
K−1 (u(s)) = ( 0 1 −s )
1
0 0 2
464 | 19 The nonhomogeneous BVP and Green’s function
1 0 0 1 0 0 0 0 0 1 1 1
D=( 0 0 0 )( 0 1 0 )+( 1 0 0 )( 0 1 2 )
0 1 0 0 0 2 0 −1 0 0 0 2
1 0 0 0 0 0
=( 0 0 0 )+( 1 1 1 )
0 1 0 0 −1 −2
1 0 0
=( 1 1 1 )
0 0 −2
1 0 0
D−1 = ( −1 1 1/2 )
0 0 − 21
{ 1 0 0 1 0 0
{ 2
{
{
{ −(1 x x ) ⋅ ( −1 1 1/2 ) ⋅ ( 0 0 0 )
{
{
{
{
{ 0 0 − 21 0 1 0
{
{
{ s2
{
{
{
{ 1 0 0 1 −s 2
0
{ 0<s<x
{
{
{ ⋅ ( 0 1 0 )⋅( 0 1 −s ) ⋅ ( 0 ) ,
{
{
{
{ 0 0 2 0 0 1 1
2
G(x, s) = {
{
{
{ 1 0 0 0 0 0
{ 2
{
{
{
{ (1 x x ) ⋅ ( −1 1 1/2 ) ⋅ ( 1 0 0 )
− 21
{
{
{
{ 0 0 0 −1 0
{
{
{
{ s2
{
{
{ 1 1 1 1 −s 2
0
{
{
{
{ ⋅( 0 1 2 )⋅( 0 1 −s ) ⋅ ( 0 ) , x<s<1
{
{ 0 0 2 0 0 1 1
2
s
2
(x − s)(1 − x), 0<s<x
={ x
2
(s − x)(s − 1), x<s<1
L∗ v = 0
KT (v(x))P(x)K(u(x)) = I
⇒
19.4 Green’s function for the n-th order TPBVP | 465
⇒
. . p0
Recalling the form of P = ( . −p0 0 ), we note that
p0 0 0
Thus,
This formula reveals the symmetric nature of the Green’s function in terms of u(x) and
v(s).
Theorem 19.1. Regarded as a function of x with s fixed, the Green’s function has the
following properties:
1. Together with its first (n − 1) derivatives it is continuous in [a, s) and (s, b]. At the
point x = s, G and its first n − 2 derivatives have removable discontinuities while
(n − 1)st derivative has an upward jump of − p 1(s) .
0
2. G satisfies the differential equation except at x = s. It satisfies the boundary condi-
tions.
3. G is the only function with properties (1) and (2).
Theorem 19.2. Regarded as a function of s with x fixed. Green’s function has the follow-
ing properties:
1. Together with its first (n − 1) derivatives, it is continuous on [a, x) and (x, b]. At the
point s = x, G and its first (n − 2) derivatives have removable discontinuities while
n−1
the (n − 1)st derivative has a jump of (−1)
p0 (x)
2. G satisfies the adjoint differential equation (L∗ v = 0) except at s = x. It satisfies the
adjoint BCs.
466 | 19 The nonhomogeneous BVP and Green’s function
L∗ v(s) = −f (s)
−kT (v(a))P(a) = aT Wa , kT (v(b))P(b) = aT Wb
is given by
Lu = −f (19.44)
k(u(a))
W( )=0 (19.45)
k(u(b))
and let the symbol ℒ stand for the linear differential operator −L and the boundary
conditions (19.45). Then we may write (19.44)–(19.45) as
ℒu = f (19.46)
u = 𝔾f (19.49)
𝔾 = ℒ−1 ,
that G(x, s) must be response at position x caused by a unit input at position s. Suppose
that this is the case and there is a distribution of inputs f (s). Then the response at
x caused by an input at s of f (s) ds must be G(x, s)f (s) ds as schematically shown in
b
Figure 19.4. Then ∫a G(x, s)f (s) ds is the total response at x.
Figure 19.4: Schematic of the distributed function f (s) and interpretation of Green’s function.
468 | 19 The nonhomogeneous BVP and Green’s function
LG = −δ(x − s) (19.51)
k(G(a, s))
W( )=0 (19.52)
k(G(b, s))
Example 19.8 (The deflection of a tightly stretched elastic string). Consider the de-
flection of a tightly stretched elastic string that is fixed at the end points as shown in
Figure 19.5.
Figure 19.5: Deflection of tightly stretched elastic string under a load distribution (top) and a point
load (bottom).
d2 y
T = −F(x); y(0) = 0, y(L) = 0 (19.53)
dx 2
where T is the tension in the spring and F(x) is the force distribution. In dimensionless
form,
d2 y
= −f (z); 0 < z < 1 (19.54)
dz 2
y(0) = y(1) = 0 (19.55)
represents the deflection of the string at position z due to a unit force acting at position
s and deflection can be expressed (same as in equation (19.50)) as
1 d2 u du
− = −δ(x − s), 0<x<1 (19.58)
Pe dx2 dx
1 du
− u = 0@x = 0 (19.59)
Pe dx
du
= 0@x = 1 (19.60)
dx
0 < x < s:
1 d2 u du
− =0
Pe dx 2 dx
⇒
1 du
− u = Constant
Pe dx
BC ⇒ Constant = 0
du
= Pe ⋅u
dx
⇒
u = c1 ePe x (19.61)
s < x < 1:
1 d2 u du
− =0
Pe dx 2 dx
⇒
1 du
− u = Constant = c2
Pe dx
⇒
u′ − Pe u = c2 . Pe
470 | 19 The nonhomogeneous BVP and Green’s function
ue− Pe x = c2 . Pe ∫ e− Pe x dx
= −c2 .e− Pe x + c3
⇒
u = c3 ePe x − c2
du
= c3 Pe ePe x , u′ (1) = 0 ⇒ c3 = 0
dx
⇒
c1 ePe x , 0<x<s
u(x) = { (19.62)
−c2 , s<x<1
c1 Pe ePe x , 0<x<s
u′ (x) = { (19.63)
0, s<x<1
1 du
− u = −H(x − s)
Pe dx
⇒
c2 = −1
c1 ePe s = 1 ⇒ c1 = e− Pe s
e− Pe s ePe x , 0<x<s
u(x, s) = G(x, s) = { (19.64)
1, s<x<1
e− Pe(s−x) , 0<x<s
G(x, s) = { (19.65)
1, s<x<1
k(u(a))
Lu = −f (x); W( )=d (19.66)
k(u(b))
As stated in the previous chapter, using the principle of superposition, the solution of
(19.66) may be expressed as
u = u1 + u2 (19.67)
k(u1 (a))
Lu1 = −f (x); W( )=0 (19.68)
k(u1 (b))
and
k(u2 (a))
Lu2 = 0; W( )=d (19.69)
k(u2 (b))
We have already seen that (19.68) has a unique solution if the homogeneous problem
is incompatible, i. e., G(x, s) exists and is unique. Thus, we only need to solve (19.69)
to get the complete solution to equation (19.66). We now consider equation (19.69) and
present two methods for solving this BVP.
Method 1
Consider the BVP
k(u(a))
Lu = 0; W( )=d (19.70)
k(u(b))
472 | 19 The nonhomogeneous BVP and Green’s function
Let u(x)T = [u1 (x) u2 (x) . . un (x)] be a fundamental vector. Then any solution to
equation (19.70) must be of the form
Wa k(u(a)T c) + Wb k(u(b)T c) = d
⇒
⇒
Dc = d
c = D−1 d. (19.72)
Example 19.10.
d2 u
=0
dx2
u(0) = d1 , u(1) = d2
u1 (x) = x, u2 (x) = (1 − x)
x 1−x
K(u(x)) = ( )
1 −1
1 0 0 1 0 0 1 0
D=( )( )+( )( )
0 0 1 −1 1 0 1 −1
0 1 0 0 0 1
=( )+( )=( )
0 0 1 0 1 0
0 1
D−1 = ( )
1 0
0 1 d
u(x) = [ u1 u2 ] ( )( 1 )
1 0 d2
d2
= [ u1 (x) u2 (x) ] ( )
d1
= d2 x + d1 (1 − x)
19.5 Solution of TPBVP with inhomogeneous boundary conditions | 473
Example 19.11.
u′′ = 0
u(0) + u(1) = d1 , u′ (0) + u′ (1) = d2
u(x)T = [ 1 x ]
2 1
D=( )
0 2
⇒
1
− 41
D−1 = ( 2
1 )
0 2
∴
d1 d2
−
D−1 d = ( 2
d2
4 )
2
d1 d2
− 2d1 − d2 d2
u(x) = [1 x] [ 2
d2
4 ]= + x
2
4 2
d T
v(s)Lu(s) − u(s)L∗ v(s) = [k (v(s))P(s)k(u(s)))] (19.73)
ds
Let
where u(s) is a fundamental vector of (−Lu) = 0. Now, let v(s) = G(x, s) and substitute
in equation (19.73) ⇒
d T
− u(s)L∗ G(x, s) = [k (G(x, s))P(s)k(u(s)))] (19.75)
ds
We know that
L∗ G(x, s) = 0, a≤s<x
and
L∗ G(x, s) = 0, x<s≤b
474 | 19 The nonhomogeneous BVP and Green’s function
P(s) is continuous ⇒
Since G(x, s) and its first (n − 2) derivatives are continuous at s = x and (n − 1) the
n−1
derivative has a jump of (−1)
p (x)
, the LHS of equation (19.78) may be simplified to
0
(−1)n−1 T
LHS = e P(x)k(u(x))
p0 (x) n
. . . p0 (x) u(x)
n−1
(−1) [
[ . −p0 (x) 0 ] [ u′ (x)
][
]
]
= [ 0 . . 0 1 ][ ][ ]
p0 (x) [ . . 0 0 ][ . ]
n−1 n−1
[ (−1) p0 (x) 0 0 0 ][ u (x) ]
u(x)
(−1)n−1 [ u′ (x) ]
[ (−1)n−1 p0 (x)
[ ]
= 0 . . 0 ][ ]
p0 (x) [ . ]
[ un−1 (x) ]
= u(x)
k(u(a))
RHS = [ kT (G(x, a))P(a) −kT (G(x, b))P(b) ] [ ]
k(u(b))
P(a) 0 k(u(a))
= [ kT (G(x, a)) kT (G(x, b)) ] [ ][ ]
0 −P(b) k(u(b))
∴
19.5 Solution of TPBVP with inhomogeneous boundary conditions | 475
P(a) 0 k(u(a))
u(x) = [ kT (G(x, a)) kT G(x, b) ] [ ][ ] (19.79)
0 −P(b) k(u(b))
But we have shown that G(x, s) satisfies the homogeneous BCs at s = a and s = b. Thus,
the row vector
P(a) 0
[ kT (G(x, a)) kT (G(x, b)) ] [ ] = h(x)T W (19.80)
0 −P(b)
must belong to the row space of W, and hence may be written as shown in equation
(19.80). Substituting equation (19.80) in equation (19.79), we get
k(u(a))
u(x) = h(x)T W [ ]
k(u(b))
= h(x)T d (19.81)
which is the solution of equation (19.70). The vector function h(x) is determined from
equation (19.80), or equivalently from
where n of these equations determine h(x) and the remaining n determine the adjoint
BCs in terms of the Green’s function.
Example 19.12.
d2 u
= −f (x)
dx2
u(0) = d1 , u(1) = d2 .
2
d
For the operator − dx 2 , u(0) = 0, u(1) = 0, we have
and
0 −1
P(x) = ( )
1 0
p0 (x) = −1
𝜕G 0 −1 1 0
[G(x, 0) (x, 0)] [ ] = [ h1 (x) h2 (x) ] [ ]
𝜕s 1 0 0 0
𝜕G 0 −1 0 0
[G(x, 1) (x, 1)] [ ] = − [ h1 (x) h2 (x) ] [ ]
𝜕s 1 0 1 0
⇒
G(x, 0) = 0, G(x, 1) = 0
𝜕G
h1 (x) = 1 − x, h2 (x) = − (x, 1) = x
𝜕s
∴
1
Problems
1. The deflection of a simply supported beam is described by the fourth-order bound-
ary value problem
d4 u
EI = F(x), 0 < x < L; u(0) = u(L) = u′′ (0) = u′′ (L) = 0
dx 4
where EI = flexural rigidity of the beam and F(x) is the intensity of the distributed
load (force per unit length) (a) Determine the Green’s function of this system by
evaluating the deflection curve for a unit load at x = s. (b) Evaluate the deflection
curve for a triangularly distributed load, i. e.,
wx
F(x) =
L
d du
Lu = (r ), a < r < b; u′ (a) = 0, u(b) = 0
dr dr
is given by
ln(b/r) a<s<r
G(r, s) = {
ln(b/s) r<s<b
Heat is generated in a thin annular disk with insulated faces. The conductivity is
k in the radial direction. The internal circumference of the disk is insulated while
the external circumferential area is held at temperature zero. The heat generation
19.5 Solution of TPBVP with inhomogeneous boundary conditions | 477
Lu = −f (x), a<x<b
Wa k(u(a)) + Wb k(u(b)) = d
du
− A(x)u = f(x), a<x<b
dx
Wa u(a) + Wb u(b) = 0
be a regular linear differential operator, i. e., p0 (x) ≠ 0 in [a, b] and pj (x) ∈ C n−j [a, b].
Consider the homogeneous boundary value problem (BVP)
k(y(a))
− Ly = λy; W( )=0 (20.2)
k(y(b))
Definition. A real or complex number λ for which the BVP defined by equation (20.2) is
compatible is called an eigenvalue and any corresponding nontrivial solution is called
an eigenfunction. The set of all eigenvalues is called the spectrum of the BVP.
Remarks.
(1) For consistency with the finite-dimensional case, we should have defined the
eigenvalue in equation (20.2) with a positive sign. However, the eigenvalues of
most of the differential operators we encounter in our applications are negative
and we are following the literature notation here.
(2) Since the BVP given by equation (20.2) is homogeneous, the eigenfunctions asso-
ciated with an eigenvalue form a subspace of C n [a, b].
k(v(a))
− L∗ v = λv;
̄ Q( )=0 (20.3)
k(v(b))
we obtain
https://doi.org/10.1515/9783110739701-021
20.1 Definition of eigenvalue problems | 479
−λ⟨y, v⟩ = ⟨y, L∗ v⟩
⇒ ⟨y, −λv⟩
̄ = ⟨y, L∗ v⟩
⇒ ⟨y, L∗ v + λv⟩
̄ =0
Thus, we have two possibilities: (i) L∗ v + λv̄ = 0, which implies that λ̄ is an eigenvalue
of L∗ with v(x) being the eigenfunction or (ii) the function L∗ v + λv̄ is orthogonal to
y(x), i. e.,
∫(L∗ v + λv)y(x)
̄ dx = 0 (20.4)
a
In case (i), we have already proved the result. Now, consider case (ii) and suppose that
⟨f , y⟩ = 0 (20.6)
If −(L∗ + λ)v
̄ represented every continuous function f (x), we have a contradiction since
equation (20.6) ⇒ y(x) = 0. Therefore, the other alternative must hold, i. e., L∗ v +
λv̄ = 0 for some nonzero v(x).
The above conclusion may also be reached using the following reasoning. Sup-
pose that the operator (L∗ + λ)̄ is invertible, i. e.,
(L∗ + λ)v
̄ = 0 ⇒ v = 0,
then the range of (L∗ + λ)̄ consists of every continuous function. Hence, by choosing
different v(x) we can obtain different f for which ⟨f , y⟩ = 0. Again, we have y(x) = 0,
which is a contradiction. ∴ (L∗ + λ)̄ must be singular and there exists a nonzero v(x)
such that
L∗ v = −λv̄
Theorem. The eigenfunctions of the BVP defined by equation (20.2) and the adjoint BVP,
(equation (20.3)) corresponding to distinct eigenvalues are orthogonal, i. e.,
Since λm ≠ λn ⇒
⟨ym , vn ⟩ = 0
1, m=n
⟨ym , yn ⟩ = δmn = {
0, m ≠ n
k(u(a))
Ly = −λy; W( )=0 (20.7)
k(u(b))
Let y1 (x, λ), y2 (x, λ), . . . , yn (x, λ) be a set of linearly independent solutions of equation
(20.7). Then any solution is of the form
D(λ)c = 0 (20.9)
where
is the characteristic matrix. If rank D(λ) = n, then c = 0 and the only solution to the
BVP is the trivial one. Thus, to get a nontrivial solution, we require that
20.2 Determination of the eigenvalues | 481
h(λ) ≡ det D(λ) = D(λ) = 0 (20.11)
Equation (20.11), which determines the eigenvalues of the BVP, is called the charac-
teristic equation. The zeros of the scalar function h(λ) give the eigenvalues. The corre-
sponding eigenfunctions can be obtained by solving equation (20.9) for c and substi-
tuting in equation (20.8).
20.2.1 Relationship between the n-th order eigenvalue problem and the vector
eigenvalue problem
k(u(a))
Ly = −λy; W( )=0 (20.12)
k(u(b))
Define
y1 (x) = y(x)
y2 (x) = y′ (x) = y1′
y3 (x) = y′′ (x) = y2′
(20.14)
.
.
yn (x) = y[n−1] x = yn−1
′
dy1
= y2
dx
dy2
= y3
dx
.
.
dyn−1
= yn
dx
dyn p p p λ
= − 1 yn − 2 yn−1 − ⋅ ⋅ ⋅ − n y1 − y1
dx p0 p0 p0 p0
or defining
482 | 20 Eigenvalue problems for differential operators
y1
y2
y=( . ) (20.15)
.
yn
0 1 0 . 0 0 . . . 0
0 0 1 . 0 0 . . . 0
dy (
= . . ) y+λ ( . ) y (20.16)
dx
0 0 0 1 0 .
p pn−1
− n − . . − pp1 − p1 0 . . 0
( p0 p0 0 ) ( 0 )
⇒
dy
= A(x)y + λB(x)y
dx
⇒
dy
= [A(x) + λB(x)]y (20.17)
dx
Wa y(a) + Wb y(b) = 0, (20.18)
where the elements of the n × n matrices A(x) and B(x) are continuous functions. Obvi-
ously, this is a more general eigenvalue problem than that defined by equation (20.12).
Theorem. Let A and B be continuous complex valued n × n matrices defined on the real
x-interval [a, b]. Let ξ be any point in (a, b) and w be any constant vector in C n . Consider
the solution of equations (20.17) and (20.18) that passes through the point (ξ , w) and
denote it by y(x, λ, ξ , w). Then:
1. The solution y(x, λ, ξ , w) exists for all x in (a, b) and is continuous in x, λ and w for
x in (a, b) and |w| + |λ| < ∞, for each fixed x in (a, b).
2. It is analytic in w and λ for |w| + |λ| < ∞.
A proof of this theorem may be found in the book by Coddington and Levinson [13]. An-
other way of expressing this result is that if we generate a fundamental matrix of equa-
tion (20.17) by using the n conditions:
y = wj ej , ξ ∈ [a, b] j = 1, 2, 3, . . . , n (20.19)
⇒ h(λ) = det D(λ) is an entire function of λ. Thus, to study the nature (real or complex
and distribution) of the eigenvalues, we need to study the zeros of entire functions. In
the special case in which A(x) and B(x) are constant matrices, we get
and
Ly = −λy,
k(y(ξ )) = ej , a ≤ ξ ≤ b, j = 1, 2, .., n
⇒
where g(z) is an analytical function with g(a) ≠ 0. If k is not finite, then h(z) ≡ 0.
If k is finite, then ∃ a region |z − a| < δ such that g(z) has no zero, or equivalently
h(z) does not vanish. ∴ The zeros are isolated
2. Suppose that z1 , z2 , . . . , zn are the zeros in ℛ such that
lim z = c, i. e.,
n→∞ n
h(z) = a0 + a1 z + a2 z 2 + a3 z 3 + ⋅ ⋅ ⋅ ,
which converges for all z and is real for z real. Suppose that z = b is a real number
⇒
h(b) = a0 + a1 b + a2 b2 + a3 b3 + ⋅ ⋅ ⋅
If b = 0, h is real ⇒ a0 is real,
h′ (z) = a1 + 2a2 z + ⋅ ⋅ ⋅
h(α) = 0
⇒
0 = a0 + a1 α + a2 α2 + ⋅ ⋅ ⋅ = h(α)
0 = a0 + a1 ᾱ + a2 ᾱ 2 + ⋅ ⋅ ⋅ = h(α)̄
20.3 Properties of the characteristic equation | 485
∴ ᾱ is also a zero of h(z). If h(z) is real for z real and h(z) is an entire function, then
its zeros must occur in conjugate pairs.
⇒
d′ d12
′
. d1n
′ d d12 . d1n
11 11
. .
h′ (λ) = + ⋅ ⋅ ⋅ + . . (20.25)
. . .
.
′
d dn2 dnn dn1 dn2 dnn
. ′
. ′
n1
h′ (λ) = 0
Thus, if the eigenvalues are to be simple, the rank of D(λ) must be (n − 1). However, the
converse is not true, i. e., h(λ) = h′ (λ) = 0 does not imply rank D = n − 2. To illustrate,
consider
λ 1
D=( )
0 λ
which gives
h = λ2 , h′ = 2λ
Thus,
h = h′ = 0 @ λ = 0
but D(0) has rank one. We now return to equations (20.24) and (20.25). By Laplace’s
expansion, h′′ (λ) is a linear combination of determinants of order (n − 2) and (n − 1)
which are minors of D. Thus, if rank D = (n − 3),
486 | 20 Eigenvalue problems for differential operators
⇒
Thus, the multiplicity of eigenvalue = 3. Again, the converse is not true, i. e., h(λ), h′ (λ)
and h′′ (λ) can be zero even if rank D = (n − 1). Thus, the multiplicity of the eigenvalue
is at least k if rank D = n − k.
h(λi ) = 0
dh
(λ ) = 0
dλ i
. (20.26)
.
k−1
d h dk h
(λi ) = 0 but (λ ) ≠ 0
dλk−1 dλk i
d2 y
= −λy; y(0) = 0; y(1) = 0. (20.27)
dx 2
We take
sin √λx
y1 = cos √λx, y2 = ,
√λ
which satisfy
k(y1 (0)) = e1
k(y2 (0)) = e2
⇒
1 0 0 0
=( )+( )
0 0 cos √λ sin √λ/√λ
1 0
=( ) (20.28)
cos √λ sin λ/√λ
√
sin √λ
h(λ) = = 0 ⇒ √λ = nπ, n = ±1, ±2, . . .
√λ
λn = n2 π 2 , n = 1, 2, . . .
1 0
D(λ) = ( sin √λ ),
cos √λ √λ
D(λn )c = 0 ⇒ c1 = 0
∴ yn (x) = c1 y1 + c2 y2 = c2 sin nπx
Note that n = ±k gives the same eigenfunction and eigenvalue. Choose c2 such that
⟨yn , yn ⟩ = ∫ yn2 dx = 1 ⇒ c2 = √2
0
Thus, there are infinite number of eigenvalues and the corresponding eigenfunctions.
In addition,
sin √λ
h(λ) = det D(λ) =
√λ
dh sin nπ cos nπ (−1)n
(λn ) = − 3 3 + = ≠ 0 (20.30)
dλ 2n π 2nπ 2nπ
Thus, the eigenvalues are simple. The eigenspace corresponding to each eigenvalue
has dimension one.
Since the operator is the same, we can use the same Wronskian matrix as that in the
previous example (but the boundary conditions are different). We have
488 | 20 Eigenvalue problems for differential operators
1 0 0 0
Wa = ( ), Wb = ( )
0 0 1 −2
sin √λ
h(λ) = − 2 cos √λ = 0 ⇒ tan √λ = 2√λ (20.33)
√λ
This is the characteristic equation for determining the eigenvalues. This is a transcen-
dental equation and has an infinite number of roots as shown in Figure 20.1, where the
two curves (LHS and RHS of equation (20.33)) intersect at infinite number of points.
Figure 20.1: The roots of characteristic equation (intersection of the two curves).
±√λ1 , ±√λ2 , . . .
or
λ = λ1 , λ2 , λ3 , . . .
D(λn )c = 0 ⇒ c1 = 0
∴
20.3 Properties of the characteristic equation | 489
⇒
1
2 sin 2√λn x sin 2√λn
2
= 1 − =1−
c2 2√λn 0
2√λn
2 sin √λn cos √λn
=1−
2√λn
= 1 − 2 cos2 √λn (using Eq. (20.33))
= − cos 2√λn
⇒
2 2
c22 = or c22 =
− cos 2√λn 1 − 2 cos2 √λn
⇒
2
c2 = √
1 − 2 cos2 √λn
√2 sin √λn x
yn (x) = , n = 1, 2, . . . (20.34)
√1 − 2 cos2 √λn
are normalized eigenfunctions. In addition, it can be shown that h′ (λn ) ≠ 0, i. e., the
eigenvalues are simple.
Again, since the operator is the same, we can use the same Wronskian matrix as that
in Example 20.1. We have
1 0 −1 0
Wa = ( ), Wb = ( )
0 1 0 −1
1 0 1 0 −1 0 cos √λ sin √λ/√λ
D(λ) = ( )( )+( )( )
0 1 0 1 0 −1 −√λ sin √λ cos √λ
490 | 20 Eigenvalue problems for differential operators
√λ = 2nπ
λn = 4n2 π 2 , n = 0, 1, 2, . . .
0 0
D(λn ) = ( ), rank = 0
0 0
dh sin √λ
=
dλ √λ
dh d2 h 1
= 0, = ≠ 0
dλ λ=λn
dλ λ=λn 2λn
2
λ0 = 0 (20.38)
2 2
λn = 4n π , n = 1, 2, 3, . . . (20.39)
Eigenfunctions:
y0 (x) = 1 (20.40)
yn1 (x) = sin 2nπx (20.41)
yn2 (x) = cos 2nπx (20.42)
20.3 Properties of the characteristic equation | 491
⇒
1 0 0 0
Wa = ( ), Wb = ( )
0 1 −α 0
sin √λ
1 0 1 0 0 0 cos √λ
D(λ) = ( )( )+( )( √λ )
0 1 0 1 −α 0 − λ sin √λ
√ cos √λ
1 0 0 0
=( )+( )
−α sin√
√λ
0 1 −α cos √λ
λ
1 0
=( ) (20.44)
1 − α sin√
√λ
−α cos √λ
λ
α sin √λ
h(λ) = D(λ) = 1 − (20.45)
√λ
Characteristic equation:
sin √λ 1
=
√λ α
y0 (x) = x
Different cases
Case A: α = 1
Let √λ = a + ib ⇒
sin(a + ib) = a + ib
sin a cosh b + i cos a sinh b = a + ib
a = sin a cosh b (20.47)
b = cos a sinh b (20.48)
a
Equation (20.47) ⇒ b = cosh−1 ( ) (20.49)
sin a
492 | 20 Eigenvalue problems for differential operators
a a
Equation (20.48) ⇒ cosh−1 ( ) − cos a sinh{cosh−1 ( )} = 0 (20.50)
sin a sin a
Asymptotic values
b
For b ≫ 1, sinh b
→ 0. Therefore, equation (20.48) ⇒ cos a = 0:
π
a ≈ (2n + 1) , n = 0, 1, 2, . . . ⇒ sin a = (−1)n
2
a
For n odd, the equation cosh b = sin a
cannot be satisfied. Thus,
π
am = (2m + 1) , m = 0, 2, 4, 6, . . .
2
π
= (4m + 1) , m = 0, 1, 2, 3, . . . (20.51)
2
π
bm = cosh−1 {(4m + 1) }
2
π √ π2
= ln{(4m + 1) + (4m + 1)2 − 1} (20.52)
2 4
Exact eigenvalues:
λ0 = 0
λ1 = 48.56 ± i41.5
λ2 = 181.97 ± i93.2
π2 2
λm ≈ (4m + 1)2 − [ln(4m + 1)π]
4
± i(4m + 1)π[ln(4m + 1)π], m ≥ 3 (20.54)
Case D: α = ∞
In this case,
sin √λ
h(λ) = = 0 ⇒ √λ = nπ, n = 1, 2, . . . (20.55)
√λ
⇒ λn = n2 π 2 ; n = 1, 2, 3, . . . , ∞ (20.56)
√λ = a + ib ⇒ α sin(a + ib) = a + ib
⇒
Equation (20.57) ⇒
a
b = cosh−1 { } (20.59)
α sin a
a a
cosh−1 { } − α cos a sinh cosh−1 { }=0 (20.60)
α sin a α sin a
Asymptotic values
For b ≫ 1
√λm = am + ibm
π π
= (4m + 1) + i cosh−1 {(4m + 1) } (20.61)
2 2α
Summary
(i) For α = ∞, only real eigenvalues
λ n = n2 π 2
(ii) For 1 < α < ∞, a finite number of real eigenvalues, infinite number of complex
eigenvalues
(iii) For α = 1, one real eigenvalue all others complex eigenvalues
(iv) For 0 < α < 1, all eigenvalues are complex and move to ∞ as α → 0.
Problems
1. Given the eigenvalue problem,
y′′ = −λy;
y(0) = y′ (1)
494 | 20 Eigenvalue problems for differential operators
y(1) = y′ (0)
d4 y
= λy, 0 < x < 1
dx4
y′′′ (0) + h1 y(0) = 0; y′′′ (1) − h3 y(1) = 0
y′′ (0) − h2 y′ (0) = 0; y′′ (1) + h4 y′ (1) = 0
1
d4 y
∫y dx
dx4
0
by parts twice.
(c) Show that the eigenfunctions corresponding to different eigenvalues are or-
thogonal.
3. Consider the eigenvalue problem
d2 u
= −λu; 0 < x < 1
dx2
u(0) − u(1) = 0, u′ (0) − u′ (1) = 0
d2 y P
=− y
dx 2 EI
y(0) = y(L) = 0
y = displacement
P = applied load
EI = flexural rigidity of the column
What is the critical (smallest) load at which buckling occurs? [Note: the resulting
formula is known as the Euler’s column formula.]
20.3 Properties of the characteristic equation | 495
Here, k is the wave number, Pr is the Prandtl number and Ra is the Rayleigh num-
ber. (a) Show that the eigenvalues are real (b) Compute the smallest value of Ra for
which an eigenvalue can be zero [this value of Ra defines the critical value beyond
which the conduction state is not stable and leads to convective patterns].
6. Consider the EVP
d dy
− (p(x) ) + q(x)y = λρ(x)y, a<x<b (21.1)
dx dx
periodic BCs
Dirichlet BCs
y(a) = 0 (21.6)
y(b) = 0 (21.7)
The BVP defined by equation (21.1) and either BCs given in equation (21.2)–(21.3),
(21.4)–(21.5) or (21.6)–(21.7) is called a regular Sturm–Liouville problem if p(x) ≠ 0 in
[a, b], −∞ < a < b < ∞ (interval [a, b] is finite). ρ(x) is called the density or weight
https://doi.org/10.1515/9783110739701-022
21.1 Sturm–Liouville theory | 497
function. It is assumed that ρ(x) > 0 for a < x < b. We discuss here the S − −L theory
for Dirichlet BCs. The extension of the theory to other cases is straightforward. (When
p(x) vanishes in [a, b] or when the interval (a, b) is infinite the BVP is called irregular.
These cases will be considered in Chapters 24 and 25.)
Remark. The equation
with either of the three sets of BCs may be written in a self-adjoint form using the
transformation
x
⇒
Multiplying both sides of equation (21.8) by the positive function p(x) gives
Proof. (i) We have already proved this for an n-th order self-adjoint BVP.
(ii) Let λn be an eigenvalue with yn (x) as the corresponding eigenfunction:
⇒
b b b
d
∫ yn (p(x)yn′ ) dx − ∫ q(x)yn2 dx = −λn ∫ ρ(x)yn2 dx
dx
a a a
⇒
b b b
b 2
pyn′ yn a − ∫ p(x)(yn′ ) dx − ∫ qyn2 dx = −λn ∫ ρ(x)yn2 dx
a a a
yn (a) = yn (b) = 0
⇒
b b
∫a p(x)(yn′ )2 dx + ∫a q(x)yn2 dx
λn = b
(21.16)
∫a ρ(x)yn2 dx
[Remark: The RHS of equation (21.16) is the Rayleigh quotient encountered in Sec-
tion 7.3.] Thus, λn > 0, if p(x) > 0, q(x) ≥ 0
(iii) To prove (iii), let λn and λm be eigenvalue (λn ≠ λm ) with eigenfunctions yn (x) and
ym (x), respectively. Then,
′ ′
(pym ) − qym = −λm ρym (21.18)
∫[(pyn′ ) ym ) yn ] dx − ∫ q(yn ym − ym yn ) dx
′ ′ ′
− (pym
a a
b
= (λm − λn ) ∫ ρyn ym dx
a
⇒
b b b
b
(λm − λn ) ∫ ρyn ym dx = [pyn′ ym − pym
′
yn ]a − ∫ pyn ym dx + ∫ pym
′ ′
yn dx
′ ′
a a a
= p(b)yn′ (b)ym (b) − p(b)ym
′
(b)yn (b) − p(a)yn′ (a)ym (a)
21.1 Sturm–Liouville theory | 499
+ p(a)ym
′
(a)yn (a)
=0
⇒
Since λm ≠ λn ⇒
(iv) We have already shown that the eigenvalues are the zeros of an entire function.
Thus, the zeros are isolated. To prove that there are an infinite number of them, we
refer to the book by Coddington and Levinson [13].
We showed that the BVP (equations (21.13)–(21.14)) may be written as an integral
equation:
y(x) = ∫ λρ(s)y(s)G(x, s) ds
a
or
b
1
y(x) = ∫ G(x, s)ρ(s)y(s) ds (21.20)
λ
a
1
𝒢 y = μy, (μ = ) (21.22)
λ
i. e., the eigenvalues of 𝒢 are reciprocals of those of equations (21.13)–(21.14) and the
eigenfunctions are the same.
𝒢 : C[a, b] → C[a, b] is a linear operator that is bounded and continuous w. r. t. the
norm induced by the inner product
500 | 21 Sturm–Liouville theory and eigenfunction expansions
⟨u, v⟩ = ∫ u(x)v(x)ρ(x) dx
a
⟨𝒢 y, u⟩ = ⟨y, 𝒢 u⟩
i. e., 𝒢 is self-adjoint
b b
since G(x, s) = G(s, x). These properties may be used to prove spectral theorem for the
operator 𝒢 . Again, we refer to the book of Coddington and Levinson [13] for further
details.
d dy
− (p(x) ) + q(x)y = λρ(x)y(x) a<x<b (21.23)
dx dx
α1 y(a) + α2 y′ (a) = 0 (α12 + α22 ≠ 0) (21.24)
β1 y(b) + β2 y (b) = 0
′
(β12 + β22 ≠ 0) (21.25)
Recall that this is a regular S–L problem if (i) −∞ < a < b < ∞, (ii) p(x) ∈ C 1 (a, b) and
q(x), ρ(x) ∈ C 0 (a, b) (iii) ∃p0 > 0 and ρ0 > 0 such that p(x) ≥ p0 and ρ(x) ≥ ρ0 in [a, b].
For this case, the following asymptotic result on the eigenvalues may be obtained (for
details, we refer to the books by Courant and Hilbert [15] and Morse and Feshback
[23]).
Asymptotic results
For n → ∞
nπ C
λn1/2 ≈ + + O(n−2 ) (21.26)
b ρ(x)
{∫a | p(x) |1/2 dx} n
nπ(x − a) nπ(x − a)
yn (x) ≈ [1 + O(n−2 )] cos{ } + [1 + O(n−2 )] sin{ } (21.27)
b−a b−a
21.1 Sturm–Liouville theory | 501
when α2 ≠ 0 and
3
y′′ = −λρ(x)y; ρ(x) = (1 − x 2 ), 0<x<1
2
y (0) = 0,
′
y(1) = 0
Here, p(x) = 1.0, q = 0 and ρ(x) = 32 (1 − x 2 ) is the velocity profile. The first few eigen-
values are listed in Table 21.1.
n λn
1 1.88517
2 21.4315
3 62.3166
4 124.537
5 208.091
We note that the Graetz eigenfunctions satisfy the orthogonality relation based on
weighted inner product
1
A plot of these functions with the normalization condition yn (0) = 1 is shown in Fig-
ure 21.1.
Here, p = 1, q = 0 and ρ = x. The first few eigenvalues are listed in Table 21.2.
502 | 21 Sturm–Liouville theory and eigenfunction expansions
n λn
1 18.9563
2 81.8866
3 189.221
4 340.967
5 537.126
Figure 21.2 shows first five of these eigenfunctions with constraint yn′ (0) = 1.
Figure 21.2: First five Airy eigenfunctions yn (x) satisfying yn′ (0) = 1.
21.2 Eigenfunction expansions | 503
1 if i = j
= δij = { (21.32)
0 otherwise
We now consider the expansion of any arbitrary function f (x) in terms of the eigen-
functions {yi (x)}. We write
∞
f (x) = ∑ ci yi (x) (21.33)
i=1
In order to determine ci , multiply both sides of equation (21.33) by ρ(x)yj (x) and inte-
grate from a to b ⇒
b b ∞
∫ f (x)yj (x)ρ(x) dx = ∫ ρ(x)yj (x)(∑ ci yi (x)) dx
a a i=1
Assuming that the summation and integration can be interchanged (this is so if the
series in (21.33) converges uniformly), we get
b ∞ b
Thus,
and
504 | 21 Sturm–Liouville theory and eigenfunction expansions
∞
f (x) = ∑⟨f , yi ⟩yi (21.35)
i=1
Definition. Let {yi (x)} be an infinite system of orthonormal functions relative to the
weight function ρ(x) on an interval [a, b]. If f (x) is any function for which the integrals
in equation (21.34) exist, the infinite series in equation (21.33) is called the eigenfunc-
tion expansion or Fourier series of f (x) relative to the system {yi (x), ρ(x)}. The coeffi-
cients ci are called Fourier coefficients of f (x) relative to {yi (x)}.
Remarks.
1. Historically, the term Fourier series is used when the orthonormal functions {yi (x)}
are the sine functions, cosine functions, or sine and cosine functions. Each of
these functions are generated from the following self-adjoint problems:
(S)
λ n = n2 π 2 , yn (x) = √2 sin[nπx], n = 1, 2, . . .
(C)
(P)
2. If the eigenfunctions are not normalized, then the expansion (equation (21.35)) is
of the form
21.3 Convergence in function spaces and introduction to Banach and Hilbert spaces | 505
∞
⟨f , yi ⟩
f (x) = ∑ ci yi (x), ci = . (21.36)
i=1
⟨yi , yi ⟩
3. When the functions {yi (x)} are not sines or cosines, the expansion (21.33) is called
eigenfunction expansion, or generalized Fourier series.
From this definition, it is seen that any convergent sequence is necessarily a Cauchy
sequence but the converse is not true.
Example 21.3.
(i) Let V = set of rational numbers in [0, 1] and for x, y ∈ V, define the metric/distance
function d(x, y) = |x − y|. This space is not complete. The reason being sequence
n n
such as Sn = ( n+1 ) , converge but not to a point in V. In this case, limn→∞ Sn is e1 ,
which is irrational.
(ii) Let V = ℝ and d(x, y) = |x − y|. This space is complete.
(iii) Let V = C[a, b], and
d∞ (f , g) = sup f (x) − g(x).
a≤x≤b
b
2
d2 (f , g) = ∫[f (x) − g(x)] dx.
a
506 | 21 Sturm–Liouville theory and eigenfunction expansions
This space is not complete. For example, sequences such as {e−nx }, {tanh[nx]} ∈ C[a, b]
for finite n, but for n → ∞, the limiting functions are not in C[a, b].
Metric spaces may be completed by appending the missing element to the space.
Thus, to the rational numbers, we add all the limits of convergent sequences, i. e., we
add all the irrationals, to get the real number system, which is complete.
Similarly, to the space C[a, b] with the metric d2 , we add the limits of convergent
sequences, we get the space ℒ2 [a, b], the space of all Lebesque square integrable func-
tions on [a, b].
Recall the definition of the Riemann integral for a continuous or piecewise continu-
ous function f (x) over an interval [a, b]. We divide the interval [a, b] into n subinter-
vals and form the upper and lower Riemann sums. If these sums converge to the same
value when n → ∞ and the largest size of the subinterval goes to zero, then the Rie-
mann integral exists. Now, consider the Dirichlet function of Section 10.1, for which
the Riemann integral does not exist. However, in the Lebesque theory of integration,
we ignore sets of measure zero (also referred to as null sets) in the integration process.
Thus, the Lebesque integral exists for the Dirichlet function. The set of all Lebesque in-
tegrable functions is denoted by ℒ[a, b], while the set of Riemann integrable function
is defined by R[a, b]. It is clear that C[a, b] ⊂ R[a, b] ⊂ ℒ[a, b].
Example 21.4.
(i) {C[a, b], d∞ } is a Banach space
(ii) {C[a, b], d2 } is not a Banach space
(iii) {ℝn , dp , 1 ≤ p < ∞} is a Banach space
(iv) {ℂn , dp , 1 ≤ p < ∞} is a Banach space
Every Hilbert space is a Banach space but the converse is not true.
Example 21.5.
(i) {ℝn , ⟨u, v⟩ = ∑nj=1 uj vj } is a Hilbert space.
(ii) {ℝn , ⟨u, v⟩ = vT Gu, G is positive definite} is a Hilbert space.
(iii) {ℂn , ⟨u, v⟩ = ∑nj=1 uj vj } is a Hilbert space.
b
(iv) {C[a, b], ⟨f , g⟩ = ∫a f (x)g(x) dx} is not a Hilbert space.
21.3 Convergence in function spaces and introduction to Banach and Hilbert spaces | 507
b
(v) {ℒ2 [a, b], ⟨f , g⟩ = ∫a ρ(x)f (x)g(x) dx, ρ(x) > 0} is a Hilbert space.
b
(vi) {ℒ2C [a, b], ⟨f , g⟩ = ∫a f (x)g(x) dx} is a Hilbert space of complex valued functions of
a real variable.
Note that in the Hilbert space of example (v) above, the eigenfunction expansion
∞
f (x) = ∑ ⟨f , yn (x)⟩yn (x)
n=1
b 2
N
lim ∫{f (x) − ∑ ⟨f , yn ⟩yn } ρ(x) dx = 0
N→∞
a n=1
and when we write f (x) ≗ g(x), the two functions may disagree on a set of points
having measure zero.
Definition. Let V be a normed linear space and U be a subset of V. Then we say that
U is a dense subset of V if given x ∈ V, ∃x0 ∈ U such that
Example 21.6.
(i) The set of rational numbers is dense in the real line. Every irrational number can
be approximated as closely as required by a rational number.
(ii) The set of polynomials is dense in C[a, b] with the metric
d∞ (f , g) = sup f (x) − g(x).
a≤x≤b
Theorem 21.1. Let f (x) be defined and continuous with two continuous derivatives on
[a, b] and f (x) satisfies the same boundary conditions as the eigenfunctions {ρ(x); ϕn (x),
n = 1, 2, . . .}, then the eigenfunction expansion
∞
f (x) = ∑ cn ϕn (x) (21.37)
n=1
Theorem 21.2. Let f (x) be piecewise smooth on [a, b]. Then, for each x in [a, b], the
eigenfunction expansion converges, and
f (x + ) + f (x − ) ∞
= ∑ cn ϕn (x), a<x<b (21.38)
2 n=1
b
2
lim ∫[f (x) − Sn (x)] ρ(x) dx = 0
n→∞
a
Then
b n
‖f ‖2 = ∫ ρ(x)f (x)2 dx = ∑ cn2 . (21.41)
a n=1
n
(‖f ‖2 − ∑ cj2 ) < ε (21.42)
j=1
∞
f (x) = ∑ cj uj (x)
j=1
where
cj = ⟨f , uj ⟩ = ∫ f (ξ )√2 sin(jπξ ) dξ
0
1
2√2 0 if j is even
= (1 − cos jπ) = { 4√2
j3 π 3 j3 π 3
if j is odd
8 ∞ sin(2k + 1)πx
RN (x) = ∑ .
π 3 k=N (2k + 1)3
Note that
510 | 21 Sturm–Liouville theory and eigenfunction expansions
8 sin(2k + 1)πx
∞
RN (x) = 3 ∑ 3
π
k=N (2k + 1)
8 ∞ 1
≤ 3 ∑
π k=N (2k + 1)3
∞
8 dN
< ∫
π 3 (2N + 1)3
N−1
8 1
∴ RN (x) < 3
π 4(2N − 1)2
For N = 3 ⇒
RN (x) < 0.00258
Thus,
approximates f (x) = x(1 − x) within an accuracy of 0.0026 for all x. Figure 21.3 shows
plots of exact function f (x) along with Fourier series expansion using first few terms.
Figure 21.3: Plot of function f (x) = x(1 − x) and its representation with Fourier series expansion using
only first term and first two terms.
In this case, the function f (x) is twice differentiable and satisfies the same BCs as the
eigenfunctions. Thus, the convergence is uniform.
Let us expand this function in terms of the eigenfunctions of the S–L problem
0, n=0
λn = {
n2 , n = 1, 2, 3, . . .
⇒
π
1 1 π √π
c0 = ⟨y0 , f ⟩ = ∫ f (x) dx = ⋅ =
√π √π 2 2
0
and
π/2 jπ
2 2 sin( 2 )
cj = ⟨yj , f ⟩ = √ ∫ cos jx dx = √
π π j
0
Thus,
√π
c0 =
2
2
c2k = 0 and c2k+1 = (−1)k √ , k = 0, 1, 2, 3
(2k + 1)2 π
⇒
√π 1 ∞
2 (−1)k 2
f (x) = ⋅ +∑√ ⋅ √ cos(2k + 1)x
2 √π k=0 π (2k + 1) π
1 2 ∞ (−1)k
= + ∑ cos(2k + 1)x. (21.44)
2 π k=0 (2k + 1)
512 | 21 Sturm–Liouville theory and eigenfunction expansions
Figure 21.4 shows the plot of this function and representation from Fourier series ex-
pansion with one, three, five and hundred terms. Note the Gibb’s phenomena of over
and undershoot at the point of discontinuity. In this example, the function f (x) is not
differentiable at x = π2 . Hence, the eigenfunction expansion converges in the mean
square norm.
Figure 21.4: Representation with Fourier series expansion using only first 1, 3, 5 and 100 terms,
demonstrating the Gibb’s phenomena of over and undershoot at the point of discontinuities.
∞ b
⇒
π/2
π ∞ 2 1 π
+∑ ⋅ = ∫ dx =
4 k=0 π (2k + 1)2 2
0
⇒
2 ∞ 1 π
∑ =
π k=0 (2k + 1)2 4
or
∞
1 π2
∑ = (21.45)
k=0
(2k + 1) 2 8
1 2 ∞ (−1)k
1= + ∑
2 π k=0 (2k + 1)
21.3 Convergence in function spaces and introduction to Banach and Hilbert spaces | 513
⇒
π ∞
(−1)k 1 1 1 1
= ∑ = 1 − + − + − ⋅⋅⋅ (21.46)
4 k=0 (2k + 1) 3 5 7 9
Many such relations (equations (21.45) and (21.46)) can be derived using eigenfunction
expansions.
𝜕2 u 𝜕2 u
+ = −λu, 0 < x < 1, 0<y<1
𝜕x 2 𝜕y2
u(0, y) = u(1, y) = u(x, 0) = u(x, 1) = 0
where
1
2√2 2 √2 0 if i or j is even
= [ 3 3 (1 − cos iπ)].[ 3 3 (1 − cos jπ)] = { 32
i π j π i3 j3 π 6
if i and j are odd
x(1 − x)y(1 − y)
64 ∞ ∞ sin(2k + 1)πx sin(2l + 1)πy
= ∑∑
π 6 k=0 l=0 (2k + 1)3 (2l + 1)3
64 ∞ sin(2k + 1)πx ∞
sin(2l + 1)πy
= [ ∑ 3
][ ∑ ]
6
π k=0 (2k + 1) l=0
(2l + 1)3
64 sin 3πx sin 5πx sin 3πy sin 5πy
= [sin πx + + + ⋅ ⋅ ⋅][sin πy + + + ⋅ ⋅ ⋅] (21.47)
π6 27 125 27 125
514 | 21 Sturm–Liouville theory and eigenfunction expansions
Figure 21.5 shows the plots of exact function f (x, y) along with 2D Fourier series ex-
pansion using first few terms. The convergence here is uniform.
Figure 21.5: Plot of function f (x, y) = x(1 − x)y(1 − y) and its representation with Fourier series
expansion using 1 × 1 and 2 × 2 terms.
Let {λi }, {yi (x)}, i = 1, 2, 3, . . . be the eigenvalues and normalized eigenfunctions, respec-
tively. Let G(x, s) be the Green’s function. We have seen that equations (21.48)–(21.49)
may be written as
Thus,
21.3 Convergence in function spaces and introduction to Banach and Hilbert spaces | 515
b
yi (x)
= ∫ G(x, s)ρ(s)yi (s) ds (21.51)
λi
a
∞
G(x, s) = ∑ ci yi (s)
i=1
⇒
b
yi (x)
ci = ⟨G(x, s), yi (s)⟩ = ∫ ρ(s)G(x, s)yi (s) ds =
λi
a
∞ b
Multiply both sides by ρ(x) and integrate from a to b and use the relation:
∫ ρ(x)yi2 (x) dx = 1
a
⇒
b b
∞
1
∑ = ∫[∫ G(x, s)2 ρ(s) ds]ρ(x) dx (21.53)
i=1 λi2
a a
Since G(x, s) is continuous, integral on the RHS of equation (21.53) is finite. Therefore,
1
∑∞i=1 λ2 converges:
i
516 | 21 Sturm–Liouville theory and eigenfunction expansions
⇒
1
→0 for i → ∞ ⇒ λi2 → ∞ as i → ∞
λi2
Example 21.10.
⇒
1 1
0 0 x
1
x 3 (1 − x)2 x 2 (1 − x)3
= ∫[ + ] dx
3 3
0
1
1
= ∫ x 2 (1 − x)2 dx
3
0
1
=
90
Eigenvalues
λ n = n2 π 2 , n = 1, 2, 3, . . .
⇒
1 1
∞
1
∑ 2 = ∫ ∫ G(x, s)2 ds dx
λ
i=1 i 0 0
∞
1 1
∑ 4 4 =
n=1 n π 90
∞
1 π4
⇒ ∑ = .
n=1 n
4 90
21.3 Convergence in function spaces and introduction to Banach and Hilbert spaces | 517
Problems
1. Consider the eigenvalue problem
which arises in solving the unsteady-state heat (mass) diffusion problem in a flat
plate.
(a) Show that the eigenvalue problem is self-adjoint.
(b) Determine the first eigenvalue as a function of the Biot number, i. e., compute
λ1 for Bi = 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1.0, 2.0, 5.0, 10.0, 20.0, 50.0, 100.0. Iden-
tify the two asymptotes.
(c) Determine the eigenvalues and eigenfunctions for the two limiting cases of no
external resistance (Bi → ∞) and no internal resistance (Bi → 0).
2. Consider the eigenvalue problem
1 ′′
y − y′ = −λy, 0<x<1
Pe
1 ′
y (0) − y(0) = 0, y′ (1) = 0 (Pe = Peclet number),
Pe
which arises in solving unsteady-state diffusion–convection–reaction problems
with Danckwert’s boundary conditions
(a) Show that the substitutions
x Pe Pe2
y = w exp( ), Λ = λ Pe −
2 4
Pe Pe
w′′ = −Λw; w′ (0) − w(0) = 0, w′ (1) + w(1) = 0
2 2
(b) Determine the first eigenvalue as a function of the Peclet number, i. e., com-
pute λ1 for Pe = 0.01, 0.1, 1, 10, 100. Identify the two asymptotes.
(c) Determine and sketch the first eigenfunction for different values of the Peclet
number.
3. Consider the Schrodinger equation in one spatial dimension
h2 d 2 ψ
− + V(x)ψ(x) = Eψ(x)
8π 2 m dx2
where ψ(x) is the wave function. E is the total energy of the particle (electron) and
V(x) is the potential energy of the particle at position x. For a square well potential,
V(x) is zero for 0 < x < L and is infinity outside this region. The appropriate
boundary conditions for the wave function are
518 | 21 Sturm–Liouville theory and eigenfunction expansions
ψ(0) = ψ(L) = 0
(a) Determine the permitted energy levels for the particle. What is the separation
between neighboring quantum levels?
(b) Sketch the first five wave functions.
4. The following fourth-order eigenvalue problem arises in the stability analysis of a
fluid filled porous medium confined between two parallel plates kept at different
temperatures (Lapwood convection):
ϕ′′ − k 2 ϕ + ψ − λϕ = 0
ψ′′ − k 2 ψ + Ra k 2 ϕ = 0
ϕ(0) = ϕ(1) = 0
ψ(0) = ψ(1) = 0
Here, k is the wave number and Ra is the Rayleigh number. Show that the eigen-
values are real.
5. Consider the eigenvalue problem
⟨y, y⟩
λ1 = Min ,
⟨Gy, y⟩
21.3 Convergence in function spaces and introduction to Banach and Hilbert spaces | 519
2
J1/3 ( √λ) = 0
3
Find the eigenvalue from appropriate tables (Abromowitz and Stegun [2]).
8. (a) Let C 1 [a, b] be the space of continuously differentiable functions on [a, b] with
the norm
′
f (x) = Sup{f (x) + f (x)}; a≤x≤b
b
2 2
f (x) = √∫(f (x) + f ′ (x) ) dx
a
Identify the missing elements that need to be added to C 1 [a, b] to make it a Hilbert
space (the resulting Hilbert space is called a Sobolev space).
22 Introduction to the solution of linear integral
equations
22.1 Introduction
An integral equation (IE) is an equation in which the unknown function appears under
one or more integrals. IEs appear in applications that deal with particulate processes
or population balances. In addition, as we have seen in previous chapters, the solution
of many initial and boundary value problems may be expressed in terms of integral
equations. This chapter is a brief introduction to the theory of linear integral equations
in one dependent and one independent variable.
The general form of a linear IE is
where u(x) is the unknown function; h(x), f (x) and K(x, s) are known functions and
λ(≠ 0) is a real or complex parameter. If f (x) = 0, the IE is called homogeneous. If the
upper limit of integration is fixed (e. g., x = b), then it is called a Fredholm equation.
Otherwise, it is called a Volterra equation. If the unknown function u(x) appears only
under the integral sign (e. g., h(x) = 0), it is called the integral equation of the first
kind. If it appears both inside and outside of the integral sign, it is called the integral
equation of the second kind.
Examples.
1.
https://doi.org/10.1515/9783110739701-023
22.2 Transformation of an IVP into an IE of Volterra type | 521
In these equations, K(x, s) is called the kernel. When one or both limits of integration
become infinity or when the kernel becomes infinity within the range of integration,
the IE is called a singular integral equation.
For example, the Laplace transform defined by
∞
̂ (s) = ∫ e−st u(t) dt
u (22.6)
0
is a singular IE of the first kind with kernel K(t, s) = e−st . The Fourier transform defined
by
∞
̂f (α) = ∫ e−iαx f (x) dx (22.7)
−∞
is also a singular integral of the first kind with kernel K(x, α) = e−iαx . The other types
of kernels that are of interest are
(i) Separable kernel: K(x, s) = ∑nj=1 aj (x)bj (s).
(ii) Symmetric kernel: K(x, s) = K(s, x).
(iii) Convolution kernel: K(x, s) = h(x − s).
In the next two sections, we review and outline the procedure for converting initial
and boundary value problems into integral equations.
d2 u du
+ a1 (t) + a2 (t)u = g(t), t > 0 (22.8)
dt 2 dt
u(0) = α0 , u′ (0) = α1 (22.9)
d2 u
= h(t) (22.10)
dt 2
t
du
= α1 + ∫ h(s) ds (22.11)
dt
0
t t′
Changing the order of integration in the double integral (in equation (22.12)) and sim-
plifying gives
dj u
Now, multiplying dt j
by an−j (t) and summing from j = 0 to 2 with a0 (t) = 1 gives
where
Once h(t) is known, we can determine u(t) from equation (22.13). Thus, the IVP is re-
duced to a Volterra integral equation of the second kind.
The above procedure can be extended to the n-th order IVP. In this case, equation
(22.13) becomes
t
t n−1 (t − s)n−1
u(t) = α0 + α1 t + ⋅ ⋅ ⋅ + αn−1 +∫ h(s) ds (22.15)
(n − 1)! (n − 1)!
0
t n−1
f (t) = g(t) − αn−1 a1 (t) − ⋅ ⋅ ⋅ − [α0 + α1 t + ⋅ ⋅ ⋅ + αn−1 ]a (t) (22.16)
(n − 1)! n
22.3 Transformation of TPBVP into an IE of Fredholm type | 523
n
(t − s)j−1
K(t, s) = − ∑ a (t) (22.17)
j=1
(j − 1)! j
We note that for the special case of homogeneous initial conditions (αj = 0), f (t) =
g(t) and the kernel depends only on the coefficient functions aj (t).
where G(x, s) is the Green’s function. We note that equation (22.20) is valid when f (x) is
replaced by a more general (and possibly nonlinear) source term of the form h(x, u(x)),
in which case equation (22.20) becomes a nonlinear IE of the form
Thus, two-point BVPs can be transformed into Fredholm integral equations with the
kernel being the Green’s function. We have also seen that for the special case in which
the homogeneous two-point BVP is self-adjoint, the Green’s function (kernel) is sym-
metric. The kernel can also be made symmetric for the more general case in which the
weight function in the inner product is not unity. For example, the Sturm–Liouville
eigenvalue problem
d du
(p(x) ) − q(x)u(x) = −λρ(x)u(x), a<x<b (22.22)
dx dx
u(a) = 0, u(b) = 0 (22.23)
However, the kernel G(x, s)ρ(s) is not symmetric when ρ(s) is not unity. By defining
524 | 22 Introduction to the solution of linear integral equations
where
This is possible since the density function ρ(x) is strictly positive in (a, b).
We consider the case of separable kernel as the solution procedure for this case may
be related to that of linear algebraic equations. Expressing the kernel as
N
K(x, s) = ∑ ai (x)bi (s). (22.29)
i=1
Without loss of generality, we assume that the functions ai (x) and bi (s) are linearly
independent. If they are not, we can combine the terms and reduce the number of
terms in the summation.
We first consider the homogeneous Fredholm IE, i. e., equation (22.28) with f (x) = 0,
with separable kernel (equation (22.29)), which can be expressed as
b N
u(x) = λ ∫(∑ ai (x)bi (s))u(s) ds. (22.30)
a i=1
N b
Let
b
ci = ∫ bi (s)u(s) ds (22.32)
a
N
u(x) = λ ∑ cj aj (x). (22.33)
j=1
b N
ci = ∫ bi (s)λ ∑ cj aj (s) ds
a j=1
N b
where
b
c = λAc
(I − λA)c = 0. (22.36)
Let
If D(λ) ≠ 0, then the only solution to equation (22.36) is the trivial one, i. e., c = 0,
which implies u(x) ≡ 0 is the only solution to the homogeneous equation (22.30). The
526 | 22 Introduction to the solution of linear integral equations
λ-values for which D(λ) = 0 are called the eigenvalues of the kernel. There are at most
n of them. The nontrivial solution c corresponding to an eigenvalue gives a nontrivial
u(x) = λ∑Nj=1 cj aj (x). These are the eigenfunctions of the kernel.
Now, consider the inhomogeneous Fredholm IE (equation (22.28)) with separable ker-
nel (equation (22.29)), which can be expressed as
b N
u(x) = f (x) + λ ∫[∑ ai (x)bi (s)]u(s) ds (22.38)
a i=1
⇒
N
u(x) = f (x) + λ ∑ ai (x)ci (22.39)
i=1
where
b
ci = ∫ bi (s)u(s) ds
a
b N
= ∫ bi (s)[f (s) + λ ∑ aj (s)cj ] ds
a j=1
c = f + λAc (22.40)
where
b
(I − λA)c = f. (22.41)
c = (I − λA)−1 f
⇒
22.4 Solution of Fredholm integral equations with separable kernels | 527
1 N
ci = ∑ D (λ)fj (22.42)
D(λ) j=1 ij
where Dij (λ) = (i, j)th element of the classical adjoint of (I−λA), i. e., matrix of cofactors.
Substituting equation (22.42) into equation (22.39) gives
N
ai (x) N
u(x) = f (x) + λ ∑ (∑D (λ)fj )
i=1
D(λ) j=1 ij
N N b
a (x)
= f (x) + λ ∑ ∑ i Dij (λ) ∫ bj (s)f (s) ds
i=1 j=1
D(λ)
a
⇒
b
where
N N ai (x)Dij (λ)bj (s)
Γ(x, s, λ) = ∑ ∑ (22.44)
i=1 j=1
D(λ)
is called the resolvent kernel. Thus, when the kernel is separable and D(λ) ≠ 0, the
solution of the Fredholm equation of the second kind is given by equations (22.43)
and (22.44). Further, it can be shown that the solution in this case is unique.
Here,
K(x, s) = (x + s)
= x.1 + 1.s
= a1 (x)b1 (s) + a2 (x)b2 (s)
Thus,
leads to
528 | 22 Introduction to the solution of linear integral equations
1 1
1
A11 = ∫ s.1 ds = , A12 = ∫ 1.1 ds = 1,
2
0 0
1 1
1 1
A21 = ∫ s.s ds = , A22 = ∫ 1.s ds = ,
3 2
0 0
⇒
1 λ
2
1 1− 2
−λ
A=( 1 1
) ⇒ (I − λA) = ( )
3 2 − λ3 1− λ
2
λ2
⇒ D(λ) = 1 − λ −
12
⇒
λ
1 1− 2
λ f1
c= ( λ λ
)( )
1−λ− λ2 1− f2
12 3 2
where
1 1
where
6(λ − 2)(x + s) − 12λxs − 4λ
Γ(x, s, λ) = .
λ2 + 12λ − 12
We note that D(λ) = 0 ⇒ λ1,2 = −6 ± 4√3. For these values of λ, the homogeneous
equation has nontrivial solutions. Hence, the above solution is valid only if λ ≠ λ1 , λ2 .
To determine the nontrivial solutions (or eigenfunctions) of the homogeneous
equation for λ = λ1 , λ2 , we have
(I − λ1 A)c = 0 ⇒ c1 = 1, c2 = √3
(I − λ2 A)c = 0 ⇒ c1 = 1, c 2 = − √3
22.4 Solution of Fredholm integral equations with separable kernels | 529
⇒
2
u1 (x) = λ1 ∑ cj aj (x) = λ1 (√3x + 1)
j=1
u2 (x) = λ2 (−√3x + 1)
Here,
K(x, s) = xs + x 2 s2
= a1 (x)b1 (s) + a2 (x)b2 (s)
⇒
2
3
0
A=( 2 )
0 5
f1 f2
⇒ c1 = 2
, c2 =
1− 3
λ 1 − 52 λ
where
1 1
⇒
1
where
sx s2 x 2
Γ(x, s, λ) = 2λ
+ 2λ
1− 3
1− 5
is the resolvent kernel. Thus, the solution is unique for any λ except for λ = 3/2 or
λ = 5/2. For λ = 3/2, u1 (x) = x or for λ = 5/2, u2 (x) = x 2 is a solution to the homogeneous
equation.
530 | 22 Introduction to the solution of linear integral equations
where K(t, s) is the kernel of the equation and f (t) is a continuous function and λ is a
parameter. There exist various methods for solving equation (22.45). We discuss here
two of these.
Let u0 (t) be the initial guess, then VIE of the second kind (equation (22.45)) gives the
sequence u0 (t), u1 (t), . . . , un (t) that are expressed by
Let
when it exists. In the so-called Picard’s method, u0 (t) = f (t), and the recurrence rela-
tion leads to
t s
t s
2
u2 − u1 = λ ∫ ∫ K(t, s)K(s, s′ )f (s′ ) ds′ ds
0 0
which, after interchanging the order of integration and simplifying further, gives
t t
where
t
Similarly,
t t
3 ′ ′ ′
u3 − u2 = λ ∫ K3 (t, s )f (s ) ds ; ′
K3 (t, s ) = ∫ K(t, s)K2 (s, s′ ) ds
0 s′
t
n
un − un−1 = λ ∫ Kn (t, s′ )f (s′ ) ds′ ; (22.52)
0
with
or
532 | 22 Introduction to the solution of linear integral equations
n t
i−1
un (t) = f (t) + λ ∑ λ ∫ Ki (t, s′ )f (s′ ) ds′ (22.54)
i=1 0
Here,
K(t, s′ ) = et−s
′
t t
s′ s′
t
= ∫ et−s ds = (t − s′ )et−s
′ ′
s′
t t
s′ s′
t t
t−s′ t−s′ (t − s′ )2
∫(s − s′ ) ds = et−s
′
′
= ∫e (s − s ) ds = e
2!
s′ s′
(t − s′ )2
Γ(t, s′ , λ) = et−s + λ(t − s′ )et−s + λ2 et−s
′ ′ ′
+ ⋅⋅⋅
2!
λ2 (t − s′ )2
= et−s [1 + λ(t − s′ ) +
′
+ ⋅ ⋅ ⋅]
2!
= et−s eλ(t−s ) = exp[(1 + λ)(t − s′ )]
′ ′
t
′
u(t) = f (t) + λ ∫ e(1+λ)(t−s ) f (s′ ) ds′ .
0
∞ t
Remarks. (a) The above series is called the Neumann series. (b) When the kernel is
bounded, the Neumann series converges since repeated integration leads to terms of
′ i
the form (t−si! ) for the iterated kernel.
For VIE (equation (22.45)), the solution can be expressed in the form of
∞
u(t) = ∑ un (t) (22.57)
n=0
where
t
In this Adomian decomposition method, the kernel remains the same but we add
higher-order terms in λ and f (t) to the solution.
u0 (t) = t
t
t3
u1 (t) = ∫(s − t)s ds = −
3!
0
t
s3 t5
u2 (t) = − ∫(s − t) ds =
3! 5!
0
t t3 t5
u(t) = − + − ⋅ ⋅ ⋅ = sin t
1! 3! 5!
There are two ways to convert equation (22.59) into a VIE of the second kind:
(i) by differentiation of equation (22.59)
(ii) by integrating equation (22.59) by parts.
t
𝜕K
λK(t, t)u(t) + λ ∫ (t, s)u(s) ds = f ′ (t) (22.60)
𝜕t
0
t
𝜕K 1 f ′ (t)
u(t) + ∫ (t, s) u(s) ds = .
𝜕t K(t, t) λK(t, t)
0
Thus, defining
22.6 Solution procedure for Volterra integral equations of the first kind | 535
−1 𝜕K(t, s) f ′ (t)
K ∗ (t, s) = ; and f ∗ (t) = (22.61)
λK(t, t) 𝜕t λK(t, t)
we get
t
∗
u(t) = f (t) + λ ∫ K ∗ (t, s)u(s) ds (22.62)
0
which is a VIE of the second kind and can be solved by methods discussed previously.
Defining
t
𝜕K(t, s)
f (t) = λ[K(t, t)ϕ(t) − ∫ ϕ(s) ds]
𝜕s
0
t 𝜕K(t,s)
f (t)
ϕ(t) = + ∫ 𝜕s ϕ(s) ds
λK(t, t) K(t, t)
0
Thus, defining
𝜕K(t,s)
̂f (t) = f (t)
; and K(t,
̂ s) = 𝜕s
(22.64)
λK(t, t) K(t, t)
we get
which is a VIE of the second kind. Once ϕ(t) is known, u(t) can be obtained from the
relation u(t) = ϕ′ (t). Note that this method does not require the function f (t) to be
differentiable.
536 | 22 Introduction to the solution of linear integral equations
where
⇒
t
This equation can be solved by the Laplace transform method using the convolution
property of LT. Let
∞
−st
ℒ[u(t)] = ∫ e u(t) dt = u
̂ (s) = L. T. of u(t). (22.69)
0
⇒
t
′ ′ ′
ℒ[∫ g(t − t )u(t ) dt ] = ℒ[g(t) ∗ u(t)] = ĝ (s)u
̂ (s).
0
u
̂ (s) = ̂f (s) + λĝ (s)u
̂ (s)
̂f (s)
⇒u
̂ (s) = (22.70)
1 − λĝ (s)
Let
−1 1
ℒ [ ] = G(t) (22.71)
1 − λĝ (s)
⇒
t
If we can expand
22.7 Volterra integral equations with convolution kernel | 537
1
= 1 + λĝ (s) + λ2 ĝ (s)2 + ⋅ ⋅ ⋅
1 − λĝ (s)
then
1
−1
ℒ [ ] = G(t) = δ(t) + λg1 (t) + λ2 g2 (t) + ⋅ ⋅ ⋅ ; (22.73)
1 − λĝ (s)
gi (t) = ℒ−1 [ĝ (s)i ], i = 1, 2, . . . (22.74)
u(t) = ∫ G(t − t ′ )f (t ′ ) dt ′
0
t
where
Γ(t, t ′ , λ) = g1 (t − t ′ ) + λg2 (t − t ′ ) + ⋅ ⋅ ⋅
t t
u(s) u(t ′ )
f (t) = ∫ ds = ∫ dt ′ .
√t − s √t − t ′
0 0
̂f (s) = u π 1 π
̂ (s).√ , (∵ ℒ[ ]=√ )
s √t s
√ŝf (s) s π ̂
u
̂ (s) = = √ f (s)
√π π s
t
1 d −1 π ̂ 1 d f (t ′ )
⇒ u(t) = ℒ [√ f (s)] = [∫ dt ′ ]
π dt s π dt √t − t ′
0
538 | 22 Introduction to the solution of linear integral equations
Diffusion–reaction problem
Consider a diffusion–reaction problem given by
d2 c
= ϕ2 R(c), 0 < x < 1 (22.75)
dx 2
c′ (0) = 0, and c(1) = 1 (22.76)
where c(x) is concentration, R(c) is rate of reaction and ϕ is the Thiele modulus. We
can express equations (22.75)–(22.76) as an integral equation by integrating twice and
changing the order of integration, which leads to the integral equation as
where
1 − x, 0<s<x
K(x, s) = { (22.78)
1 − s, x<s<1
x 1
x 1
2 2
cj (x) = 1 − ϕ ∫(1 − x)cj−1 (s) ds − ϕ ∫(1 − s)cj−1 (s) ds (22.81)
0 x
η = ∫ c(s) ds (22.82)
0
we get
c0 (x) = 1 ⇒ η0 = 1
ϕ2 ϕ2
c1 (x) = 1 − (1 − x 2 ) ⇒ η1 = 1 −
2 3
2 4
ϕ ϕ
c2 (x) = 1 − (1 − x 2 ) + (1 − x 2 )(5 − x 2 )
2 24
ϕ2 2
⇒ η2 = 1 − + ϕ4
3 15
and so on.
The higher-order terms can be obtained following the sequence, where it can be
shown that the solutions for concentration profile and effectiveness factor are the Tay-
lor series expansion (in ϕ2 ) of the functions
cosh ϕx tanh ϕ
c = c∞ = ; and η = η∞ = . (22.83)
cosh ϕ ϕ
In this case, the solution converges for all values of ϕ2 , though the convergence may
be slow for ϕ2 > 1.
⇒
540 | 22 Introduction to the solution of linear integral equations
1
2
c1 (t) = −ϕ ∫ K(x, s)c0 (s) ds (22.86)
0
1
c0 (x) = 1 ⇒ η0 = 1 (22.89)
ϕ2 ϕ2
c1 (x) = − (1 − x 2 ) ⇒ η1 = − (22.90)
2 3
ϕ4 2ϕ4
c2 (x) = (5 − 6x 2 + x 4 ) ⇒ η2 = (22.91)
24 15
6
ϕ 17ϕ6
c3 (x) = − (61 − 75x2 + 15x 4 − x 6 ) ⇒ η3 = − , and so on (22.92)
720 315
Solving above recurrence relation sequentially, we can obtain the solution to any or-
der.
Theorem.
1. The eigenvalues of a symmetric kernel are real.
2. The eigenfunctions of a symmetric kernel corresponding to distinct eigenvalues are
orthogonal.
22.9 Fredholm integral equations with symmetric kernels | 541
b
1, i=j
∫ ϕi (x)ϕj (x) dx = δij = {
0, i ≠ j.
a
∞ b b
1
∑ 2
≤ ∫ ∫ K(x, s)2 dx ds
λ
n=1 n a a
Equivalently, λ1 → 0 for n → ∞. When the equality sign holds, the kernel is said to
n
be closed.
7. The set of eigenvalues of the second iterated kernel
1
= max⟨Kϕ, ϕ⟩, ‖ϕ‖ = 1
|λ1 |
and the maximum on the RHS is attained when ϕ(x) is an eigenfunction of the sym-
metric ℒ2 -kernel corresponding to the smallest eigenvalue.
Proofs of the various statements in the above theorem may be found in the book
by Courant and Hilbert [15].
Mercer’s theorem. If the kernel K(x, s) is symmetric and square integrable on the square
{(x, s) : a ≤ x ≤ b, a ≤ s ≤ b}, continuous and has only positive eigenvalues or at most a
finite number of negative eigenvalues, then the series
∞
ϕn (x)ϕn (s)
∑
n=1 λn
542 | 22 Introduction to the solution of linear integral equations
∞
ϕn (x)ϕn (s)
K(x, s) = ∑ .
n=1 λn
it is easily seen that the adjoint operator with respect to the usual inner product
is given by
b
∗
T v(x) = ∫ K(s, x)v(s) ds. (22.97)
a
is given by
with fixed λ has one and only one solution u(x) for arbitrary ℒ2 -functions f (x) and K(x, s),
in particular, the solution u(x) ≡ 0 for f (x) = 0, or the homogeneous equation
also has a unique solution. In the second case, the adjoint homogeneous equation
also has r linearly independent solutions vhi (x), i = 1, 2, . . . , r. The inhomogeneous equa-
tion has a solution if and only if the function f (x) satisfies
In this case, the solution to equation (22.100) is determined only up to an additive linear
combination ∑ri=1 ci uhi (x); it may be determined uniquely by the additional requirements
⟨u, uhi ⟩ = 0, i = 1, 2, . . . , r.
[Remark: Compare this theorem with the version for algebraic equations discussed in
Section 4.5.2.]
∞
ϕn (x)ϕn (s)
Γ(x, s, λ) = ∑ (22.103)
n=1 (λn − λ)
and
b
∞
an ϕn (x)
u(x) = f (x) + λ ∑ ; an = ∫ ϕn (s)f (s) ds (22.104)
n=1 (λn − λ) a
where λn and ϕn are eigenvalues and eigenfunctions of the kernel K(x, s) as defined in
equation (22.95).
A solution exists and is unique if λ ≠ λn (n = 1, 2, . . .). If λ = λn for some n, a solution
may exist but it is not unique.
Let
∞ b
Write
∞
u(x) = ∑ bn ϕn (x) (22.106)
n=1
This can be done since u(x) ∈ ℒ2 and eigenfunctions form a basis for ℒ2 . To deter-
mine bn , substitute equation (22.106) into equation (22.101) to get
∞ ∞ b ∞
∑ bn ϕn (x) = ∑ an ϕn (x) + λ ∫ K(x, s) ∑ bn ϕn (s) ds
n=1 n=1 a n=1
Multiply both sides by ϕj (x) and integrate and use the orthogonal property of the
eigenfunctions, which leads to
aj λj
bj = (22.107)
λj − λ
⇒
∞ aj λj
u(x) = ∑ ϕj (x) (22.108)
j=1
λj − λ
b
∞ λj ϕj (x)
u(x) = ∑ ∫ ϕj (s)f (s) ds
j=1
λj − λ
a
b ∞
λj ϕj (x)ϕj (s)
= ∫∑ f (s) ds
λj − λ
a j=1
b ∞
λ
= ∫ ∑(1 + )ϕ (x)ϕj (s)f (s) ds
λj − λ j
a j=1
b b ∞
∞ ϕj (x)ϕj (s)
= ∑ ϕj (x) ∫ ϕj (s)f (s) ds + λ ∫ ∑ f (s) ds (22.109)
j=1
λj − λ
a a j=1
⇒
b
where
∞ ϕj (x)ϕj (s)
Γ(x, s, λ) = ∑ (22.111)
j=1
λj − λ
with
1
−1
∫ x.√2 sin(2πx) dx = ≠ 0.
√2π
0
with kernel K(x, s) same as in Example 22.6, is solvable with but the solution is not
unique. It may be shown that
9
u= sin(3πx) + c2 sin(2πx)
5
Problems
1. Apply the IE method and the Adomian decomposition method to solve vector form
of one-dimensional diffusion–reaction model with linear kinetics.
2. Consider the Fredholm integral equation of the first kind
N
K(x, s) = ∑ ai (x)bi (s)
i=1
where {ai (x), i = 1, 2 . . . , N} and {bi (s), i = 1, 2, . . . , N} are linearly independent sets.
(a) Reason that the equation does not have solution unless the function f (x) can be
expressed as a linear combination of ai (x), (b) Reason that when equation (1) has
a solution, it is not unique, i. e., there could be infinitely many solutions, (c) Con-
sider equation (1) with a continuous kernel but not separable and continuous
f (x). Is the solution also continuous? Comment on the possible types of solutions,
(d) How do the results in (b) and (c) change if the kernel is also symmetric?
3. (a) Determine the eigenvalues, eigenfunctions and the resolvent kernel for the
Fredholm equation
22.11 Solution of FIE of the second kind with symmetric kernels | 547
1 d2 c dc
− − Da R(c) = 0; 0 < x < 1
Pe dx 2 dx
1 dc
= c − 1@x = 0
Pe dx
dc
= 0@x = 1
dx
where R(c) is the dimensionless reaction rate and Pe and Da are the Peclet and
Damkohler numbers, respectively. (a) Convert the boundary value problem into a
Fredholm integral equation and (b) Solve the equation in (a) for the case of linear
kinetics using the Neumann series method and determine the exit concentration
as a function of Da up to quadratic terms.
5. Consider the Volterra integral equation for human population N(t) at time t:
where f (t) is the survival function and k is a constant describing the rate of popu-
lation variation per capita [or the birth rate is k times N(t)] (a) Solve the equation
assuming a survival function of the form
t
f (t) = exp[− ]
T
where T is the average life span of a person (b) Use the result in (a) to show that
the population increases exponentially if kT > 1 and decreases exponentially if
kT < 1.
|
Part V: Fourier transforms and solution of boundary
and initial-boundary value problems
23 Finite Fourier transforms
Finite Fourier Transform (FFT) and its various extensions is the most important tool
available for scientists and engineers to solve many practical problems. The concepts
of FFT appear in the analysis of time series, spatial profiles, length and time scales,
data analysis and compression, development of numerical algorithms and so forth. In
this chapter, we discuss mainly one application of FFT, namely, the solution of linear
boundary and initial–boundary value problems (partial differential equations).
The eigenfunctions {wj (x)} form a basis for ℒ2 [a, b], the Hilbert space of Lebesque
square integrable (real valued) functions defined on the interval [a, b]. If f (x) ∈
ℒ2 [a, b], we have
where
b
and
∞ b
and the integrals are all defined in the Lebesque sense. The expansion given by equa-
tion (23.3) converges in ℒ2 [a, b], i. e., the LHS and RHS of equation (23.3) are equal al-
https://doi.org/10.1515/9783110739701-024
552 | 23 Finite Fourier transforms
most everywhere (except perhaps on a set of measure zero in [a, b]). Equations (23.3)
and (23.4) define the finite Fourier transform, i. e., given any f (x) ∈ ℒ2 [a, b] we define
the Finite Fourier Transform (FFT) of f (x) to be the infinite sequence of constants {ci },
which give the coordinates of f (x) in the Hilbert space ℒ2 [a, b]. We write
The inverse transform uses the coordinates {ci } to reconstruct the function (vector)
f (x). Thus,
∞
ℱ {ci } = ∑ ci wi (x) = f (x) (23.7)
−1
i=1
Thus,
ℱℱ
−1
= ℱ −1 ℱ = identity. (23.8)
The finite Fourier transform may be used to simplify and solve many linear differential
equations in which the spatial self-adjoint operator 𝕃 (whose eigenvalues and eigen-
functions are λi and wi (x), respectively) appears. We outline below the general proce-
dure and illustrate with several examples.
𝕃u = f (23.9)
Remarks. (a) If 𝕃 is an operator in two or three spatial dimensions, the sum may be
a double or triple sum, (b) If the BCs are inhomogeneous, the solution will have ad-
ditional terms and (c) In most of our applications, 𝕃 is usually an elliptic differential
operator such as the Laplacian operator, i. e., 𝕃 = −∇2 .
𝜕u
= −𝕃u, t > 0; u = f @t = 0 (23.12)
𝜕t
𝜕
⟨u, wn ⟩ = −λn ⟨u, wn ⟩, t > 0; (23.13)
𝜕t
⟨u, wn ⟩ = ⟨f , wn ⟩ @ t = 0 (23.14)
∞ ∞
u = ∑ ⟨u, wn ⟩wn = ∑ exp(−λn t)⟨f , wn ⟩wn (23.15)
n=1 n=1
is the formal solution. In addition, when the operator 𝕃 is a 2/3D spatial operator, the
sum may be a double or triple sum.
𝜕2 u
= −𝕃u, t > 0; (23.16)
𝜕t 2
u = f @ t = 0 (initial position) (23.17)
𝜕u
= g@t = 0 (initial velocity) (23.18)
𝜕t
𝜕2
⟨u, wn ⟩ = −λn ⟨u, wn ⟩, t > 0;
𝜕t 2
⟨u, wn ⟩ = ⟨f , wn ⟩ @ t = 0
𝜕
⟨u, wn ⟩ = ⟨g, wn ⟩ @ t = 0
𝜕t
⇒
554 | 23 Finite Fourier transforms
1
⟨u, wn ⟩ = ⟨f , wn ⟩ cos[√λn t] + ⟨g, wn ⟩ sin[√λn t] (23.19)
√λn
⇒
∞ ∞
1
u = ∑ ⟨f , wn ⟩ cos[√λn t]wn + ∑ ⟨g, wn ⟩ sin[√λn t]wn (23.20)
n=1 n=1 √λ n
d2 u
= −f (x), 0 < x < 1 (23.21)
dx 2
u(0) = 0, u(1) = 0 (23.22)
d2 w
𝕃w = − = λw, 0<x<1 (23.23)
dx2
w(0) = w(1) = 0 (23.24)
Taking FFT of equations (23.21)–(23.22), i. e., multiplying by wj (x) and integrating from
0 and 1, gives
d2 u
⟨ , w ⟩ = ⟨−f , wj ⟩
dx 2 j
⇒
du dwj x=1 d 2 wj
( wj − u ) + ⟨u, ⟩ = ⟨−f , wj ⟩
dx dx x=0 dx 2
The first term (or the concomitant) vanishes as both u(x) and wj (x) satisfy the homo-
geneous boundary conditions. Thus, we have
23.2 Application of FFT for BVPs in 1D | 555
d 2 wj
⟨u, ⟩ = ⟨−f , wj ⟩
dx 2
⇒
⇒
⟨f , wj ⟩
⟨u, wj ⟩ = . (23.26)
λj
1
2√ 2
, j odd
∫ f (ξ )√2 sin jπξ dξ = { jπ
0, j even
0
4 ∞ sin[(2k − 1)πx]
u(x) = ∑ . (23.28)
π 3 k=1 (2k − 1)3
For f (x) = 1, equations (23.21)–(23.22) can also be solved by integrating twice, which
leads to another form of the solution
1
u(x) = x(1 − x). (23.29)
2
In other words, equation (23.28) is the Fourier series expansion of equation (23.29).
The exact solution (equation (23.29)) and Fourier series expansion (equation (23.28))
with only two terms are plotted in Figure 23.1.
556 | 23 Finite Fourier transforms
Figure 23.1: Solution of Poisson’s equation with f (x) = 1: exact solution compared with the two
terms of the Fourier series solution.
It can be seen from this figure that the Fourier series solution with only two terms
represents the solution accurately in this example. The maximum value predicted by
Fourier series solution with only two terms is 0.124 as compared to 0.125 predicted by
the exact solution.
Consider again the solution given by equation (23.27)
1 1
∞ √2 sin jπx
u=∑ ∫ f (ξ )√2 sin jπξ dξ = ∫ G(x, ξ )f (ξ ) dξ ,
j=1
j2 π 2
0 0
where
∞ w (x)w (ξ )
∞
2 sin jπx sin jπξ j j
G(x, ξ ) = ∑ = ∑ (23.30)
j=1
j2 π 2 j=1
λ j
x 1
2
, 0≤x≤ 2
u(x) = { 1
(23.32)
(1−x)
2
, 2
≤ x ≤ 1.
Alternatively, the solution (equation (23.27)) obtained from the eigenfunction expan-
sion method simplifies as follows:
23.2 Application of FFT for BVPs in 1D | 557
√2 sin jπx
√2 sin jπ
∞
u=∑
j=1
j2 π 2 2
2 ∞ sin[(2k − 1)πx]
= ∑ (−1)k−1 . (23.33)
π 2 k=1 (2k − 1)2
Again, both expressions are equivalent. The exact solution and two terms FFT solution
are plotted in Figure 23.2.
Figure 23.2: Solution of Poisson’s equation with f (x) = δ(x − 21 ): exact solution compared with the
two terms Fourier series solution.
It can be seen from Figure 23.2 again that only two terms can predict the solution with
good accuracy. The largest error occurs in this case at maxima (x = 21 ), where exact
value is uexact = 41 . This error can be reduced further by including more number of
terms in the FFT solution as shown in Figure 23.3.
Remarks.
(1) If the interval is (0, a), the eigenvalues are modified to
n2 π 2
λn = (23.34)
a2
2 nπx
wn (x) = √ sin( ) (23.35)
a a
nπx a
2a ∞ sin( a ) nπξ
u(x) = 2 ∑ ∫ f (ξ ) sin( ) dξ (23.36)
π n=1 n2 a
0
558 | 23 Finite Fourier transforms
Figure 23.3: Demonstration of the convergence of Fourier series expansion with the solution of Pois-
son’s equation with f (x) = δ(x − 21 ).
d2 u
= −f (x), 0 < x < 1 (23.37)
dx 2
u(0) = α1 , u(1) = α2 (23.38)
where
d 2 u1
= −f (x) (23.40)
dx2
u1 (0) = 0, u1 (1) = 0 (23.41)
d 2 u2
=0 (23.42)
dx2
u1 (0) = α1 , u2 (1) = α2 (23.43)
u2 = (α2 − α1 )x + α1 (23.44)
d2 u
− Au = −f(x); u(0) = 0 = u(1) (23.45)
dx2
where u and f(x) are vectors with n-components and A is a constant n × n matrix. For
n = 2, let aij {i, j = 1, 2} be real constants and consider the BVP in two variables as
defined by
d 2 u1
dx 2
− a11 u1 − a12 u2 = −f1 (x) }
2 }, 0<x<1 (23.46)
d u2
dx 2
− a21 u1 − a22 u2 = −f2 (x) }
Let
⇒
a11 + m2 π 2 a12 c f
( ) ( 1m ) = ( 1m )
a21 a22 + m2 π 2 c2m f2m
Am cm = fm , Am = A + m2 π 2 I. (23.49)
2
y∗km fm 1
cm = ∑ xkm (23.50)
y∗ x λ
k=1 km km km
where λkm , xkm and y∗km are eigenvalues, eigenvectors and eigenrows of Am . Taking the
inverse Fourier transform of equation (23.50), we get
560 | 23 Finite Fourier transforms
∞
u(x) = ∑ cm wm (x)
m=1
2
∞
y∗km fm 1
= ∑ ∑ √2 sin mπx x (23.51)
m=1 k=1 y∗km xkm λkm km
where
1
f1 (ξ ) √
fm = ∫ [ ] 2 sin mπξ dξ . (23.52)
f2 (ξ )
0
Various special cases of this solution may be examined as in the previous example.
𝜕u 𝜕2 u
= 2; 0 < x < 1, t>0 (23.53)
𝜕t 𝜕x
with homogeneous boundary conditions
d2 w
𝕃w : − , w(0) = 0, w(1) = 0, (23.56)
dx2
which has eigenvalues
λ n = n2 π 2 (23.57)
d 𝜕2 u
⟨u, wn ⟩ = ⟨ 2 , wn ⟩
dt 𝜕x
x=1
𝜕2 w n 𝜕u dwn
= ⟨u, ⟩ + ( w − u )
𝜕x 2 𝜕x n dx x=0
= ⟨u, −λn wn ⟩
= −λn ⟨u, wn ⟩
(Remark: The concomitant vanishes again as both u(x, t) and wn (x) satisfy the homoge-
neous boundary conditions.) Integrating this equation using initial condition, we get
Thus,
∞
u(x, t) = ∑ ⟨f , wn ⟩e−λn t wn (x) (23.60)
n=1
∞ 1
2 2
u(x, t) = ∑ (√2 sin nπx)e−n π t ∫ f (ξ )√2 sin nπξ dξ
n=1 0
⇒
∞ 1
−n2 π 2 t
u(x, t) = 2 ∑ e sin nπx ∫ f (ξ ) sin nπξ dξ (23.61)
n=1 0
1 1
cos nπξ
∫ f (ξ ) sin nπξ dξ = −
nπ 0
0
1 − cos nπ
=
nπ
0 if n = 2k
={ 2
(2k−1)π
if n = 2k − 1, k = 1, 2, . . .
2 2
4 ∞ e−(2k−1) π t sin(2k − 1)πx
⇒ u(x, t) = ∑ (23.62)
π k=1 (2k − 1)
1
The maximum value of u (umax ) occurs at x = 2
and is given by
562 | 23 Finite Fourier transforms
2 2
1 4 ∞ (−1)k−1 e−(2k−1) π t
umax = u( , t) = ∑
2 π k=1 (2k − 1)
4 −π 2 t 1 −9π 2 t 1 −25π 2 t 4 2
= [e − e + e − ⋅ ⋅ ⋅] ≈ e−π t for t → ∞ (23.63)
π 3 5 π
1 2 2
4 ∞ e−(2k−1) π t 1
⟨u⟩ = ∫ u dx = ∑ [− cos(2k − 1)πx]0
π 2 k=1 (2k − 1)2
0
2 2
8 ∞ e−(2k−1) π t 8 2 1 2 1 2
= ∑ = 2 [e−π t + e−9π t + e−25π t + ⋅ ⋅ ⋅] (23.64)
π 2 k=1 (2k − 1)2 π 9 25
The profiles for different times are shown in Figure 23.4. From equation (23.64), it may
be observed that for t > 0.0016, one term is sufficient to estimate the average value of
u(x, t) to within 5 %.
Figure 23.4: Dimensionless temperature distribution and its evolution with time with uniform initial
source.
1
1, n=m
2 ∫ f (ξ ) sin nπξ dξ = δnm = {
0, n ≠ m
0
Equation (23.61) ⇒
∞ 2 2 2 2
u(x, t) = ∑ e−n π t sin nπxδnm = e−m π t sin[mπx] (23.65)
n=1
Figure 23.5: Solution of heat equation with initial conditions given by f (x) = sin[mπx], m = 1, 2.
∞ 2 2
u(x, t) = 2 ∑ e−n π t sin nπx sin nπs (23.66)
n=1
1
Take s = 2
(mid-point of the interval)
⇒
nπ 0, n even
sin ={
2 ±1, n odd
Take n = 2k + 1, k = 0, 1, 2, . . .
⇒
(2k + 1)π π
sin{ } = sin(kπ + )
2 2
= (−1)k
∴
∞ 2 2
u(x, t) = 2 ∑ (−1)k e−(2k+1) π t sin[(2k + 1)πx]. (23.67)
k=0
4 ∞ (−1)k −(2k+1)2 π 2 t
⟨u⟩ = ∑ e . (23.68)
π k=0 (2k + 1)
∞ 2 2
𝜕u
= 2π ∑ (−1)k e−(2k+1) π t [cos(2k + 1)πx] ⋅ (2k + 1)π. (23.69)
𝜕x k=0
𝜕u ∞ 2 2
= 2π ∑ (−1)k e−(2k+1) π t ⋅ (2k + 1)π
𝜕x x=0
k=0
∞ 2 2
= 2 ∑ (−1)k (2k + 1)e−(2k+1) π t
k=0
2 2 2
= 2[e−π t − 3e−9π t + 5e−25π t − ⋅ ⋅ ⋅] (23.70)
1 ∞ 2 2
u( , t) = 2 ∑ e−(2k+1) π t
2 k=0
2 2 2
= 2[e−π t + e−9π t + e−25π t + ⋅ ⋅ ⋅] (23.71)
This series converges for all t > 0. However, the convergence may be very slow for t
values close to zero. Some schematic profiles are shown in Figure 23.6.
Figure 23.6: Dimensionless temperature distribution and its evolution with time with point initial
source.
23.3 FFT for parabolic, hyperbolic and elliptic PDEs (two independent variables) | 565
𝜕u 𝜕2 u
− = f (x, t), 0 < x < 1, t>0 (23.72)
𝜕t 𝜕x 2
IC:
u(x, 0) = 0 (23.73)
BCs:
Let λn = n2 π 2 and wn (x) = √2 sin nπx be the eigenvalues and normalized eigenfunc-
2
tions of the operator − ddxw2 with w(0) = w(1) = 0. Let
un = ⟨u, wn ⟩
fn = ⟨f , wn ⟩
dun
+ n2 π 2 un = fn (t) (23.75)
dt
un = 0 @ t = 0 (23.76)
t
n2 π 2 t 2 2 ′
un e = ∫ en π t fn (t ′ ) dt ′ + cn
0
t
2 2
un = ∫ en π (t ′ −t)
fn (t ′ ) dt ′
0
⇒
∞ t 1
2 2
u = ∑ √2 sin nπx ∫ en π (t ′ −t)
[∫ √2f (s, t ′ ) sin nπs ds] dt ′
n=1 0 0
⇒
566 | 23 Finite Fourier transforms
∞ t 1
2 2
u(x, t) = 2 ∑ sin nπx ∫ ∫ en π (t ′ −t)
f (s, t ′ ) sin nπs ds dt ′ (23.77)
n=1 0 0
t 1
where
∞ 2 2
G(x, s, t, t ′ ) = 2 ∑ en π (t ′ −t)
⋅ sin nπx sin nπs (23.79)
n=1
then
Thus, G(x, ξ , t, τ) is the temperature (or concentration) at position x and time t due to
a unit source at position ξ at time τ(t > τ). We now consider some special cases of the
solution given above.
(i) f (x, t) = g1 (x)δ(t)
For this case, the solution is given by
∞ 1
2 2
u(x, t) = ∑ e−n π t (2 sin nπx) ∫ g1 (s) sin nπs ds (23.80)
n=1 0
This solution is identical to the solution obtained when we take the initial condi-
tion to be u(x, 0) = g1 (x)
(ii) f (x, t) = g2 (t)δ(x − x0 )
This corresponds to a point source at x = x0 whose magnitude g2 (t) varies with time.
For this case, the solution simplifies to
∞ t
2 2
u(x, t) = 2 ∑ (sin nπx)(sin nπx0 ) ∫ en π (t ′ −t)
g2 (t ′ ) dt ′ . (23.81)
n=1 0
Consider
𝜕u 𝜕2 u
= 2 0 < x < 1, t>0 (23.82)
𝜕t 𝜕x
23.3 FFT for parabolic, hyperbolic and elliptic PDEs (two independent variables) | 567
IC:
u(x, 0) = 0 (23.83)
BCs:
wn = √2 sin nπx, wn′ = √2nπ cos nπx ⇒ wn′ (0) = √2nπ (23.85)
∴
dun
= −n2 π 2 un + √2nπf (t) (23.86)
dt
un = 0 @ t = 0 (23.87)
⇒
t
2 2
un = ∫ e−n π (t−t ′ ) √
2nπf (t ′ ) dt ′ (23.88)
0
⇒
∞ t
2 2
u(x, t) = ∑ √2 sin nπx ∫ e−n π (t−t ′ ) √
2nπf (t ′ ) dt ′
n=1 0
t ∞
2 2
= ∫ ∑ (2nπ sin nπx)e−n π (t−t ′ )
f (t ′ ) dt ′ (23.89)
0 n=1
568 | 23 Finite Fourier transforms
Special case:
1, t>0
f (t) = H(t) = {
0, t<0
t 2 2 2 2 ′
t t
2 2 e−n π t ⋅ en π 1 −n2 π 2 t
∫ e−n π (t−t ′ )
dt ′ = = 2 2 [1 − e ] (23.90)
n2 π 2 0 n π
0
⇒
∞
2 sin nπx 2 2
u(x, t) = ∑ [1 − e−n π t ]
n=1 nπ
⇒
∞
2 sin nπx
u(x, ∞) = ∑ =1−x
n=1 nπ
⇒
t 2 sin nπx
∞ 2 2
u(x, t) = 1 − x − ∑ e−n π (23.91)
n=1 nπ
2 2
2 ∞ e−n π t sin nπx
= us (x) − ∑ (23.92)
π n=1 n
where us (x) is the steady-state profile. The profiles at various times are shown in Fig-
ure 23.7.
Figure 23.7: Dimensionless temperature distribution and steady-state profile for sudden rise of
temperature to unity at the left boundary.
23.3 FFT for parabolic, hyperbolic and elliptic PDEs (two independent variables) | 569
Consider
𝜕u 𝜕2 u
= 2, 0<x<1 (23.93)
𝜕t 𝜕x
BCs:
IC:
u = u1 + u2 + u3 (23.96)
where
𝜕2 u 2
2𝜕 u
= c , 0 < x < 1, t>0 (23.100)
𝜕t 2 𝜕x2
BCs:
ICs:
𝜕u
(x, 0) = g(x) (initial velocity) (23.103)
𝜕t
d2 w
, 0 < x < 1, w(0) = 0, w(1) = 0 (23.104)
dx2
λn = −n2 π 2 , n = 1, 2, . . . (23.105)
d2
⟨u, wn ⟩ = −c2 n2 π 2 ⟨u, wn ⟩ ⇒ ⟨u, wn ⟩ = c1n cos nπct + c2n sin nπct (23.107)
dt 2
IC1:
c1n = ⟨f , wn ⟩ (23.108)
IC2:
∴
⟨g, wn ⟩
⟨u, wn ⟩ = ⟨f , wn ⟩ cos nπct + sin nπct (23.110)
nπc
1 1
∞
sin nπct
u(x, t) = ∑ √2 sin nπx[cos nπct ∫ f (ξ )√2 sin nπξ dξ + ∫ g(ξ )√2 sin nπξ dξ ]
n=1 nπc
0 0
⇒
1 1
∞
sin nπct
u(x, t) = 2 ∑ sin nπx[cos nπct ∫ f (ξ ) sin nπξ dξ + ∫ g(ξ ) sin nπξ dξ ] (23.111)
n=1 nπc
0 0
1
u(x, t) = 2 ⋅ sin jπx cos jπct ⋅ α
2
= (cos jπct)(α sin jπx) (23.114)
2π 2
T= = (23.115)
jπc jc
and
jc
cyclic frequency = (23.116)
2
The solution profile for specific case of c = 1 and j = 10 is shown in Figure 23.8.
Figure 23.8: Solution of wave equation for c = 1 and initial displacement f (x) = sin[10πx].
[Remark: The cyclic frequency of mode j = 1 is called the fundamental frequency while
j ≥ 2 are referred to as overtones or harmonics.]
𝜕2 u 𝜕2 u
+ = −f (x, y); 0 < x < a; 0<y<b (23.117)
𝜕x2 𝜕y2
u = 0 on 𝜕Ω (23.118)
572 | 23 Finite Fourier transforms
This is the Poisson equation in a rectangle (Ω = (x, y) : 0 < x < a, 0 < y < b)
and represents the temperature (at steady-state) due to a source f (x, y). Equations
(23.117)–(23.118) contain two operators:
d2
𝕃1 : − , w(0) = 0, w(a) = 0
dx2
n2 π 2
eigenvalues λn = a2
eigenfunctions wn (x) = √ a2 sin( nπx
a
)
d2
𝕃2 : − , w(0) = 0, w(b) = 0
dy2
m2 π 2
eigenvalues λm = b2
eigenfunctions wm (y) = √ b2 sin( mπy
b
)
Consider the eigenvalue problem
𝜕2 w 𝜕2 w
+ 2 = −λw (23.119)
𝜕x 2 𝜕y
w(0, y) = 0, w(a, y) = 0 (23.120)
w(x, 0) = 0, w(x, b) = 0 (23.121)
d2
− λn ⟨w(x, y), wn (x)⟩ + ⟨w(x, y), wn (x)⟩ = −λ⟨w(x, y), wn (x)⟩
dy2
n2 m 2
λnm = λn + λm = π 2 ( + ) (23.123)
a2 b2
wnm (x, y) = wn (x)wm (y)
4 nπx nπy
=√ sin( ) sin( ) (23.124)
ab a b
𝜕2 w 𝜕2 w w(0, y) = 0 w(a, y) = 0
𝕃w : −( + 2 ), { (23.125)
𝜕x 2 𝜕y w(x, 0) = 0 w(x, b) = 0
in the domain Ω. Now, to solve equations (23.117)–(23.118) take inner product with wnm ,
⇒
⇒
⟨f , wnm ⟩
⟨u, wnm ⟩ =
λnm
⇒
∞ ∞
u = ∑ ∑ ⟨u, wnm ⟩wnm
n=1 m=1
∞ ∞
1 2 nπx mπy
=∑∑ sin( ) sin( )⟨f , wnm ⟩
2 n
n=1 m=1 π ( a2
2
+ m2
) √ab a b
b2
a b
2 nπξ mπη
⟨f , wnm ⟩ = ∫ ∫ f (ξ , η) sin( ) sin( ) dξdη
√ab a b
0 0
⇒
nπx mπy a b
4 ∞ ∞ sin( a ) sin( b
) nπξ mπη
u(x, y) = ∑∑ ∫ ∫ f (ξ , η) sin( ) sin( ) dξdη.
abπ 2 n=1 m=1 2 2
( n2 + m2 ) a b
a b 0 0
(23.126)
a b
nπξ mπη
I = ∫ ∫ f (ξ , η) sin( ) sin( ) dξdη
a b
0 0
a b
nπξ mπη
= ∫ sin( ) dξ ∫ sin( )dη
a b
0 0
4ab
nmπ 2
, n and m odd
={
0, if n or m even
16 ∞ ∞ sin[ (2i−1)πx
a
] sin[ (2j−1)πy
b
]
u(x, y) = 4
∑ ∑ (23.127)
π i=1 j=1 ( 2 + 2 )(2i − 1)(2j − 1)
(2i−1) 2 (2j−1) 2
a b
The top diagram shows a 3D plot of the solution in xy domain with 50×50 = 2500 terms
in the summation, while the bottom diagram corresponds to u(x, y = 21 ) versus x using
the Fourier series expansion with 2 × 2 = 4 terms (green dashed line) and 50 × 50 terms
(blue solid lines). It can be seen from the bottom plot that the Fourier series solution
with only 2 × 2 terms is sufficient to predict the solution with good accuracy in this
case. The maximum value umax = u( 21 , 21 ) is 0.07219 using the 2 × 2 terms as compared
to 0.07367 using the 50 × 50 terms in summation.
a b
f (x, y) = δ(x − )δ(y − ).
2 2
23.3 FFT for parabolic, hyperbolic and elliptic PDEs (two independent variables) | 575
The solution (equation (23.126)) can be simplified for this case as follows:
which can further be simplified for the case of a = 1 and b = 1 (unit square) as
Figure 23.10: Solution of 2D Poisson’s equation in a square domain with f (x, y) = δ(x − 21 )δ(y − 21 ).
The left diagram shows a 3D plot of the solution and the top-right diagram shows a
contour plot of solution in xy domain with 50 × 50 = 2500 terms in the summation,
while the bottom-right diagram corresponds to u(x, y = 21 ) versus x using the Fourier
series expansion with 2 × 2 = 4 terms (green dashed line) and 50 × 50 terms (blue solid
lines). It can be seen from the bottom plot that the Fourier series solution with only
2 × 2 terms is sufficient to predict the solution in most of the domain. However, the
maximum error occurs at the center (i. e., x = 21 ), where the maximum value umax =
576 | 23 Finite Fourier transforms
𝜕2 Y 𝜕2 Y 𝜕2 Y
+ + − Λ2 Y =0 (23.131)
𝜕ξ 2 𝜕η2 𝜕ρ2
𝜕Y
= 0 @ ξ = 0; Y = 1@ξ = 1 (23.132)
𝜕ξ
𝜕Y
= 0 @ η = 0; Y = 1@η = 1 (23.133)
𝜕η
𝜕Y
= 0 @ ρ = 0; Y = 1@ρ = 1 (23.134)
𝜕ρ
where Λ is a parameter known as the Thiele modulus [Remark: The above model ap-
V
plies to 81 th of a cube. If we define the normalized Thiele modulus as Φ2 = Dk ( S p )2 , then
p
2
Φ2 = Λ9 . Here, Vp is the volume of the particle and Sp is the external surface area.]
Consider the operator
𝜕2
𝕃1 : − , Y ′ (0) = 0, Y(1) = 0 (23.135)
𝜕ξ 2
Eigenvalues
π2
λn = (2n − 1)2 (23.136)
4
Eigenfunctions (normalized)
𝜕2 w 𝜕2 w 𝜕2 w
+ 2 + 2 + Λ2 − Λ2 w = 0 (23.138)
𝜕ξ 2 𝜕η 𝜕ρ
𝜕w
= 0 @ ξ = 0; w = 0@ξ = 1 (23.139)
𝜕ξ
23.4 Additional applications of FFT in rectangular coordinates | 577
𝜕w
= 0 @ η = 0; w = 0@η = 1 (23.140)
𝜕η
𝜕w
= 0 @ ρ = 0; w = 0@ρ = 1 (23.141)
𝜕ρ
Eigenvalue problem
𝜕2 w 𝜕2 w 𝜕2 w
+ 2 + 2 = −λw (23.142)
𝜕ξ 2 𝜕η 𝜕ρ
π2
λnml = [(2n − 1)2 + (2m − 1)2 + (2l − 1)2 ] (23.143)
4
Eigenfunctions (normalized)
⇒
Λ2 ⟨1, wnml ⟩
⟨w, wnml ⟩ =
⟨Λ2 + λnml ⟩
1
(2n − 1)πξ
⟨1, wnml ⟩ = (∫ √2 cos dξ )
2
0
1 1
(2m − 1)πη (2l − 1)πρ
× (∫ √2 cos dη)(∫ √2 cos dρ)
2 2
0 0
n−1 m−1
2√2(−1) 2√2(−1) 2√2(−1)l−1
= ⋅ ⋅
(2n − 1)π (2m − 1)π (2l − 1)π
3 n+m+l−3
(2√2) (−1)
=
(2n − 1)(2m − 1)(2l − 1)π 3
∴
2 n+m+l−3 πξ πη πρ
64 ∞ ∞ ∞ Λ (−1) cos[(2n − 1) 2 ] cos[(2m − 1) 2 ] cos[(2l − 1) 2
]
w(ξ , η, ρ) = ∑ ∑ ∑
π 3 n=1 m=1 l=1 (2n − 1)(2m − 1)(2l − 1)[Λ2 + λnml ]
(23.145)
578 | 23 Finite Fourier transforms
1 1 1 1 1 1
̂ = ∫ ∫ ∫ Y dξ dη dρ = 1 − ∫ ∫ ∫ w dξ dη dρ
η (23.146)
0 0 0 0 0 0
⇒
512Λ2 ∞ ∞ ∞ 1
η
̂ =1− ∑∑∑ (23.147)
π n=1 m=1 l=1 (2n − 1) (2m − 1) (2l − 1)2 [Λ2 + λnml ]
6 2 2
Remarks.
1. Solution of the same problem in the square geometry (two-dimensional case for
2
which Φ2 = Λ4 ) gives
64Λ2 ∞ ∞ 1
η
̂ =1− ∑∑ (23.148)
π 4 n=1 m=1 (2n − 1)2 (2m − 1)2 [Λ2 + π 2 (2n − 1)2 + (2m − 1)2 ]
4
8Λ2 ∞ 1 tanh Λ
η
̂ =1− ∑ = (23.149)
2
π n=1 (2n − 1) [Λ +
2 2 π2
(2n − 1)2 ] Λ
4
A plot of the effectiveness factor for 1D, 2D and 3D solutions is shown in Fig-
ure 23.11.
Figure 23.11: Effectiveness factor for 1D, 2D and 3D diffusion–reaction problems in rectangular coor-
dinates.
2. The above formulae may be used to obtain the small and large Φ asymptotes for
all three cases. These asymptotes can also be visualized from Figure 23.11.
23.4 Additional applications of FFT in rectangular coordinates | 579
Consider the 1D transient diffusion–convection model with a position and time de-
pendent source term:
𝜕c 𝜕c 𝜕2 c
+ ⟨u⟩ = D 2 + S(x, t), 0 < x < L, t>0 (23.150)
𝜕t 𝜕x 𝜕x
𝜕c
−D = ⟨u⟩[c0 (t) − c], @ x = 0, (23.151)
𝜕x
exit condition
𝜕c
= 0, @x = L
𝜕x
x ⟨u⟩t
z= , τ= (23.153)
L L
c(x, t) ⟨u⟩L
C= , Pe = (23.154)
c ∗ D
⇒
𝜕C 𝜕C 1 𝜕2 C 1 L L
+ = + ∗ S(Lz, τ)
𝜕τ 𝜕z Pe 𝜕z 2 c ⟨u⟩ ⟨u⟩
Let
L Lτ
s(z, τ) = S(Lz, ) = dimensionless source term (23.155)
⟨u⟩c∗ ⟨u⟩
⇒
𝜕C 𝜕C 1 𝜕2 C
+ = + s(z, τ) (23.156)
𝜕τ 𝜕z Pe 𝜕z 2
BC1 ⇒
580 | 23 Finite Fourier transforms
Lτ
D 𝜕C c0 ( ⟨u⟩ )
− = ⟨u⟩[ − C]
L 𝜕z c ∗
1 𝜕C 1 Lτ
− = ∗ c0 ( )−C
Pe 𝜕z c ⟨u⟩
⇒
1 𝜕C
− + C = ĉ0 (τ) @ z = 0 (23.157)
Pe 𝜕z
where
Lτ
c0 ( ⟨u⟩ )
ĉ0 (τ) = (23.158)
c∗
BC2⇒
𝜕C
= 0@z = 1 (23.159)
𝜕z
IC⇒
ci (Lz)
C= = ĉi (z) @ τ = 0 (23.160)
c∗
Thus, the dimensionless form of the model is
𝜕C 𝜕C 1 𝜕2 C
+ = + s(z, τ) (23.161)
𝜕τ 𝜕z Pe 𝜕z 2
1 𝜕C
− C = −ĉ0 (τ) @ z = 0 (23.162)
Pe 𝜕z
𝜕C
= 0@z = 1 (23.163)
𝜕z
C = ĉi (z) @ τ = 0 (23.164)
The dimensionless group Pe is the Peclet number, which is the ratio of diffusion to
convection time scales. The problem defined by equations (23.161)–(23.164) cannot be
solved by the Laplace transform method except for some special cases of the source
function s(z, t) and initial conditions ĉi (z). We obtain a formal solution for the general
case using FFT, and examine various special cases.
The spatial operator appearing in equation (23.161) is not formally self-adjoint. To
put in a self-adjoint form, we define
Pe z
C = w(z, τ). exp[ ] (23.165)
2
23.4 Additional applications of FFT in rectangular coordinates | 581
𝜕C 𝜕w Pe z
= exp[ ]
𝜕τ 𝜕τ 2
𝜕C 𝜕w Pe z Pe Pe z
= exp[ ]+ w exp[ ]
𝜕z 𝜕z 2 2 2
𝜕2 C 𝜕2 w Pe z 𝜕w Pe z Pe2 Pe z
= exp[ ] + Pe exp[ ]+ w exp[ ]
𝜕z 2 𝜕z 2 2 𝜕z 2 4 2
𝜕w 𝜕w Pe 1 𝜕2 w 𝜕w Pe2 Pe z
+ + w= [ 2 + Pe + w] + s(z, τ)e− 2
𝜕τ 𝜕z 2 Pe 𝜕z 𝜕z 4
𝜕w 1 𝜕2 w Pe
= − w + s∗ (z, τ) (23.166)
𝜕τ Pe 𝜕z 2 4
1 𝜕w w
− = −ĉ0 (τ) @ z = 0 (23.167)
Pe 𝜕z 2
1 𝜕w w
+ = 0@z = 1 (23.168)
Pe 𝜕z 2
Pe z
w = ci (z) exp(− )@τ = 0
2
Δ
= ŵ i (z) (23.169)
d2 ψ
= −Λ2 ψ, 0<z<1 (23.170)
dz 2
Pe Pe
ψ′ (0) − ψ(0) = 0, ψ′ (1) + ψ(1) = 0 (23.171)
2 2
Let Λ2n be the eigenvalues and ψn (z) be the eigenfunctions (normalized). Taking inner
product of equation (23.166) with ψn (z) gives
⇒
d 1 𝜕2 w Pe
⟨w, ψn ⟩ = ⟨ 2 , ψn ⟩ − ⟨w, ψn ⟩ + ⟨s∗ , ψn ⟩ (23.172)
dτ Pe 𝜕z 4
Now,
𝜕2 w 𝜕w 1
′ 2
⟨ , ψ ⟩ = ψ − wψ n + ⟨w, −Λn ψn ⟩
𝜕z 2 n 𝜕z n
0
𝜕w 𝜕w
= (1, τ)ψn (1) − w(1)ψ′n (1) − (0, τ)ψn (0) + w(0, τ)ψ′n (0) − Λ2n ⟨w, ψn ⟩
𝜕z 𝜕z
582 | 23 Finite Fourier transforms
Thus,
d Λ2 Pe
⟨w, ψn ⟩ = ψn (0)ĉ0 (τ) − n ⟨w, ψn ⟩ − ⟨w, ψn ⟩ + ⟨s∗ , ψn ⟩
dτ Pe 4
⇒
d Λ2 Pe
⟨w, ψn ⟩ + ( n + )⟨w, ψn ⟩ = ψn (0)ĉ0 (τ) + ⟨s∗ , ψn ⟩ (23.173)
dτ Pe 4
IC
Let
Λ2n Pe
μn = + (23.175)
Pe 4
τ τ
0 0
0
τ
′
+ ∫ e−μn (τ−τ ) ⟨s∗ (z, τ′ ), ψn ⟩ dτ′ (23.176)
0
⇒
∞
w(z, τ) = ∑ ⟨w, ψn ⟩ψn (z) (23.177)
n=1
23.4 Additional applications of FFT in rectangular coordinates | 583
1
e−μn τ ∫0 ŵ i (z ′ )ψn (z ′ ) dz ′
Pe z
∞ [ ]
[ τ ′ ]
C(z, τ) = e 2 ∑ ψn (z) [ + ∫0 e−μn (τ−τ ) ψn (0)ĉ0 (τ′ ) dτ′ ] (23.178)
n=1
[ ]
τ ′ 1
[ + ∫0 e−μn (τ−τ ) (∫0 s∗ (z ′ , τ′ )ψn (z ′ ) dz ′ ) dτ′ ]
Equation (23.178) gives the general solution to the axial dispersion model. We note that
the first term is due to the initial condition, the second term is due to the inlet condition
and the third term is due to the source term. To evaluate this solution, we need to
determine the eigenvalues and normalized eigenfunctions and their dependence on
the Peclet number.
Eigenvalue problem
d2 ψ
= −Λ2 ψ, 0 < z < 1 (23.179)
dz 2
Pe Pe
ψ′ (0) = ψ(0), ψ′ (1) = − ψ(1) (23.180)
2 2
ψ = c1 sin Λz + c2 cos Λz
ψ′ = c1 Λ cos Λz − c2 Λ sin Λz
Pe
ψ′ (0) = c1 Λ, ψ(0) = c2 , BC1 ⇒ c1 Λ = c
2 2
BC2 ⇒
Pe
c1 Λ cos Λ − c2 Λ sin Λ + [c sin Λ + c2 cos Λ] = 0
2 1
⇒
Pe Pe Pe Pe
c2 [ cos Λ − Λ sin Λ + ⋅ sin Λ + cos Λ] = 0
2 2 2Λ 2
c2 ≠ 0 ⇒
Pe2
Pe cos Λ + ( − Λ) sin Λ = 0
4Λ
⇒
Λ Pe
cot Λ = − (Characteristic equation), (23.181)
Pe 4Λ
Pe sin Λn z
ψn (z) = c2 [ + cos Λn z]. (23.182)
2 Λn
Normalized eigenfunctions
The eigenfunction can be normalized and the constant c2 can be obtained by set-
ting
∫ ψn (z)2 dz = 1
0
⇒
1 1 1
1 Pe2 sin2 Λn z Pe
= ∫ cos2 Λn z dz + ∫ dz + ∫ cos Λn z sin Λn z dz
2
c2 4 Λ2n Λn
0 0 0
2
1 sin 2Λn Pe 1 sin 2Λn Pe cos 2Λn − 1
=( + )+ ( − )− ⋅
2 4Λn 4Λ2n 2 4Λn Λn 4Λn
sin Λn cos Λn Pe2 Pe sin2 Λn Pe2 1
= (1 − ) + ⋅ + +
2Λn 4Λ2n Λn 2Λn 8Λ2n 2
1 1 Pe2 Pe
= + + (sin2 Λn + cos2 Λn )
c2 2 8Λ2n 2Λ2n
2
1 Pe2 Pe
= + +
2 8Λ2n 2Λ2n
8Λ2n
⇒ c2 = √ (23.183)
Pe2 +4 Pe +4Λ2n
Λ
Pe tan[ Λ2 ]
= { 2Λ (23.184)
4 − 2 cot[ Λ2 ],
which can be used to determine the eigenvalue Λn for a given Peclet number Pe as
shown in Figure 23.12.
23.4 Additional applications of FFT in rectangular coordinates | 585
It can be seen from this figure that for any Pe, the n-th root Λn lies in the interval [(n −
1)π, nπ], i. e.,
Λ1 ≈ √Pe and
Pe
Λn ≈ (n − 1)π + + ⋅⋅⋅, n = 2, 3, . . .
(n − 1)π
The numerical values of the first six roots of the characteristic equation (23.181), Λn are
tabulated for some Pe-values in Table 23.1.
Table 23.1: First six roots Λn of characteristic equations for some Pe values.
Pe Λ1 Λ2 Λ3 Λ4 Λ5 Λ6
Similarly, for any value of Pe, the normalized eigenfunctions corresponding to these
eigenvalues can be determined easily. As an example, Figure 23.13 shows the plots of
normalized eigenfunctions for Pe = 10 corresponding to first six eigenvalues.
Figure 23.13: Normalized eignfunctions corresponding to the first six eigenvalues of the self-adjoint
form of axial dispersion operator for Pe = 10.
Specific solutions
We consider some specific cases of the general solution of the axial dispersion model:
∞ 1 τ
Pe z
∑ e−μn τ ψn (z)[ ∫ ŵ i (z ′ )ψn (z ′ ) dz ′ + ∫ eμn τ ψn (0)ĉ0 (τ′ ) dτ′
′
C(z, τ) = e 2
n=1 0 0
τ 1
L Lτ
s(z, τ) = c∗ Lδ(Lz)δ( )
⟨u⟩c∗ ⟨u⟩
L2 Lτ
= δ(Lz)δ( )
⟨u⟩ ⟨u⟩
1
δ(αz) = δ(z)
|α|
23.4 Additional applications of FFT in rectangular coordinates | 587
where
τ 1
Pe z ′
−μn (τ−τ′ )
αn = ∫ e [∫ e− 2 ψn (z ′ )δ(z ′ ) dz ′ ]δ(τ′ ) dτ′
0 0
−μn τ
=e ψn (0). (23.192)
Pe sin Λn
ψn (0)ψn (1) = c22 [ ⋅ + cos Λn ]
2 Λn
4 Pe Λ2n sin Λn 2Λ Λ Pe
= [1 + n ( n − )]
Pe 2
+4 Pe +4Λ2n Λn Pe Pe 4Λn
Λn sin Λn 8Λ2n
= ⋅ [2 Pe + ]
2
Pe +4 Pe +4Λ2n Pe
⇒
⇒
588 | 23 Finite Fourier transforms
2 2
Pe
∞
2 Λn sin Λn (Pe +4Λn ) −( Pe24+4Λ 2
n )τ
E(τ) = e 2 ∑ ( ) e Pe
n=1 Pe 2 2
(Pe +4 Pe +4Λn )
⇒
2 2
2 Pe2 ∞ Λn sin Λn (Pe +4Λn ) −( Pe24+4Λ2
n )τ
E(τ) = e ∑ e Pe
Pe 2 2
n=1 (Pe +4 Pe +4Λn )
Pe
∞
(−1)n−1 Λ2n Pe2 +4Λ2n
= 8e 2 ∑ 2
e−( 4 Pe
)τ
(23.194)
2
n=1 (Pe +4 Pe +4Λn )
(−1)n−1 Pe Λn
sin Λn =
Pe2
Λ2n + 4
and noting that the sign of sin Λn depends on n and such that (n − 1)π ≤ Λn ≤ nπ. [For
n odd, the sign is positive and n even, sign is negative.] A plot of the solution given by
equation (23.194) is shown in Figure 17.11 for Pe = 0.5, 2.0 and 5.0.
(2) Special case 2
IC:
s∗ (z ′ , τ′ ) = 0, ŵ i (z ′ ) = 0 (23.197)
ĉ0 (τ′ ) = δ(τ′ ) (23.198)
As our next example, we consider the historical problem solved by Fourier, i. e., deter-
mining the temperature distribution in a circular ring. The problem in dimensionless
form is described by
23.4 Additional applications of FFT in rectangular coordinates | 589
𝜕u 𝜕2 u
= 2, 0 < θ ≤ 2π, t>0 (23.199)
𝜕t 𝜕θ
with BCs:
𝜕u 𝜕u
u(θ, t) = u(θ + 2π, t), (θ, t) = (θ + 2π, t) (23.200)
𝜕θ 𝜕θ
and IC:
𝜕2 w
, 0 < θ ≤ 2π (23.202)
𝜕θ2
has eigenvalues:
λn = n2 , n = 0, 1, 2, . . . (23.204)
1
w0 =
√2π
sin nθ
1 { ≡ wns =
w0 = wn = { n = 1, 2, . . . (23.205)
√π
; cos nθ
√2π ≡ wnc = ,
{ √π
⇒
590 | 23 Finite Fourier transforms
2π ∞ −n t 2 2π
1 e
u(θ, t) = ∫ f (θ′ ) dθ′ + ∑ ∫ [sin nθ ⋅ sin nθ′ + cos nθ cos nθ′ ]f (θ′ ) dθ′
2π n=1 π
0 0
2π ∞ −n t 2 2π
1 e
= ∫ f (θ′ ) dθ′ + ∑ ∫ cos[n(θ − θ′ )]f (θ′ ) dθ′ (23.206)
2π n=1 π
0 0
⇒
2π 2π
1 1 ∞
u(θ, 0) = f (θ) = ∫ f (θ′ ) dθ′ + ∑ ∫ cos[n(θ − θ′ )]f (θ′ ) dθ′ (23.207)
2π π n=1
0 0
2π
1
u(θ, ∞) = ∫ f (θ′ ) dθ′ . (23.208)
2π
0
𝜕2 u1
𝜕u1
= 𝜕x 2
+ a11 u1 + a12 u2 }
0 < x < 1, t>0 (23.209)
𝜕t
2
𝜕 u2 };
𝜕u2
𝜕t
= 𝜕x 2
+ a21 u1 + a22 u2 }
𝜕2 w
− = λw, w(0) = 0, w(1) = 0 (23.212)
𝜕x 2
has eigenvalues
λ n = n2 π 2 (23.213)
Let
Equations (23.209)–(23.211) ⇒
d V a − n2 π 2 a12 V
[ 1n ] = [ 11 ] [ 1n ] (23.216)
dt V2n a21 a22 − n2 π 2 V2n
V1n g0 0
[ ] = [ 1n
0 ] @ t = 0, gin = ⟨fi , wn ⟩ i = 1, 2 (23.217)
V2n g2n
Let
a11 − n2 π 2 a12
Bn = [ ] (23.218)
a12 a22 − n2 π 2
μ1n , μ2n = eigenvalues of Bn (23.219)
x1n , x2n = eigenvectors of Bn (23.220)
y∗1n , y∗2n = eigenrows of Bn (23.221)
2 y∗jn g0n
Vn = ∑ eμjn t xjn (23.222)
j=1
y∗jn xjn
2 y∗ g0
u1 (x, t) ∞
jn n μjn t
u=( ) = ∑ √2 sin nπx(∑ ∗ e )xjn (23.223)
u2 (x, t) n=1 y x
j=1 jn jn
The growth or decay of the solution is determined by the eigenvalues of the matrix
Bn (n = 1, 2, . . .). If all μjn have a negative real part, the initial perturbations decay to the
trivial solution, while spatial or spatiotemporal patterns may be formed when μjn cross
the imaginary axis.
Problems
1. Consider the problem of unsteady-state heat/mass transfer in a flat plate
𝜕2 θ 𝜕θ
= ; 0 < x < 1, τ>0
𝜕x 2 𝜕τ
θ(x, 0) = f (x)
𝜕θ
(0, τ) = 0
𝜕x
592 | 23 Finite Fourier transforms
𝜕θ
(1, τ) + Bi θ(1, τ) = 0
𝜕x
(a) Determine the solution using finite Fourier transformation. Compare your re-
sult with that in Carslaw and Jaeger [10].
(b) Consider the case in which f (x) = 1. What is the limiting form of the solution
for the case no external resistance (Bi → ∞) and no internal resistance (Bi → 0)?
2. (a) Obtain the solution of the Poisson’s equation
∇2 u = −f in Ω
u=0 on 𝜕Ω
in two and three dimensions when Ω is a rectangular region. Identify Green’s func-
tion and give a physical interpretation.
(b) The velocity profile for slow viscous flow of a fluid in a rectangular channel is
given by
𝜕2 u 𝜕2 u Δp
+ =
𝜕x 2 𝜕y2 μL
u = 0; @ x = 0 and x = a
u = 0; @ y = 0 and y = b
𝜕2 T 3 y2 𝜕T
kf = ⟨u⟩ρC p (1 − )
𝜕y2 2 a2 𝜕x
T = Tw @ y = ±a, T = Tin F(y), @x = 0
(a) Cast into dimensionless form and find the formal solution without determining
the eigenvalues and eigenfunctions (Graetz functions) explicitly. (b) Determine an
expression for the cup-mixing (velocity weighted) temperature (Tm ) for the case
of uniform inlet temperature, i. e., F(y) = 1 (c) If the heat transfer coefficient (h) is
defined by
−kf 𝜕T (x, y = a)
h(x) =
𝜕y
,
Tm − Tw
where Tw is the wall temperature, obtain an expression for the dimensionless heat
transfer coefficient (or the local Nusselt number)
23.4 Additional applications of FFT in rectangular coordinates | 593
h(x)a
Nu(x) =
kf
(d) Determine the two asymptotes (short and long distance) of the Nusselt number
as a function of position
4. (a) Given the operator
d2 w
Lw = − , 0 < x < 1; w′ (0) = 0, w′ (1) = 0
dx2
𝜕2 u 𝜕u
= ; 0 < x < 1, t>0
𝜕x 2 𝜕t
1
u′ (0, t) = 0, u′ (1, t) = 0, u(x, 0) = δ(x − )
2
𝜕2 u 𝜕2 u
+ = 0, 0 < x < 1, 0<y<1
𝜕x 2 𝜕y2
u(x, 0) = f (x), u(x, 1) = 0; u(0, y) = 0, u(1, y) = 0
(b) Simplify the solution for the special case of f (x) = 1 and plot a few isotherms
(corresponding to constant values of u).
6. Solve the problem
𝜕2 C 𝜕C
D = ; 0 < x < L, t > 0
𝜕x 2 𝜕t
𝜕C 𝜕C
= 0, @ x = 0, −D = kg [C − C0 (t)], @x = L
𝜕x 𝜕x
C = 0, @ t = 0
𝜕2 C 𝜕C 𝜕C 𝜕n
ε[−D +u + ] + (1 − ε) = 0, 0 < x < L
𝜕x 2 𝜕x 𝜕t 𝜕t
𝜕n
(1 − ε) = kg a(C − C ∗ ), n = kC ∗ (equilibrium relation)
𝜕t
𝜕C 𝜕C
−D = u(C0 − C), x = 0 (C0 is a constant), = 0, x=L
𝜕x 𝜕x
C = f (x), n = g(x), t = 0
Determine the solution. This is adsorption for the case in which adsorption is mass
transfer limited. C(x, t) is the concentration in the interstitial fluid and n(x, t) is the
concentration in the solid phase.
8. Consider the steady-state problem of diffusion and surface reaction in a rectangu-
lar pore. The relevant equations are given by
𝜕2 C 𝜕2 C
+ = 0, −H < y < H, 0<z<L
𝜕y2 𝜕z 2
𝜕C
C = C0 , @ z = 0; = 0, @z = L
𝜕z
𝜕C
±D + ks C = 0; @ y = ±H
𝜕y
Cast into dimensionless form and determine the solution. Use the solution to de-
termine the effectiveness factor (ratio of the actual reaction rate in pore to that if
concentration at all points inside is equal to C0 ).
9. Transient convection–reaction problems in one spatial dimension are described
by hyperbolic system of equations of the form
𝜕y 𝜕y
C1 (x) + C2 (x) = C3 (x)y; a < x < b, t > 0
𝜕t 𝜕x
B. C.: Wa y(a, t) + Wb y(b, t) = 0, t > 0
IC: y(x, 0) = f(x)
𝜕c 𝜕c 𝜕2 c
+u = D 2 + S(x, t); 0 < x < L, t > 0
𝜕t 𝜕x 𝜕x
𝜕c 𝜕c
BC: { − D = u[c0 (t) − c]@ x = 0, = 0, @ x = L, t>0
𝜕x 𝜕x
I.C : c = ci (x), @ t = 0, 0 < x < L.
Here, c is the concentration of the tracer, u is the average velocity of the stream,
D is the effective axial dispersion coefficient, S is the sources/sinks of tracer, c0 is
the inlet concentration of tracer and ci is the initial distribution of tracer.
(a) Cast the equations into dimensionless form.
(b) Obtain a formal solution to the model in (a).
(c) Simplify the solution for the special case of c0 (t) = 0, ci (x) = 0 and unit pulse
at the inlet at time zero.
(Note: The solution in case (c) gives the residence time distribution function for
the axial dispersion model.)
11. Transient diffusion–convection–reaction problems for N chemical species are de-
scribed by coupled parabolic equations of the form
𝜕2 c 𝜕c 𝜕c
D − u − Kc = , 0 < x < L, t>0
𝜕x 2 𝜕x 𝜕t
𝜕c 𝜕c
B.C: { − D = u(c0 − c), @ x = 0, = 0, @ x = L, t>0
𝜕x 𝜕x
I.C: c = f(x), t=0
where D is the dispersion coefficient, u is the velocity, K is the matrix of rate con-
stants and c is the concentration vector.
(a) Cast the equations into dimensionless form and identify the linear operators
of interest.
(b) Indicate how the equations may be decoupled into N scalar equations.
(c) Obtain a formal solution to each scalar equation. Write down the form of the
solution to the complete system of equations.
12. Obtain a formal solution to the system of partial differential equations
𝜕2 u1 𝜕2 u1
𝜕x 2
+ 𝜕z 2
+ a1 𝜕u 2
= f1 (x, z) }
0 < x < 1, 0<z<1
𝜕x
𝜕2 u2 𝜕2 u2 };
𝜕x 2
+ 𝜕z 2
+ a1 𝜕x = f2 (x, z) }
𝜕u1
𝜕u
= D∇2 u + Au; in Ω
𝜕t
u = 0 on 𝜕Ω (Dirichlet boundary conditions)
∇u.n = 0 on 𝜕Ω
du
Lu(x) = −i ; i = √−1
dx
(a) Show that this operator is self-adjoint w. r. t. the usual inner product on V,
i. e.,
a
⟨u, v⟩ = ∫ u(x)v(x) dx
0
𝜕2 u 𝜕u
= ; 0 < x < 1, t>0
𝜕x 2 𝜕t
u(0, t) = 0, u(1, t) = 0
u(x, 0) = f (x)
23.4 Additional applications of FFT in rectangular coordinates | 597
Simplify the solution for the special case of f (x) = 1 and show that for short times
it reduces to the error function solution.
16. Consider the solution of Laplace’s equation in the rectangle
𝜕2 u 𝜕2 u
+ = 0; −a < x < a, 0 < y < b (a, b > 0)
𝜕x 2 𝜕y2
u(−a, y) = 0, u(a, y) = 0, u(x, b) = 0, u(x, 0) = f (x)
d2 u
= −λu, −a < x < a (24.1)
dx2
u(−a) = u(a), u′ (−a) = u′ (a) Periodic BCs. (24.2)
n2 π 2
λn = , n = 0, 1, 2, . . . (24.3)
a2
Normalized eigenfunctions:
1
y0 (x) = (24.4)
√2a
1 nπx
un (x) = sin( ) (24.5)
√a a
1 nπx
yn (x) = cos( ); each λn (n > 0) is double (24.6)
√a a
Note that
a
∫−a y0 un (x) dx = 0 n ≠ 0
a } (orthogonality relation) (24.7)
∫−a y0 yn (x) dx =0 n ≠ 0
and
a a
0 m ≠ n
∫ ym (x)yn (x) dx = { ⇒ ∫ yn (x)ym (x) dx = δmn (24.8)
1 if m = n
−a −a
a a
0 m ≠ n
∫ um (x)un (x) dx = { ⇒ ∫ un (x)um (x) dx = δmn (24.9)
1 if m = n
−a −a
https://doi.org/10.1515/9783110739701-025
24.1 Fourier transform on (−∞, ∞) | 599
{λn , un (x), yn (x)} form a basis for ℒ2 [−a, a], Hilbert space of periodic functions with the
standard inner product
a
⟨u, y⟩ = ∫ uy dx (24.11)
−a
⇒
a a
1 ∞
1 nπx nπξ
f (x) = ∫ f (ξ ) dξ + ∑ sin( ) ∫ sin( )f (ξ ) dξ
2a n=1 a a a
−a −a
a
1 nπx nπξ
+ cos( ) ∫ cos( )f (ξ ) dξ .
a a a
−a
Thus,
600 | 24 Fourier transforms on infinite intervals
a a
1 1 ∞ nπ(x − ξ )
f (x) = ∫ f (ξ ) dξ + ∑ ∫ cos[ ]f (ξ ) dξ (24.16)
2a a n=1 a
−a −a
This is an identity for any f ∈ ℒ2 [−a, a]. We now obtain the Fourier integral formula
from this equation.
Then, as we let a → ∞, the first term on the RHS of equation (24.16) goes to zero.
Define
nπ π
αn = ⇒ Δαn = αn+1 − αn = (24.18)
a a
The sum
a
∞
1
∑ F(αn )Δαn ; F(αn ) = ∫ cos[αn (x − ξ )]f (ξ ) dξ
n=1 π
−a
∫ F(α) dα
0
This formula is also an identity for any f ∈ ℒ2 (−∞, ∞). This is known as the Fourier
integral formula. Since cosine is an even function and sine is an odd function, equation
(24.20) may be written as
24.1 Fourier transform on (−∞, ∞) | 601
∞ ∞ ∞ ∞
1 i
f (x) = ∫ ∫ f (ξ ) cos[α(x − ξ )] dξ dα + ∫ ∫ f (ξ ) sin[α(x − ξ )] dξ dα
2π 2π
−∞ −∞ −∞ −∞
⇒
∞ ∞
1
f (x) = ∫ ∫ f (ξ )eiα(x−ξ ) dξ dα
2π
−∞ −∞
∞ ∞
1
= ∫ ( ∫ f (ξ )e−iαξ dξ )eiαx dα. (24.22)
2π
−∞ −∞
If we define
∞
This transform is useful in solving equations with a second derivative operator (as well
as derivatives of other orders) on an infinite domain:
d2
− , −∞ < x < ∞
dx2
𝜕2 u 𝜕u
= , −∞ < x < ∞
𝜕x2 𝜕t
u(x, 0) = f (x)
𝜕2 u 𝜕2 u
c2 = 2, −∞ < x < ∞
𝜕x 2 𝜕t
𝜕u
u(x, 0) = f (x), (x, 0) = g(x)
𝜕t
𝜕2 u 𝜕2 u
+ = 0, −a < x < a, −∞ < y < ∞
𝜕x 2 𝜕y2
602 | 24 Fourier transforms on infinite intervals
d2 u
= −λu, −∞ < x < ∞ (24.25)
dx2
We note that u = eiαx (with α = ±√λ) satisfies the equation and is bounded if α is real.
Now,
generates a countable infinite sequence of numbers. As the size of the interval is in-
creased to infinity the spectrum becomes continuous and the finite Fourier transform,
becomes a continuous function or “the Fourier transform,”
a
Thus, F(α) plays the role of the coefficients in the finite Fourier transform.
Remark. Since the cosine is an even function, the Fourier integral formula, given in
equation (24.20), may also be written as
∞ ∞
1
f (x) = ∫ ∫ f (ξ ) cos α(x − ξ ) dξ dα
π
0 −∞
∞ ∞
1
= ∫ ∫ f (ξ ) cos α(ξ − x) dξ dα (24.29)
π
0 −∞
⇒
24.2 Finite Fourier transform and the Fourier transform | 603
∞ ∞ ∞ ∞
1 i
f (x) = ∫ ∫ f (ξ ) cos[α(ξ − x)] dξ dα + ∫ ∫ f (ξ ) sin[α(ξ − x)] dξ dα
2π 2π
−∞ −∞ −∞ −∞
⇒
∞ ∞
1
f (x) = ∫ ∫ f (ξ )eiα(ξ −x) dξ dα
2π
−∞ −∞
∞ ∞
1
= ∫ ( ∫ f (ξ )eiαξ dξ )e−iαx dα (24.30)
2π
−∞ −∞
If we define
∞ ∞
iαξ 1
F(α) = ∫ e f (ξ ) dξ ⇒ f (x) = ∫ e−iαx F(α) dα (24.31)
2π
−∞ −∞
Many authors (Carslaw and Jaeger [10]; Sneddon [28]) use these (equation (24.31)) as
the transform pair. However, we shall follow the notation used by Churchill [12] and
use the pair defined in equations (24.23) and (24.24) as
∞
The transforms given in equations (24.31) and (24.32)–(24.33) differ only by a change
of sign in α. Since α and ξ vary from −∞ to ∞, they are equivalent.
Example 24.1. Consider the function (also known as the decaying pulse)
e−cx , x≥0
f (x) = { (24.34)
0, x<0
F(α) = ∫ e −iαξ
f (ξ ) dξ = ∫ e−iαξ ⋅ e−cξ dξ
−∞ 0
∞
e−(iα+c)ξ ∞
= ∫ e−(iα+c)ξ dξ =
−(iα + c) 0
0
1 c − iα
= = (24.35)
(c + iα) (c2 + α2 )
⇒
604 | 24 Fourier transforms on infinite intervals
∞
1 1
f (x) = ∫ eiαx dα
2π (c + iα)
−∞
∞
1 c − iα iαx
= ∫ 2 e dα; (c + iα = s)
2π c + α2
−∞
c+i∞
1 1
= ∫ e(s−c)x ds
2πi s
c−i∞
1
= e−cx ⋅ ℓ−1 { }
s
e−cx , x ≥ 0
={
0, x<0
For an extensive table of Fourier transforms, see the book by Churchill [11].
We note that
∞ ∞
1 1 2
‖f ‖2 = ∫ e−2cx dx =
= ∫ F(α) dα.
2c 2π
0 −∞
This relation is similar to Parseval’s theorem and is known as the Plancherel’s theo-
rem. We discuss it in more general form below.
d2 u
= −α2 u, −∞ < x < ∞ (24.36)
dx2
Every α2 > 0 is an eigenvalue with eigenfunction uα (x) = e−iαx . Note that eiαx is also an
eigenfunction corresponding to eigenvalue α2 . However if we let α vary from −∞ to ∞
we need to consider only one of the eigenfunctions. Thus, we have a continuous spec-
trum. To show that the eigenvalue problem is self-adjoint, we consider two functions
u, v ∈ ℒ2 (−∞, ∞), a Hilbert space with the usual inner product. Now,
𝜕2 u(x) 𝜕2 u(x)
∞
⟨ , v⟩ = ∫ v(x) dx
𝜕x 2 𝜕x 2
−∞
𝜕2 v
∞
𝜕v
∞
𝜕u
=( v − u ) + ∫ u 2 dx
𝜕x 𝜕x −∞ 𝜕x
−∞
𝜕u
𝜕v
∞
= ⟨u, Lv⟩ + ( v − u ) (24.37)
𝜕x 𝜕x −∞
24.2 Finite Fourier transform and the Fourier transform | 605
Thus, we may use the formalism we had before. Consider again the Hilbert space
ℒ2 (−∞, ∞) with the usual inner product
then
∞
iαx
F(α) = ⟨f (x), e ⟩ = ∫ e−iαξ f (ξ ) dξ (24.39)
−∞
Thus, F(α) plays the role of the coefficients in the finite Fourier transform, as already
mentioned earlier.
dm f
ℱ{ } = (iα)m F(α), m = 1, 2, 3, . . . (24.42)
dxm
2. Multiplication by x:
dF
ℱ {xf (x)} = i (24.43)
dα
3. Shift in x:
iαc
ℱ {f (x + c)} = e F(α), c real (24.44)
4. Shift in α:
icx
ℱ {f (x)e } = F(α − c), c real (24.45)
5. Scaling in x:
1 x
ℱ{ f ( )} = F(αc), c real, c ≠ 0 (24.46)
|c| c
6. Reflection in x:
8. Convolution:
∞
−∞
∞
1
ℱ {δ(x − s)} = e
−iαs
, δ(x − s) = ∫ eiα(x−s) dα (24.50)
2π
−∞
∞
(−iα)k
F(α) = ∑ Mk (24.51)
k=0
k!
where Mk is kth spatial moment of f (x) [see further explanation in the next sec-
tion].
24.2 Finite Fourier transform and the Fourier transform | 607
∞
2 2 1 2
f (x) = ∫ f (x) dx = F(α) . (24.53)
2π
−∞
The LHS of equation (24.53) may be interpreted as the total energy content of f (x)
while the RHS represents the same in the frequency domain.
F(α) = ∫ f (ξ )e−iαξ dξ
−∞
(−iαξ )k
∞ ∞
= ∫ f (ξ ) ∑ dξ
k=0
k!
−∞
(−iα)k
∞ ∞
F(α) = ∑ ∫ ξ k f (ξ ) dξ
k=0
k!
−∞
∞ k
(−iα)
= ∑ Mk , (24.54)
k=0
k!
Mk = ∫ ξ k f (ξ ) dξ , k = 0, 1, 2, . . . (24.55)
−∞
mk = ∫ (ξ − M1 )k f (ξ ) dξ , k = 1, 2, 3, . . . (24.56)
−∞
608 | 24 Fourier transforms on infinite intervals
Thus, if the Fourier transform of a function f (x) is known, we can determine the kth
spatial moment [or temporal moment for a function of time] without inverting the
Fourier transform as follows from equation (24.51):
M0 = F|α=0
dF
M1 = −i
dα α=0
d2 F
M2 = (−i)2 2
dα α=0
dk F
Mk = (−i)k k , k = 0, 1, 2, . . . (24.57)
dα α=0
𝜕c 𝜕c 𝜕2 c
+u = Dm 2 , −∞ < x < ∞, t>0
𝜕t 𝜕x 𝜕x
with initial condition
c(x, 0) = δ(x);
⇒
dĉ
+ u(iα)ĉ = Dm (iα)2 ĉ; ĉ(t = 0) = 1
dt
⇒
ĉ = exp[−(iαu + α2 Dm )t]
(iαu + α2 Dm )2 t 2
= 1 − (iαu + α2 Dm )t + − ⋅⋅⋅
2!
u2 t 2
= 1 − iαut − α2 (Dm t + ) + ⋅⋅⋅
2
⇒
24.2 Finite Fourier transform and the Fourier transform | 609
M0 = 1,
M1 = ut,
M2 = 2Dm t + u2 t 2 ,
M3 = 6Dm t 2 u + u3 t 3 ,
m2 = σ 2 = M2 − M12 = 2Dm t
m3 = 0
It can be shown that all the odd central moments are zero. Thus, the dispersion is
symmetric around the centroid located at ut. [Remark: This result is not valid in a
finite domain due to inlet and exit boundary conditions.]
(24.58)
−∞
∞
1
f (x) = ℱ {F(α)} =
−1
∫ eiαx F(α) dα
2π
−∞
α
If we use cyclic frequency ω = 2π
(cycle/cm), the transform pair is defined by
(24.59)
−∞
∞
Note that the constant multiplier 2π in inverse FT in spatial frequency has disappeared
when cyclic frequency is used. The cyclic frequency definition is used in analyzing
signals in time or spatially periodic structures.
610 | 24 Fourier transforms on infinite intervals
α1 α2
where α1 and α2 are in rad/cm. If we use cyclic frequencies ω1 = 2π
and ω2 = 2π
in
cycle/cm, the transform pair can be defined by
∞ ∞
α1
Similarly, in 3D, we can define the pair using the spatial frequency vector α = ( α2 ) as
α3
∞ ∞ ∞
ω1
or in the cyclic frequency vector ω = ( ω2 ) as
ω3
∞ ∞ ∞
x
where x = ( y ) is the vector of spatial coordinates, and α.x = α1 x + α2 y + α3 z and
z
ω.x = ω1 x + ω2 y + ω3 z represent the usual dot product in ℝ3 .
24.2 Finite Fourier transform and the Fourier transform | 611
Let
then
∞ ∞
1
∫ f (x)g(x) dx = ∫ F(α)G(α) dα (24.65)
2π
−∞ −∞
[As shown below, the 2π factor disappears if we use cyclic frequencies.] As stated ear-
lier, this is known as Plancherel’s theorem. A proof of this theorem uses integral rep-
resentation of the Dirac delta function:
∞
1
δ(α − α ) = ∫ eix(α−α ) dx (24.66)
′
′
2π
−∞
Since we have
∞
1
f (x) = ∫ eiαx F(α) dα
2π
−∞
2π
−∞
we can write
∞ ∞ ∞ ∞
1
∫ f (x)g(x) dx = ∫ ∫ ∫ eiαx F(α)e−iα x G(α′ ) dα′ dα dx
′
(2π)2
−∞ −∞ −∞ −∞
∞ ∞ ∞
1 1
∫ ∫ F(α)G(α′ )( ∫ eiαx e−iα x dx) dα′ dα
′
=
2π 2π
−∞ −∞ −∞
∞ ∞
1
= ∫ ∫ F(α)G(α′ )δ(α − α′ ) dα′ dα
2π
−∞ −∞
∞ ∞
1
= ∫ ∫ F(α)G(α) dα.
2π
−∞ −∞
∞ ∞
2 1 2
∫ f (x) dx = ∫ F(α) dα (24.67)
2π
−∞ −∞
∞
2 α
= ∫ F(2πω) dω, ω= . (24.68)
2π
−∞
𝜕2 u 𝜕u
= , −∞ < x < ∞ (24.69)
𝜕x 2 𝜕t
u(α,
̂ t) = ℱ {u(x, t)}
−∞
𝜕2 u iαx 𝜕u
⟨ 2
,e ⟩ = ⟨ , eiαx ⟩
𝜕x 𝜕t
d
−α2 ⟨u, eiαx ⟩ = ⟨u, e+iαx ⟩
dt
dû
=
dt
⇒
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 613
2
⟨u, e+iαx ⟩ = Ke−α t
⇒
2
lim⟨u, e+iαx ⟩ = K lim e−α t or ⟨lim u, e+iαx ⟩ = K
t→0 t→0 t→0
∴
2
⟨u, eiαx ⟩ = F(α)e−α t (24.73)
⇒
∞
1 2
u(x, t) = ∫ eiαx F(α)e−α t dα
2π
−∞
∞ ∞
1 2
= ∫ eiαx e−α t ( ∫ e−iαξ f (ξ ) dξ ) dα
2π
−∞ −∞
∞ ∞
1 2
= ∫ ∫ eiα(x−ξ )−α t f (ξ ) dξ dα (24.74)
2π
−∞ −∞
Direct method
Let
∞
2
I = ∫ e−α t+iα(x−ξ ) dα
−∞
∞
2
= ∫ e−α t ⋅ [cos α(x − ξ ) + i sin α(x − ξ )] dα
−∞
∞ ∞
2 2
= ∫ e−α t cos α(x − ξ ) dα + i ∫ e−α t sin α(x − ξ ) dα
−∞ −∞
∞
2
= 2 ∫ e−α t cos α(x − ξ ) dα
0
π − (x−ξ4t )2
=2⋅√ e
4t
614 | 24 Fourier transforms on infinite intervals
∴ Equation (24.74) ⇒
∞
1 π (x−ξ )2
u(x, t) = ∫ 2 ⋅ √ e− 4t f (ξ ) dξ
2π 4t
−∞
∞
1 (x−ξ )2
= ∫ e− 4t f (ξ ) dξ (24.75)
√4πt
−∞
where
(x − ξ )
δ= (24.77)
2t
∴
(x−ξ )2
I = e− 4t J, (24.78)
∞
2
J = ∫ e−t(α−iδ) dα (24.79)
−∞
to evaluate J, we use Cauchy’s theorem around the contour shown in Figure 24.1.
Figure 24.1: Schematic of the contour for Cauchy theorem to evaluate integral.
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 615
2
Since g(z) = e−t(z−iδ) is analytic inside and on the boundary of C,
2
∫ e−t(z−iδ) dz = 0,
C
⇒
R −R
−t(x−iδ)2 −t(z−iδ)2 2 2
∫e dx + ∫ e dz + ∫ e−tx dx + ∫ e−t(z−iδ) dz = 0 (24.80)
−R Γ2 R Γ4
δ
2 2
∫ e−t(z−iδ) dx = i ∫ e−t[R+iu−iδ] du
Γ2 0
δ
2 2
= ie−tR ∫ et[u−δ] ⋅ e−2iRt(u−δ) du = 0 for R → ∞
0
Similarly,
lim ∫ = 0
R→∞
Γ4
Equation (24.80) ⇒
∞ ∞
−t(x−iδ)2 2 π
∫ e dx = ∫ e−tx dx = √ (24.81)
t
−∞ −∞
∴
∞
1 (x−ξ )2
u(x, t) = ∫ e− 4t f (ξ ) dξ (24.82)
√4πt
−∞
the integral on the RHS of equation (24.82) converges absolutely and uniformly in both
x and t for t > 0, as well as all of its derivatives w. r. t. x and t. Thus, differentiation
under the integral sign is valid. To verify the initial condition, let
x−ξ
= y ⇒ ξ = x − y√4t ⇒ dξ = −√4t dy (24.83)
√4t
⇒
616 | 24 Fourier transforms on infinite intervals
∞
1 2
u(x, t) = ∫ e−y f (x − 2y√t) ⋅ (−√4t) dy
√4πt
−∞
∞
1 −y2
= ∫ e f (x − 2y√t) dy (24.84)
√π
−∞
If f (x) is sectionally smooth or piecewise continuous, then the integral converges uni-
formly and absolutely in x and t. Hence, we can take the limit under the integral sign,
∞
1 2
lim u(x, t) = ∫ e−y lim f (x − 2y√t) dy
t→0 √π t→0
−∞
= f (x) (24.85)
∴ For all f (x) for which ∫−∞ |f (x)| dx exists, the solution of equations (24.69)–(24.70) is
∞
Physical interpretation
Let
f (x) = δ(x − s)
= unit source of heat at position x = s at time t = 0 (24.86)
1 (x−s)2
u(x, t) = e− 4t
√4πt
= temperature at position x and time t due to
a unit source at position s at t = 0 (24.87)
More generally,
1 (x−s)2
W(x, t, s, τ) = e− 4(t−τ)
√4π(t − τ)
= temp. at position x at time t due to a unit source
at position s at time τ(t > τ) (24.88)
W is called a fundamental solution (or the Green’s function) of the heat equation. It
satisfies the equation,
𝜕W 𝜕2 W
− = δ(x − s)δ(t − τ), (24.89)
𝜕t 𝜕x 2
𝜕W 𝜕2 W
− − = δ(x − s)δ(t − τ) (24.90)
𝜕τ 𝜕s2
This solution is valid for any piecewise continuous f (ξ ) for which the integral in equa-
tion (24.91) exists. We now consider some special cases:
Special case
Consider
1, ξ ≤0
f (ξ ) = { (24.92)
0, ξ >0
⇒
0
1 (x−ξ )2
u= ∫ e− 4t dξ (24.93)
√4πt
−∞
Let
x−ξ
=η
√4t
x
√4t ∞
1 2 1 2
u= ∫ e−η (−√4t) dη = ∫ e−η dη
√4πt √π
∞ x
√4t
x x
√4t ∞ √4t
1 2 2 2 2
= ⋅ [ ∫ e−η dη + ∫ e−η dη − ∫ e−η dη]
2 √π
0x 0
√4t
x
√4t
1 2 2
= [1 − ∫ e−η dη]
2 √π
0
1 x
u(x, t) = [1 − erf( )] (24.94)
2 √4t
Figure 24.2: Temporal evolution of dimensionless temperature distribution for 1D transient diffusion
in infinite domain.
𝜕2 u 𝜕u
D = (24.95)
𝜕x 2 𝜕t
then
1 x
u = [1 − erf( )], (24.96)
2 √4Dt
𝜕2 u 𝜕u
= , 0 < x < ∞, t>0 (24.97)
𝜕x 2 𝜕t
u = 0@x = 0 (24.98)
In the solution (equation (24.91)), suppose that we assume that f (ξ ) is odd, i. e.,
f (−ξ ) = −f (ξ ) (24.100)
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 619
Then
0 ∞
1 (x−ξ )2 (x−ξ )2
u(x, t) = [ ∫ e− 4t f (ξ ) dξ + ∫ e− 4t f (ξ ) dξ ] (24.101)
√4πt
−∞ 0
0
(x+s)2
I1 = − ∫ e− 4t f (−s) ds
∞
0
(x+s)2
= ∫ f (s)e− 4t ds
∞
∞
(x+ξ )2
= − ∫ f (ξ )e− 4t dξ
0
Thus,
∞
1 (x−ξ )2 (x+ξ )2
u(x, t) = [ ∫ (e− 4t − e− 4t )f (ξ ) dξ ]. (24.102)
√4πt
0
Since the integral (equation (24.102)) converges uniformly and absolutely, we can take
limit under the integral sign:
⇒
u(0, t) = 0
Example: f (ξ) = 1
⇒
∞ ∞
1 (x−ξ )2 1 (x+ξ )2
u(x, t) = ∫ e− 4t dξ − ∫ e− 4t dξ
√4πt √4πt
0 0
1 x 1 x
= [1 + erf( )] − [1 − erf( )]
2 √4t 2 √4t
x
= erf( ). (24.103)
√4t
This is the solution to the heat equation in a semi-infinite domain with initial temper-
ature of unity and boundary (x = 0) temperature of zero (for t > 0). Figure 24.3 shows
the spatial profiles of the solution at various times.
620 | 24 Fourier transforms on infinite intervals
Figure 24.3: Spatial profiles of solution u(x, t) at various times for heat equation in semi-infinite
domain.
Nonhomogeneous problem
Consider the heat/diffusion equation in a semi-infinite domain
𝜕2 u 𝜕u
= , 0<x<∞ (24.104)
𝜕x2 𝜕t
with boundary and initial conditions
Define
w =u−1 (24.106)
𝜕2 w 𝜕w
= , (24.107)
𝜕x 2 𝜕t
w(0, t) = 0, w(x, 0) = f (x) − 1 = F(x) (24.108)
⇒
∞
1 (x−ξ )2 (x+ξ )2
u=1+ [ ∫ [e− 4t − e− 4t ][f (ξ ) − 1] dξ ] (24.109)
√4πt
0
For special case of f (x) = 0, the solution given in equation (24.109) reduces to
x x
u(x, t)|f =0 = u0 (x, t) = 1 − erf( ) = erfc( ). (24.110)
√4t √4t
𝜕2 u 𝜕u
= , u(0, t) = 1, u(x, 0) = 0 (24.111)
𝜕x 2 𝜕t
Consider the more general case of a nonhomogeneous problem given by
𝜕2 u 𝜕u
= , 0<x<∞ (24.112)
𝜕x 2 𝜕t
u = g(t) @ x = 0, u = f (x) @ t = 0 (24.113)
We have seen how to solve this problem for g(t) = 1. Call this solution U(x, t), i. e.,
∞
1 (x−ξ )2 (x+ξ )2
U(x, t) = 1 + [ ∫ [e− 4t − e− 4t ][f (ξ ) − 1] dξ ] (24.114)
√4πt
0
𝜕U
W(x, t) = (x, t) (24.115)
𝜕t
Then
t t
𝜕U
u(x, t) = ∫ W(x, t − τ)g(τ) dτ ⇒ u(x, t) = ∫ (x, t − τ)g(τ) dτ (24.116)
𝜕t
0 0
x2
∞
2 2
u(x, t) = ∫ g(t − 2 )e−λ dλ (24.117)
√π
x
4λ
√4t
x2
∞ ∞
2 2 1 (x−ξ )2 (x+ξ )2
u(x, t) = ∫ g(t − 2 )e−λ dλ + [ ∫ [e− 4t − e− 4t ]f (ξ ) dξ ]. (24.118)
√π 4λ √4πt
x 0
√4t
of the equation
622 | 24 Fourier transforms on infinite intervals
f (−x) = f (x)
⇒
0 ∞
1 (x−ξ )2 (x−ξ )2
u(x, t) = [ ∫ e− 4t f (ξ ) dξ + ∫ e− 4t f (ξ ) dξ ]
√4πt
−∞ 0
0 ∞
1 2 (x−ξ )2
= [− ∫ e − (x+s)
4t f (−s) ds + ∫ e− 4t f (ξ ) dξ ]
√4πt
∞ 0
∞
1 (x+ξ )2 (x−ξ )2
= ∫ [e− 4t + e− 4t ]f (ξ ) dξ
√4πt
0
⇒
∞
𝜕u 1 2(x + ξ ) − (x+ξ4t )2 2(x − ξ ) − (x−ξ4t )2
= ∫ [− e − e ]f (ξ ) dξ
𝜕x √4πt 4t 4t
0
𝜕u
=0
𝜕x x=0
∴ The solution of
𝜕2 u 𝜕u
= , 0<x<∞
𝜕x 2 𝜕t
𝜕u (24.119)
(0, t) = 0
𝜕x
u(x, 0) = f (x)
is given by
∞
1 (x+ξ )2 (x−ξ )2
u(x, t) = ∫ [e− 4t + e− 4t ]f (ξ ) dξ (24.120)
√4πt
0
𝜕u 𝜕2 u
= 2, 0<x<∞
𝜕t 𝜕x
u(x, 0) = f (x),
u(0, t) = 0,
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 623
we first consider the same problem in a finite domain of length a (with u(a, t) = 0).
The solution for this case is given by
a
2 ∞ − n2 π2 2 t nπx nπs
u(x, t) = ∑ e a sin( ) ∫ f (s) sin( ) ds
a n=1 a a
0
Let
nπ π
αn = ⇒ Δαn = and a → ∞
a a
⇒
∞ ∞
2 2
u(x, t) = ∫ e−α t sin αx( ∫ f (s) sin αs ds) dα
π
0 0
eiαx − e−iαx eiαs − e−iαs
∞∞
2 2
= ∫ ∫ e−α t f (s)[ ][ ] ds
π 2i 2i
0 0
∞∞
1 2
= ∫ ∫ e−α t [cos α(x − s) − cos α(x + s)]f (s) dα ds
π
0 0
⇒
∞
1 (x−s)2 (x+s)2
u(x, t) = ∫ [e− 4t − e− 4t ]f (s) ds.
√4πt
0
Thus, the solution in a semi-infinite domain may be obtained from that of finite do-
main by taking the limiting process. Other problems of infinite and semi-infinite do-
mains may also be solved in a similar way.
Fs (α) = √ π2 ∫0 f (ξ ) sin αξ ⋅ dξ }
∞
l
2 nπξ
F(n) = √ ∫ f (ξ ) sin( ) dξ (24.124)
l l
0
2 ∞ nπx
f (x) = √ ∑ F(n) sin( ) (24.125)
l n=1 l
d2
, 0<x<∞ (24.126)
dx2
u(0) = 0. (24.127)
Fc (α) = √ π2 ∫0 f (ξ ) cos αξ dξ }
∞
d2
, 0<x<∞ (24.130)
dx2
u′ (0) = 0 (24.131)
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 625
𝜕2 u 𝜕u
= , 0 < x < ∞, t>0 (24.132)
𝜕x 2 𝜕t
𝜕u
(0, t) − Bi u(0, t) = 0, (24.133)
𝜕x
and initial condition
d2 w
= −λw (24.135)
dx 2
w′ − Bi w = 0, x = 0 (24.136)
w + Bi w = 0,
′
x=l (24.137)
l
∞
αn cos αn x + Bi sin αn x
f (x) = 2 ∑ ∫ f (ξ )[αn cos αn ξ + Bi sin αn ξ ] dξ (24.138)
n=1 (αn2 + Bi2 )l + 2 Bi
0
where
αn Bi
cot αn l = − (24.139)
2 Bi 2αn
π
Δαn = αn+1 − αn =
l
Let l → ∞ and replace the Riemann sum by an integral
⇒
∞ ∞
2 α cos αx + Bi sin αx
f (x) = ∫ ( ∫ (α cos αξ + Bi sin αξ )f (ξ ) dξ ) dα (24.140)
π α2 + Bi2
0 0
For some class of functions, this is an identity and is another type of Fourier transform.
We define the transform pair by
626 | 24 Fourier transforms on infinite intervals
∞
2 (α cos αξ + Bi sin αξ )
F(α) = √ ∫ f (ξ ) dξ (24.141)
π √α2 + Bi2
0
∞
2 (α cos αx + Bi sin αx)
f (x) = √ ∫ F(α) dα (24.142)
π √α2 + Bi2
0
This transform pair is useful for solving the heat equation with radiation BCs, i. e.,
d2
, 0<x<∞ (24.143)
dx2
u′ − Bi ⋅u = 0 @ x = 0 (24.144)
Let
u(α,
̂ t) = ℱ (u(x, t))
∞
2 α cos αx + Bi sin αx
=√ ∫ u(x, t) dx (24.145)
π √α2 + Bi2
0
𝜕2 u
ℱ{ (x, t)} = −α2 u(α,
̂ t)
𝜕x 2
u(α,
̂ 0) = ℱ (f (x)) = F(α)
⇒
2
û = e−α t ⋅ F(α) (24.146)
⇒
∞ ∞
2 2 α cos αx + Bi sin αx 2 α cos αξ + Bi sin αξ
u(x, t) = √ ∫ e−α t ( ∫ √ f (ξ ) dξ ) dα
π √α2 + Bi2 π √α2 + Bi2
0 0
−α2 t
∞∞
2 e .f (ξ )[α cos αx + Bi sin αx][α cos αξ + Bi sin αξ ]
= ∫∫ dξ dα (24.147)
π (α2 + Bi2 )
0 0
α Bi
Let cos θ = √α2 +Bi2
⇒ sin θ = √α2 +Bi2
. Then the second integral may be written as
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 627
∞
2
I = ∫ e−α t cos(αx − θ) cos(αξ − θ) dα (24.149)
0
∞
1 2
= ∫ e−α t [cos(α(x + ξ ) − 2θ) + cos α(x − ξ )] dα (24.150)
2
0
∞ ∞
1 2 1 2
= ∫ e−α t cos(αx + αξ − 2θ) dα + ∫ e−α t cos α(x − ξ ) dα. (24.151)
2 2
0 0
Thus, it may be shown after algebraic simplifications and evaluation of the integrals
that
∞
1 (x+ξ )2 (x−ξ )2
u(x, t) = ∫ [e− 4t + e− 4t ]f (ξ ) dξ
√4πt
0
∞
2 x+ξ
− Bi ∫ eBi t+Bi(x+ξ )
erf c[ + Bi √t]f (ξ ) dξ (24.152)
√4t
0
Remarks.
(1) The solution of
is given by
∞
1 (x+ξ )2 (x−ξ )2
u = ∫√ [e− 4t + e− 4t ]f (ξ ) dξ
4πt
0
∞
2 x+ξ
− Bi ∫ eBi t+Bi(x+ξ )
erf c[ + Bi √t]f (ξ ) dξ
√4t
0
∞ 2
e−x /4(t−τ) 2 x
+ Bi ∫ { − Bi eBi t+Bi(x+ξ ) erf c[ + Bi √t − τ]}ϕ(τ) dτ
√π(t − τ) √4(t − τ)
0
(24.156)
(2) Many problems in the infinite and semi-infinite regions can also be solved by using
the Laplace transformation. We refer to the books (Carslaw and Jaeger [10]; Crank
[16]) for further examples.
628 | 24 Fourier transforms on infinite intervals
d2
, −∞ < x < ∞, (24.157)
dx2
which has continuous spectrum {α2 > 0} with eigenfunction {e±iαx }. Let
∞
−∞
Inversion formula
∞
1
f (x) = ∫ F(α)eiαx dα (24.159)
2π
−∞
Wave equation
𝜕2 u 𝜕2 u
2
= c2 2 , −∞ < x < ∞, t>0 (24.160)
𝜕t 𝜕x
IC:
𝜕u
u(x, 0) = f (x), (x, 0) = g(x), −∞ < x < ∞ (24.161)
𝜕t
∞
̂ t) = ℱ [u(x, t)] = ∫ u(x, t)e−iαx dx
u(α, (24.162)
−∞
Now
𝜕n u
ℱ{ } = (iα)n ⋅ ℱ {u(x, t)}
𝜕x n
Let
d2 û
= −c2 α2 û
dt 2
dû
u(0)
̂ = F(α), (0) = G(α)
dt
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 629
⇒
G(α)
u(α,
̂ t) = F(α) cos[αct] + sin[αct] (24.164)
αc
∴
∞
1
u(x, t) = ∫ u(α,
̂ t)e+iαx dα
2π
−∞
∞ ∞
1 1 G(α)
= ∫ F(α) cos αcte+iαx dα + ∫ sin αcte+iαx dα (24.165)
2π 2π αc
−∞ −∞
1
u1 (x, t) = [f (x + ct) + f (x − ct)] (24.167)
2
eiαct − e−iαct
∞ ∞
1
= ∫ ∫ eiαx−iαξ g(ξ )( ) dξ dα
2πc 2iα
−∞ −∞
x 1 x
If h(x) = ∫0 f (x) dx, then ℱ {h(x)} = iα
ℱ {f (x)} ⇒ ℱ −1 { F(α)
iα
} = ∫0 f (x) dx,
∴
x−ct x+ct
−1 1
u2 (x, t) = ∫ g(λ) dλ + ∫ g(λ) dλ
2c 2c
0 0
x+ct
1
= ∫ g(λ) dλ (24.168)
2c
x−ct
∴
x+ct
f (x + ct) + f (x − ct) 1
u(x, t) = + ∫ g(λ) dλ (24.169)
2 2c
x−ct
𝜕2 u 2
2𝜕 u
= c , 0 < x < ∞, t>0 (24.170)
𝜕t 2 𝜕x 2
BC:
IC:
𝜕u
u(x, 0) = 0, (x, 0) = 0 (24.172)
𝜕t
∞ ct
(ii) 𝜕2 u 𝜕2 u
= 2, 0 < x < ∞, t>0 (24.175)
𝜕t 2 𝜕x
BC:
ux (0, t) = 0 (24.176)
IC:
𝜕u
u(x, 0) = f (x), (x, 0) = 0 (24.177)
𝜕t
Using the cosine transform, one gets
f (x+t)+f (t−x)
2
, 0<x<t
u(x, t) = { f (x+t)+f (x−t)
(24.178)
2
, x>t
𝜕2 u 2
2𝜕 u
= c + q(x, t), −∞ < x < ∞, t>0 (24.179)
𝜕t 2 𝜕x 2
with IC
𝜕u
u(x, 0) = 0, (x, 0) = 0 (24.180)
𝜕t
may be written in the form
t x+c(t−τ)
1
u(x, t) = ∫ ∫ q(ξ , τ) dξ dτ (24.181)
2c
0 x−c(t−τ)
In this section, we consider the solution of Laplace’s equation using the Fourier trans-
form method.
𝜕2 u 𝜕2 u
+ = 0; −∞ < x < ∞, 0<y<a (24.182)
𝜕x 2 𝜕y2
632 | 24 Fourier transforms on infinite intervals
Let
d2 û
− α2 û = 0
dy2
u(a)
̂ = 0, u(0)
̂ = F(α)
F(α)
u(a)
̂ = 0 ⇒ c1 = 0 and u(0)
̂ = F(α) ⇒ c2 = sinh αa
. Therefore,
𝜕2 u 𝜕2 u
+ = 0, −∞ < x < ∞, y>0 (24.189)
𝜕x 2 𝜕y2
u(x, 0) = f (x) (24.190)
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 633
𝜕2 u
ℱs { } = −α2 û s + αu(x, 0) (24.192)
𝜕y2
d2 û s
− α2 û s = −αf (x), −∞ < x < ∞ (24.193)
dx2
αeiβx
∞ ∞
1
û s = ∫ 2 ( ∫ f (ξ )e−iβξ dξ ) dβ (24.194)
2π α + β2
−∞ −∞
and
∞
2
u(x, y) = √ ∫ û s (x, α) sin αy dy
π
0
αeiβx
∞ ∞ ∞
2 1
= √ ∫[ ∫ 2 ( ∫ f (ξ )e−iβξ dξ ) dβ] sin αy dα
π 2π α + β2
0 −∞ −∞
but
αeiβ(x−ξ )
∞
αe−α(x−ξ )
∫ dβ = 2πi = πe−α(x−ξ )
α2 + β 2 2αi
−∞
⇒
∞ ∞
1
u(x, y) = √ ∫ ∫ f (ξ )e−α(x−ξ ) sin αy dξ dα
2π
0 −∞
∞
y f (ξ )
= ∫ dξ (24.195)
√2π (x − ξ )2 + y2
−∞
1, |x| ≤ 1
f (x) = {
0, |x| > 1,
1 1−x 1+x
u(x, y) = {tan−1 [ ] + tan−1 [ ]}
√2π y y
1, |x|≤1
Figure 24.4: Solution profile for 2D Laplace equation in a half-plane with u(x, y = 0) = { .
0, |x|>1
𝜕2 u 𝜕2 u
+ = 0, 0 < x < ∞, 0<y<1 (24.196)
𝜕x 2 𝜕y2
∞
2
û = √ ∫ u(ξ , y) sin αξ dξ (24.199)
π
0
24.3 Solution of BVPs and IBVPs in infinite intervals using the FT | 635
∞
2
f̂ = √ ∫ f (ξ ) sin αξ dξ . (24.200)
π
0
d2 û
− α2 û = 0
dy2
û = f ̂ @ y = 0, û = 0 @ y = 1
BCs ⇒
c1 = 0
f̂
c2 =
sinh α
∴
f ̂ sinh α(1 − y)
û = (24.201)
sinh α
∞ ∞
2 sinh α(1 − y)
u(x, y) = ∫ sin αx( ∫ f (ξ ) sin αξ dξ ) dα (24.202)
π sinh α
0 0
Remark. If the BC at x = 0 is 𝜕u
𝜕x
= 0 then we can use the cosine transform.
𝜕u 𝜕2 u 𝜕2 u 𝜕2 u
= 2 + 2 + 2, −∞ < x < ∞, −∞ < y < ∞, −∞ < z < ∞, t>0
𝜕t 𝜕x 𝜕y 𝜕x
(24.203)
Eigenvalues
636 | 24 Fourier transforms on infinite intervals
eigenfunctions
Define
∞ ∞ ∞
−i(α1 x+α2 y+α3 z)
ℱ {f (x, y, z)} = ∫ ∫ ∫ e f (x, y, z) dx dy dz = F(α1 , α2 , α3 ) (24.207)
−∞ −∞ −∞
Then we get
dû
= −(α12 + α22 + α32 )û
dt
u(0)
̂ = F(α1 , α2 , α3 )
⇒
2 2 2
û = F(α1 , α2 , α3 )e−(α1 +α2 +α3 )t
∞ ∞ ∞
2 2 2
= ∫ ∫ ∫ f (ξ , η, λ)e−(α1 +α2 +α3 )t−i(α1 ξ +α2 η+α3 λ) dξ dη dλ (24.209)
−∞ −∞ −∞
⇒
∞ ∞ ∞
1 2 2 2
u(x, y, z) = 3
∫ ∫ ∫ e−(α1 +α2 +α3 )t+i(α1 x+α2 y+α3 z) dα1 dα2 dα3
(2π)
−∞ −∞ −∞
∞ ∞ ∞
This can be simplified in the same way as the one-dimensional problem. The final
result is
∞ ∞ ∞
1 2 2 2
u(x, y, z, t) = 3
∫ ∫ ∫ e−((x−ξ ) +(y−η) +(z−ζ ) )/4t f (ξ , η, ζ ) dξ dη dζ (24.210)
( 4πt)
√
−∞ −∞ −∞
∞ ∞ ∞
where
24.4 Relationship between Fourier and Laplace transforms | 637
1
G= exp{−[(x − ξ )2 + (y − η)2 + (z − ζ )2 ]/4(t − τ)} (24.212)
[4π(t − τ)]3/2
is the temperature at position (x, y, z) and time t due to a unit point source at (ξ , η, ζ )
at time τ(t > τ).
G satisfies the heat equation
𝜕G 𝜕2 G 𝜕2 G 𝜕2 G
− 2 − 2 − 2 = δ(x − ξ )δ(y − η)δ(z − ζ )δ(t − τ) (24.213)
𝜕t 𝜕x 𝜕y 𝜕z
𝜕G 𝜕2 G 𝜕2 G 𝜕2 G
− − − − = δ(x − ξ )δ(y − η)δ(z − ζ )δ(t − τ) (24.214)
𝜕τ 𝜕ξ 2 𝜕η2 𝜕ζ 2
In analogous fashion, one can define double Fourier transforms, double sine trans-
forms, double cosine transforms, triple cosine transforms, and so on.
and assume that f (x) is absolutely integrable. Then f (x) is bounded, i. e., ∃ a constant
M such that
f (x) < M. (24.216)
Now, let
Equation (24.215)⇒
∞ ∞
1
e−γx ϕ(x) = ∫ eiαx ( ∫ e−γξ ⋅ e−iαξ ϕ(ξ ) dξ ) dα
2π
−∞ 0
⇒
638 | 24 Fourier transforms on infinite intervals
∞ ∞
1
ϕ(x) = ∫ e(γ+iα)x ( ∫ e−(γ+iα)ξ ϕ(ξ ) dξ ) dα (24.219)
2π
−∞ 0
Let s = γ + iα ⇒ dα = 1i ds ⇒
γ+i∞ ∞
1
ϕ(x) = ∫ esx ( ∫ e−sξ ϕ(ξ ) dξ ) ds (24.220)
2πi
γ−i∞ 0
γ+i∞
1
ϕ(x) = ∫ esx Φ(s) ds. (24.222)
2πi
γ−i∞
We showed that Φ(s) is analytical in the right half-plane Re s > γ. Thus, equations
(24.221) and (24.222) define the Laplace transform and the inversion formula.
Consider again the Fourier transform pair
F(α) = ∫ f (ξ )e−iαξ dξ
−∞
∞
1
f (x) = ∫ F(α)eiαx dα
2π
−∞
F(α) = ∫ f (ξ )e−iαξ dξ
0
∞
1
f (x) = ∫ F(α)eiαx dα
2π
−∞
or
∞
ℱ {f (x)} = ∫ e f (x) dx
−iαx
0
24.4 Relationship between Fourier and Laplace transforms | 639
Thus, in the Laplace transform, we replace s by iα, we get the Fourier transform pro-
vided both exist.
Example 24.3.
e−x , x>0
f (x) = {
0, x<0
1
ℓ{f (x)} =
1+s
1 1 − iα
F(α) = ℱ {f (x)} = = .
1 + iα 1 + α2
2 1
⇒ F(α) = .
1 + α2
We note that
∞ ∞
2 1 1 2
∫ f (x) dx = = ∫ F(α) dα,
2 2π
−∞ −∞
Problems
1. The evolution of small amplitude waves on a vertically falling film is described by
the linear partial differential equation,
𝜕h 𝜕h 𝜕3 h 5 𝜕2 h 27 𝜕2 h Re We 𝜕4 h
+3 +3 3 − Re − Re 2 + = 0, −∞ < x < ∞,
𝜕t 𝜕x 𝜕x 32 𝜕x𝜕t 160 𝜕x 12 𝜕x 4
where h is the film height, Re is the Reynolds number and We is the Weber number.
(a) Use the separation of variables method
to determine the condition on the eigenvalue λ so that hα (x, t) is a solution. (b) Use
the result in (a) to determine the range of unstable wave numbers (for which the
real part of the eigenvalue is positive), (c) Plot the neutral stability curve (that
demarcates between the stable and unstable wave numbers) in the (α, We) plane.
2. The stability of the conduction state in a porous layer is determined by the homo-
geneous partial differential equations
640 | 24 Fourier transforms on infinite intervals
𝜕2 ψ 𝜕2 ψ 𝜕θ
+ 2 − Ra =0
𝜕x 2 𝜕z 𝜕x
𝜕2 θ 𝜕2 θ 𝜕ψ
+ + = 0, −∞ < x, ∞, 0<z<1
𝜕x 2 𝜕z 2 𝜕x
(b) What is the relationship between the Fourier transforms of f (x) and f (x − ξ )?
(c) Use the above results to solve the boundary value problem
d2 u
− + a2 u = δ(x − ξ ); −∞ < x < ∞
dx2
(d) Use the result in (c) to solve the boundary value problem
d2 u
− + a2 u = h(x); −∞ < x < ∞
dx2
𝜕c 𝜕c 𝜕2 c 𝜕2 c
+ p + λp − 2 = 0, −∞ < z < ∞, t > 0; c(z, 0) = δ(z)
𝜕t 𝜕z 𝜕z𝜕t 𝜕z
Here, p and λ are positive constants and δ(z) is the Dirac’s delta function. (a) Use
the Fourier transform (or any other method) to solve the equation and determine
c(z, t) for λ = 0 (b) If the k-th spatial moment (k ≥ 0) of c(z, t) is defined as
mk (t) = ∫ z k c(z, t) dz
−∞
determine the first three spatial moments (k = 0, 1, 2) for any λ, and hence the
variance without solving for c(z, t).
24.4 Relationship between Fourier and Laplace transforms | 641
𝜕2 u 𝜕u
D = ; 0 < x < ∞, t > 0
𝜕x 2 𝜕t
u = f (t); @ x = 0, t > 0
u = g(t); @ t = 0, x>0
Apply your result to the special case g = 0 and f = A cos ωt to determine how a
periodic signal is attenuated.
6. (a) Consider the problem of a very long empty tubular reactor
𝜕2 c 𝜕c 𝜕c
D 2
− u − kc = , 0 < x < ∞, t > 0
𝜕x 𝜕x 𝜕t
𝜕c
−D = u(c0 − c), x = 0, t > 0
𝜕x
c = f (x), t = 0, x > 0
Put it into self-adjoint form. Consider what transform on the semi-infinite interval
might solve it.
(b) Consider the above problem with c0 = c0 (t). Cast the above equations into di-
mensionless form but leave c(x, t) dimensional. Make successively, substitutions
of the following form to put the equation in its simplest form:
x Pe Pe2 2 𝜕w
c = exp( )v; v = exp{−( + k)τ}w; ϕ=w−
2 4 Pe 𝜕x
What is now the form for the problem and what is the solution? Having found ϕ,
how does one find w?
7. A very long slab with two insulated opposite faces has arbitrary temperatures im-
posed on the other two faces so that the system is described by
𝜕2 u 𝜕2 u
+ = 0, −∞ < x < ∞, 0<y<L
𝜕x 2 𝜕y2
u(x, 0) = f (x), −∞ < x < ∞
u(x, L) = g(x), −∞ < x < ∞
Find a formal solution and show that it is identical to the Poisson’s integral for-
mula in the limit L → ∞.
8. Use multiple Fourier transform to solve the problem of heat flow with heat pro-
duction:
𝜕u
− div grad u = q(x, y, z, t), t > 0, −∞ < x, y, z < ∞
𝜕t
u(x, y, z, 0) = 0, u(x, y, z, t) bounded
25 Fourier transforms in cylindrical and spherical
geometries
Recall that for a regular differential operator, the leading coefficient did not vanish in-
side or at the end points of the interval. We now consider problems in which this con-
dition is violated. These problems arise mostly in cylindrical and spherical domains.
2. Poisson’s equation
3. Heat equation
𝜕u
= ∇2 u in Ω, t > 0; (25.3)
𝜕t
u = u0 @ t = 0, in Ω (25.4)
αu + βn.∇u = γ on 𝜕Ω, t>0 (25.5)
4. Wave equation
𝜕2 u
= c2 ∇2 u in Ω, t > 0; (25.6)
𝜕t 2
u = 0 on 𝜕Ω (25.7)
𝜕u
u = g1 and = g2 on 𝜕Ω @ t > 0 (25.8)
𝜕t
5. Helmholtz/diffusion–reaction equation
https://doi.org/10.1515/9783110739701-026
25.1 BVP and IBVP in cylindrical and spherical geometries | 643
BVPs and can be treated by the standard FFT method. Thus, the treatment below
is mostly confined to a solid cylinder and sphere.
The domain of interest may include either the inside or outside of a cylindrical domain
or the annular region or their combinations as shown schematically in Figure 25.1.
1 𝜕 𝜕u 1 𝜕2 u 𝜕2 u
∇2 u = (r ) + 2 2 + 2 , (25.10)
r 𝜕r 𝜕r r 𝜕θ 𝜕z
{ 0 < θ < 2π
{
Ω≡{ 0<r<a (25.11)
{
{ 0 < z < L,
644 | 25 Fourier transforms in cylindrical and spherical geometries
r = a, 0<z<L
𝜕Ω ≡ { (25.12)
z = 0, L, 0 < r < a.
and so forth. Similarly, the Laplacian operator in 1D (in r only) can be simplified to
1 𝜕 𝜕u
∇2 u = (r ) (25.14)
r 𝜕r 𝜕r
or in 2D as
1 𝜕 𝜕u 1 𝜕2 u
(r, θ) : ∇2 u = (r ) + 2 2 (25.15)
r 𝜕r 𝜕r r 𝜕θ
1 𝜕 𝜕u 𝜕2 u
(r, z) : ∇2 u = (r ) + 2 (25.16)
r 𝜕r 𝜕r 𝜕z
and so forth.
Similar to the cylindrical case, the BVPs and initial BVPs in spherical geometries usu-
ally involve solid or hollow spheres. The domain of interest may include either the
25.1 BVP and IBVP in cylindrical and spherical geometries | 645
inside or outside of the spherical domain or the annular region or their combinations.
Figure 25.3 shows the spherical coordinate system, where the relationship between
the Cartesian coordinate to the spherical coordinate system is as follows:
1 𝜕 2 𝜕u 1 𝜕 𝜕u 1 𝜕2 u
∇2 u = (r ) + (sin θ ) + . (25.17)
r 2 𝜕r 𝜕r r 2 sin θ 𝜕θ 𝜕θ r 2 sin2 θ 𝜕ϕ2
{ 0<r<a
{
Ω≡{ 0<θ<π (25.18)
{
{ 0 < ϕ < 2π
and so forth.
646 | 25 Fourier transforms in cylindrical and spherical geometries
[Note: The polar angle θ is also referred to as latitude, where 0 < θ < π, with
θ = π2 denoting equator, θ = 0 denoting north pole and θ = π denoting the south pole.
Similarly, the azimuthal angle ϕ is also referred to as longitude where 0 < ϕ < 2π].
Similarly, the Laplacian operator in 1D (in r only) can be simplified to
1 𝜕 2 𝜕u
∇2 u = (r ) (25.21)
r 2 𝜕r 𝜕r
or in 2D as
1 𝜕 2 𝜕u 1 𝜕 𝜕u
(r, θ) : ∇2 u = 2
(r )+ 2 (sin θ ) (25.22)
r 𝜕r 𝜕r r sin θ 𝜕θ 𝜕θ
and so forth.
1 𝜕 𝜕ψ 1 𝜕2 ψ 𝜕2 ψ
∇2 ψ = (r ) + 2 2 + 2 = −λψ,
r 𝜕r 𝜕r r 𝜕θ 𝜕z
0 < r < a, 0 < θ < 2π, 0<z<L (25.23)
𝜕ψ
ψ = 0 on r = a, z = 0, L; ψ = finite (or, = 0) @ r = 0. (25.24)
𝜕r
ψ = R(r)Θ(θ)Z(z) (25.25)
1 1 d dR 1 1 d2 Θ 1 d2 Z
(r ) + +λ =− , (25.26)
R r dr dr Θ r dθ
2 2 Z dz 2
dR
R = 0 on r = a; R = finite (or, = 0) @ r = 0 (25.27)
dr
Z = 0 at z = 0, L; (25.28)
Note that LHS in equation (25.26) is function of (r, θ) while the RHS is a function of z
only. Thus, we get
25.1 BVP and IBVP in cylindrical and spherical geometries | 647
1 d2 Z
= −λz Z; Z = 0 @ z = 0, L (25.29)
Z dz 2
⇒ the z-eigenvalues and eigenfunctions are given by
n2 π 2 nπz
λz = = λn and Zn = √2 sin[ ], n = 1, 2, 3, . . . (25.30)
L2 L
L
1
⟨Zi , Zj ⟩ = ∫ Zi Zj dz = δij (25.31)
L
0
Thus, equation (25.26) can be rewritten by using equations (25.29) and (25.30) as fol-
lows:
1 d dR 1 d2 Θ
r (r ) + r 2 (λ − λn ) = − (25.32)
R dr dr Θ dθ2
d2 Θ
= −λθ Θ; 0 < θ < 2π (25.33)
dθ2
Θ and Θ′ periodic in θ (25.34)
⇒
2π
1
⟨Θi , Θj ⟩ = ∫ Θi Θj dθ = δij (25.36)
2π
0
Thus, equation (25.32) can be rewritten by using equations (25.33) and (25.35) as fol-
lows:
1 d dR m2
(r ) − 2 R = −(λ − λn )R (25.37)
r dr dr r
dR
R = 0 on r = a; R = finite (or, = 0) @ r = 0 (25.38)
dr
Jm (√λmnk − λn r)
Rmnk = (25.39)
Jm+1 (√λmnk − λn a)
Jm (√λmnk − λn a) = 0; λmnk > λn ; (25.40)
m = 0, 1, 2, . . . ; n = 1, 2, 3, . . . ; and k = 1, 2, 3, . . .
Thus, the 3D Laplacian operator with Dirichlet condition in cylindrical coordinate sys-
tem has the eigenvalues and eigenfunctions
μ2mk
λmnk = + λn ,
a2
(25.42)
ψ = ψmnk (r, θ, z) = Rmnk (r)Θm (θ)Zn (z),
m = 0, 1, 2, . . . ; n = 1, 2, 3, . . . ; and k = 1, 2, 3, . . .
where Rmnk , Θm and Zn are given in equations (21.26), (21.29) and (25.41), and μmk are
the k th zero of Jm . Similarly, eigenvalues and eigenfunctions with other BCs (compati-
ble with separation of variables) can also be obtained with similar procedure.
1 𝜕 2 𝜕ψ 1 𝜕 𝜕ψ 1 𝜕2 ψ
∇2 ψ = (r ) + (sin θ ) + = −λψ, (25.43)
r 2 𝜕r 𝜕r r 2 sin θ 𝜕θ 𝜕θ r 2 sin2 θ 𝜕ϕ2
0 < r < a, 0 < θ < π, 0 < ϕ < 2π
𝜕ψ
ψ = 0 on r = a, ψ = finite (or, = 0) @ r = 0. (25.44)
𝜕r
ψ = R(r)Θ(θ)Φ(ϕ) (25.45)
ψ
and substituting equation (25.45) into equations (25.43)–(25.44) and dividing by r2
, we
get
1 d 2 dR 1 d dΘ 1 d2 Φ
(r ) + r2 λ = − (sin θ )− , (25.46)
R dr dr Θ sin θ dθ dθ Φ sin θ dϕ2
2
25.1 BVP and IBVP in cylindrical and spherical geometries | 649
𝜕R
R = 0 on r = a; R = finite (or, = 0) @ r = 0 (25.47)
𝜕r
Since LHS in equation (25.46) is function of r while the RHS is a function of (θ, ϕ) only,
both terms should be equated to a constant, and thus we can write
1 d dΘ 1 d2 Φ
(sin θ )+ = −λθϕ .
Θ sin θ dθ dθ Φ sin2 θ dϕ2
sin θ d dΘ 1 d2 Φ
(sin θ ) + λθϕ sin2 θ = − = λϕ (25.48)
Θ dθ dθ Φ dϕ2
0 < θ < π, 0 < ϕ < 2π
2π
1
⟨Φi , Φj ⟩ = ∫ Φi Φj dϕ = δij . (25.50)
2π
0
1 d dΘ m2
(sin θ )− Θ = −λθϕ Θ, 0<θ<π (25.51)
sin θ dθ dθ sin2 θ
z = cos θ (25.52)
d2 Θ dΘ m2
(1 − z 2 ) − 2z + (λθϕ − )Θ = 0, −1 < z < 1 (25.53)
dz 2 dz 1 − z2
with Θ being finite in the domain (−1 ≤ z ≤ 1). Equation (25.53) is referred to as “Asso-
ciated Legendre equation” [see Chapter 16, Sections 16.4.5 and 16.4.6 for a discussion
of this equation]. It can be shown that the eigenvalues are
(n − m)!(2n + 1) m
Θnm (z) = √ Pn (z) (25.55)
(n + m)!
(−1)l+m 2 m/2 d
n+m
n
Pnm = (1 − z ) (1 − z 2 ) . (25.56)
2n n! dz n+m
Note that the constant multiplier in equations (25.54)–(25.56) makes the eigenfunc-
tions Θnm (z) orthogonal with respect to the inner product:
1
1
⟨Θnm , Θkm ⟩ = ∫ Θnm Θkm dz = δij . (25.57)
2
−1
1 d 2 dR
(r ) + r 2 λ = λθϕ = n(n + 1), n = m, m + 1, . . . (25.58)
R dr dr
𝜕R
R = 0 on r = a; R = finite (or, = 0) @ r = 0 (25.59)
𝜕r
ξ = r √λ (25.60)
d2 R dR
ξ2 + 2ξ + [ξ 2 − n(n + 1)]R = 0, n = m, m + 1, . . . (25.61)
dξ 2 dξ
which is spherical Bessel equation (see the discussion in Chapter 16, Section 16.4.4).
Thus, the eigenfunction can be expressed in terms of spherical Bessel functions jn (ξ ) =
jn (r√λnk ) of first kind (due to the BC: R is finite at r = 0, the spherical Bessel function
of second kind is omitted). Thus, using the boundary condition (equation (25.59)), the
eigenvalues and normalized eigenfunctions are given by
2 jn (r√λnk )
eigenfunctions: Rnk (r) = √ , (25.63)
3 jn+1 (a√λnk )
which are orthonormal with respect to the standard spherical inner product:
25.2 FFT method for 1D problems in spherical and cylindrical geometries | 651
a
1
⟨Rnk , Rnj ⟩ = ∫ 3r 2 Rnk Rnj dr = δkj (25.64)
a3
0
Thus, the 3D Laplacian operator with Dirichlet boundary condition on the surface and
with domain being the interior of a sphere has the eigenfunctions
where Rnk , Θmn and Φm are given in equations (25.49)–(25.50), (25.54)–(25.57) and
(25.62)–(25.64).
1 d dc
De ∇2 c = De (r ) = kc, in Ω ≡ 0 < r < a (25.66)
r dr dr
c = c0 on 𝜕Ω ≡ r = a; c = finite @ r = 0 (25.67)
Nondimensionalization
Define
c r a2 k
u= ; ξ = ; ϕ2 = (25.69)
c0 a Dm
⇒
652 | 25 Fourier transforms in cylindrical and spherical geometries
1 d du
(ξ ) = ϕ2 u, 0 < ξ < 1; u(1) = 1; u(0) finite (25.70)
ξ dξ dξ
and
1
η = ∫ 2ξu(ξ ) dξ (25.71)
0
Direct solution
Equation (25.70) can be solved directly in terms of modified Bessel functions, which
with given BCs can be expressed as
I0 (ϕξ )
u(ξ ) = (25.72)
I0 (ϕ)
2 ′ 2 I1 (ϕ)
η= u (1) = (25.73)
ϕ2 ϕ I0 (ϕ)
Here, I0 and I1 are the modified Bessel functions of first kind of order zero and order
one, respectively.
FFT method
The model equation (25.70) can be rewritten by substituting
v =1−u (25.74)
as
1 d dv
(ξ ) − ϕ2 v = −ϕ2 , 0 < ξ < 1; v(1) = 0; v(0) finite (25.75)
ξ dξ dξ
1 d dw
(ξ ) = −λw, 0 < ξ < 1; w(1) = 0; w(0) finite (25.76)
ξ dξ dξ
J0 (√λk ξ )
J0 (√λk ) = 0, and wk = , k = 1, 2, 3, . . . (25.77)
J1 (√λk )
which are orthonormal with respect to the standard cylindrical inner product:
25.2 FFT method for 1D problems in spherical and cylindrical geometries | 653
ϕ2 ⟨1, wj ⟩
− λj ⟨v, wj ⟩ − ϕ2 ⟨v, wj ⟩ = −ϕ2 ⟨1, wj ⟩ ⇒ ⟨v, wj ⟩ = (25.79)
ϕ2 + λj
⇒
∞ ∞ ϕ2 ⟨1, wj ⟩
v(ξ ) = ∑⟨v, wj ⟩wj = ∑ wj (ξ ) (25.80)
j=1 j=1
ϕ2 + λj
⇒
η = 1 − ⟨1, v⟩
∞
ϕ2
=1−∑ ⟨1, wj ⟩2
j=1
ϕ2 + λj
1 2
∞
ϕ2 J (√λk ξ )
=1−∑ 2
{∫ 2ξ 0 dξ }
j=1
ϕ + λj J1 (√λk )
0
∞
4ϕ2
=1−∑ (25.81)
λ (ϕ2 + λj )
j=1 j
where eigenvalues λj are the roots of J0 (√λj ) = 0. Figure 25.4 shows a comparison of
effectiveness factor evaluated from direct solution (equation (25.73)) and FFT solution
(equation (25.81)) with few terms included in the summation. Table 25.1 lists the eigen-
values of cylindrical Laplacian operator that are utilized in the summation [Remark:
√λn is the nth zero of J0 (x).]
Figure 25.4: Effectiveness factor for a cylindrical catalyst from direct solution and FFT solution.
654 | 25 Fourier transforms in cylindrical and spherical geometries
Table 25.1: First few eigenvalues λj of the Laplacian operator in 1D cylindrical coordinate: J0 (λj ) = 0.
# √λ λ
1 2.40483 5.78319
2 5.52008 30.4713
3 8.65373 74.887
4 11.7915 139.04
5 14.9309 222.932
6 18.0711 326.563
7 21.2116 449.934
8 24.3525 593.043
9 27.4935 755.891
10 30.6346 938.479
11 33.7758 1140.81
12 36.9171 1362.87
13 40.0584 1604.68
14 43.1998 1866.22
15 46.3412 2147.51
16 49.4826 2448.53
17 52.6241 2769.29
18 55.7655 3109.79
19 58.907 3470.03
20 62.0485 3850.01
As expected, the high ϕ asymptote approaches ϕ2 and FFT solution becomes accurate
as more terms are included in the summation. The number of terms to be included to
obtain η accurately is about equal to ϕ and is an indication of the boundary (reaction)
layer thickness which is of the order ϕ1 .
Remarks.
VΩ a
1. When the effective diffusion length RΩ = AΩ
= 2
is used as the length scale to
define the Thiele modulus as
R2Ω k
Φ2 =
De
𝜕T k 𝜕 𝜕T
ρcp = k∇2 T = (r ), 0 < r < a, t>0 (25.82)
𝜕t r 𝜕r 𝜕r
25.2 FFT method for 1D problems in spherical and cylindrical geometries | 655
r T − Ts kt T0 (aξ ) − Ts
ξ = ; u= ; τ= ; f (ξ ) = ; (25.84)
a Ts ρcp a2 Ts
the dimensionless temperature u is given by the following initial boundary value prob-
lem:
1 𝜕 𝜕u 𝜕u
(ξ ) = , 0 < ξ < 1, τ > 0 (25.85)
ξ 𝜕ξ 𝜕ξ 𝜕τ
u(ξ = 1, τ) = 0; u(ξ = 0, τ) finite; u(ξ , τ = 0) = f (ξ ) (25.86)
The relevant EVP in this case is same as described previously in equations (25.76)
and equations (25.77)–(25.78), i. e.,
J0 (√λk ξ )
J0 (√λk ) = 0, and wk = , k = 1, 2, 3, . . .
J1 (√λk )
which are orthonormal with respect to the standard cylindrical inner product (equa-
tion (25.133)). Thus, taking the inner product of model equations (25.85)–(25.86)
with wj , we get
d
−λj ⟨u, wj ⟩ = ⟨u, wj ⟩, τ > 0; ⟨u, wj ⟩ = ⟨f , wj ⟩ @ τ = 0
dτ
⇒
∞ ∞
⟨u, wj ⟩ = ⟨f , wj ⟩e−λj τ ⇒ u(ξ , τ) = ∑⟨u, wj ⟩wj = ∑ e−λj τ ⟨f , wj ⟩wj (25.87)
j=1 j=1
⇒
1
J (√λj ξ ) J0 (√λj x)
−λj τ 0
∞
u(ξ , τ) = ∑ e ∫ 2xf (x) dx
j=1 J1 (√λj ) 0
J1 (√λj )
⇒
1
−λj τ J0 ( λj ξ )
∞ √
u(ξ , τ) = 2∑e ∫ xf (x)J0 (√λj x) dx (25.88)
j=1 J12 (√λj )
0
656 | 25 Fourier transforms in cylindrical and spherical geometries
1
J1 (√ λ j )
∫ xJ0 (√λj x) dx =
√λ j
0
∞ e−λj τ J0 (√λj ξ )
u(ξ , τ) = 2 ∑ , (25.89)
j=1 √λj ⋅ J1 (√λj )
1 ∞
e−λj τ
⟨u⟩(τ) = ⟨1, u(ξ , τ)⟩ = ∫ 2ξu(ξ , τ) dξ = 4 ∑ . (25.90)
j=1
λj
0
Figure 25.5: 2D density plot of u(ξ , τ) along with the temporal profile of average temperature ⟨u⟩(τ)
and center temperature u(ξ = 0, τ).
25.2 FFT method for 1D problems in spherical and cylindrical geometries | 657
The numerical solution at τ = 0 in the density plot shows oscillation in the solution
instead of a constant value of unity. This is due to the Gibb’s phenomena which arises
due to the discontinuity in the initial condition at τ = 0 (as discussed in Chapter 21).
1 ξ2
f (ξ ) = δ2 (ξ ) = lim e− ε ,
ε→0 ε
which satisfies
∞
∫ 2ξδ2 (ξ ) dξ = 1
0
and
∞
∞ J0 (√λj ξ )
u(ξ , τ) = ∑ e−λj τ .
j=1 J12 (√λj )
We note that the more terms are needed in the summation for τ → 0.
658 | 25 Fourier transforms in cylindrical and spherical geometries
Consider again the 1D diffusion–reaction model with Dirichlet boundary condition but
in a spherical catalyst particle:
1 d 2 dc
De ∇2 c = De (r ) = kc, in Ω ≡ 0 < r < a (25.91)
r 2 dr dr
c = c0 on 𝜕Ω ≡ r = a; c = finite @ r = 0 (25.92)
2 a
actual reaction rate ∫0 kc(r)4πr dr
η= = (25.93)
rate if c = c0 in Ω 4π 3
a kc0
3
Nondimensionalization
Define
c r a2 k VΩ a
u= ; ξ = ; ϕ2 = ; RΩ = = (25.94)
c0 a Dm AΩ 3
⇒
1 d du
(ξ 2 ) = ϕ2 u, 0 < ξ < 1; u(1) = 1; u(0) finite (25.95)
ξ 2 dξ dξ
and
η = ∫ 3ξ 2 u(ξ ) dξ (25.96)
0
Direct solution
Equation (25.95) can be solved exactly in terms of spherical Bessel functions of order
zero, which with given BCc can be expressed as
sinh(ϕξ )
u(ξ ) = (25.97)
ξ sinh(ϕ)
3 ′ 3
η= u (1) = 2 (ϕ coth ϕ − 1). (25.98)
ϕ2 ϕ
25.2 FFT method for 1D problems in spherical and cylindrical geometries | 659
FFT method
The model equation (25.95) can be rewritten by substituting
v =1−u (25.99)
as
1 d dv
(ξ 2 ) − ϕ2 v = −ϕ2 , 0 < ξ < 1; v(1) = 0; v(0) finite (25.100)
ξ 2 dξ dξ
1 d dw
(ξ 2 ) = −λw, 0 < ξ < 1; w(1) = 0; w(0) finite (25.101)
ξ 2 dξ dξ
sin √λk
j0 (√λk ) = = 0, ⇒ λk = k 2 π 2 , k = 1, 2, 3, . . . (25.102)
√λk
2 sin kπξ
and wk = √ , (25.103)
3 ξ
which are orthonormal with respect to the standard spherical inner product:
ϕ2 ⟨1, wj ⟩
− λj ⟨v, wj ⟩ − ϕ2 ⟨v, wj ⟩ = −ϕ2 ⟨1, wj ⟩ ⇒ ⟨v, wj ⟩ = (25.105)
ϕ2 + λj
⇒
∞ ∞ ϕ2 ⟨1, wj ⟩
v(ξ ) = ∑⟨v, wj ⟩wj = ∑ wj (ξ ) (25.106)
j=1 j=1
ϕ2 + λj
⇒
η = 1 − ⟨1, v⟩
∞
ϕ2
=1−∑ ⟨1, wj ⟩2
j=1
ϕ2 + λj
660 | 25 Fourier transforms in cylindrical and spherical geometries
∞
ϕ2
=1−∑ ⟨1, wj ⟩2
j=1
ϕ2 + λj
1 2
∞
ϕ2 2 sin jπξ
= 1 − ∑ 2 2 2 {∫ 3ξ 2 √ dξ }
j=1
ϕ + j π 3 ξ
0
2
∞
6ϕ
=1−∑ (25.107)
j=1
j2 π 2 (ϕ2 + j2 π 2 )
Figure 25.7 shows the comparison of effectiveness factors evaluated from the direct so-
lution (equation (25.98)) and FFT solution (equation (25.107)) with few terms included
in the summation.
Figure 25.7: Effectiveness factor for a spherical catalyst from exact solution and FFT solution.
3
As expected, the high ϕ asymptote approaches ϕ
and FFT solution becomes accurate
as more terms are included in the summation.
a
Remark. If Thiele modulus is defined based on the effective diffusion length RΩ = 3
instead of radius a, i. e.,
R2Ω k
Φ2 = ,
D3
1
then the effectiveness factor η approaches to Φ
for Φ ≫ 1.
1 𝜕 𝜕u 𝜕u
(ξ 2 ) = , 0 < ξ < 1, τ>0 (25.108)
ξ 2 𝜕ξ 𝜕ξ 𝜕τ
25.2 FFT method for 1D problems in spherical and cylindrical geometries | 661
The relevant EVP in this case is same as that described previously in equations (25.101)
and equations (25.102)–(25.104), i. e.
2 sin kπξ
λk = k 2 π 2 and wk = √ ; k = 1, 2, 3, . . .
3 ξ
which are orthonormal w. r. t. the standard spherical inner product (equation (25.104)).
Thus, taking the inner product of model equations (25.108)–(25.109) with wj , we get
d
−λj ⟨u, wj ⟩ = ⟨u, wj ⟩, τ > 0; ⟨u, wj ⟩ = ⟨f , wj ⟩ @ τ = 0
dτ
⇒
∞ ∞
⟨u, wj ⟩ = ⟨f , wj ⟩e−λj τ ⇒ u(ξ , τ) = ∑⟨u, wj ⟩wj = ∑ e−λj τ ⟨f , wj ⟩wj (25.110)
j=1 j=1
⇒
1
∞ 2 2
τ2sin jπξ sin[jπx]
u(ξ , τ) = ∑ e−j π ∫ 3x 2 f (x) dx
j=1
3 ξ x
0
⇒
1
−j2 π 2 τ sin[jπξ ]
∞
u(ξ , τ) = 2 ∑ e ∫ xf (x) sin[jπx] dx (25.111)
j=1
ξ
0
1
(−1)j−1
∫ x sin[jπx] dx =
jπ
0
τ sin[jπξ ]
∞ 2 2
u(ξ , τ) = 2 ∑(−1)j−1 e−j π (25.112)
j=1
jπξ
Figure 25.8 shows the 2D density plot of the dimensionless temperature u(ξ , τ) in (ξ −τ)
space, as well as its average temperature ⟨u⟩(τ) and center temperature u(0, τ) as a
function of time τ.
Figure 25.8: 2D density plot of u(ξ , τ) along with the temporal profile of average temperature ⟨u⟩(τ)
and center temperature u(ξ = 0, τ).
Consider the 2D Laplace equation inside a unit circle with Dirichlet boundary condi-
tion, i. e.
1 𝜕 𝜕u 1 𝜕2 u 0<r<1
∇2 u = (r ) + 2 2 = 0, (25.114)
r 𝜕r 𝜕r r 𝜕θ 0 < θ < 2π
u(1, θ) = f (θ); (25.115)
These eigenfunctions are complete. Thus, if f (θ) ∈ ℒ2 [0, 2π], Hilbert space of 2π pe-
riodic functions with the standard inner product, then taking the inner product of
equations (25.114)–(25.115) with wj , we can write
Equation (25.118) is Euler’s equation. The two linearly independent solutions are r j
and r −j . For bounded solution at r → 0, we can only take r j . Thus, the solution to
equations (25.118)–(25.119) can be expressed as
⟨u, wj ⟩ = ⟨f , wj ⟩r j (25.120)
⇒
∞ ∞
u(r, θ) = ∑ ⟨u, wj ⟩wj = ∑ r j ⟨f , wj ⟩wj (25.121)
j=0 j=0
2π ∞ j 2π
1 r
= ∫ f (θ′ ) dθ′ + ∑ cos[jθ] ∫ f (θ′ ) cos[jθ′ ] dθ′
2π j=1
π
0 0
2π
∞
rj
+∑ sin[jθ] ∫ f (θ′ ) sin[jθ′ ] dθ′
j=1
π
0
2π
1 ∞
= ∫ f (θ′ )[1 + 2 ∑ r j cos[j(θ − θ′ )]] dθ′ (25.122)
2π j=1
0
j=1 j=1
∞
j
= Re[∑{rei(θ−θ ) } ]
′
j=1
rei(θ−θ )
′
∞
z
= Re[ ′ ] ( ∵ ∑ zj = z + z2 + z3 + ⋅ ⋅ ⋅ = )
1 − rei(θ−θ ) j=1
1−z
2
r cos[θ − θ ] − r ′
=
1 + r 2 − 2r cos[θ − θ′ ]
664 | 25 Fourier transforms in cylindrical and spherical geometries
2π
1 r cos[θ − θ′ ] − r 2
u(r, θ) = ∫ f (θ′ )[1 + 2 ] dθ′
2π 1 + r 2 − 2r cos[θ − θ′ ]
0
2π
1 (1 − r 2 )f (θ′ )
= ∫ dθ′ (25.123)
2π 1 + r − 2r cos[θ − θ′ ]
2
0
which is also referred to as the Poisson’s integral formula (for the interior of a circle).
Special cases
1. f (θ) = 1: In this case,
∞
u(r, θ) = ∑ r j δj0 wj = 1
j=0
A
⟨f , wj ⟩ = δ
√2 jp
∞
A
u(r, θ) = ∑ r j δ w
j=0
√2 jps j
As an example, taking cosine input with p = 5, i. e., f (θ) = cos 5θ, the 3D plot and
the density plot are shown in Figure 25.9.
Consider the vibrational motion of a circular membrane described by the partial dif-
ferential equation
𝜕2 u 𝜕2 u 1 𝜕u 1 𝜕2 u
= c2 ( 2 + + ) 0 < r < 1, 0 < θ < 2π, t>0 (25.124)
𝜕t 2 𝜕r r 𝜕r r 2 𝜕θ2
25.3 2D and 3D problems in cylindrical geometry | 665
Figure 25.9: 3D plot (top) and density plot (bottom) for solution of Laplace equation with boundary
value f (θ) = cos 5θ.
BC
ICs
metry, the BCs for the first operator are periodicity in θ with period 2π. The eigen-
values and eigenfunctions for θ-operators are given in equations (25.35)–(25.36) or
(25.116)–(25.117):
2π
1
⟨wi , wj ⟩ = ∫ wi wj dθ = δij (25.129)
2π
0
These eigenfunctions are complete. Thus, if f (r, θ) ∈ ℒ2 [0, 2π], then defining
and taking inner product of equation (25.124) with the eigenfunctions wm , we get
𝜕2 u m 2
2 𝜕 um 1 𝜕um m2
= c ( + − 2 um ), 0 < r < 1, t>0 (25.131)
𝜕t 2 𝜕r 2 r 𝜕r r
BC
um = 0 @ r = 1 (25.132)
IC
1 d dψ m2
(r ) − 2 ψ = −λψ
r dr dr r
ψ(1) = 0, ψ(0) finite
which is a special case of equation (25.35), i. e., 3D to 2D. Thus, the eigenvalues and
eigenfunctions are given by equations (25.39)–(25.41) as
which are orthonormal with respect to the standard cylindrical inner product:
and form a complete set. Thus, taking inner product of equation (25.131) with wmk (r)
gives
d2 Umk
= −c2 λmk Umk ; Umk = ⟨um , wmk ⟩ (25.138)
dt 2
25.3 2D and 3D problems in cylindrical geometry | 667
⇒
∞ ∞
um (r, t) = ∑ ⟨um , wmk ⟩wmk = ∑ Umk (t)wmk (r)
k=1 k=1
1
∞
Jm (√λmk r) Jm (√λmk ξ )
= ∑ cos[c√λmk t] ⋅ ∫ 2ξfm (ξ ) dξ
k=1 Jm+1 (√λmk ) Jm+1 (√λmk )
0
1
∞
cos c√λmk tJm (√λmk r)
=2∑ 2 (√λ )
⋅ ∫ ξfm (ξ )Jm (√λmk ξ ) dξ (25.141)
k=1 Jm+1 mk
0
⇒
∞
u(r, θ, t) = ∑ um (r, t)wm (θ)
m=0
∞ ∞
= u0 (r, t) + ∑ umc √2 cos mθ + ∑ ums √2 sin mθ
m=1 m=1
∞
= ∑ cos[c√λ0k t]J0 (√λ0k r)A0k
k=1
∞ ∞
+ ∑ ∑ [Ackm cos mθ + Askm sin mθ]Jm (√λmk r) cos[c√λmk t] (25.142)
k=1 m=1
where
1 2π
1
A0k = 2 ∫ ∫ ξJ0 (√λ0k ξ )f (ξ , θ) dθ dξ (25.143)
πJ1 (√λ0k )
0 0
1 2π
2
Ackm = 2 ∫ ∫ ξJm (√λmk ξ )f (ξ , θ) cos mθ dθ dξ (25.144)
πJm+1 (√λmk )
0 0
1 2π
2
Askm = 2 (√λ )
∫ ∫ ξJm (√λmk ξ )f (ξ , θ) sin mθ dθ dξ (25.145)
πJm+1 mk
0 0
m = 1, 2, . . . ; k = 1, 2, 3, . . .
668 | 25 Fourier transforms in cylindrical and spherical geometries
Special case: f (r, θ) = Jm (√λmk r) cos mθ or Jm (√λmk r) sin mθ (i. e., pure
eigenmodes)
For the special case when f (r, θ) are the pure mode of vibration, the solution remains
in the same mode for all times. For example,
(i) when f (r, θ) = J0 (√λ0k r), then equation (25.143), (25.144), (25.145) implies
(ii) when f (r, θ) = Jm (√λmk r) sin mθ (m > 0), then equations (25.143), (25.144), (25.145)
implies
(iii) when f (r, θ) = Jm (√λmk r) cos mθ(m > 0), then equation (25.143)–(25.145) implies
equations (25.143)–(25.145) implies that all sine coefficients vanish, i. e.,
Thus, the solution remains in the same eigenmode with amplitude ratio A.R. vary-
ing with time as
The contour profile of first few modes of vibration (i. e., eigenmodes) are shown in
Figure 25.10, while eigenvalues corresponds to these modes are listed in Table 25.2.
Similarly, the 2D problems in cylindrical coordinate system in (r, z) or (z, θ) space can
be solved. For additional chemical engineering applications, we refer to the articles
by Balakotaiah and Gupta [6], Ratnakar and Balakotaiah [26], Balakotaiah [5] and Aris
and Balakotaiah [4].
25.3 2D and 3D problems in cylindrical geometry | 669
Figure 25.10: Contour profiles of first few eigenmodes: cos[mθ]Jm [√λmk r].
m k μmk λmk
0 1 2.40483 5.78319
0 2 5.52008 30.4713
1 1 3.83171 14.682
1 2 7.01559 49.2185
2 1 5.13562 26.3746
2 2 8.41724 70.8499
3D Poisson’s equation
Consider the Poisson equation
1 𝜕 𝜕u 1 𝜕2 u 𝜕2 u
∇2 u = (r ) + 2 2 + 2 = −f (r, θ, z) in Ω (25.146)
r 𝜕r 𝜕r r 𝜕θ 𝜕z
670 | 25 Fourier transforms in cylindrical and spherical geometries
with BC:
The eigenvalues and eigenfunctions are obtained in Section 25.1.3 for the Laplacian
operator with Dirichlet condition in the cylindrical geometry (see equations (25.23)–
(25.24) to (25.42)). Using these eigenfunctions and corresponding inner product, the
solution can be obtained by taking the inner product of equations (23.72)–(23.73) with
ψmnk as follows:
⟨f , ψnmk ⟩
− λmnk ⟨u, ψmnk ⟩ = −⟨f , ψnmk ⟩ ⇒ ⟨u, ψmnk ⟩ = (25.148)
λmnk
⇒
∞ ∞ ∞
⟨f , ψnmk ⟩
u = ∑ ⟨u, ψmnk ⟩ψmnk = ∑ ∑ ∑ ψmnk (25.149)
m,n,k n=1 m=0 k=1 λmnk
where
μ2mk n2 π 2
λmnk = + 2 : Jm (μmk ) = 0 (25.150)
a2 L
J ( μ r)
{
{ ψ0nk = √2 J0(√√μ 0ka) sin nπz
L
{ 1 0k
{
{ c Jm (√μmk r)
ψmnk = { ψmnk = 2 J (√μ a) cos mθ sin nπz L
(25.151)
{ m+1 mk
{
{
{ s Jm (√μmk r) nπz
ψ = 2 J (√μ a) sin mθ sin L
{ mnk m+1 mk
and
L a 2π
1
⟨f , ψnmk ⟩ = ∫ ∫ ∫ rf (r, θ, z)ψ(r, θ, z) dθ dr dz (25.152)
πa2 L
0 0 0
3D Heat/Diffusion equation
Consider the transient heat/diffusion equation
𝜕u 1 𝜕 𝜕u 1 𝜕2 u 𝜕2 u
= ∇2 u = (r ) + 2 2 + 2 in Ω (25.153)
𝜕t r 𝜕r 𝜕r r 𝜕θ 𝜕z
Ω ≡ 0 < r < a, 0 < θ < 2π, 0<z<L
with BC:
and IC:
Again, using the same eigenvalues and eigenfunctions (obtained in Section 25.1.3 for
the Laplacian operator with Dirichlet condition in the cylindrical geometries), and tak-
ing the inner product of equation (25.153) with ψmnk , we get:
d
⟨u, ψmnk ⟩ = −λmnk ⟨u, ψnmk ⟩, t > 0;
dt
⟨u, ψmnk ⟩ = ⟨f , ψnmk ⟩ @ t = 0
⇒
⇒
∞ ∞ ∞
u = ∑ ⟨u, ψmnk ⟩ψmnk = ∑ ∑ ∑ e−λmnk t ⟨f , ψnmk ⟩ψmnk (25.157)
m,n,k n=1 m=0 k=1
Consider the Poisson’s equation in the interior of a sphere with Dirichlet boundary
condition, i. e.,
1 𝜕 2 𝜕u 1 𝜕 𝜕u 1 𝜕2 u
2 u = (r ) + (sin θ ) + = −f (r, θ, ϕ) in Ω
r 2 𝜕r 𝜕r r 2 sin θ 𝜕θ 𝜕θ r 2 sin2 θ 𝜕ϕ2 (25.158)
Ω ≡ 0 < r < a, 0 < θ < π, 0 < ϕ < 2π
with
FFT method
The eigenvalues and eigenfunctions of Laplacian operator in spherical coordinate are
described in Section 25.1.4 (see equations (25.43)–(25.44) to (25.65)). Taking the inner
product of equations (25.158)–(25.159) with eigenfunctions ψmnk , we get
⟨f , ψmnk ⟩
− λnk ⟨u, ψmnk ⟩ = −⟨f , ψmnk ⟩ ⇒ ⟨u, ψmnk ⟩ = (25.160)
λnk
and eigenfunctions by
{ Φ0 = 1
{
{ c
Φm (ϕ) = { Φm = √2 cos mϕ n = m, m + 1, . . . (25.163)
{
{ s
{ Φm = √2 sin mϕ,
⇒
∞ ∞ ∞
⟨f , ψmnk ⟩
u = ∑ ⟨u, ψmnk ⟩ψmnk = ∑ ∑ ∑ = ψmnk (25.164)
m,n,k m=0 n=m k=1
λnk
For the special case of azimuthal/longitudinal symmetry (i. e., symmetry w. r. t. ϕ),
i. e., for the 2D case in (r, θ), we can disregard the m-operator, and only the m = 0
mode remains. In this case, the formal solution (equation (25.164)) simplifies to
∞ ∞
⟨f , ψnk ⟩
u = ∑⟨u, ψnk ⟩ψnk = ∑ ∑ = ψnk (25.165)
n,k n=0 k=1
λnk
where λnk is obtained the same way (from equation (25.161)) and eigenfunctions sim-
plifies to
2(2n + 1) jn (μnk r)
ψnk (r, θ) = √ P (cos θ) (25.166)
3 jn+1 (μnk a) n
The solution of equations (25.158)–(25.159) can also be obtained by using the FFT
method in combination of direct or Green’s function method, where FFT is used to
reduce the problem to 1D (i. e., 3D to 1D or 2D to 1D), and then solving the 1D problem
directly. We demonstrate this approach below.
25.4 2D and 3D problems in spherical geometry | 673
1 𝜕 2 𝜕u 1 𝜕 𝜕u
2
(r )+ 2 (sin θ ) = −f (r, θ), (25.167)
r 𝜕r 𝜕r r sin θ 𝜕θ 𝜕θ
0 < r < a, 0<θ<π
u(a, θ) = 0, u(r, θ) finite (25.168)
1 d dy
(sin θ ) = −λy, 0<θ<π (25.169)
sin θ dθ dθ
y(θ) is finite for 0 < θ < π (25.170)
which is the special case of θ-operator in general case (see equation (25.51)). Thus, the
eigenvalues and eigenfunction can be obtained from equations (25.54)–(25.56) with
m = 0, i. e., eigenvalues λ:
eigenfunction yn :
It can be shown that they form a basis for the Hilbert space ℒ2 [0, π] with the inner
product defined above in equation (25.173), i. e., for any function f (θ), we can express
π
∞
1
f (θ) = ∑ fn yn (θ); fn = ⟨f , yn ⟩ = ∫ f (θ)yn (θ) sin θ dθ (25.174)
n=0
2
0
Now consider the solution of the Poisson equation (25.167). Taking the inner product
with eigenfunctions, we get
1 d 2 dun n(n + 1)
(r )− un = −fn (r); n = 0, 1, 2, . . . (25.175)
r 2 dr dr r2
un (a) = 0, un (r) is finite (25.176)
674 | 25 Fourier transforms in cylindrical and spherical geometries
where
π
1
un = ⟨u, yn ⟩ = ∫ u(r, θ)yn (θ) sin θ dθ (25.177)
2
0
π
1
fn = ⟨f , yn ⟩ = ∫ f (r, θ)yn (θ) sin θ dθ (25.178)
2
0
Equations (25.175)–(25.176) may be solved using the Green’s function method. We note
that
n n
r r r
−n−1
u1 = ( ) and u2 = ( ) −( ) (25.179)
a a a
are two linearly independent solutions of the homogeneous equation satisfying the
BCs at r = 0 and r = a, respectively. Thus,
a
where
1 ( s )n [( r )−n−1 − ( ar )n ], s≤r
Gn (r, s) = { ar n as −n−1 (25.181)
(2n + 1)a ( a ) [( a ) − ( as )n ], s⩾r
where G is the Green’s function for the operator on the LHS of equations (25.175)–
(25.176), and can be expressed as
∞
G(r, s, θ, α) = ∑ Gn (r, s)Kn (θ, α) (25.183)
n=0
25.4 2D and 3D problems in spherical geometry | 675
2n + 1
Kn = Pn (cos θ)Pn (cos α) (25.184)
2
Three-dimensional problem
Similar to the 2D problem, the solution to the 3D problem can be obtained by consid-
ering the eigenvalue problem in (θ, ϕ) and converting the 3D model to 1D model using
FFT, and then using the direct solution. For example, equations (25.49)–(25.50) and
(25.54)–(25.56) give the eigenvalues and eigenfunctions:
(n − m)!(2n + 1) m
ynm = √ Pn (cos θ)Φm (ϕ) (25.185)
(n + m)!
{ Φ0 = 1
{
{ c
Φm (ϕ) = { Φm = √2 cos mϕ n = m, m + 1, . . . (25.186)
{
{ s
{ Φm = √2 sin mϕ,
with eigenvalues λnm = n(n + 1). Thus, taking the inner product of equations (25.158)–
(25.159) with ynm , we get
1 d 2 dunm n(n + 1)
(r )− unm = −fnm (r); (25.187)
r 2 dr dr r2
unm (a) = 0; unm (r) is finite; m = 0, 1, 2, . . . ; n = m, m + 1, . . . (25.188)
where
π 2π
1
unm = ⟨u, ynm ⟩ = ∫ ∫ u(r, θ, ϕ)ynm (θ, ϕ) sin θ dθ dϕ (25.189)
4π
0 0
π 2π
1
fnm = ⟨f , ynm ⟩ = ∫ ∫ f (r, θ, ϕ)ynm (θ, ϕ) sin θ dθ dϕ. (25.190)
4π
0 0
∞ ∞
u(r, θ, ϕ) = ∑ ∑ unm (r)ynm (θ, ϕ)
m=0 n=m
a π 2π
∞ ∞
1
= ∑ ∑ ynm (θ, ϕ) ∫ Gn (r, s)s2 ∫ ∫ f (r, θ′ , ϕ′ )ynm (θ′ , ϕ′ ) sin θ′ dθ′ dϕ′ ds
m=0 n=m 4π
0 0 0
a π 2π
1
Knm (θ, θ′ , ϕ, ϕ′ ) = y (θ, ϕ)ynm (θ′ , ϕ′ ) (25.194)
4π nm
Similarly, other problems in spherical geometry with other types of boundary con-
ditions can also be solved using this approach. For further example of problems in
cylindrical and spherical geometries, we refer to the books by Carslaw and Jaeger [10]
and Crank [16].
Problems
1. Consider the problem of cooling a circular cylinder of length 10 cms and diameter
2 cms made of copper. Its initial temperature is 500 ∘ C and it is suddenly plunged
into an agitated bath at 100 ∘ C temperature. Assume that the agitation is high
enough so that the surface of the cylinder is at 100 ∘ C.
(a) Derive a mathematical model for the temperature history of the rod and cast
into dimensionless form.
(b) Find the solution of the model. Determine an expression for the maximum
(center) temperature of the rod as a function of time. Compute this value after
0.05, 0.1, 1.0, 5.0 and 10 seconds.
(c) Repeat part (b) by assuming the cylinder to be of infinite length and compare
your result.
2. Consider the solution of the Laplace’s equation in a circle of radius unity with
Dirichlet condition u = f (θ) on the boundary (r = 1, 0 < θ ≤ 2π). Obtain a solution
to this problem using finite Fourier transform. Show that the solution obtained
may be simplified to the Poisson’s integral formula:
2π
1 (1 − r 2 )f (ϕ)
u(r, θ) = ∫ dϕ
2π 1 − 2r cos(θ − ϕ) + r 2
0
25.4 2D and 3D problems in spherical geometry | 677
𝜕C r 2 𝜕C
+ 2u[1
̂ − 2] = Dm ∇2 C + f (r, t); 0 < r < R, 0 < z < L, t>0
𝜕t R 𝜕z
with no flux boundary conditions at the pipe wall and C = C0 (r, t) at z = 0. Cast
into dimensionless form and solve using finite Fourier transform. Determine an
expression for the convected mean concentration at any axial position and time.
4. Consider the problem of tracer dispersion described in problem 3 above. Assume
that the pipe is of infinite length. Cast into dimensionless form and solve using
finite Fourier transform.
5. Consider the problem of unsteady-state diffusion and reaction in a porous spher-
ical catalyst
εD 𝜕 2 𝜕C 𝜕C
(r ) − kC = ε ; 0 < r < R, t>0
r 2 𝜕r 𝜕r 𝜕t
C = C0 (t), @ r = R, t > 0; C = F(r), @ t = 0, 0<r<R
Cast into dimensionless form and solve using finite Fourier transform.
6. Solve the problem of vibration of a sphere:
𝜕2 u
= ∇2 u for r < 1, t>0
𝜕t 2
𝜕u
u(r, θ, ϕ, 0) = f (r, θ, ϕ); (r, θ, ϕ, 0) = 0
𝜕t
1 𝜕 2 1 𝜕
(r vr ) + (v sin θ) = 0 (continuity)
2
r 𝜕r r sin θ 𝜕θ θ
𝜕p 1 𝜕 𝜕v 1 𝜕 𝜕v v 2 𝜕v 2
− + μ[ 2 (r 2 r ) + 2 (sin θ r ) − 2 2r − 2 θ − 2 vθ cot θ] = 0
𝜕r r 𝜕r 𝜕r r sin θ 𝜕θ 𝜕θ r r 𝜕θ r
(r-momentum)
1 𝜕p 1 𝜕 𝜕v 1 𝜕 𝜕v 2 𝜕v vθ
− + μ[ 2 (r 2 θ ) + 2 (sin θ θ ) + 2 r − =0
r 𝜕θ r 𝜕r 𝜕r r sin θ 𝜕θ 𝜕θ r 𝜕θ r 2 sin2 θ
(θ-momentum)
@ r = R, vr = vθ = 0
r → ∞, vr = U cos θ, vθ = −U sin θ, p = p0
678 | 25 Fourier transforms in cylindrical and spherical geometries
where Ω is a (i) circular disk (ii) sphere. Here, u is the vector of concentrations,
D is the matrix of diffusion coefficients and A is a constant n × n matrix, (b) Obtain
a formal solution to (1) with Neumann boundary conditions
∇u.n = 0 on 𝜕Ω
where n is the outward normal to 𝜕Ω. (c) Show schematic diagrams of contour
plots of the first few eigenfunctions for case a(i).
9. A first-order chemical reaction carried out in a tubular reactor that discharges into
a stirred reactor. The entire assembly is held at a constant temperature. Reaction
occurs in the tube as well as the stirred vessel.
(a) Assuming that the axial dispersion model applies in the tube, develop a math-
ematical model for the system.
(b) Devise a suitable self-adjoint formalism and determine a formal solution to
the model.
(c) Determine the residence time distribution function (response of the system
for a pulse input when there is no reaction) of the model developed in (a).
(d) How does your result change if the axial dispersion model is replaced by a
two-dimensional model that accounts for axial as well as radial dispersion of
the reacting species?
10. (a) Consider the problem of heat loss from a very long heated pipe buried vertically
in the earth. Suppose that the pipe outside radius is R and its surface temperature
is raised and kept constant at Ts . Assume that the initial earth temperature is Ta .
(i) Set up an appropriate mathematical model and determine an expression for
the heat flux at the surface of the pipe as a function of time.
(ii) Use the result in (i) to determine the effective heat transfer coefficient as a
function of time.
(b) A circular cylinder of radius R and infinite length is immersed in a fluid at
rest everywhere, and is suddenly made to move with steady velocity U parallel to
its length. Determine the frictional force per unit length of the cylinder at time t
after the motion has begun. With appropriate change of notation, show that this
expression is identical to that determined in (a) above.
(c) Determine the asymptotic form of the solutions in (a) and (b) for small and
large times.
|
Part VI: Formulation and solution of some classical
chemical engineering problems
Introduction
The aim of this last part is to demonstrate the use of mathematical tools, developed
in earlier parts, in the solution of some classical problems encountered by Chemi-
cal Engineers. It is hoped that these representative problems combined with practical
knowledge gained from experience can help the student in the mathematical analysis
of other such problems.
https://doi.org/10.1515/9783110739701-027
26 The classical Graetz–Nusselt problem
The classical Graetz–Nusselt problem deals with the determination of the heat or mass
transfer from a fluid in a duct to the wall in steady laminar flow. Here, we formulate
the model for heat transfer, and show how the tools of linear analysis may be used to
determine the heat transfer coefficient.
Figure 26.1: Schematics of (a) a laminar flow in a duct of an arbitrary cross-section, and (b) typical
cross-sections used in various applications.
𝜕T 𝜕2 T 𝜕2 T
ρf Cpf uf (y′ , z ′ ) = k ∇′2
f ⊥ T = k f ( + ), x ′ > 0, (y′ , z ′ ) ∈ Ω (26.1)
𝜕x ′ 𝜕y′2 𝜕z ′2
https://doi.org/10.1515/9783110739701-028
684 | 26 The classical Graetz–Nusselt problem
Here, T, ρf , Cpf and kf are temperature, density, specific heat capacity and thermal
conductivity of the fluid; x ′ is the coordinate along flow direction, (y′ , z ′ ) are transverse
coordinates; Ωf and 𝜕Ωf are transverse domain and its boundary, respectively; Tin and
Tw are fluid temperatures at the inlet and at the transverse boundary (wall); qw is the
heat flux at the transverse boundary entering the domain Ωf ; uf (y′ , z ′ ) is the velocity
profile; and n is unit outward normal to 𝜕Ωf . In the general case, there are two types of
boundary conditions: (i) constant wall temperature and (ii) constant heat flux at the
wall. Here, we discuss the first case and leave the second case for exercises.
T = Tw @ 𝜕Ωf (26.3)
and define the following quantities to nondimensionalize the governing model (equa-
tions (26.1) and (26.3)):
Here, AΩf , PΩf and RΩf are the cross-section area, wetted perimeter and hydraulic ra-
dius [hydraulic diameter dh = 4RΩf ], respectively; uf = A1 ∫Ω u(y′ , z ′ ) dy′ dz ′ is the
Ωf f
average velocity in the fluid phase; αf is the thermal diffusivity. This leads to the fol-
lowing dimensionless linear model:
1 𝜕2 θ 𝜕2 θ 𝜕θ
Lθ = ( 2 + 2) = , x > 0, (y, z) ∈ Ω (26.5)
u(y, z) 𝜕y 𝜕z 𝜕x
The same model is obtained for mass transfer from the duct interior to the wall with
an infinitely fast wall reaction (so that the species concentration at the wall is zero).
The dimensionless model given by equations (26.5) and (26.6) can be solved by
considering the eigenvalue problem (EVP) defined by
26.1 Model formulations and formal solution | 685
1 1 𝜕2 ψ 𝜕2 ψ
Lψ = ∇⊥2 ψ = ( 2 + 2 ) = −λψ in Ω; ψ|𝜕Ω = 0. (26.7)
u(y, z) u(y, z) 𝜕y 𝜕z
Note that the operator appearing in equation (26.7) is a self-adjoint operator (i. e.,
L∗ = L) w. r. t. the inner product defined by
1
⟨ψi , ψj ⟩ = ∫ u(y, z)ψi ψj dy dz = δij ; AΩ = ∫ dy dz (26.8)
AΩ
Ω Ω
where
1, i=j
δij = { }
0, i ≠ j
is the Kronecker delta. Hence, equation (26.7) defines a Sturm–Liouville EVP with
eigenvalues λi and eigenfunctions ψi .
Thus, taking the inner product with ψi , equation (26.5) leads to
d
⟨θ, ψi ⟩ = ⟨Lθ, ψi ⟩ = ⟨θ, L∗ ψi ⟩ = ⟨θ, Lψi ⟩ = −λi ⟨θ, ψi ⟩; ⟨θ, ψi ⟩|x=0 = ⟨1, ψi ⟩
dx
⇒
qw 1
h= = ∫ (n.kf ∇⊥′ T) dPΩf , (26.10)
Tw − Tm (Tw − Tm )PΩf
𝜕Ωf
which leads to the Nusselt number, Nu (or the dimensionless heat transfer coefficient)
as
hRΩf −1
NuΩ = = ∫ (n.∇⊥ θ) dPΩ . (26.11)
kf θm PΩ
𝜕Ω
1
θm (x) = ⟨1, θ⟩ = ∫ u(y, z)θ(x, y, z) dy dz = ∑ exp(−λi x)⟨1, ψi ⟩2 . (26.12)
AΩ i
Ω
Thus, the Nusselt number can be expressed in terms of eigenvalues and eigen-
functions from equations (26.11)–(26.12) as follows:
686 | 26 The classical Graetz–Nusselt problem
1
PΩ
∑i exp(−λi x)⟨1, ψi ⟩ ∫𝜕Ω (−n.∇⊥ ψi ) dPΩ
NuΩ = . (26.13)
∑i exp(−λi x)⟨1, ψi ⟩2
The above expression can further be simplified using the EVP (equation (26.7)) that
suggests
1 𝜕2 ψ 𝜕2 ψ
−λ⟨1, ψ⟩ = ⟨1, Lψ⟩ = ∫( 2 + 2 )dAΩ
AΩ 𝜕y 𝜕z
Ω
1 1
= ∫ (n.∇⊥ ψ) dPΩ = ∫ (n.∇⊥ ψ) dPΩ (26.14)
AΩ PΩ
𝜕Ω 𝜕Ω
Note that due to nondimensionlization of the spatial coordinate with hydraulic radius,
the dimensionless domain Ω has the property: AΩ = PΩ . Thus, the general expression
of the Nusselt number can be simplified from equations (26.13) and (26.14) as
where βi = ⟨1, ψi ⟩2 is the Fourier weight. Equation (26.15) shows that the local Nusselt
number can be expressed in terms of the eigenvalues and Fourier weights.
[Remark: In the literature, the Nusselt number Nu is defined using hydraulic di-
ameter dh as the length scale. It is clear that Nu = 4 NuΩ .]
1
−λ⟨ψ, ψ⟩ = ⟨Lψ, ψ⟩ = ∫ ψ∇⊥2 ψ dy dz
AΩ
Ω
1 1
= ∫ ψ(n.∇⊥ ψ) dPΩ − ∫(∇⊥ ψ.∇⊥ ψ) dy dz (Green’s identity)
AΩ AΩ
𝜕Ω Ω
1
=0− ∫(∇⊥ ψ.∇⊥ ψ) dy dz
AΩ
Ω
⇒
1 ∫Ω (∇⊥ ψi .∇⊥ ψi ) dy dz
λi = >0 ∀i (since u does not change sign)
AΩ ∫ u(y, z)ψ2i dy dz
Ω
Further, from expanding unity in terms of eigenfunctions ψ′i s, we get the Parseval’s
relation:
26.1 Model formulations and formal solution | 687
where the energy content βi of the ith mode is less than unity. Thus, the long distance
asymptote (x ≫ 1) can be obtained from equation (26.15) as follows:
Thus, only the first eigenvalue determines the dimensionless heat transfer coefficient
(NuΩ ) at long distance from inlet.
u(ξ ) = u0 ξ (26.17)
and the governing model (equations (26.5) and (26.6)) can be simplified as follows:
1 𝜕2 θ 𝜕θ
= ; θ|x=0 = 1; θ|ξ =0 = 0; θ|ξ →∞ → 1 (26.18)
u0 ξ 𝜕ξ 2 𝜕x
γξ
z= , (26.19)
x 1/3
we get
𝜕θ 𝜕θ 𝜕z −γξ 𝜕θ
= =
𝜕x 𝜕z 𝜕x 3x 4/3 𝜕z
and
𝜕θ 𝜕θ 𝜕z γ 𝜕θ 𝜕2 θ γ 2 𝜕2 θ
= = 1/3 or 2
= 2/3 2 ,
𝜕ξ 𝜕z 𝜕ξ x 𝜕z 𝜕ξ x 𝜕z
1 γ 2 𝜕2 θ −γξ 𝜕θ
=
u0 ξ x 2/3 𝜕z 2 3x 4/3 𝜕z
⇒
688 | 26 The classical Graetz–Nusselt problem
𝜕2 θ u u0
= − 03 z 2 = −3z 2
𝜕θ 𝜕θ
if γ = √3 (26.20)
𝜕z 2 3γ 𝜕z 𝜕z 9
⇒
γξ
z=
x 1/3
1
θ(x, ξ ) = θ(z) = ∫ exp(−t 3 ) dt (26.22)
Γ( 43 )
0
1 𝜕θ γ 𝜕θ γ
NuΩ0 = lim NuΩ (x) = lim = 1/3 = (26.23)
x≪1 x≪1 θm 𝜕ξ ξ =0 x 𝜕z z=0 Γ( 4 )x 1/3
3
⇒
1/3
γ 1 u0
NuΩ0 = lim NuΩ (x) = = ( ) . (26.24)
x≪1 Γ( 43 )x 1/3 4
Γ( 3 ) 9x
The two asymptote can also be combined to obtain a simpler approximate expression
for the Nusselt number as discussed in Gundlapally and Balakotaiah [19].
3
u(y) = (1 − y2 ) (26.25)
2
3 𝜕θ 𝜕2 θ
(1 − y2 ) = , x > 0 and 0 < y < 1 (26.26)
2 𝜕x 𝜕y2
The EVP can be defined for parallel plate geometry from equation (26.7) as follows:
2 𝜕2 ψ
Lψ = = −λψ, 0 < y < 1; (26.28)
3(1 − y ) 𝜕y2
2
𝜕ψ
ψ|y=1 = 0; = 0, (26.29)
𝜕y y=0
1
3
⟨ψi , ψj ⟩ = ∫ (1 − y2 )ψi ψj dy = δij (26.30)
2
0
The first few eigenvalues and Fourier coefficients βi = ⟨1, ψ2i ⟩ are listed in Table 26.1.
The corresponding Graetz eigenfunctions are shown in Figure 21.1.
Table 26.1: First few eigenvalues and Fourier coefficients of the Graetz problem for flow between
parallel plates and constant wall temperature.
i λi βi = ⟨1, ψi ⟩2
1 1.8852 0.91035
2 21.431 0.05314
3 62.317 0.01528
4 124.54 0.00681
5 208.09 0.00374
6 312.98 0.00232
Using these values and equation (26.15), the Nusselt number can be plotted against
the dimensionless axial coordinate x, which is shown in Figure 26.2.
It can be seen from this figure that it exhibits the two asymptotes in the limits of
x ≫ 1 and x ≪ 1. The long distance asymptote (x ≫ 1) can be obtained from equation
(26.16), which simplifies to
Similarly, the short distance asymptote can be obtained from equation (26.15), where
velocity profile is simplified as
3
lim u(y) = lim (1 − y2 ) → 3(1 − y) = 3ξ ⇒ u0 = 3,
y→1 y→1 2
690 | 26 The classical Graetz–Nusselt problem
Figure 26.2: Local or position dependent Nusselt number (dimensionless heat transfer coefficient)
for fully developed laminar flow through parallel plates for the case of constant wall temperature.
and hence the short distance asymptote of Nusselt number can be given from equation
(26.24) as
1 3 1 −1/3
NuΩ0 = lim NuΩ (x) = √ x = 0.7765x −1/3 . (26.32)
x≪1 Γ( 43 ) 3
𝜕θ 1 𝜕
2(1 − y2 )
𝜕θ
= (y ), x > 0 and 0 < y < 1 (26.34)
𝜕x y 𝜕y 𝜕y
1 1 𝜕 𝜕ψ
Lψ = (y ) = −λψ, 0 < y < 1; (26.36)
2(1 − y2 ) y 𝜕y 𝜕y
𝜕ψ
ψ|y=1 = 0; = 0, (26.37)
𝜕y y=0
1 1
The first six eigenvalues and Fourier coefficients βi = ⟨1, ψ2i ⟩ are listed in Table 26.2.
The corresponding Graetz eigenfunctions are shown in Figure 26.3.
Table 26.2: First few eigenvalues and Fourier coefficients of the Graetz problem for flow through a
circular duct and constant wall temperature.
i λi βi = ⟨1, ψi ⟩2
1 3.65679 0.819050
2 22.3047 0.097527
3 56.9605 0.032504
4 107.620 0.015440
5 174.282 0.008788
6 256.945 0.005584
Figure 26.3: First six eigenfunctions of the Graetz–Nusselt problem for fully developed laminar flow
through a circular duct with constant wall temperature.
Note that in this case, since duct radius is used to nondimensionlize the transverse
coordinate, AΩ ≠ PΩ (AΩ = 1 and PΩ = 2), and hence we rewrite equation (26.14) as
692 | 26 The classical Graetz–Nusselt problem
1 2
−λ⟨1, ψ⟩ = ⟨1, Lψ⟩ = ∫ (n.∇⊥ ψ) dPΩ = ∫ (n.∇⊥ ψ) dPΩ
AΩ PΩ
𝜕Ω 𝜕Ω
⇒
1 −λj
∫ (n.∇⊥ ψj ) dPΩ = ⟨1, ψj ⟩.
PΩ 2
𝜕Ω
This simplifies the Nusselt number based on hydraulic diameter from equation (26.14)
as
Figure 26.4: Local or position dependent Nusselt number (dimensionless heat transfer coefficient)
based on hydraulic diameter for fully developed laminar flow through a circular duct for the case of
constant wall temperature.
As expected, it exhibits the two asymptotes in the limits of x ≫ 1 and x ≪ 1. The long
distance asymptote (x ≫ 1) can be obtained from equation (26.39), which simplifies
to
Similarly, the short distance asymptote can be obtained from equation (26.15), where
the velocity profile is simplified as
and hence the Short distance asymptote of the Nusselt number can be given from equa-
tion (26.24) as
2 3 4 −1/3
Nu0 = √ x = 1.7092x −1/3 . (26.41)
Γ( 43 ) 9
Other duct geometries as well as flux and mixed boundary conditions at the wall can
be analyzed in a similar way.
Problems
1. Consider the Graetz–Nusselt problem in a triangular duct with constant wall tem-
perature boundary condition:
(a) Identify the EVP and obtain the formal solution.
(b) Determine the long and short distance asymptotes for the Nusselt number.
2. Consider the Graetz–Nusselt problem in a circular tube with constant wall flux
boundary condition. Formulate the model equations and identify the EVP. Deter-
mine an expression for the Nusselt number and identify the short and long dis-
tance asymptotes.
3. Repeat problem 2 for parallel plate geometry.
4. The concentration of a reactant in a tubular catalytic reactor in which the flow
is laminar and fully developed is given by (assuming centerline symmetry and
neglecting axial diffusion)
r 2 𝜕C 1 𝜕 𝜕C
2⟨u⟩(1 − ) = Dm (r ); x > 0, 0 < r < a
a2 𝜕x r 𝜕r 𝜕r
𝜕C 𝜕C
= 0 at r = 0; −Dm = ks C at r = a
𝜕r 𝜕r
C(r, x = 0) = C0
(a) Cast the model equations in dimensionless form and solve using finite Fourier
transform.
(b) Use the solution in (a) to determine the local Sherwood number (or dimen-
sionless mass transfer coefficient) defined by
2a −Dm 𝜕r (r = a)
𝜕C
2a
Sh = ( )kc = ( ) ,
Dm Dm Cm − C(r = a)
(d) Simplify the result in (b) for the two limiting cases of infinitely fast (ks → ∞)
and slow reaction (ks → 0) and comment on your results.
5. The Graetz–Nusselt formulation assumes that the duct length is much larger than
the hydraulic diameter and neglects the axial diffusion term (or takes axial Peclet
number to be infinity). At the other extreme of the large hydraulic diameter com-
pared to length for heat/mass transfer or reaction, for which the axial Peclet num-
ber goes to zero, the appropriate model is the so-called “short tube model.” For
the case of fixed temperature boundary condition, this model may be expressed
as
T − Tin
kf ∇⊥2 T = ρf Cpf uf (y′ , z ′ )( ), (y′ , z ′ ) ∈ Ω
L
T|𝜕Ω = Tw
ΔP
μ∇′2 ux = ( ) in Ω′ ; ux = 0 on 𝜕Ω′ (no slip boundary condition) (27.1)
L
Here, μ is the fluid viscosity; ux is the axial component of the velocity; −ΔP = P1 −P2 > 0
is the pressure drop; P1 and P2 are pressures at the entrance (x ′ = 0) and the exit
(x ′ = L), respectively; L is the length of the duct; Ω′ and 𝜕Ω′ are the transverse (cross-
sectional) domain and its boundary.
Defining the effective transverse length (i. e., hydraulic radius) RΩ and dimension-
less transverse coordinates as
the momentum balance (equation (27.1)) and no-slip wall boundary condition can be
expressed in dimensionless form as follows:
∇2 U = −1 in Ω; U = 0 on 𝜕Ω (27.4)
https://doi.org/10.1515/9783110739701-029
696 | 27 Friction factors for steady-state laminar flow in ducts
(−ΔP)
τw PΩ′ L = (−ΔP)AΩ′ ⇒ τw = RΩ . (27.5)
L
Here, τw is the average wall shear stress. The friction factor f (or the dimensionless
pressure drop) is defined by
τw
f = 1
, (27.6)
2
ρ⟨ux ⟩2
where ρ is the fluid density and ⟨ux ⟩ is the average velocity. Defining the Reynolds
number Re as
(4RΩ )⟨ux ⟩ρ
Re = , (27.7)
μ
Thus, to obtain the relationship between pressure drop (or f ) and flow rate (or ⟨U⟩),
we need to solve the Poisson equation (27.4) to determine U(y, z) and then ⟨U⟩ can be
determined by
1
⟨U⟩ = ∫ U(y, z) dy dz. (27.9)
AΩ
Ω
Equation (27.8) gives the required relationship. Equation (27.4) can be solved either by
direct method (e. g., 1D boundary value problem for symmetric geometries) or by finite
Fourier transform.
FFT method
Let λi be the eigenvalue and ψi be the eigenfunction of the EVP:
Lψ = ∇2 ψ = −λψ in Ω; ψ = 0 on 𝜕Ω (27.10)
1
⟨ψi , ψj ⟩ = ∫ ψi ψj dy dz = δij (27.11)
AΩ
Ω
⇒
1
⟨U, ψi ⟩ = ⟨1, ψi ⟩, (27.12)
λi
1
U(y, z) = ∑⟨U, ψi ⟩ψi (y, z) = ∑ ⟨1, ψi ⟩ψi (y, z). (27.13)
i i
λi
1 β
⟨U⟩ = ⟨1, U⟩ = ∑ ⟨1, ψi ⟩2 = ∑ i ; βi = ⟨1, ψi ⟩2 (27.14)
i
λi i
λi
where βi are the Fourier weights that satisfy the Parseval’s relation: ∑i βi = 1. Thus,
the friction factor can be expressed in terms of eigenvalues of the Laplacian operator
(with Dirichlet boundary condition and Fourier weights) as follows:
8 8 8
f Re = = 1
= βi
. (27.15)
⟨U⟩ ∑i
λi
⟨1, ψi ⟩2 ∑i λi
d2 U
= −1; U(y = ±1) = 0 (27.16)
dy2
The direct solution of equation (27.16) can be obtained by integrating it twice, which
with Dirichlet boundary condition leads to as
1
U(y) = (1 − y2 ) (27.17)
2
⇒
1 1
1 1
⟨U⟩ = ∫ U(y) dy = ∫ (1 − y2 ) dy = (27.18)
2 3
0 0
698 | 27 Friction factors for steady-state laminar flow in ducts
⇒
8
f Re = = 24 (27.19)
⟨U⟩
d2 ψ
Lψ = = −λψ, ψ[y = ±1] = 0, (27.20)
dy2
1
1
⟨ψi , ψj ⟩ = ∫ ψi ψj dy = δij (27.21)
2
−1
(2i − 1)2 π 2 πy
λi = , ψi = √2 cos[(2i − 1) ], i = 1, 2, 3, . . . (27.22)
4 2
8
βi = ⟨1, ψi ⟩2 = (27.24)
(2i − 1)2 π 2
Thus, the formal solution can be expressed in terms of eigenvalues and eigenfunctions
from equation (27.13) as
1 ∞
(−1)i−1 16 πy
U(y) = ∑ ⟨1, ψi ⟩ψi (y, z) = ∑ cos[(2i − 1) ]. (27.25)
i
λi i=1
(2i − 1) 3π3 2
βi ∞ 32 1 32 π 4 1
⟨U⟩ = ∑ =∑ 4 = 4 = (27.26)
i
λi i=1 π (2i − 1) 4 π 96 3
which matches the result from direct solution (equation (27.18)). Similarly, the friction
factor can be obtained from equation (27.8) or (27.15) as
27.3 Specific case: elliptical ducts | 699
8
f Re = = 24 (27.27)
⟨U⟩
y′2 z ′2 a
Ω′ ≡ + 2 − 1 < 0; −a ≤ y′ ≤ a, −b ≤ z ′ ≤ b; σ= (27.28)
a2 b b
where a and b are the lengths of semimajor and semiminor axes, and σ is the aspect
ratio of the ellipse (σ = 1 for a circle). The momentum equation (27.1) can be solved in
many ways. Here, we utilize a single trial function that vanishes at the boundary 𝜕Ω′
and expresses the flow rate as follows:
y′2 z ′2
ux = β(1 − − 2) (27.29)
a2 b
ΔP ΔP 1 1
μ∇′2 ux = ( ) ⇒ ( ) = −2β( 2 + 2 ) (27.30)
L μL a b
1 1
u∗ = R2Ω ( ) = 2βR2Ω ( 2 + 2 ).
−ΔP
(27.31)
μL a b
1
⟨ux ⟩ = ∫ ux (y′ , z ′ ) dy′ dz ′
A Ω′
Ω′
′2
b√ 1− y 2
a a
4β y′2 z ′2
= ∫[ ∫ (1 − − 2 ) dz ′ ] dy′
πab a2 b
0 0
a 3/2
4β 2b y′2
= ∫ (1 − 2 ) dy′
πab 3 a
0
π/2
8β
= ∫ cos4 (θ) dθ (taking y′ = a sin θ)
3π
0
8β 3π β
= . = (27.32)
3π 16 2
700 | 27 Friction factors for steady-state laminar flow in ducts
Thus, the friction factor can be expressed from equations (27.8), (27.31) and (27.32) as
8 8u∗ 1 1
f Re = = = 32R2Ω ( 2 + 2 ) (27.33)
⟨U⟩ ⟨ux ⟩ a b
AΩ′
where RΩ = PΩ′
is the hydraulic radius of the duct. The cross-section area of the duct
is AΩ′ = πab while the wetted perimeter PΩ′ is given by the integral:
Since 𝜕Ω′ ≡ ya2 + zb2 − 1 = 0, taking y′ = a sin θ and z ′ = b cos θ, the above expression
′2 ′2
π/2
π/2
πab πb 1
RΩ = = ; where E(σ) = ∫ √1 − (1 − 2 ) sin2 θdθ. (27.34)
PΩ′ 4E(σ) σ
0
Thus, equation (27.33) can be further simplified to express the friction factor as
R2Ω 1
f Re = 32 2
(1 + 2 )
b σ
2 π/2
1 π 1
= 2(1 + 2 )[ ]; where E(σ) = ∫ √1 − (1 − 2 ) sin2 θ dθ (27.35)
σ E(σ) σ
0
Figure 27.2 plots friction factor for elliptical duct against its aspect ratio and few
of these values are listed in Table 27.1. Note from Figure 27.2 that the friction factor
curve is symmetric around σ = 1 on log-linear plot. This is because aspect ratio of σ
and σ −1 have the same cross-sectional shape of the duct, and hence the same friction
factor. It can also be seen from equation (27.35) that in the limit of σ → ∞, E(σ) →
27.3 Specific case: elliptical ducts | 701
Table 27.1: Friction factor and E-function for elliptical ducts of aspect ratio σ.
σ E(σ) f Re
1 2
σ≪1 σ2
2π = 19.74
π
1.0 2
= 1.571 16.0
Figure 27.2: Friction factor and E-function for elliptical ducts against the aspect ratio σ.
π/2
∫0 cos θ dθ = 1, and hence limσ→∞ f Re = 2π 2 . In addition, since f Re is symmetric
around σ = 1 (on log-linear plot),
2π 2
This also implies from equation (27.35) that in the limit of σ → 0, E(σ) → f Re
(1 + σ12 ) →
1
σ2
, which can also be seen from equation (27.35).
Problems
1. Consider the problem of laminar flow in a duct of equilateral triangular cross-
section with side a and height H = √3a/2 as shown in Figure 27.3:
y′ y′ 2 − 3x ′ 2
uz (x ′ , y′ ) = c0 ( − 1)( )
H H2
satisfies the no-slip condition, and hence represents a possible velocity pro-
file. Determine the constant c0 so that equation (27.1) is satisfied.
27.3 Specific case: elliptical ducts | 703
24
f Re =
2 192λ tan h[ (2k−1)π
(1 + λ) [1 − 2λ
]
π5
∑∞
k=1 ]
(2k−1)5
where λ(= a/b) is the aspect ratio. Use the above formula to evaluate and plot f Re
as a function λ for 1 ≤ λ ≤ ∞. Determine the numerical value of f Re for a square
duct.
28 Multicomponent diffusion and reaction
Problems of multicomponent diffusion and reaction are of fundamental importance
in chemical engineering. They arise in the design of catalysts and catalytic reactors,
adsorption and separation processes as well as many other applications. We have al-
ready illustrated the application of linear analysis to these problems in Sections 4.5.6,
5.3 and Chapters 23 and 25. In this chapter, we show some further application to deter-
mine catalyst effectiveness factors and calculation of effluent concentrations in mono-
lith reactors.
In the next section, we examine the problem of catalyst effectiveness factor. We
also introduce the concept of internal mass transfer coefficient and its calculation for
an arbitrary geometry. This is followed by a discussion of the multicomponent case
and illustration of the calculations.
Assuming a single step reaction A → B with linear kinetics, the steady-state reactant
concentration profile C(x ′ , y′ , z ′ ) satisfies the following diffusion–reaction problem:
where De is the effective diffusivity of the reactant in the porous particle; k0 is the first-
order rate constant (based on unit volume); and a(x′ , y′ , z ′ ) is the normalized activity
profile, i. e.,
1
∫ a(x ′ , y′ , z ′ ) dVΩ′ = 1 (28.2)
V Ω′
Ω′
https://doi.org/10.1515/9783110739701-030
28.1 Generalized effectiveness factor problem | 705
Here, VΩ′ is the volume of the catalyst. Let SΩ′ is the external surface area of the catalyst
particle, then the effective diffusion length RΩ , Thiele modulus ϕ and other quantities
can be defined as
V Ω′ (x ′ , y′ , z ′ ) C(x ′ , y′ , z ′ )
RΩ = ; (x, y, z) = ; c(x, y, z) = ;
SΩ′ RΩ C0
R2Ω k0
ϕ2 = ; g(x, y, z) = a(RΩ x, RΩ y, RΩ z). (28.3)
De
Thus, the model equation (28.1) can be expressed in dimensionless form as follows:
1 1
∫ g(x, y, z) dΩ = ∫ g(x, y, z) dx dy dz = 1, (28.5)
VΩ VΩ
Ω Ω
where VΩ is the volume of particle in dimensionless units (i. e., of scaled domain). For
the case of uniform catalyst activity, g(x, y, z) = 1. In the general case, g > 0.
To solve equation (28.4), we consider the EVP:
1
Lψ = ∇2 ψ = −λψ in Ω; ψ = 0 on 𝜕Ω (28.6)
g(x, y, z)
1
⟨ψi , ψj ⟩ = ∫ g(x, y, z)ψi (x, y, z)ψ(x, y, z) dx dy dz = δij (28.7)
VΩ
Ω
As shown in earlier chapters, the eigenvalues λj are real and positive. In addition, the
Parseval’s relation suggests that the normalized eigenfunctions and Fourier weights
satisfy
1
Lu = ∇2 u = ϕ2 (u − 1) in Ω; u = 1 − c; u = 0 on 𝜕Ω (28.9)
g(x, y, z)
706 | 28 Multicomponent diffusion and reaction
⇒
ϕ2 ⟨1, ψi ⟩
⟨u, ψi ⟩ =
ϕ2 + λi
⇒
ϕ2 ⟨1, ψi ⟩
u = ∑⟨u, ψi ⟩ψi = ∑ ψ
i i
ϕ2 + λi i
ϕ2 ⟨1, ψi ⟩
c =1−u=1−∑ ψ (x, y, z) (28.10)
i
ϕ2 + λi i
The effectiveness factor can be determined using the concentration profile. The effec-
tiveness factor η is defined by
1
η= ∫ g(x, y, z)c(x, y, z) dx dy dz
VΩ
Ω
⇒
ϕ2 ⟨1, ψi ⟩2 ϕ 2 βi λβ
η = ⟨1, c⟩ = 1 − ∑ 2
= 1 − ∑ 2
= ∑ 2i i (28.12)
i
ϕ + λi i
ϕ + λi i
ϕ + λi
where βi are Fourier weights defined in equation (28.8). The above expression for ef-
fectiveness factor can be expanded in power series of Thiele modulus ϕ2 as
∞
βi j
η = 1 + ∑ ∑(−1)j j
(ϕ2 ) , (28.13)
i j=1 λi
where the coefficients in the power series are termed as Aris numbers (Balakota-
iah [5]):
28.1 Generalized effectiveness factor problem | 707
∞
j βi
η = 1 + ∑(−1)j Arj (ϕ2 ) ; Arj = ∑ j
(28.14)
j=1 i λi
The Aris numbers depend only on the geometric shape of the catalyst particle. These
can also be obtained by using a perturbation method, by expanding the concentration
in powers of ϕ2 as
∞
j
c = ∑ (−1)j cj (x, y, z)(ϕ2 ) ; (28.15)
j=0
where
Equation (28.12) gives the effectiveness factor for a catalyst particle (or layer) in terms
of eigenvalues, Fourier weights and ϕ2 .
The internal mass-transfer coefficient kci for species exchange between the boundary
and interior of the catalyst particle can be defined as follows:
1
SΩ′
∫𝜕Ω′ De n.∇C dSΩ′
kci = (28.17)
C0 − Cm
1
Cm = ∫ a(x ′ , y′ , z ′ )C(x ′ , y′ , z ′ ) dVΩ′ . (28.18)
V Ω′
Ω′
k0 VΩ′ Cm
∫ ∇.(∇C) dVΩ′ = ∫ n.∇C dSΩ′ = , (28.19)
De
Ω′ 𝜕Ω′
the internal Sherwood number Shi (or dimensionless mass transfer coefficient) can be
expressed as
λi βi
Cm k0 R2Ω ηϕ2 (∑∞
i=1 ϕ2 +λi
)
Shi = = = (28.20)
C0 − Cm De 1 − η (∑∞ βi
)
i=1 ϕ2 +λi
708 | 28 Multicomponent diffusion and reaction
or
1
η= . (28.21)
ϕ2
1+ Shi
The exact expressions for the internal Sherwood number for a slab, an infinite cylinder
and a sphere for uniform activity can be obtained as (see Chapters 23 and 25)
1 1
−1
Shi = ( − ) (slab) (28.22)
ϕ tanh ϕ ϕ2
ϕ2 I1 (2ϕ)
Shi = (infinite cylinder) (28.23)
ϕI0 (2ϕ) − I1 (2ϕ)
3ϕ3 coth(3ϕ) − ϕ2
Shi = (sphere). (28.24)
3ϕ2 − 3ϕ coth(3ϕ) + 1
Several limiting cases of equations (28.12) and (28.20) for the general case are of inter-
est. For ϕ2 → 0, we have
and
−1
1 ∞
β
Shi ≜ Shi∞ = = (∑ i ) . (28.26)
Ar1 λ
i=1 i
For ϕ2 ≫ 1, the sum defining η may be replaced by an integral. In this limit, it can be
shown that η → ϕ1 while Shi → ϕ. Using these limits, Shi for any ϕ may be approxi-
mated by
where the constants Shi∞ and Λ∗ depend only on the geometric shape of the cata-
lyst particle. These constants can be related to Arj (j = 1, 2, . . .) or λi and βi . Values of
these constants for the case of uniform activity and various common geometries can
be found in the literature (Tu et al. [30]; Sarkar et al. [27]). A plot of Shi for infinite slab
(Shi∞ = 3 and Λ∗ = 0.2), cylinder (Shi∞ = 2 and Λ∗ = 0.33) and sphere (Shi∞ = 35
and Λ∗ = 0.43) is shown in Figure 28.2. The shape of the Shi versus ϕ curve for any
arbitrary geometry is similar to the curves shown in Figure 28.2.
28.2 Multicomponent diffusion and reaction in the washcoat layer of a monolith reactor | 709
Figure 28.2: Internal Sherwood number versus Thiele modulus for slab, infinite cylinder and a
sphere.
R
∑ νij Aj = 0; j = 1, 2, . . . , R (28.28)
j=1
Let ri be the rate of reaction i, then the steady-state species balance for Aj gives
d2 cwj R
Dej + ∑ νij ri (cw ) = 0 (28.29)
dy2 i=1
d 2 cw dcw
De + νT r(cw ) = 0; cw = cs @ y = 0; = 0 @ y = δc , (28.32)
dy2 dy
r(cw ) = Kc
̂ w, K
̂ = R × S matrix of rate constants
⇒
where Kv = S×S matrix of effective rate constants (for species consumption). Equation
(28.32) may be written as
d2 cw
De = Kv cw ; 0 < y < δc .
dy2
Defining
y
ξ = , Φ2w = δc2 D−1
e Kv
δc
⇒
d 2 cw dcw
= Φ2w cw , 0 < ξ < 1; = 0 @ ξ = 1; cw = cs @ ξ = 0. (28.33)
dξ 2 dξ
Here, Φw is the Thiele matrix, which may be shown to have real and nonnegative
eigenvalues for the case of monomolecular kinetics (Wei and Prater [31]).
Equation (28.33) may be solved using the standard matrix methods (as discussed
in Chapter 5):
robs = ∫ Kv cw (ξ ) dξ
0
= Kv (tanh Φw )Φ−1
w cs = Kv cs
∗
(28.35)
while we have
1
Kv∗ = Kv (D−1
e Kv ) ‖Φw ‖ ≫ 1. (28.37)
−1/2
,
δc
𝜕cw
kci (cs − cw ) = −De = jf −wc (28.38)
𝜕y y=0
where the average concentration vector in the washcoat cw and the species flux vector
at fluid-washcoat interface jf −wc are given by
⇒
Defining
Shi = δc D−1
e kci , (28.41)
we obtain
w (tanh Φw ) − Φw ] , (28.42)
−1
= [Φ−1 −1 −2
which is a generalization of the expression for the scalar case (slab or parallel plate
geometry).
[Remark: For arbitrary shaped catalyst particle, the expression of internal Sher-
wood number matrix can be shown as
which is a generalization of the scalar case (equation (28.27)). Here, the scalars Shi∞
and Λ∗ depend only on the shape/geometry of the washcoat/catalyst.]
The kci or Shi matrix can be calculated using the spectral theorem or the Cayley–
Hamilton theorem for calculating functions of a matrix. A sample calculation is shown
in the next section.
712 | 28 Multicomponent diffusion and reaction
Considering the isothermal case, the steady-state reactor model that couples convec-
tion in the channel to transverse diffusion and reaction in the catalyst layer can be
expressed as
dcf
u = −av kce (cf − cs ); cf = cf ,in @ x = 0 (28.44)
dx
jf −wc = kce (cf − cs ) = kci (cs − cw ) (28.45)
= kc0 (cf − cw ) = δc Rv (cw ) (28.46)
where
k−1
c0 = kce + kci ,
−1 −1
(28.47)
is the overall mass transfer coefficient matrix, kce and kci are the external and internal
mass transfer coefficient matrices, respectively. For linear kinetics,
Rv (cw ) = Kv cw (28.48)
x
and we can write equation (28.44) with z = L
as
dcf τ
=− δ K (k + δc Kv )−1 kco cf = −Dacf ; (28.49)
dz RΩ c v c0
cf = cf ,in @ z = 0. (28.50)
L
Here, L is the channel length and τ = ū
is the space time. Thus, the exit concentration
vector is given by
Here, Da is the Damköhler matrix and it can be computed from Kv and kco as
εf τ
Da = δ K (k + δc Kv )−1 kco (28.52)
RΩ c v c0
1
kce = D Sh , (28.53)
4RΩ m e
kci = δc Kv [Φw (tanh Φw )−1 − I] . (28.54)
−1
Note that in dilute mixture, She may be assumed to be a diagonal matrix of external
Sherwood numbers.
Using these expressions, we can determine the impact of external mass trans-
fer as well as that of pore diffusion on the yield (or selectivity) of intermediate prod-
ucts.
Consider the reaction scheme shown in Figure 28.4 among four components A, B, C
and D, with rate constants as shown.
For simplicity, assume that the diffusivities of all species (in gas phase and washcoat)
are equal.
⇒
Dm
Kce = kce I, kce = SheΩ ; (28.55)
RΩ
1 − 21 0 0
De −1 1 − 41 0
kci = Sh ; Kv = kA, A=( ), (28.56)
δc i 0 − 21 1
2
− 81
0 0 − 41 1
8
δc kτ
Da = A(kco + δc kA)−1 kco (28.57)
RΩ
1 δ
k−1
co = I + c Sh−1
kce De i
RΩ δ
= Sh−1
eΩ I + c Sh−1
Dm De i
τ D τ Dm δc
kco = m2 [Sh−1
eΩ I + μShi ] ;
−1 −1
μ= , (28.58)
RΩ RΩ De RΩ
where μ is the ratio of diffusion velocity in the fluid to that in the washcoat. This ex-
presses the Damköhler matrix (equation (28.57)) as
1 Dm τ
−1
Da = A( k + A) eΩ I + μShi )
(Sh−1 −1 −1
kδc co R2Ω
Dm Dm τ
−1
= A( eΩ I + μShi ] + A)
[Sh−1 −1 −1
eΩ I + μShi )
(Sh−1 −1 −1
kδc RΩ 2
RΩ
Dm τ D
−1
= A{ m I + (Sh−1
eΩ I + μShi )A} ,
−1
2
RΩ kδc RΩ
RΩ −1 R2
−1
=[ A + Ω (Sh−1
eΩ I + μShi )] .
−1
(28.59)
kτδc Dm τ
Limiting cases
1. No external and internal resistances to mass transfer:
In this case, the Damköhler matrix (equation (28.59)) reduces to
δc
Da = kτA (28.60)
RΩ
Dm
kc0 = kce = SheΩ I
RΩ
RΩ −1 R2 1
−1
Da = { A + Ω I} . (28.61)
δc kτ Dm τ SheΩ
Note that in the limit of fast kinetics, k → ∞ (or ϕ2 ≫ 1), the Damköhler matrix
simplifies to diagonal form
28.3 Isothermal monolith reactor model for multiple reactions | 715
Dm τ
Da = SheΩ I, (28.62)
R2Ω
De
kc0 = kci = Sh
δc i
RΩ −1 R2Ω μ −1
−1
Da = [ A + Sh ] .
kτδc Dm τ i
δ kτ
= c (A−1 + ϕ2 Sh−1 (28.63)
−1
i ) .
RΩ
If Shi = Shi∞ I (i. e., asymptotic approximation), the Damköhler matrix reduces
(equation (28.63)) to
δc kτ ϕ2
−1
Da = A(I + A) .
RΩ Shi∞
For ϕ2 ≫ 1 (i. e., at high temperature or fast kinetics), the asymptotic approxima-
tion of internal Sherwood reduces the above expression for Damköhler matrix to
a diagonal form
De τ
Da = Sh I,
RΩ δc i∞
RΩ δc
−1
kco = ( + ) I
Dm SheΩ De Shi∞
Dm 1 μ
−1
= ( + ) I
RΩ SheΩ Shi∞
RΩ −1 R2 1 μ
−1
Da = [ A + Ω ( + )I] . (28.64)
kτδc Dm τ SheΩ Shi∞
If the first term is negligible or k → ∞, the Damköhler matrix can be further sim-
plified from equation (28.64) to
De τ 1 μ
−1
Da = ( + ) I,
R2Ω SheΩ Shi∞
k
Shi ≡ Φw = ϕ√A = δc √ A,
De
which is the fast kinetics asymptote. Thus, in this limit, the Damkohler matrix
(equation (28.59)) reduces to
De τ D τ √kDe
Da = Sh = e ϕ√A = τ√A. (28.65)
δc RΩ i δc RΩ RΩ
Note that for the relative reaction rate constant matrix shown in Figure 28.4, the
matrix A is given in equation (28.56), which leads to
The positive square root of matrix A (when it has all positive eigenvalues) can
be obtained either using spectral method or Caley–Hamilton theorem (see Chap-
ter 5). Thus, while original reaction constant matrix is Kv = kA, the observed (pore
√kD
diffusion disguised) reaction rate constant matrix becomes Kobs = R √A. Fig-
e
Ω
ure 28.5 shows the effect of pore diffusion on reaction network suggesting the total
number of observed reaction to be 12 in contrast to 6 original reactions (i. e., 6 new
reactions appear—three reversible reactions between species A and D, A and C and
B and D).
28.3 Isothermal monolith reactor model for multiple reactions | 717
In this case, the exit concentration vector can be expressed from equation (28.51)–
(28.54) as
√kDe
cfe = exp[− τ√A]cf ,in . (28.67)
RΩ
Taking the temperature dependence of the rate constant and other parameters
values as
−12000 −1
k = 1012 exp( ) s ; T in K
T
RΩ = 100 µm; De = 10−7 m2 /s; τ = 1 ms,
the exit concentration of each of the four components is calculated (using equa-
tion (28.51)) and plotted against temperature in Figure 28.6.
As expected, the concentration of A decreases and concentration of D increases
monotonically, while the concentration of the intermediate components can vary
nonmonotonically with temperature and may exhibit maxima. In this specific
case, the concentration of component C exhibits maxima near T = 915.6 K.
The above calculations may be repeated for the case of negligible pore diffusional re-
sistance or external mass transfer to show that the presence of either external or in-
ternal mass transfer reduces the yield (or maximum attainable concentration) of in-
termediate products.
718 | 28 Multicomponent diffusion and reaction
Figure 28.6: Exit concentrations of the four components in pore diffusion controlled limit.
Problems
1. Consider the three regular geometries of a sphere of diameter 2a, cylinder of height
and diameter 2a and a cube of side 2a.
(a) Show that the effective diffusion length (RΩ ) is the same for all three geome-
tries
(b) Formulate the diffusion–reaction problem with linear kinetics and uniform
activity for the three geometries and obtain the solution.
(c) Determine and plot the effectiveness factor and comment on the shape of the
curves.
(d) Determine and plot the internal Sherwood number for the three cases.
2. Consider a porous catalyst particle in the form of a hollow cylinder of length 2L,
inside radius μa (0 ≤ μ < 1) and outside radius a as shown in Figure 28.7.
(a) Formulate the diffusion–reaction problem for the case of Dirichlet BC at the
interior and exterior surface.
(b) Solve the model for the case of linear kinetics and determine an expression
for the effectiveness factor η.
(c) Determine the internal Sherwood number Shi and the various limits of η and
Shi for μ → 0, μ → 1, aL → ∞ and aL → 0.
28.3 Isothermal monolith reactor model for multiple reactions | 719
1 𝜕 𝜕C 𝜕2 C
(r ) + 2 = 0; 0 < r < R, 0 < x < L,
r 𝜕r 𝜕r 𝜕x
𝜕C 𝜕C
= 0 @ x = 0; C = C0 @ x = L; C = finite @ r = 0; −D = kC @ r = R.
𝜕x 𝜕r
Here, R is the radius of pore, L is the length, k is the reaction rate constant, D is the
molecular diffusivity of the reactant and C0 is the concentration of the reactant in
the gas at the pore mouth.
(a) Cast the equations in dimensionless form and obtain a formal solution to the
concentration profile
(b) The quantity of interest is the pore effectiveness factor defined by
R
D 𝜕C
η= ∫ (L, r)2πr dr.
2πRLkC0 𝜕x
0
d2 c
D = r(c), 0<x<L
dx 2
dc
c = cs @ x = L; = 0 @ x = 0.
dx
d2 u
= Φ2 u − α, u′ (0) = 0, u(1) = 0,
dx2
where
and J is the Jacobian of the rate vector r(c) evaluated at c = cs . Solve the linearized
problem and show that the flux vector
dc
js = D
dx x=L
720 | 28 Multicomponent diffusion and reaction
may be approximated by
Figure 29.1: Schematic of a packed-bed of porous particles for adsorption of solute from a fluid.
In developing a model, the following assumptions are made: (i) plug flow of fluid,
(ii) uniform and constant transport properties, (iii) small particles (i. e., no internal
gradients), (iv) dilute solution and (v) negligible axial dispersion in fluid phase. The
species (solute) balance can be expressed in fluid phase for the small volume element
(between x and x + Δx) as follows:
𝜕
[A Δxεcf ] = Ac εu0 cf |x − Ac εu0 cf |x+Δx − Ac Δxav kc (cf − cfi ) (29.1)
𝜕t c
⇒
𝜕cf 𝜕cf
ε = −εu0 − kc av (cf − cfi ) (29.2)
𝜕t 𝜕x
where ε is the bed porosity, Ac is the bed cross-section area, u0 is fluid interstitial veloc-
ity, cf is the solute concentration, kc is the mass transfer coefficient, av is the fluid-solid
interfacial area per unit bed volume, cfi is the solute concentration in fluid phase at
https://doi.org/10.1515/9783110739701-031
722 | 29 Packed-bed chromatography
the solid-fluid interface and cs is the solute concentration in the solid phase. Similarly,
the solid phase species balance can be expressed as
𝜕
[A Δx(1 − ε)cs ] = Ac Δxav kc (cf − cfi ) (29.3)
𝜕t c
⇒
𝜕cs
(1 − ε) = kc av (cf − cfi ). (29.4)
𝜕t
The above model (equations (29.2) and (29.4)) is not closed until cfi is represented in
terms of cs or cf . For this, we can utilize the adsorption isotherm as described below.
where cs0 is the saturation concentration in the solid phase (adsorption capacity of
the solid). Defining the fractional adsorbed concentration θ as
cs
θ= = fraction of adsorbed sites, (29.6)
cs0
rd = kd cs = kd cs0 θ (29.8)
or
θ k ka
= a c = Keq cfi ; Keq = = adsorption equilibrium constant (29.9)
1 − θ kd fi kd
⇒
29.1 Model formulation | 723
cs Keq cfi
θ= = (29.10)
cs0 1 + Keq cfi
Equation (29.10) gives the so-called Langmuir isotherm, which is also plotted in Fig-
ure 29.2.
Figure 29.2: Langmuir isotherm demonstrating the linear regime (Keq cfi ≪ 1) and saturation regime
(Keq cfi ≫ 1).
which shows that the isotherm can be linearized in this regime (see Figure 29.2). Here,
K is the dimensionless adsorption equilibrium constant. In this linear regime, the
model (equations (29.2) and (29.4)) becomes closed and linear, and can be expressed
as follows:
𝜕cf 𝜕cf
ε( + u0 ) = −kc av (cf − cfi ) (29.12)
𝜕t 𝜕x
𝜕c
(1 − ε) s = kc av (cf − cfi ) (29.13)
𝜕t
cs = Kcfi (29.14)
For consistency, we should have initial conditions related as cs0 (x) = Kcf 0 (x). For a
column that is free from adsorbate (solute) initially, we can take cf 0 (x) = cs0 (x) = 0.
Below we consider only this specific case.
x u0 t
z= ; τ= L = column length (29.17)
L L
K(1 − ε)
α= = capacitance ratio (29.18)
ε
εu0 1/(kc av )
p= = = local (or transverse) Peclet number (29.19)
kc av L L/(εu0 )
where 1/(kc av ) = tm represents the characteristic time for external mass-transfer; εu0 =
⟨u⟩ is the superficial velocity and L/⟨u⟩ = L/(εu0 ) = tc represents the convection (or
space) time; the local Peclet number p is the ratio of external mass-transfer time to
space time. Thus, the model (equations (29.12)–(29.16)) can be expressed in nondi-
mensional form as follows:
𝜕cf 𝜕cf
p( + ) = −(cf − cfi ) (29.20)
𝜕τ 𝜕z
𝜕cfi
αp = (cf − cfi ), τ > 0, 0<z<1 (29.21)
𝜕τ
Note that cs is eliminated by using the local equilibrium relation equation (29.14). Sim-
ilarly, cfi can be eliminated from equation (29.20) and can be expressed in terms of cf
as
𝜕cf 𝜕cf
cfi = cf + p( + ) (29.24)
𝜕τ 𝜕z
⇒
2 2
𝜕cf 1 𝜕cf αp 𝜕 cf αp 𝜕 cf
+ + + =0 (29.25)
𝜕τ 1 + α 𝜕z 1 + α 𝜕τ𝜕z 1 + α 𝜕τ2
cf (z, τ = 0) = 0; cf (z = 0, τ) = cin (τ) (29.26)
The above model (equations (29.25)–(29.26)) is a single second-order PDE (of hyper-
bolic type) for cf (z, τ). The exit concentration, i. e., cf (z = 1, τ), can be solved and
plotted as a function of time for any given input (or initial condition). The response
to a unit step input is referred to as the breakthrough curve while response to a Dirac
delta function (or pulse) input is referred to as the dispersion curve.
In the limiting case of p → 0, the external mass transfer resistance is neglected and
cfi = cf . In this limit, the model equations (29.25)–(29.26) reduces to a single hyperbolic
equation (plug flow) as
1 𝜕cf
𝜕cf
+ = 0, τ > 0, 0<z<1 (29.27)
𝜕τ 1 + α 𝜕z
cf (z, τ = 0) = 0; cf (z = 0, τ) = cin (τ) (29.28)
dĉf
= −(1 + α)sĉf ; ĉf (z = 0) = ĉ
in (s)
dz
⇒
1, τ>0
cin (τ) = H(τ) = { (29.32)
0, τ < 0,
726 | 29 Packed-bed chromatography
1, τ > (1 + α)z
cf (z, τ) = H(τ − (1 + α)z) = { (29.33)
0, τ < (1 + α)z
dz 1
Thus, the step input or discontinuity moves with a dimensionless speed dτ
= 1+α
, or
the adsorption front moves with a velocity
dx u0
= . (29.34)
dt 1 + K(1−ε)
ε
It can also be seen from Figure 29.3, which shows the profiles of unit-step input and
the concentration. However, for finite but small values of p, the front velocity remains
the same but the front is not sharp due to dispersion or mass transfer effects.
Figure 29.3: Solution profile with unit step input for packed-bed chromatography in the plug flow
regime.
show below, when mass transfer and dispersion are present, the breakthrough curves
(and separation of solutes) are not sharp.
𝜕θf 𝜕θf
ph ( + ) = −(θf − θs ); (29.35)
𝜕τ 𝜕z
𝜕θs
αh p h = (θf − θs ), τ > 0, 0 < z, 1 (29.36)
𝜕τ
θf (z, 0) = 0; θs (z, 0) = 0 (29.37)
Here, the local Peclet number ph is the ratio of heat transfer time th to the space time tc .
We note that the model (equations (29.35)–(29.38)) for heat transfer in packed-bed is
identical to that for chromatography in packed-bed, where θf , θs , αh , ph and θin can be
replaced by cf , cfi , α, p and cin , respectively.
𝜕cf 1 𝜕cf
≈− + O(p). (29.39)
𝜕τ 1 + α 𝜕z
Using the above approximation in equation (29.25), we get
𝜕cf 𝜕cf 𝜕2 cf 2
1 𝜕 cf
(1 + α) + + αp( − + O(p)) = 0
𝜕τ 𝜕z 𝜕τ𝜕z 1 + α 𝜕τ𝜕z
⇒
2
𝜕cf 𝜕cf α2 p 𝜕 cf
(1 + α) + + = 0 + O(p2 ), (29.40)
𝜕τ 𝜕z 1 + α 𝜕τ𝜕z
which is the hyperbolic form of the model. The same approximation (equation (29.39))
can be used again in equation (29.40) that leads to
2
𝜕cf 𝜕cf α2 p −1 𝜕 cf
(1 + α) + + ( + O(p)) = 0 + O(p2 )
𝜕τ 𝜕z 1 + α 1 + α 𝜕z 2
728 | 29 Packed-bed chromatography
⇒
2
𝜕cf 𝜕cf α 2 p 𝜕 cf
(1 + α) + − = 0 + O(p2 )
𝜕τ 𝜕z (1 + α)2 𝜕z 2
⇒
2
𝜕cf 1 𝜕cf α2 p 𝜕 cf
+ − = 0 + O(p2 ) (29.41)
𝜕τ (1 + α) 𝜕z (1 + α)3 𝜕z 2
which is the parabolic form of the model where the dimensionless dispersion coeffi-
α2 p
cient is (1+α) 3.
The model equation can also be expressed in interfacial concentration mode (cfi )
for chromatography or solid temperature (θs ) for heat transfer. For this, let us con-
sider the two-mode model (equations (29.20)–(29.23)). We can use equation (29.21) to
express cf in terms of cfi as
𝜕cfi
cf = cfi + αp (29.42)
𝜕τ
which is the same equation as that satisfied by cf as given in equation (29.25). There-
c
fore, using equation (29.11), cfi = Ks can be substituted in above equation (29.43), which
leads to equation in solid phase concentration cs as
𝜕cs 1 𝜕cs αp 𝜕2 cs αp 𝜕2 cs
+ + + =0 (29.44)
𝜕τ (1 + α) 𝜕z (1 + α) 𝜕τ𝜕z (1 + α) 𝜕τ2
Thus, all concentration modes cf , cfi and cs satisfy the same equation (see equations
(29.25), (29.43) and (29.44)).
Thus, in the limit of p → 0, the front velocity is the same in solid and fluid phases.
In addition, dispersion of the front in the two phases is also the same. In general, this
is not true when axial conduction or intraparticle gradients are included.
[Remark: Though the differential equations satisfied by cf (z, τ), cfi (z, τ) and cs (z, τ)
are the same, the initial and inlet conditions are different for these variables.]
29.4 Solution of the hyperbolic model by Laplace transform | 729
𝜕cm 𝜕cm 𝜕2 c 𝜕2 c
(1 + α) + + αp m + αp 2m = 0 (29.46)
𝜕τ 𝜕z 𝜕τ𝜕z 𝜕τ
For small values of p, the above hyperbolic model can be expressed in parabolic form
as
𝜕cm 1 𝜕cm α 2 p 𝜕2 c m
+ − =0
𝜕τ (1 + α) 𝜕z (1 + α)3 𝜕z 2
⇒ in dimensional form as
2
𝜕cm u0 𝜕cm α2 2 ε 𝜕 cm
+ − u 0 =0
𝜕t (1 + α) 𝜕x (1 + α)3 kc av 𝜕x 2
⇒
𝜕cm 𝜕c 𝜕2 c
+ ueff m − Deff 2m = 0; (29.47)
𝜕t 𝜕x 𝜕x
u0 ε α2 u20
ueff = ; Deff = (29.48)
1+α kc av (1 + α)3
where ueff and Deff are the effective velocity and effective dispersion coefficient, re-
spectively. This leads to the effective axial Peclet number Peeff as
ĉf (z, s)
αpsĉfi = (ĉf − ĉfi ) ⇒ ĉfi (z, s) = (29.50)
1 + αps
dĉf αs
= −sĉf − ĉ ; ĉf (z = 0, s) = ĉ
in (s)
dz 1 + αps f
⇒
αsz
ĉf (z, s) = exp[−sz − ]ĉ (s). (29.51)
1 + αps in
exp[− p1 − (τ−1) 1
].[√ αp2 (τ−1) I1 (2√ (τ−1) ) + δ(τ − 1)], τ>1
E(τ) = { αp αp2 (29.52)
0, τ<1
1 1 1 (τ − 1) 2 √ (τ − 1)
E(τ)|τ≫1+αp2 = √ exp[− − + ]
4πp [α(τ − 1) ]
3 1/4 p αp p α
1/4
1 (√(τ − 1) − √α)2
=( ) exp[− ]. (29.53)
16π p α(τ − 1)3
2 2 αp
The above expression can be simplified to evaluate the E-value at the effective resi-
dence time τ = 1 + α as
1
E(τ)|τ=1+α = . (29.54)
α√4πp
Figure 29.4 shows the dispersion curves E(τ) for specific value of α = 5 and
p-values from 0.01 to 0.1. It can be seen from this figure that for small p-values (i. e.,
p → 0), the dispersion curve is symmetric around τ = 1 + α with maximum value given
in equation (29.54). For large p-values, the dispersion curve is asymmetric and may
have long tail.
1
For a unit-step input, cin (τ) = 1 (or ĉ
in (s) = s ), the breakthrough curve is given by
0, τ<1
F(τ) = { (29.55)
cf∗ (τ, 1), τ>1
τ−1
1 [ exp[− αp ].I0 (2√ αp2 )
(τ−1)
The breakthrough curve (i. e., exit concentration versus time plot) at various times is
shown in Figure 29.5 for α = 5 and p-values in the range of 0.01 to 0.2. Here, break-
29.5 Chromatography model with dispersion in fluid phase | 731
Figure 29.4: Dispersion curves E(τ) for packed-bed chromatography for α = 5 and p-values varying
from 0.01 to 0.1.
Figure 29.5: Breakthrough curve for packed-bed chromatography for various transverse Peclet num-
ber p.
through occurs at τ = 1 + α = 6 (as expected), where all the curves (for small p) pass
through the point (6, 0.5). However, as p increases, the breakthrough curves are not
symmetric.
2
𝜕cf 𝜕cf p 𝜕 cf
p( + ) = −(cf − cfi ) + (29.57)
𝜕τ 𝜕z Pemf 𝜕z 2
𝜕cfi
αp = (cf − cfi ), τ > 0, 0<z<1 (29.58)
𝜕τ
inlet condition:
1 𝜕cf
= cf − cin (τ) @ z = 0 (29.60)
Pemf 𝜕z
𝜕cf
= 0 @ z = 1. (29.61)
𝜕z
Here, we have three dimensionless parameters: α, p and Pemf . The new parameter is
u L
the axial Peclet number Pemf = D0 , where Dxe is the effective axial dispersion coeffi-
xe
cient.
1. Pemf → ∞ (i. e., negligible dispersion in fluid phase). In this case, the model
reduces to hyperbolic form and is discussed earlier.
2. Pemf → 0 (i. e., fluid phase is well mixed or no axial gradient in fluid or solid
phase). In this case, the model reduces to a set of ODEs:
dcf
p = −(cf − cfi ) + p[cin (τ) − cf ] (29.62)
dτ
dcfi
αp = (cf − cfi ), τ > 0, (29.63)
dτ
cf = cfi = 0 @ τ = 0. (29.64)
The lumped model (equations (29.62)–(29.64)) in the limit of p → 0 (i. e., neglecting
difference between cf and cfi ) reduces to
dcf dcf
p + αp = p[cin (τ) − cf ]; cf (τ = 0) = 0
dτ dτ
⇒
29.5 Chromatography model with dispersion in fluid phase | 733
dcf
(1 + α) = cin (τ) − cf , τ > 0; cf = 0 @ τ = 0 (29.65)
dτ
LT (τ → s) gives
1
For a unit step input: ĉ
in (s) = s , we get
ĉ
in (s) 1 1
ĉf = = − (29.67)
(1 + α)s[s + 1
] s s+ 1
(1+α) (1+α)
Consider the lumped model for p > 0 as given in equations (29.62)–(29.64). Taking LT
gives
(1 + αps)ĉ
in (s)
ĉf = . (29.71)
[1 + αs + αps + s + αps2 ]
Thus, the breakthrough curves can be obtained by considering a unit step input
1
(ĉ
in (s) = s ) as
(1 + αps)
F(s)
̂ = , (29.72)
s[1 + αs + αps + s + αps2 ]
while the dispersion curve can be obtained by considering a unit impulse input
(ĉ
in (s) = 1) as
(1 + αps)
E(s)
̂ = . (29.73)
[1 + αs + αps + s + αps2 ]
Note that the LT of the dispersion curve (equation (29.73)) has two poles for p ≠ 0 given
by
734 | 29 Packed-bed chromatography
−(1 + α + αp) ± √△
s1 , s2 = ; (29.74)
2αp
△ = (1 + α + αp)2 − 4αp
= (1 + α)2 + α2 p2 + 2α(α − 1)p. (29.75)
It can be shown that △ > 0 for all α > 0 and p > 0, and s1 , s2 are always real and
negative.
Write
̂ = 1 E(s)
F(s) ̂ = 1 + αps
s sαp(s − s1 )(s − s2 )
1
s+ αp
= . (29.76)
s(s − s1 )(s − s2 )
⇒
1
Residue est F(s)|
̂ s=0 = =1 (29.77)
αps1 s2
1
sτ ̂
es1 τ (s1 + αp
)
Residue e F(s)|s=s1 = = β1 es1 τ (let) (29.78)
s1 (s1 − s2 )
1
es2 τ (s2 + αp
)
Residue esτ F(s)|
̂ s=s = = β2 es2 τ (let) (29.79)
2
s2 (s2 − s1 )
⇒
where β1 and β2 are given by equations (29.78) and (29.79). An example of the break-
through curve is plotted in Figure 29.6 corresponding to p = 0.01 and α = 5.0.
Figure 29.6: A plot of the breakthrough curve from lumped model with α = 5 and p = 0.01.
29.5 Chromatography model with dispersion in fluid phase | 735
29.5.4 Chromatography model with dispersion in fluid phase for unit impulse input
Consider the chromatography model with dispersion in fluid phase (equations (29.57)–
(29.60)) with unit impulse input cin (τ) = δ(τ). LT of equations (29.57)–(29.60) gives
2
dĉf p d ĉf
p(sĉf + ) = −(ĉf − ĉfi ) + (29.80)
dz Pemf dz 2
ĉf
αpsĉfi = (ĉf − ĉfi ) ⇒ ĉfi = (29.81)
1 + αps
⇒
2
1 d ĉf dĉf αs
− − sĉf − ĉ = 0, (29.82)
Pemf dz 2 dz 1 + αps f
1 dĉf 𝜕ĉf
= ĉf − 1 @ z = 0; = 0 @ z = 1. (29.83)
Pemf dz 𝜕z
Note that for p = 0 or α = 0, the above model reduces to axial dispersion model while
in the limit of Pemf → ∞, it reduces to the hyperbolic model. Here, we consider the
mixed case where α, p and Pemf are finite. In this case, the LT of the dispersion curve
can be expressed by solving equations (29.82)–(29.83) as
Pemf
4qe 2
E(s)
̂ = ĉf (z = 1, s) =
Pemf q − Pemf q
(29.84)
[(1 + q)2 e 2 − (1 − q)2 e 2 ]
where
4s α
q = √1 + (1 + ). (29.85)
Pemf 1 + αps
The above equations (29.84)–(29.85) can be numerically inverted. Figure 29.7 shows
a plot of dispersion curves from numerical LT inversion for α = 5, p = 0.01 and for
various values of Pemf in the range of 1 to 1000.
The LT solution (equations (29.84)–(29.85)) can also be expanded as power series
in s to obtain the temporal moments without Laplace inversion. Expanding E(s)
̂ gives
2
̂ = 1 − (1 + α)s + s M2 + O(s3 )
E(s) (29.86)
2
736 | 29 Packed-bed chromatography
Figure 29.7: Dispersion curves for chromatography model with dispersion in fluid phase for α = 5,
p = 0.01 and various values of Pem f in the range of 1 to 1000.
where
(1 + α)2
M2 = 1 + 2α + α2 + 2pα2 + [2 Pemf +2e− Pemf − 2] (29.87)
Pe2mf
M0 = 1; M1 = (1 + α) (29.88)
M2 − M12 2pα2 2 2
σθ2 = = +[ − (1 − e− Pemf )], (29.89)
M12 (1 + α)2 Pemf Pe2mf
where the first term (in equation (29.89)) is the dispersion contribution due to external
mass transfer and second term (in square bracket) is the dispersion contribution due
to mixing in fluid phase.
Consider the hyperbolic model (equations (29.20)–(29.23)). When the upwinding first-
order discretization scheme is used for the convection term, the discretized model can
be expressed as
Taking LT gives
29.6 Impact of intraparticle gradients | 737
psĉ
f ,j + (ĉ
f ,j − ĉ
f ,j−1 )Np = −(ĉ
f ,j − c
̂ fi,j ); ĉ
f ,0 = c
̂ in (s); (29.92)
ĉ
f ,j
αpsĉ
fi,j = (ĉ
f ,j − c
̂fi,j ) ⇒ c
̂fi,j = (29.93)
1 + αps
1
where Δz = N
. Thus, the LT solution can be expressed as
ĉ
f ,j−1
ĉ
f ,j = s αs ; ĉ
f ,0 = c
̂ in (s) (29.94)
1+ N
+ (1+αsp)N
⇒
ĉin (s)
ĉ
f ,N = s αs (29.95)
[1 + N
+ (1+αsp)N ]N
a2 u0 kc a Sh∗
Λ= ; Sh∗ = ; Sh = .
LDe De 5
cs = ∫ 3ξ 2 cs (ξ , z, τ) dξ . (29.103)
0
It may be seen that for (a2 /De ) → 0, the gradient inside the particle becomes negligible
and cs = csi = Kcfi and the model reduces to the hyperbolic two-phase model. When
the gradient inside the particle is small, by simplifying equations (29.99)–(29.101) and
(29.103), it may be shown that
Sh
cs = (1 + )c − Sh cf . (29.104)
K si
[Remark: In the literature, this approximation is known as the linear driving force
or parabolic profile approximation.] Using equation (29.104), the model (equation
(29.98)) may be expressed as
Equations (29.97) and (29.105) along with the initial and inlet conditions:
2
𝜕cf 𝜕cf Sh 𝜕 cf 𝜕2 cf
(1 + α) + + αp(1 + )( 2 + ) = 0. (29.107)
𝜕τ 𝜕z K 𝜕τ 𝜕z𝜕τ
Thus, as can be expected, including of intraparticle gradients does not change the
speed of the adsorption front but the effective axial Peclet number becomes
1 α2 Sh
= p(1 + ) (29.109)
Peeff (1 + α)3 K
29.6 Impact of intraparticle gradients | 739
or in dimensional form
εu20 α2 1 a2
Deff = [ + ]. (29.110)
(1 + α)3 kc av 15De
We note that the first term in the parentheses of equation (29.110) is the external mass
transfer time while the second term represents the intraparticle diffusion time.
Problems
1. Consider the problem of solute uptake by a spherical particle (of radius a), which
is initially of solute free and surface exposed to a time varying concentration csi (t).
Solve the intraparticle diffusion problem and show that the concentration within
the particle is given by
t
2De ∞ (−1)n+1 nπ nπr D n2 π 2
− e (t−t ′ )
cs (r, t) = ∑ sin[ ] ∫ e a2 csi (t ′ ) dt ′
a n=1 r a
0
2. Obtain the solution, and hence the breakthrough curves for the chromatogra-
phy model with intraparticle gradients defined by equations (29.97), (29.105) and
(29.106) of Section 29.6.
3. Determine the dimensionless second central moment of the dispersion curve for
the chromatographic model that accounts for external mass transfer, dispersion
in the fluid phase and intraparticle gradients.
30 Stability of transport and reaction processes
In this chapter, we discuss two problems in which the stability of a base solution is de-
termined by examining the eigenvalues of the linearized system of differential equa-
tions.
Figure 30.1: Schematic diagram illustrating Lapwood convection in a porous rectangular box.
https://doi.org/10.1515/9783110739701-032
30.1 Lapwood convection in a porous rectangular box | 741
T = T1 @ z ′ = 0; T = T0 @ z ′ = H (30.5)
𝜕T
= 0 @ x ′ = 0, L; T = T ∗ (x ′ , z ′ ) @ t = 0 (30.6)
𝜕x
n.u′ = 0 @ z ′ = 0, H and x ′ = 0, L (30.7)
Dimensionless form
We define following dimensionless variables:
z′ x′ L T − T0 H ′ 1
z= , x= , α= , θ= , u= u, ∇′ = ∇, (30.8)
H H H T1 − T0 λe H
λe t κgρ0 Hβ(T1 − T0 ) Rad p′
τ= , Rad = , p= [ + z] (30.9)
H2 μλe β(T1 − T0 ) ρ0 gH
∇⋅u=0 (30.10)
∇p = Rad θez − u (30.11)
= −u ⋅ ∇θ + ∇2 θ
𝜕θ
(30.12)
𝜕τ
with boundary conditions,
θ = 1 @ z = 0; θ = 0 @ z = 1; (30.13)
𝜕θ
= 0 @ x = 0, α; θ = θ∗ @ τ = 0 (30.14)
𝜕x
u ⋅ n = 0 @ z = 0, 1 and x = 0, α (30.15)
where Rad is known as the Darcy–Rayleigh number. Note that Darcy’s law does not
permit the specification of tangential velocity at a boundary. We can only specify the
normal component of the velocity to be zero. In component form, the model may be
written as
𝜕ux 𝜕uz
+ =0 (30.16)
𝜕x 𝜕z
𝜕p
= −ux (30.17)
𝜕x
𝜕p
= Rad θ − uz (30.18)
𝜕z
𝜕θ 𝜕θ 𝜕θ 𝜕2 θ 𝜕2 θ
= −ux − uz + + . (30.19)
𝜕τ 𝜕x 𝜕z 𝜕x 2 𝜕z 2
742 | 30 Stability of transport and reaction processes
We can satisfy the continuity equation and remove the pressure and velocity variables
from these equations by introducing the stream function ψ(x, z) as follows:
𝜕ψ 𝜕ψ
ux = − , uz = (30.20)
𝜕z 𝜕x
⇒
𝜕p 𝜕ψ 𝜕2 p 𝜕2 ψ
= −ux = ⇒ = 2 (30.21)
𝜕x 𝜕z 𝜕x𝜕z 𝜕z
𝜕p 𝜕ψ 𝜕2 p 𝜕θ 𝜕2 ψ
= Rad θ − ⇒ = Rad − (30.22)
𝜕z 𝜕x 𝜕z𝜕x 𝜕x 𝜕x 2
⇒
𝜕2 ψ 𝜕2 ψ 𝜕θ
+ 2 = Rad (30.23)
𝜕x 2 𝜕z 𝜕x
Therefore, the model equations may be expressed in terms of two scalar variables
ψ(x, y) and θ(x, y) as
∇2 ψ = Rad
𝜕θ
(30.24)
𝜕x
. + ∇2 θ,
𝜕θ 𝜕ψ 𝜕θ 𝜕ψ 𝜕θ
= . − 0 < x < α, 0<z<1 (30.25)
𝜕τ 𝜕z 𝜕x 𝜕x 𝜕z
with the boundary conditions
𝜕ψ 𝜕ψ
= 0 and =0 (30.26)
𝜕z x=0,α 𝜕x z=0,1
𝜕θ
= 0; θ|z=0 = 1; θ|z=1 = 0, (30.27)
𝜕x x=0,α
and appropriate initial conditions for the time-dependent case. The boundary condi-
tions on ψ can also be taken as
ψ = 0 @ x = 0, α and z = 0, 1 (30.28)
instead of equation (30.26). Both sets of boundary conditions give the same solution,
and in what follows we use the second set given by equation (30.28).
ψ ∇2 ψ − Rad 𝜕θ
𝜕x 0
F( )=( )=( ) (30.29)
θ ∇2 θ + 𝜕z 𝜕θ
𝜕ψ
−
𝜕ψ 𝜕θ 0
𝜕x 𝜕x 𝜕z
30.1 Lapwood convection in a porous rectangular box | 743
ψ = 0 @ x = 0, α ∧ z = 0, 1 (30.30)
𝜕θ
= 0; θ|z=0 = 1; θ|z=1 = 0; (30.31)
𝜕x x=0,α
Note that the only nonlinear terms in the model equations are the quadratic convec-
tion terms in equation (30.29). The base state or conduction solution that exists for all
values of Rad is given by
As the Rayleigh number Rad increases, the buoyancy force overcomes the viscous
force and fluid begins to move or convection sets in. Our aim is to determine the crit-
ical value of Rad at which the conduction state loses stability leading to convection
states. We also note that if ( ψ(x,z)
θ(x,z)
) is a solution of equations (30.29)–(30.31) then so is
( −ψ(α−x,z)
θ(α−x,z)
). Thus, the convective solutions appear in pairs having reflectional symme-
try in the domain. Let u0 and v given by
v1 (x, z) ψ0 (x, z) 0
v=( ), u0 = ( )=( ), (30.33)
v2 (x, z) θ0 (x, z) 1−z
denote the base state and perturbation to the base state. To determine the stability of
the base state u0 (equation (30.33)), we linearize the model equations:
DF(ψ0 , θ0 , Rad ) ⋅ v
𝜕 𝜕 F (ψ + sv1 , θ0 + sv2 )
= lim F(u0 + sv) = lim ( 1 0 )
s→0 𝜕s s→0 𝜕s F2 (ψ0 + sv1 , θ0 + sv2 )
∇2 ψ0 + s∇2 v1 − Rad 𝜕x0 − s Rad 𝜕v
𝜕θ 2
𝜕 𝜕x
= lim ( )
s→0 𝜕s ∇2 θ + s∇2 v2 + ( 𝜕z0 + s 𝜕v
𝜕ψ 𝜕θ0 𝜕v2 𝜕ψ0 𝜕v1 𝜕θ0 𝜕v2
𝜕z
1
)( 𝜕x
+ s 𝜕x
) − ( 𝜕x
+ s 𝜕x
)( 𝜕z
+ s 𝜕z
)
∇2 v1 − Rad 𝜕v2
𝜕x
∇2 v1 − Rad 𝜕v2
𝜕x
=( )=( )
∇2 v2 +
𝜕θ0 𝜕v1
𝜕x 𝜕z
+
𝜕ψ0 𝜕v2
𝜕z 𝜕x
𝜕θ
− 𝜕z0 𝜕v
𝜕x
1
−
𝜕ψ0 𝜕v2
𝜕x 𝜕z
∇2 v2 + 𝜕v1
𝜕x
∇2 v1 − Rad 𝜕v2
𝜕x
L⋅v=( ) = DF(ψ0 , θ0 , Rad ) ⋅ v (30.34)
∇2 v2 + 𝜕v1
𝜕x
The boundary conditions on v1 and v2 are obtained in a similar way from equations
(30.30) and (30.31) by setting
v1 = ψ − ψ0 and v2 = θ − θ0 , (30.35)
Thus, new steady-state solutions (or bifurcation from trivial solution) can occur
only if the equation
Lv = 0
𝜕2 v1 𝜕2 v1 𝜕v2
+ 2 − Rad =0 (30.38)
𝜕x 2 𝜕z 𝜕x
𝜕2 v2 𝜕2 v2 𝜕v1
+ + =0 (30.39)
𝜕x 2 𝜕z 2 𝜕x
d2 ϕ
− , ϕ(0) = ϕ(1) = 0,
dz 2
These equations may be combined to give a single fourth-order boundary value prob-
lem:
d 4 w1 2
2 2 d w1
+ (Rad −2n π ) + n4 π 4 w1 = 0 (30.45)
dx4 dx 2
d2 w1 d2 w1
w1 (0) = w1 (α) = 2
(0) = (α) = 0 (30.46)
dx dx 2
30.1 Lapwood convection in a porous rectangular box | 745
This is a linear equation with constant coefficients and can be solved easily. By inspec-
tion, we see that
mπx
w1 (x) = sin( ), m = 1, 2, . . . (30.47)
α
m4 π 4 2 2 −m2 π 2
+ (Ra d −2n π )( ) + n4 π 4 = 0
α4 α2
π 2 (m2 + n2 α2 )2
⇒ Rad = (30.48)
m2 α 2
We are interested in determining the smallest value of the Darcy–Rayleigh number for
which there is a nontrivial solution. Since Rad is monotonically increasing with n but
nonmonotonic with m, we take n = 1 (In physical terms, this implies that it is first
vertical mode that is always destabilized in this specific problem):
π 2 (m2 + α2 )2
⇒ Rad = (30.49)
m2 α2
Equation (30.49) is plotted below for different values of m (= 1, 2 and 3) in Figure 30.2.
Figure 30.2: Neutral stability curves (bifurcation set) for the Lapwood convection problem: First verti-
cal mode and different horizontal modes.
π 2 (α2 + α2 )2
Rad ≥ Radc = = 4π 2 . (30.51)
α4
Thus, we can determine the streamlines and isotherms using equations (30.52)–(30.53)
and (30.54). The eigenfunctions from equation (30.54) are plotted in Figure 30.3 for
α = 2.
Note that there are two circulation cells (the symmetric pair has cells rotating in
opposite direction). For α = n, the solution has n circulation cells.
Remark. The above solution can be modified easily for an infinite layer (α → ∞), In
this case, mπ/α becomes a continuous variable (often called the wave number and
denoted by k) and equation (30.49) modifies to
(n2 π 2 + k 2 )2
⇒ Rad = (30.55)
k2
Thus,
d Rad 2 2 2 4n4 π 4
= 0 ⇒ k = n π ⇒ Ra d ≥ Radc = = 4n2 π 2 (30.56)
dk 2 n2 π 2
Once again, the smallest Rad occurs for n = 1 and is given by
Figure 30.3: Contour plots of the eigenfunctions (streamlines and isotherms) for the Lapwood prob-
lem.
The corresponding flow pattern is similar to that shown in Figure 30.3 except now the
cells are square-shaped.
The temperature θ for the convective solutions is of the form (from equations
(30.52)–(30.53) and (30.54)),
2π
θ(x, z) = 1 − z ± ε ⋅ sin πz ⋅ cos πz (30.59)
Radc
where ε is the amplitude of the convective branch. The isotherms of these convective
branches
2π
θ1 = 1 − z + ε ⋅ sin πz ⋅ cos πz
Radc
2π
θ2 = 1 − z − ε ⋅ sin πz ⋅ cos πz
Radc
Figure 30.4: Isotherms of the bifurcating convective branches for the Lapwood convection problem.
= F(x, u, ∇u, ∇2 u, p) in Ω
𝜕u
C
𝜕t
BCs: β(x, u, ∇u) = 0 on 𝜕Ω, t > 0 (30.60)
I.C.: Γ(x, u, ∇u) = 0 in Ω @ t = 0
When spatial dependence of the state variables is ignored, we get the so-called
“lumped resistance” models or simply lumped models. In this case, equations (30.60)
are of the form:
du
C = F(u, p), t > 0; u = u0 @ t = 0 (30.61)
dt
30.2 Chemical reactor stability and dynamics | 749
NS
∑ νij Aj = 0; i = 1, 2, . . . , NR (30.62)
j=1
are taking place. Assuming (a) constant physical properties and constant density, (b)
volume of reactor and volumetric flow rate being constant, the species and energy
balances are given by
NR
dcj cj,in (t) − cj
= + ∑ νij ri (c, T), j = 1, 2, 3, . . . Ns (30.63)
dt τc i=1
NR
dT Tin (t) − T (−ΔHR,i ) UAh
LeR = +∑ ri (c, T) − (T − Tc (t)), (30.64)
dt τc i=1
ρf Cpf VR ρf Cpf
where
(MCp )wall
LeR = 1 + = reactor Lewis number
VR ρf Cpf
and U is overall heat transfer coefficient for heat exchange between reactor contents
and coolant; Ah is heat transfer area; VR is the volume of reactor; τc = Vq R is the space
0
(residence or convection) time and q0 is volumetric flow rate.
Equations (30.63) and (30.64) represent (Ns + 1) nonlinear ODEs that describe the
variation of reactor composition and temperature with time. These equations have to
be integrated numerically with appropriate initial conditions:
Denoting
1 0 ⋅⋅⋅ 0 0 c1
0 1 ⋅⋅⋅ 0 0 c2
.
C = ( ... ..
.
..
.
..
.
..
.
), u = ( .. ) (30.66)
0 0 ⋅⋅⋅ 1 0 c Ns
( 0 0 ⋅⋅⋅ 0 LeR ) ( T )
and considering only the special case in which the inputs cj,in (t), Tin (t) and Tc (t) are
independent of time, we can write equations (30.63), (30.64) and (30.65) in the au-
tonomous form given by equation (30.61). If inputs vary with time, equation (30.61)
can be modified to
du
C = F(t, u, p), t > 0; u = u0 @ t = 0 (30.67)
dt
The above form (equation (30.67)) of the lumped model is known as the forced or
nonautonomous system. Here, we consider only the autonomous case.
Assuming a single step exothermic reaction of the form A → B with linear kinetics, we
can express rate of reaction (for disappearance) of the species A as
Ea
r = k(T)cA = k0 exp(− )c (30.68)
RT A
cA T − Tin
τ = k(Tin )t; χ =1− ; y= ; Da = k(Tin )τc ;
cA,in Tin
(−ΔHR )cA,in E ΔTad ρf Cpf VR
ΔTad = ; γ= a ; β= ; τh = ; (30.69)
ρf Cpf RTin Tin UAh
30.2 Chemical reactor stability and dynamics | 751
where τ is the dimensionless time (scaled with reaction time at the inlet temperature);
χ is the conversion; y is the dimensionless temperature of fluid; Da is Damköhler num-
ber at inlet temperature; γ is dimensionless activation energy; ΔTad is the adiabatic
temperature rise; β is dimensionless adiabatic temperature rise and τh is the heat ex-
change time with the coolant (or cooling time); α is the ratio of characteristic reaction
time at the inlet temperature to the cooling time; y0 is the initial fluid temperature and
χ0 is the conversion corresponding to initial concentration. With these dimensionless
quantities, the model equations (30.63) and (30.64) in dimensionless form reduce to
two nonlinear ODEs:
dχ χ γy
=− + (1 − χ) exp( ); (30.70)
dτ Da 1+y
dy y γy
LeR =− + β(1 − χ) exp( ) − α(y − yc ); (30.71)
dτ Da 1+y
χ = χ0 and y = y0 @ t = 0 (30.72)
This more general model has three additional parameters LeR , α and yc compared to
the simpler adiabatic case (α = 0) with LeR = 1 (or negligible reactor wall thermal
capacitance).
In what follows, we consider special case of yc = 0 (i. e., coolant and feed temperature
are equal). For this case, the steady-state model reduces to
βχs
ys = (30.73)
1 + α Da
γβχs
χs = Da(1 − χs ) exp( ) (30.74)
1 + βχs + α Da
Figure 30.7: Different types of χ versus Da diagrams in each of the five regions denoted in Fig-
ure 30.6.
30.2 Chemical reactor stability and dynamics | 753
The solid and dashed lines in the phase diagram shown in Figure 30.6 are referred to as
the isola and hysteresis locus, respectively. These loci divide the (α, β) plane into five
regions in each of which a different type of χ versus Da diagram is obtained. Regions c
and e can exhibit isola (or an isolated solution branch) as shown in Figure 30.7(c) and
Figure 30.7(e).
To determine the stability of the steady state, we write equations (30.70)–(30.72)
for the case of yc = 0, as
dχ χ γy
= F1 (χ, y) = − + (1 − χ) exp( ); (30.75)
dτ Da 1+y
dy 1 −y γy
= F2 (χ, y) = [ + β(1 − χ) exp( ) − αy] (30.76)
dτ LeR Da 1+y
We linearize these equations around the steady state and determine the eigenvalues
of the linearized matrix:
𝜕F1 𝜕F1
𝜕χ 𝜕y
A= ( (30.77)
𝜕F2 𝜕F2
)
𝜕χ 𝜕y (χ ,y )
s s
In this case, the eigenvalues can be complex for LeR ≥ 1 and the trace of the matrix can
change sign leading to periodic solutions in time. Usually this occurs when the reactor
is cooled strongly. In such cases, even when a single steady state exists, it could be lo-
cally unstable leading to sustained oscillations of the exit conversion and temperature
(though the inlet concentration and temperature remain constant).
For example, we consider the following parameters: β = 1, γ = 30, yc = 0 and
α = 35, where the steady-state diagram can be obtained by solving F1 = 0 and F2 = 0
and is shown in Figure 30.8 (these parameters lie in region b in Figure 30.6). The top
plot shows the conversion while the bottom plot shows the dimensionless temperature
at steady state. At each point of the steady-state curve in Figure 30.8, the eigenvalues
of the matrix A defined in equation (30.77) can be obtained. When the real part of
any of the two eigenvalues is positive, the solution becomes unstable (such states are
shown in Figure 30.8 by the dashed lines). The system has a stable solution only when
real parts of both eigenvalues are negative (shown in Figure 30.8 by solid lines). The
singular points where solution transitions from stable to unstable regimes are shown
by diamond marker points in Figure 30.8.
To be specific, if we choose a point denoted by the black circle in Figure 30.8 as an
example case, which corresponds to the cooled region with Da = 0.2 and the steady-
state conversion and temperature
χs = 0.6705; ys = 0.0838
754 | 30 Stability of transport and reaction processes
Figure 30.8: Steady-state diagram of (a) conversion (χ) versus Damkohler number (Da) and (b) di-
mensionless temperature (y) versus Damkohler number (Da) for the cooled CSTR corresponding to
β = 1, γ = 30, yc = 0 and α = 35. Dashed curve corresponds to the unstable region while the solid
curve corresponds to the stable region.
Taking LeR = 1.5 and computing the eigenvalues of the linearized matrix, we find that
they are complex with a positive real part, i. e.,
Thus, the steady state is unstable. Integrating the full nonlinear equations numeri-
cally, we find that the conversion and temperature oscillate with time, as shown in
Figure 30.9. In this figure, the top plot shows the transient oscillation of conversion
and temperature, while the bottom plot shows the periodic orbit (i. e., limit cycle) in
the χ − y phase plane.
Similar analysis can be performed in other regions and for the more general case
when the feed temperature and coolant temperature are different.
30.2 Chemical reactor stability and dynamics | 755
Problems
1. Bifurcation set for discrete model of thermohaline convection
Consider the following discrete model of thermohaline convection (where the den-
sity is assumed to vary both with temperature and salt concentration):
dx
= Pr(y − x + u) = f1
dt
dy 4
= −xz + Ra x − y = f2
dt 27π 4 T
dz 8
= xy − z = f3
dt 3
du 4
= −xv − Le Ra x − Le u = f4
dt 27π 4 c
dv 8
= xu − Le v = f5
dt 3
dψ
= f(ψ) = (f1 , f2 , f3 , f4 , f5 )T (30.78)
dt
(a) Show that equation (30.78) has a trivial steady-state solution, i. e., ψs =
(x, y, z, u, v)T = 0 is one of the steady-state solutions.
𝜕f
(b) Determine the Jacobian J = { 𝜕ψi } of the function f and show that it is given by
j
− Pr Pr 0 Pr 0
4
RaT −z
27π 4
−1 −x 0 0
𝜕fi
J={ }= ( y x −8
3
0 0 ) (30.79)
𝜕ψj 4
Le 27π 4 Rac −v 0 0 − Le −x
( u 0 0 x −8 Le
3 )
(c) Neutral curve near trivial solution: Determine the neutral curve near the trivial
solution ψs (by setting the determinant of the Jacobian J at ψs to zero) and
show that it is given by
64 Le2 Pr
|J|ψs | = − (27π 4 + 4 Rac −4 RaT ) = 0
243π 4
27π 4
⇒ RaT = Rac + (30.80)
4
d2 ϕ
− k2 ϕ + ψ = 0
dz 2
d2 ψ
− k 2 ψ + Rad k 2 ϕ = 0
dz 2
dψ m2 π 2
ϕ(0) = ϕ(1) = 0; ψ(0) = (1) = 0; k2 =
dz α2
where α is the aspect ratio (width to height of the box) and Rad is the Darcy–
Rayleigh number.
(c) Determine the marginal/neutral stability boundary and compare the critical
Rad to that of the box closed at the top.
30.2 Chemical reactor stability and dynamics | 757
d2 w
= −λw, 0 < x < 1; w′ (0) = 0 = w(1)
dx2
(a) What is the smallest value of λ for which the BVP is compatible?
(b) If λ = λ1 is the value determined in (a), show that for −∞ < λ < λ1 , the only
solution to the BVP is the trivial one.
(c) Now, consider the nonlinear boundary value problem
d2 w
= −f (w), 0 < x < 1; w′ (0) = 0 = w(1)
dx2
and reason that it has only one solution if the maximum value of f ′ (w) < λ1 .
[Hint: Consider the case that it has two solutions and take their difference and
use the mean value theorem of calculus.]
(d) Use the result in (c) to determine the maximum value of ϕ2 for which the fol-
lowing nonlinear BVP has only one solution:
d2 w
= −ϕ2 (B − w)ew , 0 < x < 1; w′ (0) = 0 = w(1)
dx2
dz z(t − τ)3
=β− z(t); t > 0, z(t) = z0 for − τ ≤ t ≤ 0
dt 1 + z(t − τ)3
https://doi.org/10.1515/9783110739701-033
Index
addition 207 eigenvalue 82
adjoint 239 eigenvalues of the kernel 526
adjoint eigenvalue problem 86 eigenvector 82
adjoint equation 307, 424 eigenvector expansions 82
algebraic multiplicity 166, 486 elementary row operations 8
alien cofactor 50 equidimensional 302
analytic 322 essential singularity 326
analytic part 335 Euler’s equation 302
atomic matrix 73
augmented matrix 3 Fibonacci equation 194
Finite Fourier Transform 551, 552
Floquet exponents 297
basic states 103
Fourier coefficients 504
basis 67
Fourier integral formula 600
bijection 214
Fourier series 504
bilinear concomitant 309, 425
Fredholm alternative 14
biorthogonality property 89
Fredholm equation 520
branch point 325
fundamental frequency 571
breakthrough curve 389, 725
fundamental matrix 278
fundamental modes 103
canonical 112 fundamental set of solutions 71, 278
canonical variables 113 fundamental solution 616
Cauchy sequence 505 fundamental vector 281
Cauchy–Goursat theorem 328
Cayley–Hamilton 122 Gaussian elimination 15
characteristic equation 84, 481 generalized eigenvector 160
characteristic exponents 297 generalized Fourier series 505
characteristic multipliers 297 generalized inverse 197
cofactor 49 geometric multiplicity 167, 486
commutative 5 Gibb’s phenomena 512
commute 5 Gram–Schmidt procedure 70
concomitant 309 Green’s formula 435
concomitant matrix 309 Green’s function 445
conjugate harmonic 323
continuous spectrum 602 harmonic 323
heat/diffusion equation 560
Heaviside’s expansion formula 375
diagonal matrix 6
Heaviside’s function 367
difference equation 190
Hermitian 6
dimension 67, 210
Hessian matrix 179
Dirac delta function 367
holomorphic 322
dispersion curve 725
homogeneous system 3
Duhammel’s formula 621
hysteresis locus 753
https://doi.org/10.1515/9783110739701-034
762 | Index