Mathematical Methods in Engineering and Science: (Http://Home - Iitk.Ac - In/ Dasgupta/Mathcourse)

Mathematical Methods in Engineering and Science
Mathematical Methods in Engineering and

Science
[http://home.iitk.ac.in/ dasgupta/MathCourse]
Bhaskar Dasgupta
dasgupta@iitk.ac.in
An Applied Mathematics course for
graduate and senior undergraduate students
and also for
rising researchers.
1,
Textbook: Dasgupta B., App. Math. Meth.

(Pearson Education 2006, 2007).
http://home.iitk.ac.in/ dasgupta/MathBook
2,
Contents I
Preliminary Background
Matrices and Linear Transformations
Operational Fundamentals of Linear Algebra
Systems of Linear Equations
Gauss Elimination Family of Methods
Special Systems and Special Methods
Numerical Aspects in Linear Systems
3,
Contents II
Eigenvalues and Eigenvectors
Diagonalization and Similarity Transformations
Jacobi and Givens Rotation Methods
Householder Transformation and Tridiagonal Matrices
QR Decomposition Method
Eigenvalue Problem of General Matrices
Singular Value Decomposition
Vector Spaces: Fundamental Concepts*
4,
Contents III
Topics in Multivariate Calculus
Vector Analysis: Curves and Surfaces
Scalar and Vector Fields
Polynomial Equations
Solution of Nonlinear Equations and Systems
Optimization: Introduction
Multivariate Optimization
Methods of Nonlinear Optimization*
5,
Contents IV
Constrained Optimization
Linear and Quadratic Programming Problems*
Interpolation and Approximation
Basic Methods of Numerical Integration
Advanced Topics in Numerical Integration*
Numerical Solution of Ordinary Differential Equations
ODE Solutions: Advanced Issues
Existence and Uniqueness Theory
6,
Contents V
First Order Ordinary Differential Equations
Second Order Linear Homogeneous ODEs
Second Order Linear Non-Homogeneous ODEs
Higher Order Linear ODEs
Laplace Transforms
ODE Systems
Stability of Dynamic Systems
Series Solutions and Special Functions
7,
Contents VI
Sturm-Liouville Theory
Fourier Series and Integrals
Fourier Transforms
Minimax Approximation*
Partial Differential Equations
Analytic Functions
Integrals in the Complex Plane
Singularities of Complex Functions
8,
Contents VII
Variational Calculus*
Epilogue
Selected References
9,
Outline
Theme of the Course
Course Contents
Sources for More Detailed Study
Logistic Strategy
Expected Background
Theme of the Course
Course Contents
Logistic Strategy
Expected Background
10,
Theme of the Course
Theme of the Course
Course Contents
Logistic Strategy
Expected Background
To develop a firm mathematical background necessary for graduate

studies and research
a fast-paced recapitulation of UG mathematics
extension with supplementary advanced ideas for a mature

and forward orientation
exposure and highlighting of interconnections
To pre-empt needs of the future challenges
trade-off between sufficient and reasonable
target mid-spectrum majority of students
Notable beneficiaries (at two ends)
would-be researchers in analytical/computational areas
students who are till now somewhat afraid of mathematics
11,
Course Contents
Applied linear algebra
Multivariate calculus and vector calculus
Numerical methods
Differential equations + +
Complex analysis
Theme of the Course
Course Contents
Logistic Strategy
Expected Background
12,
Theme of the Course
Course Contents
Logistic Strategy
Expected Background
If you have the time, need and interest, then you may consult
individual books on individual topics;
another umbrella volume, like Kreyszig, McQuarrie, ONeil

or Wylie and Barrett;
a good book of numerical analysis or scientific computing, like

Acton, Heath, Hildebrand, Krishnamurthy and Sen, Press et
al, Stoer and Bulirsch;
friends, in joint-study groups.
13,
Logistic Strategy
Theme of the Course
Course Contents
Logistic Strategy
Expected Background
Study in the given sequence, to the extent possible.
Do not read mathematics.
14,
Logistic Strategy
Theme of the Course
Course Contents
Logistic Strategy
Expected Background
Use lots of pen and paper.

Read mathematics books and do mathematics.
15,
Logistic Strategy
Theme of the Course
Course Contents
Logistic Strategy
Expected Background

Exercises are must.
Use as many methods as you can think of, certainly including

the one which is recommended.
Consult the Appendix after you work out the solution. Follow
the comments, interpretations and suggested extensions.
Think. Get excited. Discuss. Bore everybody in your known
circles.
Not enough time to attempt all? Want a selection ?
16,
Logistic Strategy
Theme of the Course
Course Contents
Logistic Strategy
Expected Background

Exercises are must.
Use as many methods as you can think of, certainly including

the one which is recommended.
Consult the Appendix after you work out the solution. Follow
the comments, interpretations and suggested extensions.
Think. Get excited. Discuss. Bore everybody in your known
circles.
Not enough time to attempt all? Want a selection ?
Program implementation is needed in algorithmic exercises.
Master a programming environment.

Use mathematical/numerical library/software.
Take a MATLAB tutorial session?
17,
Logistic Strategy
Theme of the Course

Course Contents
Logistic Strategy
Expected Background
Tutorial Plan
Chapter
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Selection Tutorial Chapter Selection Tutorial

26
2,3
3
1,2,4,6
4
27
2,4,5,6
4,5
1,2,3,4
3,4
28
1,2,4,5,7
4,5
2,5,6
6
29
1,4,5
4
1,2,5,6
6
30
1,2,4,7
4
1,2,3,4,5
4
31
1,2,3,4
2
1,2
1(d)
32
1,2,3,4,6
4
1,3,5,7
7
33
1,2,4
4
1,2,3,7,8
8
34
2,3,4
4
1,3,5,6
5
35
2,4,5
5
1,3,4
3
36
1,3
3
1,2,4
4
37
1,2
1
1
1(c)
38
2,4,5,6,7
4
1,2,3,4,5
5
39
6,7
7
2,3,4,5
4
40
2,3,4,8
8
1,2,4,5
4
41
1,2,3,6
6
1,3,6,8
8
42
1,2,3,6,7
3
1,3,6
6
43
1,3,4,6
6
2,3,4
3
44
1,2,3
2
1,2,4,7,9,10
7,10
45
1,2,5,7,8
7
1,2,3,4,7,9
4,9
46
1,2,3,4,5,6 3,4
1,2,5,7
7
47
1,2,3
3
1,2,3,5,8,9,10 9,10
48
1,2,3,4,5,6
1
1,2,4,5
5
1,2,3,4,5
5
18,
Expected Background
Theme of the Course
Course Contents
Logistic Strategy
Expected Background
moderate background of undergraduate mathematics
firm understanding of school mathematics and undergraduate

calculus
Take the preliminary test.
[p 3, App. Math. Meth.]
Grade yourself sincerely.
19,
Expected Background
Theme of the Course
Course Contents
Logistic Strategy
Expected Background
moderate background of undergraduate mathematics
firm understanding of school mathematics and undergraduate

calculus
Take the preliminary test.
Grade yourself sincerely.
Prerequisite Problem Sets*
20,
Points to note
Theme of the Course
Course Contents
Logistic Strategy
Expected Background
Put in effort, keep pace.
Stress concept as well as problem-solving.
Follow methods diligently.
Ensure background skills.
Necessary Exercises: Prerequisite problem sets ??
21,
Outline

Matrices
Geometry and Algebra
Linear Transformations
Matrix Terminology

Matrices
Matrix Terminology
22,
Matrices
Question: What is a matrix?

Matrices
Matrix Terminology
23,
Matrices

Matrices
Matrix Terminology

Answers:
a rectangular array of numbers/elements
24,
Matrices

Matrices
Matrix Terminology

Answers:
a mapping f : M N F , where M = {1, 2, 3, , m},

N = {1, 2, 3, , n} and F is the set of real numbers or
complex numbers ?
25,
Matrices

Matrices
Matrix Terminology

Answers:

complex numbers ?
Question: What does a matrix do?
26,
Matrices

Matrices
Matrix Terminology

Answers:

complex numbers ?
Question: What does a matrix do?

Explore: With an m n matrix A,
y1 = a11 x1 + a12 x2 + + a1n xn
y2 = a21 x1 + a22 x2 + + a2n xn
..
..
..
.. ..
.
.
.
. .
ym = am1 x1 + am2 x2 + + amn xn
or
Ax = y
27,
Matrices
Consider these definitions:
y = f (x)
y = f (x) = f (x1 , x2 , , xn )
yk = fk (x) = fk (x1 , x2 , , xn ),
y = f(x)

Matrices
Matrix Terminology
k = 1, 2, , m
28,
Matrices
y = f (x)
y = f (x) = f (x1 , x2 , , xn )
yk = fk (x) = fk (x1 , x2 , , xn ),
y = f(x)
y = Ax

Matrices
Matrix Terminology
k = 1, 2, , m
29,
Matrices

Matrices
Matrix Terminology

y = f (x)
y = f (x) = f (x1 , x2 , , xn )
yk = fk (x) = fk (x1 , x2 , , xn ),
k = 1, 2, , m
y = f(x)
y = Ax
Further Answer:
A matrix is the definition of a linear vector function of a
vector variable.
30,
Matrices

Matrices
Matrix Terminology

y = f (x)
y = f (x) = f (x1 , x2 , , xn )
yk = fk (x) = fk (x1 , x2 , , xn ),
k = 1, 2, , m
y = f(x)
y = Ax
Further Answer:
vector variable.
Caution: Matrices do not define vector functions whose components are

of the form
yk = ak0 + ak1 x1 + ak2 x2 + + akn xn .
31,
Matrices
Matrices
Matrix Terminology

y = f (x)
y = f (x) = f (x1 , x2 , , xn )
yk = fk (x) = fk (x1 , x2 , , xn ),
k = 1, 2, , m
y = f(x)
y = Ax
Further Answer:
vector variable.
Anything deeper?
Caution: Matrices do not define vector functions whose components are

of the form
yk = ak0 + ak1 x1 + ak2 x2 + + akn xn .
32,

Matrices
Matrix Terminology
Let vector x = [x1 x2 x3 ]T denote a point (x1 , x2 , x3 ) in

3-dimensional space in frame of reference OX1 X2 X3 .
Example: With m = 2 and n = 3,

y1 = a11 x1 + a12 x2 + a13 x3
.
y2 = a21 x1 + a22 x2 + a23 x3
Plot y1 and y2 in the OY1 Y2 plane.
33,
Matrices
Matrix Terminology
Let vector x = [x1 x2 x3 ]T denote a point (x1 , x2 , x3 ) in

3-dimensional space in frame of reference OX1 X2 X3 .
Example: With m = 2 and n = 3,

y1 = a11 x1 + a12 x2 + a13 x3
.
y2 = a21 x1 + a22 x2 + a23 x3
Plot y1 and y2 in the OY1 Y2 plane.
111
R3
A :000
R2
X3
Y2
x
11
00
00
11
O
Y1
00000
11111
11y
00
11111
00000
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
O
X2
X1
Domain
Codomain
Figure: Linear transformation: schematic illustration
What is matrix A doing?
34,

Matrices
Matrix Terminology
Operating on point x in R 3 , matrix A transforms it to y in R 2 .

Point y is the image of point x under the mapping defined by
matrix A.
35,

Matrices
Matrix Terminology

matrix A.
Note domain R 3 , co-domain R 2 with reference to the figure and
verify that A : R 3 R 2 fulfils the requirements of a mapping, by
definition.
36,

Matrices
Matrix Terminology

matrix A.
Note domain R 3 , co-domain R 2 with reference to the figure and
verify that A : R 3 R 2 fulfils the requirements of a mapping, by
definition.
A matrix gives a definition of a linear transformation
from one vector space to another.
37,

Matrices
Matrix Terminology
Operate A on a large number of points xi R 3 .

Obtain corresponding images yi R 2 .
The linear transformation represented by A implies the totality of
these correspondences.
38,

Matrices
Matrix Terminology

We decide to use a different frame of reference OX1 X2 X3 for R 3 .
[And, possibly OY1 Y2 for R 2 at the same time.]
Coordinates change, i.e. xi changes to xi (and possibly yi to yi ).
Now, we need a different matrix, say A , to get back the
correspondence as y = A x .
39,

Matrices
Matrix Terminology

We decide to use a different frame of reference OX1 X2 X3 for R 3 .
[And, possibly OY1 Y2 for R 2 at the same time.]
Coordinates change, i.e. xi changes to xi (and possibly yi to yi ).
Now, we need a different matrix, say A , to get back the
correspondence as y = A x .
A matrix: just one description.
Question: How to get the new matrix A ?
40,
Matrix Terminology

Matrices
Matrix Terminology
Matrix product
Transpose
Conjugate transpose
Symmetric and skew-symmetric matrices
Hermitian and skew-Hermitian matrices
Determinant of a square matrix
Inverse of a square matrix
Adjoint of a square matrix
41,
Points to note

Matrices
Matrix Terminology
A matrix defines a linear transformation from one vector space

to another.
Matrix representation of a linear transformation depends on

the selected bases (or frames of reference) of the source and
target spaces.
Important: Revise matrix algebra basics as necessary tools.
Necessary Exercises: 2,3
42,
Outline

Range and Null Space: Rank and Nullity
Basis
Change of Basis
Elementary Transformations

Basis
Change of Basis
43,

Basis
Change of Basis
Consider A R mn as a mapping
A : R n R m,
Ax = y,
x R n,
y R m.
44,

Basis
Change of Basis
A : R n R m,
Ax = y,
x R n,
y R m.
Observations
1. Every x R n has an image y R m , but every y R m need
not have a pre-image in R n .
Range (or range space) as subset/subspace of
co-domain: containing images of all x R n .
45,

Basis
Change of Basis
A : R n R m,
Ax = y,
x R n,
y R m.
Observations
1. Every x R n has an image y R m , but every y R m need
not have a pre-image in R n .
Range (or range space) as subset/subspace of
co-domain: containing images of all x R n .
2. Image of x R n in R m is unique, but pre-image of y R m
need not be.
It may be non-existent, unique or infinitely many.
Null space as subset/subspace of domain:
containing pre-images of only 0 R m .
46,

Basis
Change of Basis
R
0
O
Range ( A )
Null ( A)
Domain
Codomain
Figure: Range and null space: schematic representation
47,

Basis
Change of Basis
R
0
O
Range ( A )
Null ( A)
Domain
Codomain
Question: What is the dimension of a vector space?
48,

Basis
Change of Basis
R
0
O
Range ( A )
Null ( A)
Domain
Codomain

Linear dependence and independence: Vectors x1 , x2 , , xr
in a vector space are called linearly independent if
k1 x1 + k2 x2 + + kr xr = 0
k1 = k2 = = kr = 0.
49,

Basis
Change of Basis
R
0
O
Range ( A )
Null ( A)
Domain
Codomain

Linear dependence and independence: Vectors x1 , x2 , , xr
in a vector space are called linearly independent if
k1 x1 + k2 x2 + + kr xr = 0
k1 = k2 = = kr = 0.
Range(A) = {y : y = Ax, x R n }
Null(A) = {x : x R n , Ax = 0}
Rank(A) = dim Range(A)

Nullity(A) = dim Null (A)
50,
Basis

Basis
Change of Basis
Take a set of vectors v1 , v2 , , vr in a vector space.

Question: Given a vector v in the vector space, can we describe it
as
v = k1 v1 + k2 v2 + + kr vr = Vk,
where V = [v1 v2 vr ] and k = [k1 k2 kr ]T ?
51,
Basis

Basis
Change of Basis

as
v = k1 v1 + k2 v2 + + kr vr = Vk,

Answer: Not necessarily.
52,
Basis

Basis
Change of Basis

as
v = k1 v1 + k2 v2 + + kr vr = Vk,

Answer: Not necessarily.
Span, denoted as < v1 , v2 , , vr >: the subspace
described/generated by a set of vectors.
Basis:
A basis of a vector space is composed of an ordered
minimal set of vectors spanning the entire space.
The basis for an n-dimensional space will have exactly n

members, all linearly independent.
53,
Basis
Orthogonal basis: {v1 , v2 , , vn } with
Orthonormal basis:
vjT vk

Basis
Change of Basis
vjT vk = 0 j 6= k.
= jk =
0
1
if
if
j 6= k
j =k
54,
Basis
Orthonormal basis:
vjT vk

Basis
Change of Basis
vjT vk = 0 j 6= k.
= jk =
0
1
if
if
j 6= k
j =k
Members of an orthonormal basis form an orthogonal matrix.

Properties of an orthogonal matrix:
V1 = VT or VVT = I, and
det V = +1 or 1,
55,
Basis

Basis
Change of Basis

Orthonormal basis:
vjT vk
vjT vk = 0 j 6= k.
= jk =
0
1
if
if
j 6= k
j =k
Members of an orthonormal basis form an orthogonal matrix.

Properties of an orthogonal matrix:
V1 = VT or VVT = I, and
Natural basis:
e1 =
det V = +1 or 1,
1
0
0
..
.
0
e2 =
0
1
0
..
.
0
en =
0
0
0
..
.
1
56,
Change of Basis

Basis
Change of Basis
nElementary Transformations
Suppose x represents a vector (point) in R in some basis.

Question: If we change over to a new basis {c1 , c2 , , cn }, how
does the representation of a vector change?
57,
Change of Basis

Basis
Change of Basis

x = x1 c1 + x2 c2 + + xn cn
x1
x2
= [c1 c2 cn ] .. .
.
xn
58,
Change of Basis

Basis
Change of Basis

x = x1 c1 + x2 c2 + + xn cn
x1
x2
= [c1 c2 cn ] .. .
.
xn
With C = [c1
c2
cn ],
new to old coordinates: C

x = x and
old to new coordinates: x = C1 x.
Note: Matrix C is invertible. How?
59,
Change of Basis

Basis
Change of Basis

x = x1 c1 + x2 c2 + + xn cn
x1
x2
= [c1 c2 cn ] .. .
.
xn
With C = [c1
c2
cn ],
new to old coordinates: C

x = x and
old to new coordinates: x = C1 x.
Note: Matrix C is invertible. How?
Special case with C orthogonal:
orthogonal coordinate transformation.
60,
Change of Basis

Basis
Change of Basis
Question: And, how does basis change affect the representation of

a linear transformation?
61,
Change of Basis

Basis
Change of Basis

Consider the mapping
A : R n R m,
Ax = y.
Change the basis of the domain through P R nn and that of the

co-domain through Q R mm .
62,
Change of Basis

Basis
Change of Basis

A : R n R m,
Ax = y.

New and old vector representations are related as
P
x=x
and
Q
y = y.
x = y, with
Then, Ax = y A
= Q1 AP
A
63,
Change of Basis

Basis
Change of Basis

A : R n R m,
Ax = y.

New and old vector representations are related as
P
x=x
and
Q
y = y.
x = y, with
Then, Ax = y A
= Q1 AP
A
Special case: m = n and P = Q gives a similarity transformation
= P1 AP
A
64,

Basis
Change of Basis
Observation: Certain reorganizations of equations in a system

have no effect on the solution(s).
65,

Basis
Change of Basis

Elementary Row Transformations:
1. interchange of two rows,
2. scaling of a row, and
3. addition of a scalar multiple of a row to another.
66,

Basis
Change of Basis

Elementary Row Transformations:
1. interchange of two rows,
2. scaling of a row, and
3. addition of a scalar multiple of a row to another.
Elementary Column Transformations: Similar operations with

columns, equivalent to a corresponding shuffling of the variables
(unknowns).
67,

Basis
Change of Basis
Equivalence of matrices: An elementary transformation defines

an equivalence relation between two matrices.
Reduction to normal form:
AN =
Ir
0
0
0
Rank invariance: Elementary transformations do not alter the

rank of a matrix.
Elementary transformation as matrix multiplication:
an elementary row transformation on a matrix is
equivalent to a pre-multiplication with an elementary
matrix, obtained through the same row transformation on
the identity matrix (of appropriate size).
Similarly, an elementary column transformation is equivalent to
post-multiplication with the corresponding elementary matrix.
68,
Points to note

Basis
Change of Basis
Concepts of range and null space of a linear transformation.
Effects of change of basis on representations of vectors and

linear transformations.
Elementary transformations as tools to modify (simplify)

systems of (simultaneous) linear equations.
Necessary Exercises: 2,4,5,6
69,
Outline

Nature of Solutions
Basic Idea of Solution Methodology
Homogeneous Systems
Pivoting
Partitioning and Block Operations

Nature of Solutions
Homogeneous Systems
Pivoting
70,
Nature of Solutions
Ax = b
Nature of Solutions
Homogeneous Systems
Pivoting
71,
Nature of Solutions
Ax = b
Nature of Solutions
Homogeneous Systems
Pivoting
Coefficient matrix: A, augmented matrix: [A | b].
72,
Nature of Solutions
Ax = b
Nature of Solutions
Homogeneous Systems
Pivoting

Existence of solutions or consistency:
Ax = b
has a solution
b Range(A)
Rank(A) = Rank([A | b])
73,
Nature of Solutions
Ax = b
Nature of Solutions
Homogeneous Systems
Pivoting

Ax = b
has a solution
b Range(A)
Uniqueness of solutions:
Rank(A)
Rank([A | b]) = n
Solution of Ax = b is unique.
Ax = 0 has only the trivial (zero) solution.
74,
Nature of Solutions
Nature of Solutions
Homogeneous Systems
Pivoting
Ax = b

Ax = b
has a solution
b Range(A)
Uniqueness of solutions:
Rank(A)
Rank([A | b]) = n
Solution of Ax = b is unique.
Ax = 0 has only the trivial (zero) solution.

Infinite solutions: For Rank(A) = Rank([A|b]) = k < n, solution
x = x + xN ,
with A
x=b
and
xN Null (A)
75,

To diagnose

Nature of Solutions
Homogeneous Systems
Pivoting
the non-existence of a solution,
To determine the unique solution, or

To describe
infinite solutions;
decouple the equations using elementary transformations.
76,

To diagnose
Nature of Solutions
Homogeneous Systems
Pivoting

To describe
infinite solutions;

For solving Ax = b, apply suitable elementary row transformations
on both sides, leading to
Rq Rq1 R2 R1 Ax = Rq Rq1 R2 R1 b,
or,
[RA]x = Rb;
such that matrix [RA] is greatly simplified.

In the best case, with complete reduction, RA = In , and
components of x can be read off from Rb.
77,

To diagnose
Nature of Solutions
Homogeneous Systems
Pivoting

To describe
infinite solutions;

For solving Ax = b, apply suitable elementary row transformations
on both sides, leading to
Rq Rq1 R2 R1 Ax = Rq Rq1 R2 R1 b,
or,
[RA]x = Rb;
such that matrix [RA] is greatly simplified.

In the best case, with complete reduction, RA = In , and
components of x can be read off from Rb.
For inverting matrix A, treat AA1 = In similarly.
78,
Homogeneous Systems

Nature of Solutions
Homogeneous Systems
Pivoting
To solve Ax = 0 or to describe Null (A),

apply a series of elementary row transformations on A to reduce it
to the A,
the row-reduced echelon form or RREF.
79,
Homogeneous Systems

Nature of Solutions
Homogeneous Systems
Pivoting

to the A,
Features of RREF:
1. The first non-zero entry in any row is a 1, the leading 1.
2. In the same column as the leading 1, other entries are zero.
3. Non-zero entries in a lower row appear later.
80,
Homogeneous Systems

Nature of Solutions
Homogeneous Systems
Pivoting

to the A,
Features of RREF:
Variables corresponding to columns having leading 1s
are expressed in terms of the remaining variables.
81,
Homogeneous Systems

Nature of Solutions
Homogeneous Systems
Pivoting

to the A,
Features of RREF:
Variables corresponding to columns having leading 1s
are expressed in terms of the remaining variables.
u1
u2
Solution of Ax = 0: x = z1 z2 znk

unk
Basis of Null(A): {z1 , z2 , , znk }
82,
Pivoting

Nature of Solutions
Homogeneous Systems
Pivoting
Attempt:
To get 1 at diagonal (or leading) position, with 0 elsewhere.
Key step: division by the diagonal (or leading) entry.
83,
Pivoting

Nature of Solutions
Homogeneous Systems
Pivoting
Attempt:
Consider
Ik
. . .
.
.
.
. .
.
.
.
. . . BIG .
.
A=
.
big
.
.
.
.
.
. . .
.
.
.
. . .
.
.
Cannot divide by zero. Should not divide by .
84,
Pivoting

Nature of Solutions
Homogeneous Systems
Pivoting
Attempt:
Consider
Ik
. . .
.
.
.
. .
.
.
.
. . . BIG .
.
A=
.
big
.
.
.
.
.
. . .
.
.
.
. . .
.
.
partial pivoting: row interchange to get big in place of
complete pivoting: row and column interchanges to get

BIG in place of
85,
Pivoting

Nature of Solutions
Homogeneous Systems
Pivoting
Attempt:
Consider
Ik
. . .
.
.
.
. .
.
.
.
. . . BIG .
.
A=
.
big
.
.
.
.
.
. . .
.
.
.
. . .
.
.
partial pivoting: row interchange to get big in place of
complete pivoting: row and column interchanges to get

BIG in place of
Complete pivoting does not give a huge advantage over partial pivoting,
but requires maintaining of variable permutation for later unscrambling.
86,

Nature of Solutions
Homogeneous Systems
Pivoting
Equation Ax = y can be written as

x1

A11 A12 A13
y1
x2 =
,
A21 A22 A23
y2
x3
with x1 , x2 etc being themselves vectors (or matrices).
For a valid partitioning, block sizes should be consistent.
Elementary transformations can be applied over blocks.
Block operations can be computationally economical at times.
Conceptually, different blocks of contributions/equations can

be assembled for mathematical modelling of complicated
coupled systems.
87,
Points to note

Nature of Solutions
Homogeneous Systems
Pivoting
Solution(s) of Ax = b may be non-existent, unique or

infinitely many.
Complete solution can be described by composing a particular

solution with the null space of A.
Null space basis can be obtained conveniently from the

row-reduced echelon form of A.
For a strategy of solution, pivoting is an important step.
Necessary Exercises: 1,2,4,5,7
88,
Outline

Gauss-Jordan Elimination
Gaussian Elimination with Back-Substitution
LU Decomposition

LU Decomposition
89,

LU Decomposition
Task: Solve Ax = b1 , Ax = b2 and Ax = b3 ; find A1 and

evaluate A1 B, where A R nn and B R np .
90,
LU Decomposition

Assemble C = [A b1 b2
and follow the algorithm .
b3
In
B] R n(2n+3+p)
91,
LU Decomposition

b3
In
B] R n(2n+3+p)
Collect solutions from the result
C C = [In
A1 b1
A1 b2
A1 b3
A1
A1 B].
92,
LU Decomposition

b3
In
B] R n(2n+3+p)
Collect solutions from the result
C C = [In
A1 b1
A1 b2
A1 b3
A1
A1 B].
Remarks:
Premature termination: matrix A singular decision?
If you use complete pivoting, unscramble permutation.
Identity matrix in both C and C? Store A1 in place.
For evaluating A1 b, do not develop A1 .
Gauss-Jordan elimination an overkill? Want something

cheaper ?
93,
LU Decomposition
Gauss-Jordan Algorithm
=1
For k = 1, 2, 3, , (n 1)
1. Pivot : identify l such that |clk | = max |cjk | for k j n.

If clk = 0, then = 0 and exit.
Else, interchange row k and row l.
2. ckk ,
Divide row k by ckk .
3. Subtract cjk times row k from row j, j 6= k.
cnn
If cnn = 0, then exit.
Else, divide row n by cnn .
In case of non-singular A,
default termination .
This outline is for partial pivoting.
94,
LU Decomposition
Gaussian elimination:
Ax = b
a11
a12
a
22
or,
..
Ax = b
b1
x1
a1n

x
a2n
b2
2
.. .. = ..
.
. .
xn
bn
ann
95,
LU Decomposition
Ax = b
a11
a12
a
22
or,
..
Back-substitutions:
Ax = b
b1
x1
a1n

x
a2n
b2
2
.. .. = ..
.
. .
xn
bn
ann
xn = bn /ann
,
n
X
1
xi =
aij xj for i = n 1, n 2, , 2, 1
bi
aii
j=i +1
96,
LU Decomposition
Ax = b
a11
a12
a
22
or,
..
Back-substitutions:
Ax = b
b1
x1
a1n

x
a2n
b2
2
.. .. = ..
.
. .
xn
bn
ann
xn = bn /ann
,
n
X
1
xi =
aij xj for i = n 1, n 2, , 2, 1
bi
aii
j=i +1
Remarks
Computational cost half compared to G-J elimination.
Like G-J elimination, prior knowledge of RHS needed.
97,
LU Decomposition
Anatomy of the Gaussian elimination:

The process of Gaussian elimination (with no pivoting) leads to
U = Rq Rq1 R2 R1 A = RA.
98,
LU Decomposition

The steps given by
for k = 1, 2, 3, , (n 1)
j-th row j-th row

j = k + 1, k + 2, , n
involve elementary matrices
Rk |k=1
1
aa21
11
aa31
11
..
.
aan1
11
ajk
akk
0 0
1 0
0 1
.. .. . .
.
. .
0 0
k-th row
0
0
0
..
.
1
etc.
for
99,
LU Decomposition

The steps given by
for k = 1, 2, 3, , (n 1)
j-th row j-th row

j = k + 1, k + 2, , n
involve elementary matrices
Rk |k=1
With L = R1 ,
1
aa21
11
aa31
11
..
.
aan1
11
A = LU.
ajk
akk
0 0
1 0
0 1
.. .. . .
.
. .
0 0
k-th row
0
0
0
..
.
1
etc.
for
100,
LU Decomposition

LU Decomposition
A square matrix with non-zero leading minors is LU-decomposable.
101,
LU Decomposition

LU Decomposition

No reference to a right-hand-side (RHS) vector!
102,
LU Decomposition
LU Decomposition

To solve Ax = b, denote y = Ux and split as
Ax = b LUx = b
Ly = b
and
Ux = y.
103,
LU Decomposition
LU Decomposition

To solve Ax = b, denote y = Ux and split as
Ax = b LUx = b
Ly = b
Forward substitutions:
i 1
X
1
lij yj
yi = bi
lii
j=1
Back-substitutions:
n
X
1
uij xj
yi
xi =
uii
j=i +1
and
Ux = y.
for i = 1, 2, 3, , n;
for i = n, n 1, n 2, , 1.
104,
LU Decomposition

LU Decomposition
Question: How to LU-decompose a given matrix?
105,
LU Decomposition
106,
LU Decomposition
L=
l11
l21
l31
..
.
0
0
l22
l32
..
.
l33
..
.
ln1
ln2
ln3
..
.
0
0
0
..
.
lnn
and U =
u11
0
0
..
.
u12
u22
0
..
.
u13
u23
u33
..
.
..
.
u1n
u2n
u3n
..
.
unn
LU Decomposition
107,
LU Decomposition
L=
l11
l21
l31
..
.
0
0
l22
l32
..
.
l33
..
.
ln1
ln2
ln3
..
.
0
0
0
..
.
lnn
and U =
u11
0
0
..
.
u12
u22
0
..
.
u13
u23
u33
..
.
..
.
u1n
u2n
u3n
..
.
unn
Elements of the product give

i
X
and
k=1
j
X
lik ukj = aij
for i j,
lik ukj = aij
for i > j.
k=1
n2 equations in n2 + n unknowns: choice of n unknowns
LU Decomposition

LU Decomposition
Doolittles algorithm
Choose lii = 1
For j = 1, 2, 3, , n
Pi 1
1. uij = aij k=1
lik ukj for 1 i j
Pj1
1
2. lij = ujj (aij k=1 lik ukj ) for i > j
108,
LU Decomposition

LU Decomposition
Doolittles algorithm
Choose lii = 1
For j = 1, 2, 3, , n
Pi 1
1. uij = aij k=1
lik ukj for 1 i j
Pj1
1
2. lij = ujj (aij k=1 lik ukj ) for i > j
Evaluation proceeds in column order of
u11 u12 u13

l21 u22 u23
A = l31 l32 u33

..
..
..
.
.
.
ln1 ln2 ln3
the matrix (for storage)
u1n
u2n
u3n
..
..
.
.
unn
109,
LU Decomposition

LU Decomposition
Question: What about matrices which are not LU-decomposable?

Question: What about pivoting?
110,
LU Decomposition

LU Decomposition

Consider the non-singular matrix
0 1 2
1
0 0
u11 = 0 u12 u13
3 1 2 = l21 =? 1 0
0
u22 u23 .
2 1 3
l31
l32 1
0
0 u33
111,
LU Decomposition
LU Decomposition

0 1 2
1
0 0
u11 = 0 u12 u13
3 1 2 = l21 =? 1 0
0
u22 u23 .
2 1 3
l31
l32 1
0
0 u33
LU-decompose a permutation of its rows
0
3
2
1 2
1 2
1 3
0
= 1
0
0
= 1
0
1 0
0 0
0 1
1 0
0 0
0 1
1 2
1 2
1 3
1 0 0
3 1
0 1 0 0 1
1
2
0 0
1
3
3
3
0
2
2
2 .
1
112,
LU Decomposition
LU Decomposition

0 1 2
1
0 0
u11 = 0 u12 u13
3 1 2 = l21 =? 1 0
0
u22 u23 .
2 1 3
l31
l32 1
0
0 u33
LU-decompose a permutation of its rows
0
3
2
1 2
1 2
1 3
0
= 1
0
0
= 1
0
1 0
0 0
0 1
1 0
0 0
0 1
1 2
1 2
1 3
1 0 0
3 1
0 1 0 0 1
1
2
0 0
1
3
3
3
0
2
2
2 .
1
In this PLU decomposition, permutation P is recorded in a vector.
113,
Points to note

LU Decomposition
For invertible coefficient matrices, use
Gauss-Jordan elimination for large number of RHS vectors

available all together and also for matrix inversion,
Gaussian elimination with back-substitution for small number

of RHS vectors available together,
LU decomposition method to develop and maintain factors to

be used as and when RHS vectors are available.
Pivoting is almost necessary (without further special structure).

Necessary Exercises: 1,4,5
114,
Outline
115,
Quadratic Forms, Symmetry and Positive Definitenes

Cholesky Decomposition
Sparse Systems*

Quadratic Forms, Symmetry and Positive Definiteness
Sparse Systems*
116,

Quadratic Forms, Symmetry and Positive
Definiteness
Sparse Systems*
Quadratic form
T
q(x) = x Ax =
n
n X
X
i =1 j=1
aij xi xj
117,

Definiteness
Sparse Systems*
Quadratic form
T
q(x) = x Ax =
n
n X
X
aij xi xj
i =1 j=1
defined with respect to a symmetric matrix.
118,

Definiteness
Sparse Systems*
Quadratic form
T
q(x) = x Ax =
n
n X
X
aij xi xj
i =1 j=1

Quadratic form q(x), equivalently matrix A, is called positive
definite (p.d.) when
xT Ax > 0
x 6= 0
119,

Definiteness
Sparse Systems*
Quadratic form
T
q(x) = x Ax =
n
n X
X
aij xi xj
i =1 j=1

xT Ax > 0
x 6= 0
and positive semi-definite (p.s.d.) when

xT Ax 0
x 6= 0.
120,

Definiteness
Sparse Systems*
Quadratic form
T
q(x) = x Ax =
n
n X
X
aij xi xj
i =1 j=1

xT Ax > 0
x 6= 0
and positive semi-definite (p.s.d.) when

xT Ax 0
x 6= 0.
Sylvesters criteria:
a11

a
a
0, 11 12
a21 a22

0, , det A 0;

i.e. all leading minors non-negative, for p.s.d.
121,

Sparse Systems*
If A R nn is symmetric and positive definite, then there exists a

non-singular lower triangular matrix L R nn such that
A = LLT .
122,

Sparse Systems*

A = LLT .
Algorithm For i = 1, 2, 3, , n
q
P 1 2
Lii =
aii ik=1
Lik

Pi 1
Lji = 1 aji
L
L
jk
ik
k=1
Lii
for i < j n
123,

Sparse Systems*

A = LLT .
q
P 1 2
Lii =
aii ik=1
Lik

Pi 1
Lji = 1 aji
L
L
jk
ik
k=1
Lii
For solving Ax = b,
Forward substitutions: Ly = b
Back-substitutions: LT x = y
for i < j n
124,

Sparse Systems*

A = LLT .
q
P 1 2
Lii =
aii ik=1
Lik

Pi 1
Lji = 1 aji
L
L
jk
ik
k=1
Lii
for i < j n
For solving Ax = b,
Forward substitutions: Ly = b
Back-substitutions: LT x = y
Remarks
Test of positive definiteness.
Stable algorithm: no pivoting necessary!
Economy of space and time.
Sparse Systems*
What is a sparse matrix?
Bandedness and bandwidth
Efficient storage and processing

Updates
Sherman-Morrison formula
(A + uvT )1 = A1
(A1 u)(vT A1 )
1 + vT A1 u
Woodbury formula
Conjugate gradient method
125,

Sparse Systems*
efficiently implemented matrix-vector products
Points to note
Concepts and criteria of positive definiteness and positive

semi-definiteness
Cholesky decomposition method in symmetric positive definite

systems
Nature of sparsity and its exploitation
126,

Sparse Systems*
Outline

Norms and Condition Numbers
Ill-conditioning and Sensitivity
Rectangular Systems
Singularity-Robust Solutions
Iterative Methods

Rectangular Systems
Iterative Methods
127,

Norm of a vector: a measure of size

Rectangular Systems
Iterative Methods
Euclidean norm or 2-norm

1 q
kxk = kxk2 = x12 + x22 + + xn2 2 = xT x
128,


Rectangular Systems
Iterative Methods

1 q
kxk = kxk2 = x12 + x22 + + xn2 2 = xT x
The p-norm
1
kxkp = [|x1 |p + |x2 |p + + |xn |p ] p
129,


Rectangular Systems
Iterative Methods

1 q
kxk = kxk2 = x12 + x22 + + xn2 2 = xT x
The p-norm
1
kxkp = [|x1 |p + |x2 |p + + |xn |p ] p
The 1-norm: kxk1 = |x1 | + |x2 | + + |xn |

The -norm:
kxk = lim [|x1 |p + |x2 |p + + |xn |p ] p = max |xj |

p
130,


Rectangular Systems
Iterative Methods

1 q
kxk = kxk2 = x12 + x22 + + xn2 2 = xT x
The p-norm
1
kxkp = [|x1 |p + |x2 |p + + |xn |p ] p
The 1-norm: kxk1 = |x1 | + |x2 | + + |xn |

The -norm:
kxk = lim [|x1 |p + |x2 |p + + |xn |p ] p = max |xj |

p
Weighted norm
kxkw =
q
xT Wx
where weight matrix W is symmetric and positive definite.
131,

Rectangular Systems
Iterative Methods
Norm of a matrix: magnitude or scale of the transformation
132,

Rectangular Systems
Iterative Methods

Matrix norm (induced by a vector norm) is given by the largest
magnification it can produce on a vector
kAk = max
x
kAxk
= max kAxk
kxk
kxk=1
133,

Rectangular Systems
Iterative Methods

kAk = max
x
kAxk
= max kAxk
kxk
kxk=1
Direct consequence: kAxk kAk kxk
134,

Rectangular Systems
Iterative Methods

kAk = max
x
kAxk
= max kAxk
kxk
kxk=1

Index of closeness to singularity: Condition number
(A) = kAk kA1 k,
1 (A)
135,

Rectangular Systems
Iterative Methods

kAk = max
x
kAxk
= max kAxk
kxk
kxk=1

Index of closeness to singularity: Condition number
(A) = kAk kA1 k,
1 (A)
** Isotropic, well-conditioned, ill-conditioned and singular matrices
136,

Rectangular Systems
Iterative Methods
0.9999x1 1.0001x2 = 1
x1
x2 = 1 +
Solution: x1 =
10001+1
,
2
x2 =
99991
2
137,

Rectangular Systems
Iterative Methods
0.9999x1 1.0001x2 = 1
x1
x2 = 1 +
Solution: x1 =
10001+1
,
2
x2 =
99991
2
sensitive to small changes in the RHS

insensitive to error in a guess
See illustration
138,

Rectangular Systems
Iterative Methods
0.9999x1 1.0001x2 = 1
x1
x2 = 1 +
Solution: x1 =
10001+1
,
2
x2 =
99991
2

For the system Ax = b, solution is x = A1 b and

x = A1 b A1 A x
See illustration
139,

Rectangular Systems
Iterative Methods
0.9999x1 1.0001x2 = 1
x1
x2 = 1 +
Solution: x1 =
10001+1
,
2
x2 =
99991
2

See illustration

x = A1 b A1 A x
If the matrix A is exactly known, then
kxk
kbk
kbk
kAk kA1 k
= (A)
kxk
kbk
kbk
140,

Rectangular Systems
Iterative Methods
0.9999x1 1.0001x2 = 1
x1
x2 = 1 +
Solution: x1 =
10001+1
,
2
x2 =
99991
2

See illustration

x = A1 b A1 A x
If the matrix A is exactly known, then
kxk
kbk
kbk
kAk kA1 k
= (A)
kxk
kbk
kbk
If the RHS is known exactly, then
kAk
kAk
kxk
kAk kA1 k
= (A)
kxk
kAk
kAk
141,

X2
(2)

Rectangular Systems
X2
Iterative Methods
(2)
(1)
o
X
(a) Reference system
X2
(2)
(2)
(1)
(c)
(c) Guess validation
X1
(a)
(b) Parallel shift
X2
(b)
X1
(a)
(2b)
(1)
(2d)
(1)
X1
(d) Singularity
Figure: Ill-conditioning: a geometric perspective
X1
142,
Rectangular Systems

Rectangular Systems
Iterative Methods
Consider Ax = b with A R mn and Rank(A) = n < m.

AT Ax = AT b x = (AT A)1 AT b
143,
Rectangular Systems

Rectangular Systems
Iterative Methods

Square of error norm
U(x) =
=
1
1
kAx bk2 = (Ax b)T (Ax b)
2
2
1 T T
1
x A Ax xT AT b + bT b
2
2
Least square error solution:

U
= AT Ax AT b = 0
x
144,
Rectangular Systems

Rectangular Systems
Iterative Methods

Square of error norm
U(x) =
=
1
1
kAx bk2 = (Ax b)T (Ax b)
2
2
1 T T
1
x A Ax xT AT b + bT b
2
2
Least square error solution:

U
= AT Ax AT b = 0
x
Pseudoinverse or Moore-Penrose inverse or left-inverse
A# = (AT A)1 AT
145,
Rectangular Systems

Rectangular Systems
Iterative Methods
Consider Ax = b with A R mn and Rank(A) = m < n.
146,
Rectangular Systems

Rectangular Systems
Iterative Methods

Look for R m that satisfies AT = x and
AAT = b
Solution
x = AT = AT (AAT )1 b
147,
Rectangular Systems

Rectangular Systems
Iterative Methods

AAT = b
Solution
Consider the problem
minimize U(x) = 12 xT x
subject to Ax = b.
148,
Rectangular Systems

Rectangular Systems
Iterative Methods

AAT = b
Solution
Consider the problem
minimize U(x) = 12 xT x
subject to Ax = b.
Extremum of the Lagrangian L(x, ) = 12 xT x T (Ax b) is

given by
L
L
= 0,
= 0 x = AT , Ax = b.
x
Solution x = AT (AAT )1 b gives foot of the perpendicular on the

solution plane and the pseudoinverse
A# = AT (AAT )1
149,

Rectangular Systems
Iterative Methods
Ill-posed problems: Tikhonov regularization
recipe for any linear system (m > n, m = n or m < n), with

any condition!
150,

Rectangular Systems
Iterative Methods

any condition!
Ax = b may have conflict: form AT Ax = AT b.
151,

Rectangular Systems
Iterative Methods

any condition!

AT A may be ill-conditioned: rig the system as
(AT A + 2 In )x = AT b
Coefficient matrix: symmetric and positive definite!
152,

Rectangular Systems
Iterative Methods

any condition!

The idea: Immunize the system, paying a small price.
153,

Rectangular Systems
Iterative Methods

any condition!

Issues:
The choice of ?
154,

Rectangular Systems
Iterative Methods

any condition!

Issues:
The choice of ?
When m < n, computational advantage by

(AAT + 2 Im ) = b,
x = AT
155,
Iterative Methods

Rectangular Systems
Iterative Methods
Jacobis iteration method:
n
X
1
(k)
(k+1)
aij xj for i = 1, 2, 3, , n.
bi
=
xi
aii
j=1, j6=i
156,
Iterative Methods
157,

Rectangular Systems
Iterative Methods
n
X
1
(k)
(k+1)
aij xj for i = 1, 2, 3, , n.
bi
=
xi
aii
j=1, j6=i
Gauss-Seidel method:
n
i 1
X
X
1
(k+1)
(k)
(k+1)
=
xi
aij xj for i = 1, 2, 3, , n.
aij xj
bi
aii
j=1
j=i +1
Iterative Methods
158,

Rectangular Systems
Iterative Methods
n
X
1
(k)
(k+1)
aij xj for i = 1, 2, 3, , n.
bi
=
xi
aii
j=1, j6=i
Gauss-Seidel method:
n
i 1
X
X
1
(k+1)
(k)
(k+1)
=
xi
aij xj for i = 1, 2, 3, , n.
aij xj
bi
aii
j=1
j=i +1
The category of relaxation methods:

diagonal dominance and availability of good initial
approximations
Points to note

Rectangular Systems
Iterative Methods
Solutions are unreliable when the coefficient matrix is

ill-conditioned.
Finding pseudoinverse of a full-rank matrix is easy.
Tikhonov regularization provides singularity-robust solutions.
Iterative methods may have an edge in certain situations!
159,
Outline

Eigenvalue Problem
Generalized Eigenvalue Problem
Some Basic Theoretical Results
Power Method

Eigenvalue Problem
Power Method
160,
Eigenvalue Problem

Eigenvalue Problem
Power Method
nn
In mapping A : R n R n , special vectors of matrix A R
mapped to scalar multiples, i.e. undergo pure scaling
161,
Eigenvalue Problem
Eigenvalue Problem
Power Method
nn

Av = v
Eigenvector (v) and eigenvalue (): eigenpair (, v)
162,
Eigenvalue Problem
Eigenvalue Problem
Power Method
nn

Av = v

algebraic eigenvalue problem
163,
Eigenvalue Problem
Eigenvalue Problem
Power Method
nn

Av = v

(I A)v = 0
For non-trivial (non-zero) solution v,
det(I A) = 0
Characteristic equation: characteristic polynomial: n roots
n eigenvalues for each, find eigenvector(s)
164,
Eigenvalue Problem
Eigenvalue Problem
Power Method
nn

Av = v

(I A)v = 0
det(I A) = 0
Multiplicity of an eigenvalue: algebraic and geometric
165,
Eigenvalue Problem
Eigenvalue Problem
Power Method
nn

Av = v

(I A)v = 0
det(I A) = 0
Multiplicity of an eigenvalue: algebraic and geometric

Multiplicity mismatch: diagonalizable and defective matrices
166,

1-dof mass-spring system: m
x + kx = 0
Natural frequency of vibration: n =

Eigenvalue Problem
Power Method
k
m
167,

x + kx = 0
Free vibration of n-dof system:
Eigenvalue Problem
Power Method
k
m
M
x + Kx = 0,
Natural frequencies and corresponding modes?
168,
Eigenvalue Problem
Power Method

x + kx = 0
k
m
M
x + Kx = 0,
Assuming a vibration mode x = sin(t + ),
( 2 M + K) sin(t + ) = 0
K = 2 M
169,

x + kx = 0
Eigenvalue Problem
Power Method
k
m
M
x + Kx = 0,
( 2 M + K) sin(t + ) = 0 K = 2 M

Reduce as M1 K = 2 ? Why is it not a good idea?
170,

x + kx = 0
Eigenvalue Problem
Power Method
k
m
M
x + Kx = 0,
( 2 M + K) sin(t + ) = 0 K = 2 M

Reduce as M1 K = 2 ? Why is it not a good idea?
K symmetric, M symmetric and positive definite!!
With M = LLT , = LT and K = L1 KLT ,
K = 2
171,

Eigenvalue Problem
Power Method
Eigenvalues of transpose
Eigenvalues of AT are the same as those of A.
Caution: Eigenvectors of A and AT need not be same.
172,

Eigenvalue Problem
Power Method

Diagonal and block diagonal matrices
Eigenvalues of a diagonal matrix are its diagonal entries.
Corresponding eigenvectors: natural basis members (e1 , e2 etc).
173,

Eigenvalue Problem
Power Method

Diagonal and block diagonal matrices
Eigenvalues of a diagonal matrix are its diagonal entries.
Corresponding eigenvectors: natural basis members (e1 , e2 etc).
Eigenvalues of a block diagonal matrix: those of diagonal blocks.
Eigenvectors: coordinate extensions of individual eigenvectors.
With (2 , v2 ) as eigenpair of block A2 ,
A1 0
0
0
0
0
A v2 = 0 A2 0 v2 = A2 v2 = 2 v2
0
0 A3
0
0
0
174,

Eigenvalue Problem
Power Method
Triangular and block triangular matrices

Eigenvalues of a triangular matrix are its diagonal entries.
175,

Eigenvalue Problem
Power Method

Eigenvalues of a block triangular matrix are the collection of

eigenvalues of its diagonal blocks.
176,
Eigenvalue Problem
Power Method


Take
H=
A B
0 C
A R r r and C R ss
177,
Eigenvalue Problem
Power Method


Take
H=
A B
0 C
A R r r and C R ss
If Av = v, then

v
A B
v
Av
v
v
H
=
=
=
=
0
0 C
0
0
0
0
178,
Eigenvalue Problem
Power Method


Take
H=
A B
0 C
A R r r and C R ss
If Av = v, then

v
A B
v
Av
v
v
H
=
=
=
=
0
0 C
0
0
0
0
If is an eigenvalue of C, then it is also an eigenvalue of CT and

T

0
A
0
0
0
CT w = w HT
=
=
w
B T CT
w
w
179,

Eigenvalue Problem
Power Method
Shift theorem
Eigenvectors of A + I are the same as those of A.
Eigenvalues: shifted by .
180,
Eigenvalue Problem
Power Method
Shift theorem
Eigenvectors of A + I are the same as those of A.
Eigenvalues: shifted by .
Deflation
For a symmetric matrix A, with mutually orthogonal eigenvectors,
having (j , vj ) as an eigenpair,
B = A j
vj vjT
vjT vj
has the same eigenstructure as A, except that the eigenvalue

corresponding to vj is zero.
181,

Eigenvalue Problem
Power Method
Eigenspace
If v1 , v2 , , vk are eigenvectors of A corresponding to the same
eigenvalue , then
eigenspace: < v1 , v2 , , vk >
182,

Eigenvalue Problem
Power Method
Eigenspace
eigenvalue , then
Similarity transformation
B = S1 AS: same transformation expressed in new basis.
183,

Eigenvalue Problem
Power Method
Eigenspace
eigenvalue , then
det(I A) = det S1 det(I A) det S = det(I B)
Same characteristic polynomial!
184,

Eigenvalue Problem
Power Method
Eigenspace
eigenvalue , then
det(I A) = det S1 det(I A) det S = det(I B)
Same characteristic polynomial!
Eigenvalues are the property of a linear transformation,
not of the basis.
An eigenvector v of A transforms to S1 v, as the corresponding
eigenvector of B.
185,
Power Method
Consider matrix A with

Eigenvalue Problem
Power Method
|1 | > |2 | |3 | |n1 | > |n |

and a full set of n eigenvectors v1 , v2 , , vn .
186,
Power Method

Eigenvalue Problem
Power Method
|1 | > |2 | |3 | |n1 | > |n |

For vector x = 1 v1 + 2 v2 + + n vn ,
p
p

p
3
n
2
p
p
2 v2 +
3 v3 + +
n vn
A x = 1 1 v1 +
1
1
1
187,
Power Method

Eigenvalue Problem
Power Method
|1 | > |2 | |3 | |n1 | > |n |

p
p

p
3
n
2
p
p
2 v2 +
3 v3 + +
n vn
A x = 1 1 v1 +
1
1
1
As p , Ap x p1 1 v1 , and
188,
Power Method

Eigenvalue Problem
Power Method
|1 | > |2 | |3 | |n1 | > |n |

p
p

p
3
n
2
p
p
2 v2 +
3 v3 + +
n vn
A x = 1 1 v1 +
1
1
1
(Ap x)r
,
p (Ap1 x)r
1 = lim
r = 1, 2, 3, , n.
189,
Power Method
Eigenvalue Problem
Power Method
|1 | > |2 | |3 | |n1 | > |n |

p
p

p
3
n
2
p
p
2 v2 +
3 v3 + +
n vn
A x = 1 1 v1 +
1
1
1
(Ap x)r
,
p (Ap1 x)r
1 = lim
r = 1, 2, 3, , n.
At convergence, n ratios will be the same.
190,
Power Method
Eigenvalue Problem
Power Method
|1 | > |2 | |3 | |n1 | > |n |

p
p

p
3
n
2
p
p
2 v2 +
3 v3 + +
n vn
A x = 1 1 v1 +
1
1
1
(Ap x)r
,
p (Ap1 x)r
1 = lim
r = 1, 2, 3, , n.
At convergence, n ratios will be the same.

Question: How to find the least magnitude eigenvalue?
191,
Points to note

Eigenvalue Problem
Power Method
Meaning and context of the algebraic eigenvalue problem
Fundamental deductions and vital relationships
Power method as an inexpensive procedure to determine

extremal magnitude eigenvalues
192,
Outline

Diagonalizability
Canonical Forms
Symmetric Matrices
Similarity Transformations

Diagonalizability
Canonical Forms
Symmetric Matrices
193,
Diagonalizability

Diagonalizability
Canonical Forms
Symmetric Matrices
Consider A R nn , having n eigenvectors v1 , v2 , , vn ;

with corresponding eigenvalues 1 , 2 , , n .
194,
Diagonalizability
Diagonalizability
Canonical Forms
Symmetric Matrices

AS = A[v1
= [v1
v2
v2
vn ] = [1 v1 2 v2 n vn ]
1 0 0
0 2 0
vn ] ..
.. . .
.. = S
.
. .
.
0 0 n
A = SS1
and
S1 AS =
195,
Diagonalizability
Diagonalizability
Canonical Forms
Symmetric Matrices

AS = A[v1
= [v1
v2
v2
vn ] = [1 v1 2 v2 n vn ]
1 0 0
0 2 0
vn ] ..
.. . .
.. = S
.
. .
.
0 0 n
A = SS1
and
S1 AS =
Diagonalization: The process of changing the basis of a linear

transformation so that its new matrix representation is diagonal,
i.e. so that it is decoupled among its coordinates.
196,
Diagonalizability

Diagonalizability
Canonical Forms
Symmetric Matrices
Diagonalizability:
A matrix having a complete set of n linearly independent
eigenvectors is diagonalizable.
197,
Diagonalizability

Diagonalizability
Canonical Forms
Symmetric Matrices
Diagonalizability:
Existence of a complete set of eigenvectors:
A diagonalizable matrix possesses a complete set of n
linearly independent eigenvectors.
198,
Diagonalizability

Diagonalizability
Canonical Forms
Symmetric Matrices
Diagonalizability:
Existence of a complete set of eigenvectors:
A diagonalizable matrix possesses a complete set of n
linearly independent eigenvectors.
All distinct eigenvalues implies diagonalizability.
But, diagonalizability does not imply distinct eigenvalues!
However, a lack of diagonalizability certainly implies a

multiplicity mismatch.
199,
Canonical Forms
Jordan canonical form (JCF)

Diagonal (canonical) form
Triangular (canonical) form

Diagonalizability
Canonical Forms
Symmetric Matrices
200,
Canonical Forms
Jordan canonical form (JCF)

Diagonal (canonical) form
Triangular (canonical) form
Other convenient forms

Tridiagonal form
Hessenberg form

Diagonalizability
Canonical Forms
Symmetric Matrices
201,
Canonical Forms

Diagonalizability
Canonical Forms
Symmetric Matrices
Jordan canonical form (JCF): composed of Jordan blocks
J1
J2
J=
..
, Jr =
.
..
. 1
Jk
202,
Canonical Forms
Diagonalizability
Canonical Forms
Symmetric Matrices
Jordan canonical form (JCF): composed of Jordan blocks
J1
J2
J=
..
, Jr =
.
..
. 1
Jk
The key equation AS = SJ in extended form gives
..
.
Jr
A[ Sr ] = [ Sr ]
..
where Jordan block Jr is associated with the subspace of

Sr = [v
w2
w3
203,
Canonical Forms
Diagonalizability
Canonical Forms
Symmetric Matrices
Equating blocks as ASr = Sr Jr gives
[Av
Aw2
Aw3
] = [v
w2
w3
..
]
.
..
.
204,
Canonical Forms
Diagonalizability
Canonical Forms
Symmetric Matrices
[Av
Aw2
Aw3
] = [v
Columnwise equality leads to
w2
w3
..
]
.
..
.
Av = v, Aw2 = v + w2 , Aw3 = w2 + w3 ,
205,
Canonical Forms
Diagonalizability
Canonical Forms
Symmetric Matrices
[Av
Aw2
Aw3
] = [v
w2
Columnwise equality leads to
w3
..
]
.
..
.
Av = v, Aw2 = v + w2 , Aw3 = w2 + w3 ,
Generalized eigenvectors w2 , w3 etc:
(A I)v = 0,
(A I)w2 = v
(A I)w3 = w2
and
and
(A I)2 w2 = 0,
(A I)3 w3 = 0,
206,
Canonical Forms

Diagonalizability
Canonical Forms
Symmetric Matrices
Diagonal form
Special case of Jordan form, with each Jordan block of 1 1

size
Matrix is diagonalizable
Similarity transformation matrix S is composed of n linearly

independent eigenvectors as columns
None of the eigenvectors admits any generalized eigenvector
Equal geometric and algebraic multiplicities for every

eigenvalue
207,
Canonical Forms

Diagonalizability
Canonical Forms
Symmetric Matrices
Triangular form
Triangularization: Change of basis of a linear tranformation so as
to get its matrix in the triangular form
208,
Canonical Forms

Diagonalizability
Canonical Forms
Symmetric Matrices
Triangular form
For real eigenvalues, always possible to accomplish with

orthogonal similarity transformation
Always possible to accomplish with unitary similarity

transformation, with complex arithmetic
Determination of eigenvalues
209,
Canonical Forms

Diagonalizability
Canonical Forms
Symmetric Matrices
Triangular form
For real eigenvalues, always possible to accomplish with

orthogonal similarity transformation
Always possible to accomplish with unitary similarity

transformation, with complex arithmetic
Determination of eigenvalues
Note: The case of complex eigenvalues: 2 2 real diagonal block

+ i
0

0
i
210,
Canonical Forms

Diagonalizability
Canonical Forms
Symmetric Matrices
Forms that can be obtained with pre-determined number of

arithmetic operations (without iteration):
Tridiagonal form: non-zero entries only in the (leading) diagonal,
sub-diagonal and super-diagonal
useful for symmetric matrices
211,
Canonical Forms

Diagonalizability
Canonical Forms
Symmetric Matrices

Hessenberg form: A slight generalization of a triangular matrix
.
.
.
.
Hu =
.
.
.
.
.
. . .
..
.
.
.
.
.
. .

212,
Canonical Forms

Diagonalizability
Canonical Forms
Symmetric Matrices

Hessenberg form: A slight generalization of a triangular matrix
.
.
.
.
Hu =
.
.
.
.
.
. . .
..
.
.
.
.
.
. .

Note: Tridiagonal and Hessenberg forms do not fall in the
category of canonical forms.
213,
Symmetric Matrices

Diagonalizability
Canonical Forms
Symmetric Matrices
A real symmetric matrix has all real eigenvalues and

is diagonalizable through an orthogonal similarity
transformation.
Similar results for complex Hermitian matrices.
214,
Symmetric Matrices

Diagonalizability
Canonical Forms
Symmetric Matrices

transformation.
Eigenvalues must be real.
215,
Symmetric Matrices
Diagonalizability
Canonical Forms
Symmetric Matrices

transformation.
A complete set of eigenvectors exists.
216,
Symmetric Matrices

Diagonalizability
Canonical Forms
Symmetric Matrices

transformation.
Eigenvectors corresponding to distinct eigenvalues are
necessarily orthogonal.
217,
Symmetric Matrices

Diagonalizability
Canonical Forms
Symmetric Matrices

transformation.
Corresponding to repeated eigenvalues, orthogonal eigenvectors
are available.
218,
Symmetric Matrices

Diagonalizability
Canonical Forms
Symmetric Matrices

transformation.
Corresponding to repeated eigenvalues, orthogonal eigenvectors
are available.
In all cases of a symmetric matrix, we can form an
orthogonal matrix V, such that VT AV = is a real
diagonal matrix.
Further, A = VVT .
219,
Symmetric Matrices

Diagonalizability
Canonical Forms
Symmetric Matrices
Proposition: Eigenvalues of a real symmetric matrix must be real.
220,
Symmetric Matrices
Diagonalizability
Canonical Forms
Symmetric Matrices

Take A R nn such that A = AT , with eigenvalue = h + ik.
k = 0 and = h
221,
Symmetric Matrices
Diagonalizability
Canonical Forms
Symmetric Matrices

Since I A is singular, so is
A) = (hI A + ikI)(hI A ikI)
B = (I A) (I
= (hI A)2 + k 2 I
k = 0 and = h
222,
Symmetric Matrices
Diagonalizability
Canonical Forms
Symmetric Matrices

B = (I A) (I
= (hI A)2 + k 2 I
For some x 6= 0, Bx = 0, and

xT Bx = 0 xT (hI A)T (hI A)x + k 2 xT x = 0
k = 0 and = h
223,
Symmetric Matrices
Diagonalizability
Canonical Forms
Symmetric Matrices

B = (I A) (I
= (hI A)2 + k 2 I
For some x 6= 0, Bx = 0, and

xT Bx = 0 xT (hI A)T (hI A)x + k 2 xT x = 0
Thus, k(hI A)xk2 + kkxk2 = 0
k = 0 and = h
224,
Symmetric Matrices

Diagonalizability
Canonical Forms
Symmetric Matrices
Proposition: A symmetric matrix possesses a complete set of

eigenvectors.
225,
Symmetric Matrices

Diagonalizability
Canonical Forms
Symmetric Matrices

eigenvectors.
Consider a repeated real eigenvalue of A and examine its Jordan
block(s).
All Jordan blocks will be of 1 1 size.
226,
Symmetric Matrices

Diagonalizability
Canonical Forms
Symmetric Matrices

eigenvectors.
block(s).
Suppose Av = v.
The first generalized eigenvector w satisfies (A I)w = v, giving
vT (A I)w = vT v vT AT w vT w = vT v
(Av)T w vT w = kvk2
kvk2 = 0
which is absurd.
227,
Symmetric Matrices

Diagonalizability
Canonical Forms
Symmetric Matrices

eigenvectors.
block(s).
Suppose Av = v.
The first generalized eigenvector w satisfies (A I)w = v, giving
vT (A I)w = vT v vT AT w vT w = vT v
(Av)T w vT w = kvk2
kvk2 = 0
which is absurd.
An eigenvector will not admit a generalized eigenvector.
228,
Symmetric Matrices

Diagonalizability
Canonical Forms
Symmetric Matrices
Proposition: Eigenvectors of a symmetric matrix corresponding to

distinct eigenvalues are necessarily orthogonal.
229,
Symmetric Matrices
Diagonalizability
Canonical Forms
Symmetric Matrices

Take two eigenpairs (1 , v1 ) and (2 , v2 ), with 1 6= 2 .
v1T v2 = 0
230,
Symmetric Matrices
Diagonalizability
Canonical Forms
Symmetric Matrices

v1T Av2 = v1T (2 v2 ) = 2 v1T v2
v1T Av2 = v1T AT v2 = (Av1 )T v2 = (1 v1 )T v2 = 1 v1T v2
v1T v2 = 0
231,
Symmetric Matrices
Diagonalizability
Canonical Forms
Symmetric Matrices

v1T Av2 = v1T (2 v2 ) = 2 v1T v2
From the two expressions, (1 2 )v1T v2 = 0
v1T v2 = 0
232,
Symmetric Matrices
Diagonalizability
Canonical Forms
Symmetric Matrices

v1T Av2 = v1T (2 v2 ) = 2 v1T v2
v1T v2 = 0
Proposition: Corresponding to a repeated eigenvalue of a

symmetric matrix, an appropriate number of orthogonal
eigenvectors can be selected.
233,
Symmetric Matrices
Diagonalizability
Canonical Forms
Symmetric Matrices

v1T Av2 = v1T (2 v2 ) = 2 v1T v2
v1T v2 = 0
Proposition: Corresponding to a repeated eigenvalue of a

symmetric matrix, an appropriate number of orthogonal
eigenvectors can be selected.
If 1 = 2 , then the entire subspace < v1 , v2 > is an eigenspace.
Select any two mutually orthogonal eigenvectors for the basis.
234,
Symmetric Matrices
Diagonalizability
Canonical Forms
Symmetric Matrices
Facilities with the omnipresent symmetric matrices:

Expression
A = VVT
= [v1
v2
vn ]
1
2
..
v1T
v2T
..
.
vnT
n
n
X
i vi viT
= 1 v1 v1T + 2 v2 v2T + + n vn vnT =
i =1
235,
Symmetric Matrices
Diagonalizability
Canonical Forms
Symmetric Matrices

Expression
A = VVT
= [v1
v2
vn ]
1
2
..
v1T
v2T
..
.
vnT
n
n
X
i vi viT
= 1 v1 v1T + 2 v2 v2T + + n vn vnT =
i =1
Reconstruction from a sum of rank-one components

Efficient storage with only large eigenvalues and corresponding
eigenvectors
236,
Symmetric Matrices
Diagonalizability
Canonical Forms
Symmetric Matrices

Expression
A = VVT
= [v1
v2
vn ]
1
2
..
v1T
v2T
..
.
vnT
n
n
X
i vi viT
= 1 v1 v1T + 2 v2 v2T + + n vn vnT =
i =1
Reconstruction from a sum of rank-one components

Efficient storage with only large eigenvalues and corresponding
eigenvectors
Deflation technique
Stable and effective methods: easier to solve the eigenvalue
problem
237,
General
Diagonalizability
Canonical Forms
Symmetric Matrices
Hessenberg
Symmetric
Tridiagonal
Triangular
Symmetric Tridiagonal
Diagonal
Figure: Eigenvalue problem: forms and steps
238,
General
Diagonalizability
Canonical Forms
Symmetric Matrices
Hessenberg
Symmetric
Tridiagonal
Triangular
Diagonal
How to find suitable similarity transformations?
239,
General
Diagonalizability
Canonical Forms
Symmetric Matrices
Hessenberg
Symmetric
Tridiagonal
Triangular
Diagonal
How to find suitable similarity transformations?

1. rotation
2. reflection
3. matrix decomposition or factorization
4. elementary transformation
240,
Points to note

Diagonalizability
Canonical Forms
Symmetric Matrices
Generally possible reduction: Jordan canonical form
Condition of diagonalizability and the diagonal form
Possible with orthogonal similarity transformations: triangular

form
Useful non-canonical forms: tridiagonal and Hessenberg
Orthogonal diagonalization of symmetric matrices
Caution: Each step in this context to be effected through

similarity transformations
241,
Outline

Plane Rotations
Jacobi Rotation Method
Givens Rotation Method

(for symmetric matrices)
Plane Rotations
242,
Plane Rotations
Plane Rotations
Y/
P (x, y)
y
L
O
x
X/
Figure: Rotation of axes and change of basis
243,
Plane Rotations
Plane Rotations
Y/
P (x, y)
y
L
O
x
X/
Figure: Rotation of axes and change of basis
= OL + LM = OL + KN = x cos + y sin
= PN MN = PN LK = y cos x sin
244,
Plane Rotations

Plane Rotations
Orthogonal change of basis:

x
cos sin
x
r=
=
= r
y
y
sin cos
245,
Plane Rotations

Plane Rotations

x
cos sin
x
r=
=
= r
y
y
sin cos
Mapping of position vectors with

cos sin
1
T
= =
sin
cos
246,
Plane Rotations

Plane Rotations

x
cos sin
x
r=
=
= r
y
y
sin cos
Mapping of position vectors with

cos sin
1
T
= =
sin
cos
In three-dimensional (ambient) space,
cos sin 0
cos 0 sin
xy = sin cos 0 , xz =
0 1
0 etc.
0
0 1
sin 0 cos
247,
Plane Rotations

Plane Rotations
Generalizing to n-dimensional Euclidean space (R n ),
1
0
0
1
0
0
..
..
.
.
.
.
.
1 0
0
0 0 0 c 0 0 s
0 1
0
Ppq =
.
..
.
.
.
.
.
.
0
1 0
0 0 0 s 0 0 c
..
.. . .
.
.
.
0
0
248,
Plane Rotations

Plane Rotations
Generalizing to n-dimensional Euclidean space (R n ),
1
0
0
1
0
0
..
..
.
.
.
.
.
1 0
0
0 0 0 c 0 0 s
0 1
0
Ppq =
.
..
.
.
.
.
.
.
0
1 0
0 0 0 s 0 0 c
..
.. . .
.
.
.
0
0
Matrix A is transformed as
T
A = P1
pq APpq = Ppq APpq ,
only the p-th and q-th rows and columns being affected.
249,

Plane Rotations
apr
= arp
= carp sarq for p 6= r 6= q,
aqr
= arq
= carq + sarp for p 6= r 6= q,
app
= c 2 app + s 2 aqq 2scapq ,
aqq
= s 2 app + c 2 aqq + 2scapq , and
apq
= aqp
= (c 2 s 2 )apq + sc(app aqq )
250,

Plane Rotations
apr
= arp
aqr
= arq
app
aqq
apq
= aqp
In a Jacobi rotation,
apq
=0
aqq app
c2 s2
=
= k (say).
2sc
2apq
Left side is cot 2: solve this equation for .
251,

Plane Rotations
apr
= arp
aqr
= arq
app
aqq
apq
= aqp
apq
=0
aqq app
c2 s2
=
= k (say).
2sc
2apq

Jacobi rotation transformations P12 , P13 , , P1n ; P23 , , P2n ;
; Pn1,n complete a full sweep.
252,

Plane Rotations
apr
= arp
aqr
= arq
app
aqq
apq
= aqp
apq
=0
aqq app
c2 s2
=
= k (say).
2sc
2apq

Jacobi rotation transformations P12 , P13 , , P1n ; P23 , , P2n ;
; Pn1,n complete a full sweep.
Note: The resulting matrix is far from diagonal!
253,
Plane Rotations
Sum of squares of off-diagonal terms before the transformation
X
X
X
2
2
arp
+
arq
S =
|ars |2 = 2
r 6=p
r 6=s
= 2
p6=r 6=q
p6=r 6=q
2
2
2
(arp
+ arq
) + apq
254,
Plane Rotations
X
X
X
2
2
arp
+
arq
S =
|ars |2 = 2
r 6=p
r 6=s
= 2
p6=r 6=q
and that afterwards
S = 2
= 2
p6=r 6=q
2
2
2
(arp
+ arq
) + apq
p6=r 6=q
p6=r 6=q
2
2
2
(arp
+ arq
) + apq
2
2
)
+ arq
(arp
255,
Plane Rotations
X
X
X
2
2
arp
+
arq
S =
|ars |2 = 2
r 6=p
r 6=s
= 2
p6=r 6=q
and that afterwards
S = 2
= 2
p6=r 6=q
2
2
2
(arp
+ arq
) + apq
p6=r 6=q
2
2
2
(arp
+ arq
) + apq
2
2
)
+ arq
(arp
p6=r 6=q
differ by
2
S = S S = 2apq
0;
and S 0.
256,

Plane Rotations
= 0: tan = rq
While applying the rotation Ppq , demand arq
arp
257,

Plane Rotations
= 0: tan = rq
arp
r = p 1: Givens rotation
Once ap1,q is annihilated, it is never updated again!
258,

Plane Rotations
= 0: tan = rq
arp
Sweep P23 , P24 , , P2n ; P34 , , P3n ; ; Pn1,n to

annihilate a13 , a14 , , a1n ; a24 , , a2n ; ; an2,n .
Symmetric tridiagonal matrix
259,
Plane Rotations
= 0: tan = rq
arp

How do eigenvectors transform through Jacobi/Givens rotation

steps?
A = P(2) P(1) AP(1) P(2)
Product matrix P(1) P(2) gives the basis.
260,
Plane Rotations
= 0: tan = rq
arp

How do eigenvectors transform through Jacobi/Givens rotation

steps?
A = P(2) P(1) AP(1) P(2)
Product matrix P(1) P(2) gives the basis.
To record it, initialize V by identity and keep multiplying new

rotation matrices on the right side.
261,

Plane Rotations
Contrast between Jacobi and Givens rotation methods
What happens to intermediate zeros?
What do we get after a complete sweep?
How many sweeps are to be applied?
What is the intended final form of the matrix?
How is size of the matrix relevant in the choice of the method?
262,

Plane Rotations
Contrast between Jacobi and Givens rotation methods
What happens to intermediate zeros?
What do we get after a complete sweep?
How many sweeps are to be applied?
What is the intended final form of the matrix?
How is size of the matrix relevant in the choice of the method?
Fast forward ...
Householder method accomplishes tridiagonalization more

efficiently than Givens rotation method.
But, with a half-processed matrix, there come situations in

which Givens rotation method turns out to be more efficient!
263,
Points to note

Plane Rotations
Rotation transformation on symmetric matrices
Plane rotations provide orthogonal change of basis that can

be used for diagonalization of matrices.
For small matrices (say 4 n 8), Jacobi rotation sweeps

are competitive enough for diagonalization upto a reasonable
tolerance.
For large matrices, one sweep of Givens rotations can be

applied to get a symmetric tridiagonal matrix, for efficient
further processing.
264,
Outline
265,
Householder Reflection Transformation

Householder Method
Eigenvalues of Symmetric Tridiagonal Matrices

Householder Method
266,

Householder Method
uv
w
Plane of
Reflection
O
v
Figure: Vectors in Householder reflection
Consider u, v R k , kuk = kvk and w =
uv
kuvk .
267,

Householder Method
uv
w
Plane of
Reflection
O
v

Householder reflection matrix
Hk = Ik 2wwT
is symmetric and orthogonal.
uv
kuvk .
268,

Householder Method
uv
w
Plane of
Reflection
O
v

uv
kuvk .
Hk = Ik 2wwT
For any vector x orthogonal to w,
Hk x = (Ik 2wwT )x = x
and
Hk w = (Ik 2wwT )w = w.
269,

Householder Method
uv
w
Plane of
Reflection
O
v

uv
kuvk .
Hk = Ik 2wwT
For any vector x orthogonal to w,
Hk x = (Ik 2wwT )x = x
and
Hk w = (Ik 2wwT )w = w.
Hence, Hk y = Hk (yw + y ) = yw + y , Hk u = v and Hk v = u.
Householder Method
270,

Householder Method
Consider n n symmetric matrix A.

Let u = [a21 a31 an1 ]T R n1 and v = kuke1 R n1 .
Householder Method


1
0
Construct P1 =
and operate as
0 Hn1
A
(1)
= P1 AP1 =
=
271,

Householder Method
a11 uT
u A1

T
v
.
Hn1 A1 Hn1
1
0
0 Hn1
a11
v

1
0
0 Hn1
Householder Method


1
0
Construct P1 =
and operate as
0 Hn1
A
(1)
= P1 AP1 =
=
272,

Householder Method
a11
v
Reorganizing and re-naming,
A(1)
a11 uT
u A1

T
v
.
Hn1 A1 Hn1
1
0
0 Hn1

d1 e2 0
.
= e2 d2 uT
2
0 u2 A2

1
0
0 Hn1
Householder Method
Next, with v2 = ku2 ke1 , we form

I2
0
P2 =
0 Hn2
and operate as A(2) = P2 A(1) P2 .
273,

Householder Method
Householder Method

I2
0
P2 =
0 Hn2
274,

Householder Method

After j steps,
d1 e2
e2 d2 . . .
..
..
A(j) =
.
. ej+1
ej+1 dj+1 uT
j+1
uj+1 Aj+1
Householder Method

I2
0
P2 =
0 Hn2

After j steps,
d1 e2
e2 d2 . . .
..
..
A(j) =
.
. ej+1
ej+1 dj+1 uT
j+1
uj+1 Aj+1
By n 2 steps, with P = P1 P2 P3 Pn2 ,
A(n2) = PT AP
is symmetric tridiagonal.
275,

Householder Method
276,
Reflection Transformation
Eigenvalues of Symmetric TridiagonalHouseholder
Matrices
Householder
Method
d1
e2
e2
T=
d2
..
.
..
..
en1
en1
dn1
en
en
dn
277,
Matrices
Householder
Method
d1
e2
e2
T=
d2
..
.
..
..
en1
en1
dn1
en
en
dn
Characteristic polynomial

d1
e2

..
e2
.
d2

..
..
p() =
.
.
en1

e
dn1
en
n1

en
dn
278,
Matrices
Householder
Method
Characteristic polynomial of the leading k k sub-matrix: pk ()

p0 () = 1,
p1 () = d1 ,
p2 () = ( d2 )( d1 ) e22 ,
2
pk+1 () = ( dk+1 )pk () ek+1
pk1 ().
279,
Matrices
Householder
Method

p0 () = 1,
p1 () = d1 ,
p2 () = ( d2 )( d1 ) e22 ,
2
pk+1 () = ( dk+1 )pk () ek+1
pk1 ().
P() = {p0 (), p1 (), , pn ()}
a Sturmian sequence if ej 6= 0 j
280,
Matrices
Householder
Method

p0 () = 1,
p1 () = d1 ,
p2 () = ( d2 )( d1 ) e22 ,
2
pk+1 () = ( dk+1 )pk () ek+1
pk1 ().
P() = {p0 (), p1 (), , pn ()}
Question: What if ej = 0 for some j?!
281,
Matrices
Householder
Method

p0 () = 1,
p1 () = d1 ,
p2 () = ( d2 )( d1 ) e22 ,
2
pk+1 () = ( dk+1 )pk () ek+1
pk1 ().
P() = {p0 (), p1 (), , pn ()}
Question: What if ej = 0 for some j?!

Answer: That is good news. Split the matrix.
282,
Matrices
Householder
Method
Sturmian sequence property of P() with ej 6= 0:
Interlacing property: Roots of pk+1 () interlace the

roots of pk (). That is, if the roots of pk+1 () are
1 > 2 > > k+1 and those of pk () are
1 > 2 > > k ; then
1 > 1 > 2 > 2 > > k > k > k+1 .
This property leads to a convenient
procedure .
283,
Matrices
Householder
Method

1 > 2 > > k ; then
1 > 1 > 2 > 2 > > k > k > k+1 .

Proof
p1 () has a single root, d1 .
p2 (d1 ) = e22 < 0,
procedure .
284,
Matrices
Householder
Method

1 > 2 > > k ; then
1 > 1 > 2 > 2 > > k > k > k+1 .

Proof
procedure .
p2 (d1 ) = e22 < 0,

Since p2 () = > 0, roots t1 and t2 of p2 () are separated as
> t1 > d1 > t2 > .
285,
Matrices
Householder
Method

1 > 2 > > k ; then
1 > 1 > 2 > 2 > > k > k > k+1 .

Proof
procedure .
p2 (d1 ) = e22 < 0,

Since p2 () = > 0, roots t1 and t2 of p2 () are separated as
> t1 > d1 > t2 > .
The statement is true for k = 1.
286,
Matrices
Householder
Method
Next, we assume that the statement is true for k = i .

Roots of pi (): 1 > 2 > > i
Roots of pi +1 (): 1 > 2 > > i > i +1
Roots of pi +2 (): 1 > 2 > > i > i +1 > i +2
287,
Matrices
Householder
Method
Next, we assume that the statement is true for k = i .

Roots of pi (): 1 > 2 > > i
Roots of pi +1 (): 1 > 2 > > i > i +1
Roots of pi +2 (): 1 > 2 > > i > i +1 > i +2
i+2
i1
i
i+1
2
(a) Roots of p ( ) and p
i
j+1
i+1
Assumption: 1 > 1 > 2 > 2 > > i > i > i +1

1
()
j1
j+1
ve
ve
(b) Sign of pi pi+2
Figure: Interlacing of roots of characteristic polynomials
To show: 1 > 1 > 2 > 2 > > i +1 > i +1 > i +2
288,
Matrices
Householder
Method
Since 1 > 1 , pi (1 ) is of the same sign as pi (), i.e. positive.
289,
Matrices
Householder
Method

Therefore, pi +2 (1 ) = ei2+2 pi (1 ) is negative.
But, pi +2 () is clearly positive.
290,
Matrices
Householder
Method

Hence, 1 (1 , ).
291,
Matrices
Householder
Method

Hence, 1 (1 , ).
Similarly, i +2 (, i +1 ).
Question: Where are the rest of the i roots of pi +2 ()?
292,
Matrices
Householder
Method

Hence, 1 (1 , ).
Similarly, i +2 (, i +1 ).
Question: Where are the rest of the i roots of pi +2 ()?

pi +2 (j ) = (j di +2 )pi +1 (j ) ei2+2 pi (j ) = ei2+2 pi (j )
pi +2 (j+1 ) = ei2+2 pi (j+1 )
That is, pi and pi +2 are of opposite signs at each .
Refer figure.
Over [i +1 , 1 ], pi +2 () changes sign over each sub-interval

[j+1 , j ], along with pi (), to maintain opposite signs at each .
Conclusion: pi +2 () has exactly one root in (j+1 , j ).
293,
Matrices
Householder
Method
Examine sequence P(w ) = {p0 (w ), p1 (w ), p2 (w ), , pn (w )}.

If pk (w ) and pk+1 (w ) have opposite signs then pk+1 () has one
root more than pk () in the interval (w , ).
294,
Matrices
Householder
Method

Number of roots of pn () above w = number of sign
changes in the sequence P(w ).
Consequence: Number of roots of pn () in (a, b) = difference

between numbers of sign changes in P(a) and P(b).
295,
Matrices
Householder
Method


Bisection method: Examine the sequence at
a+b
2 .
Separate roots, bracket each of them and then squeeze

the interval!
296,
Matrices
Householder
Method


a+b
2 .

the interval!
Any way to start with an interval to include all eigenvalues?
297,
Matrices
Householder
Method


a+b
2 .

the interval!
Any way to start with an interval to include all eigenvalues?
|i | bnd = max {|ej | + |dj | + |ej+1 |}
1jn
298,
Matrices
Householder
Method
Algorithm
Identify the interval [a, b] of interest.
For a degenerate case (some ej = 0), split the given matrix.

For each of the non-degenerate matrices,
by repeated use of bisection and study of the sequence P(),

bracket individual eigenvalues within small sub-intervals, and
by further use of the bisection method (or a substitute) within
each such sub-interval, determine the individual eigenvalues to
the desired accuracy.
Note: The algorithm is based on
Sturmian sequence property .
Points to note
A Householder matrix is symmetric and orthogonal. It effects

a reflection transformation.
A sequence of Householder transformations can be used to

convert a symmetric matrix into a symmetric tridiagonal form.
Eigenvalues of the leading square sub-matrices of a symmetric

tridiagonal matrix exhibit a useful interlacing structure.
This property can be used to separate and bracket eigenvalues.
Method of bisection is useful in the separation as well as

subsequent determination of the eigenvalues.
299,

Householder Method
Outline
QR Decomposition
QR Iterations
Conceptual Basis of QR Method*
QR Algorithm with Shift*
QR Decomposition
QR Iterations
300,
QR Decomposition
QR Decomposition
QR Iterations
Decomposition (or factorization) A = QR into two factors,

orthogonal Q and upper-triangular R:
(a) It always exists.
(b) Performing this decomposition is pretty straightforward.
(c) It has a number of properties useful in the solution of the
eigenvalue problem.
301,
QR Decomposition
QR Decomposition
QR Iterations

eigenvalue problem.
r11 r1n
..
..
[a1 an ] = [q1 qn ]
.
.
rnn
302,
QR Decomposition
QR Decomposition
QR Iterations

eigenvalue problem.
r11 r1n
..
..
[a1 an ] = [q1 qn ]
.
.
rnn
A simple method based on Gram-Schmidt
P orthogonalization:
Considering columnwise equality aj = ji =1 rij qi ,
for j = 1, 2, 3, , n;
rij
qj
qT
i aj
i < j,
aj /rjj ,
aj
= aj
j1
X
i =1
rij qi , rjj = kaj k;
if rjj 6= 0;
any vector satisfying qT
i qj = ij for 1 i j, if rjj = 0.
303,
QR Decomposition
QR Decomposition
QR Iterations
Practical method: one-sided Householder transformations,

starting with
u0 = a1 , v0 = ku0 ke1 R n and w0 =
and P0 = Hn = In 2w0 w0T .
u0 v0
ku0 v0 k
304,
QR Decomposition
QR Decomposition
QR Iterations
Practical method: one-sided Householder transformations,

starting with
u0 = a1 , v0 = ku0 ke1 R n and w0 =
u0 v0
ku0 v0 k
and P0 = Hn = In 2w0 w0T .
With

ka1 k
Pn2 Pn3 P2 P1 P0 A = Pn2 Pn3 P2 P1
0
A0
r11
= Pn2 Pn3 P2
r22 = = R
A1
Q = (Pn2 Pn3 P2 P1 P0 )T = P0 P1 P2 Pn3 Pn2 ,
we have QT A = R A = QR.
305,
QR Decomposition
QR Decomposition
QR Iterations
Alternative method useful for tridiagonal and Hessenberg

matrices: One-sided plane rotations
rotations P12 , P23 etc to annihilate a21 , a32 etc in that

sequence
Givens rotation matrices!
306,
QR Decomposition
QR Decomposition
QR Iterations
Alternative method useful for tridiagonal and Hessenberg

matrices: One-sided plane rotations
rotations P12 , P23 etc to annihilate a21 , a32 etc in that

sequence
Givens rotation matrices!
Application in solution of a linear system: Q and R factors of

a matrix A come handy in the solution of Ax = b
QRx = b Rx = QT b
needs only a sequence of back-substitutions.
307,
QR Iterations
Multiplying Q and R factors in reverse,
QR Decomposition
QR Iterations
A = RQ = QT AQ,
an orthogonal similarity transformation.
1. If A is symmetric, then so is A .
2. If A is in upper Hessenberg form, then so is A .
3. If A is symmetric tridiagonal, then so is A .
308,
QR Iterations
QR Decomposition
QR Iterations
A = RQ = QT AQ,
Complexity of QR iteration: O(n) for a symmetric tridiagonal
matrix, O(n2 ) operation for an upper Hessenberg matrix and
O(n3 ) for the general case.
309,
QR Iterations
QR Decomposition
QR Iterations
A = RQ = QT AQ,
Complexity of QR iteration: O(n) for a symmetric tridiagonal
matrix, O(n2 ) operation for an upper Hessenberg matrix and
O(n3 ) for the general case.
Algorithm: Set A1 = A and for k = 1, 2, 3, ,
decompose Ak = Qk Rk ,
reassemble Ak+1 = Rk Qk .
As k , Ak approaches the quasi-upper-triangular form.
310,
QR Iterations
Quasi-upper-triangular
..
QR Decomposition
QR Iterations
form:
Bk
..
.
..
..
.
.

with |1 | > |2 | > .

Diagonal blocks Bk correspond to eigenspaces of equal/close
(magnitude) eigenvalues.
2 2 diagonal blocks often correspond to pairs of complex
eigenvalues (for non-symmetric matrices).
For symmetric matrices, the quasi-upper-triangular form
reduces to quasi-diagonal form.
311,
QR Decomposition
QR Iterations
QR decomposition algorithm operates on the basis of the relative

magnitudes of eigenvalues and segregates subspaces.
312,
QR Decomposition
QR Iterations

With k ,
Ak Range{e1 } = Range{q1 } Range{v1 }
T
and (a1 )k QT
k Aq1 = 1 Qk q1 = 1 e1 .
313,
QR Decomposition
QR Iterations

With k ,
Ak Range{e1 } = Range{q1 } Range{v1 }
T
and (a1 )k QT
k Aq1 = 1 Qk q1 = 1 e1 .
Further,
Ak Range{e1 , e2 } = Range{q1 , q2 } Range{v1 , v2 }.
(1 2 )1
.
and (a2 )k QT
2
k Aq2 =
0
And, so on ...
314,
QR Decomposition
QR Iterations
For i < j , entry aij decays through iterations as
k
i
j
315,
QR Decomposition
QR Iterations

With shift,
k
k = Ak k I;
A
k = Qk Rk , A
k+1 = Rk Qk ;
A
k+1 + k I.
Ak+1 = A
i
j
316,
QR Decomposition
QR Iterations

With shift,
k
i
j
k = Ak k I;
A
k = Qk Rk , A
k+1 = Rk Qk ;
A
k+1 + k I.
Ak+1 = A
Resulting transformation is
Ak+1 = Rk Qk + k I = QT
k Ak Qk + k I
T
= QT
k (Ak k I)Qk + k I = Qk Ak Qk .
317,
QR Decomposition
QR Iterations

With shift,
k
i
j
k = Ak k I;
A
k = Qk Rk , A
k+1 = Rk Qk ;
A
k+1 + k I.
Ak+1 = A
k Ak Qk + k I
T
= QT
For the iteration,

convergence ratio =
i k
j k .
318,
QR Decomposition
QR Iterations

With shift,
k
i
j
k = Ak k I;
A
k = Qk Rk , A
k+1 = Rk Qk ;
A
k+1 + k I.
Ak+1 = A
k Ak Qk + k I
T
= QT
For the iteration,

convergence ratio =
i k
j k .
Question: How to find a suitable value for k ?
319,
Points to note
QR Decomposition
QR Iterations
QR decomposition can be effected on any square matrix.
Practical methods of QR decomposition use Householder

transformations or Givens rotations.
A QR iteration effects a similarity transformation on a matrix,

preserving symmetry, Hessenberg structure and also a
symmetric tridiagonal form.
A sequence of QR iterations converge to an almost

upper-triangular form.
Operations on symmetric tridiagonal and Hessenberg forms

are computationally efficient.
QR iterations tend to order subspaces according to the

relative magnitudes of eigenvalues.
Eigenvalue shifting is useful as an expediting strategy.
320,
Outline

Introductory Remarks
Reduction to Hessenberg Form*
QR Algorithm on Hessenberg Matrices*
Inverse Iteration
Recommendation

Inverse Iteration
Recommendation
321,

Inverse Iteration
Recommendation
A general (non-symmetric) matrix may not be diagonalizable.

We attempt to triangularize it.
With real arithmetic, 2 2 diagonal blocks are inevitable

signifying complex pair of eigenvalues.
Higher computational complexity, slow convergence and lack

of numerical stability.
322,

Inverse Iteration
Recommendation



A non-symmetric matrix is usually unbalanced and is prone to

higher round-off errors.
Balancing as a pre-processing step: multiplication of a row and
division of the corresponding column with the same number,
ensuring similarity.
323,

Inverse Iteration
Recommendation



A non-symmetric matrix is usually unbalanced and is prone to

higher round-off errors.
Balancing as a pre-processing step: multiplication of a row and
division of the corresponding column with the same number,
ensuring similarity.
Note: A balanced matrix may get unbalanced again through
similarity transformations that are not orthogonal!
324,

Inverse Iteration
Recommendation
Methods to find appropriate similarity transformations

1. a full sweep of Givens rotations,
2. a sequence of n 2 steps of Householder transformations, and

3. a cycle of coordinated Gaussian elimination.
325,

Inverse Iteration
Recommendation


Method based on Gaussian elimination or elementary

transformations:
The pre-multiplying matrix corresponding to the
elementary row transformation and the post-multiplying
matrix corresponding to the matching column
transformation must be inverses of each other.
326,

Inverse Iteration
Recommendation


Method based on Gaussian elimination or elementary

transformations:
The pre-multiplying matrix corresponding to the
elementary row transformation and the post-multiplying
matrix corresponding to the matching column
transformation must be inverses of each other.
Two kinds of steps
Pivoting
Elimination
327,

= Prs APrs = P1
Pivoting step: A
rs APrs .

Inverse Iteration
Recommendation
Permutation Prs : interchange of r -th and s-th columns.
P1
rs = Prs : interchange of r -th and s-th rows.
Pivot locations: a21 , a32 , , an1,n2 .
328,

= Prs APrs = P1
Pivoting step: A
rs APrs .

Inverse Iteration
Recommendation
Permutation Prs : interchange of r -th and s-th columns.
P1
rs = Prs : interchange of r -th and s-th rows.
Pivot locations: a21 , a32 , , an1,n2 .
Elimination
Ir
Gr = 0
0
= G1
step: A
r AGr with elimination matrix
0
0
Ir 0
0
and G1
0 1
.
1
0
0
r =
k Inr 1
0 k Inr 1
G1
r : Row (r + 1 + i ) Row (r + 1 + i ) ki Row (r + 1)
for i = 1, 2, 3, , n r 1
Gr : Column (r + P
1) Column (r + 1)+
nr 1
[ki Column (r + 1 + i ) ]
i =1
329,

Inverse Iteration
Recommendation
QR iterations: O(n2 ) operations for upper Hessenberg form.
Whenever a sub-diagonal zero appears, the matrix is split

into two smaller upper Hessenberg blocks, and they are
processed separately, thereby reducing the cost drastically.
330,

Inverse Iteration
Recommendation

Particular cases:
an,n1 0: Accept ann = n as an eigenvalue, continue with

the leading (n 1) (n 1) sub-matrix.
an1,n2
0: Separately find the eigenvalues n1 and n
an1,n1 an1,n
from
, continue with the leading
an,n1
an,n
(n 2) (n 2) sub-matrix.
331,

Inverse Iteration
Recommendation

Particular cases:
an,n1 0: Accept ann = n as an eigenvalue, continue with

the leading (n 1) (n 1) sub-matrix.
an1,n2
0: Separately find the eigenvalues n1 and n
an1,n1 an1,n
from
, continue with the leading
an,n1
an,n
(n 2) (n 2) sub-matrix.
Shift strategy: Double QR steps.
332,
Inverse Iteration

Inverse Iteration
Recommendation
Assumption: Matrix A has a complete set of eigenvectors.

(i )0 : a good estimate of an eigenvalue i of A.
333,
Inverse Iteration

Inverse Iteration
Recommendation

Purpose: To find i precisely and also to find vi .
334,
Inverse Iteration

Inverse Iteration
Recommendation

Step: Select a random vector y0 (with ky0 k = 1) and solve

[A (i )0 I]y = y0 .
335,
Inverse Iteration
Inverse Iteration
Recommendation


[A (i )0 I]y = y0 .
Result: y is a good estimate of vi and
(i )1 = (i )0 +
1
y0T y
is an improvement in the estimate of the eigenvalue.
336,
Inverse Iteration
Inverse Iteration
Recommendation


[A (i )0 I]y = y0 .
Result: y is a good estimate of vi and
(i )1 = (i )0 +
1
y0T y
is an improvement in the estimate of the eigenvalue.

How to establish the result and work out an
algorithm ?
337,
Inverse Iteration
With y0 =
j=1 j vj
n
X
j=1
Pn
and y =
j [A (i )0 I]vj
Pn
j=1 j vj ,
n
X
Inverse Iteration
Recommendation
[A (i )0 I]y = y0 gives
j vj
j=1
j [j (i )0 ] = j
j =
j
.
j (i )0
i is typically large and eigenvector vi dominates y.
338,
Inverse Iteration
With y0 =
j=1 j vj
n
X
j=1
Pn
and y =
j [A (i )0 I]vj
Pn
j=1 j vj ,
n
X
Inverse Iteration
Recommendation
[A (i )0 I]y = y0 gives
j vj
j=1
j [j (i )0 ] = j
j =
j
.
j (i )0
i is typically large and eigenvector vi dominates y.

Avi = i vi gives [A (i )0 I]vi = [i (i )0 ]vi . Hence,
[i (i )0 ]y [A (i )0 I]y = y0 .
Inner product with y0 gives
[i (i )0 ]y0T y 1 i (i )0 +
1
y0T y
339,
Inverse Iteration
Inverse Iteration
Recommendation
Algorithm:
Start with estimate (i )0 , guess y0 (normalized).

For k = 0, 1, 2,
Solve [A (i )k I]y = yk .
y
kyk .
Normalize yk+1 =
Improve (i )k+1 = (i )k +
If kyk+1 yk k < , terminate.
1
.
ykT y
340,
Inverse Iteration
Inverse Iteration
Recommendation
Algorithm:
Start with estimate (i )0 , guess y0 (normalized).

For k = 0, 1, 2,
Solve [A (i )k I]y = yk .
y
kyk .
Normalize yk+1 =
Improve (i )k+1 = (i )k +
If kyk+1 yk k < , terminate.
1
.
ykT y
Important issues
Update eigenvalue once in a while, not at every iteration.
Use some acceptable small number as artificial pivot.
The method may not converge for defective matrix or for one
having complex eigenvalues.
Repeated eigenvalues may inhibit the process.
341,
Recommendation

Inverse Iteration
Recommendation
Table: Eigenvalue problem: summary of methods

Type
General
Size
Small
(up to 4)
Symmetric
Intermediate
(say, 412)
Reduction
Definition:
Characteristic
polynomial
Jacobi sweeps
Tridiagonalization
(Givens rotation
or Householder
method)
Large
Nonsymmetric
Intermediate
Large
General
Very large
(selective
requirement)
Tridiagonalization
(usually
Householder method)
Balancing, and then
Reduction to
Hessenberg form
(Above methods or
Gaussian elimination)
Algorithm
Polynomial
root finding
(eigenvalues)
Selective
Jacobi rotations
Sturm sequence
property:
Bracketing and
bisection
(rough eigenvalues)
QR decomposition
iterations
Post-processing
Solution of
linear systems
(eigenvectors)
QR decomposition
iterations
(eigenvalues)
Inverse iteration
(eigenvectors)
Power method,
shift and deflation
Inverse iteration
(eigenvalue
improvement
and eigenvectors)
342,
Points to note

Inverse Iteration
Recommendation
Eigenvalue problem of a non-symmetric matrix is difficult!
Balancing and reduction to Hessenberg form are desirable

pre-processing steps.
QR decomposition algorithm is typically used for reduction to

an upper-triangular form.
Use inverse iteration to polish eigenvalue and find

eigenvectors.
In algebraic eigenvalue problems, different methods or

combinations are suitable for different cases; regarding matrix
size, symmetry and the requirements.
343,
Outline

SVD Theorem and Construction
Properties of SVD
Pseudoinverse and Solution of Linear Systems
Optimality of Pseudoinverse Solution
SVD Algorithm

Properties of SVD
SVD Algorithm
344,

Properties of SVD
SVD Algorithm
Eigenvalue problem: A = UV1 where U = V

Do not ask for similarity. Focus on the form of the decomposition.
Similar result for complex matrices
345,

Properties of SVD
SVD Algorithm

Guaranteed decomposition with orthogonal U, V, and
non-negative diagonal entries in by allowing U 6= V.
A = UVT such that UT AV =
346,

Properties of SVD
SVD Algorithm

Guaranteed decomposition with orthogonal U, V, and
non-negative diagonal entries in by allowing U 6= V.
A = UVT such that UT AV =
SVD Theorem For any real matrix A R mn , there
exist orthogonal matrices U R mm and V R nn such
that
UT AV = R mn
is a diagonal matrix, with diagonal entries 1 , 2 , 0,
obtained by appending the square diagonal matrix
diag (1 , 2 , , p ) with (m p) zero rows or (n p)
zero columns, where p = min(m, n).
Singular values: 1 , 2 , , p .
347,

Question: How to construct U, V and ?
This provides a proof as well!

Properties of SVD
SVD Algorithm
348,

For A R mn ,

Properties of SVD
SVD Algorithm
AT A = (VT UT )(UVT ) = VT VT = VVT ,

where = T is an n n diagonal matrix.
|
|
..
.
|
0
=
p
|
+
0
|
349,

For A R mn ,

Properties of SVD
SVD Algorithm
AT A = (VT UT )(UVT ) = VT VT = VVT ,

where = T is an n n diagonal matrix.
|
|
..
.
|
0
=
p
|
+
0
|
Determine V and . Work out and we have

A = UVT AV = U
350,

From AV = U, determine columns of U.

Properties of SVD
SVD Algorithm
1. Column Avk = k uk , with k 6= 0: determine column uk .

Columns developed are bound to be mutually
orthonormal!
T

1
1
= ij .
Av
Av
Verify uT
u
=
i
j
j
i
i
j
2. Column Avk = k uk , with k = 0: uk is left indeterminate

(free).
3. In the case of m < n, identically zero columns Avk = 0 for
k > m: no corresponding columns of U to determine.
4. In the case of m > n, there will be (m n) columns of U left
indeterminate.
Extend columns of U to an orthonormal basis.
351,

From AV = U, determine columns of U.

Properties of SVD
SVD Algorithm
1. Column Avk = k uk , with k 6= 0: determine column uk .

Columns developed are bound to be mutually
orthonormal!
T

1
1
= ij .
Av
Av
Verify uT
u
=
i
j
j
i
i
j
2. Column Avk = k uk , with k = 0: uk is left indeterminate

(free).
3. In the case of m < n, identically zero columns Avk = 0 for
k > m: no corresponding columns of U to determine.
4. In the case of m > n, there will be (m n) columns of U left
indeterminate.
Extend columns of U to an orthonormal basis.
All three factors in the decomposition are constructed, as desired.
352,
Properties of SVD

Properties of SVD
SVD Algorithm
For a given matrix, the SVD is unique up to
(a) the same permutations of columns of U, columns of V and

diagonal elements of ;
(b) the same orthonormal linear combinations among columns of
U and columns of V, corresponding to equal singular values;
and
(c) arbitrary orthonormal linear combinations among columns of
U or columns of V, corresponding to zero or non-existent
singular values.
353,
Properties of SVD

Properties of SVD
SVD Algorithm

and
singular values.
Ordering of the singular values:
1 2 r > 0, and r +1 = r +2 = = p = 0.
354,
Properties of SVD

Properties of SVD
SVD Algorithm

and
singular values.
Ordering of the singular values:
1 2 r > 0, and r +1 = r +2 = = p = 0.
Rank(A) = Rank() = r
Rank of a matrix is the same as the number of its
non-zero singular values.
355,
Properties of SVD
Ax = UVT x = Uy = [u1
356,

Properties of SVD
SVD Algorithm
ur
ur +1
1 y1
..
um ] .
r yr
= 1 y1 u1 + 2 y2 u2 + + r yr ur
has non-zero components along only the first r columns of U.

U gives an orthonormal basis for the co-domain such that
Range(A) = < u1 , u2 , , ur > .
Properties of SVD
Ax = UVT x = Uy = [u1
357,

Properties of SVD
SVD Algorithm
ur
ur +1
1 y1
..
um ] .
r yr
= 1 y1 u1 + 2 y2 u2 + + r yr ur
has non-zero components along only the first r columns of U.

U gives an orthonormal basis for the co-domain such that
Range(A) = < u1 , u2 , , ur > .
With VT x = y, vkT x = yk , and
x = y1 v1 + y2 v2 + + yr vr + yr +1 vr +1 + yn vn .
V gives an orthonormal basis for the domain such that
Null(A) = < vr +1 , vr +2 , , vn > .
Properties of SVD

Properties of SVD
SVD Algorithm
In basis V, v = c1 v1 + c2 v2 + + cn vn = Vc and the norm is

given by
vT AT Av
kAvk2
=
max
v
v
kvk2
vT v
P 2 2
cT T c
cT VT AT AVc
k k ck
.
= max
= max P
= max
2
T
T
T
c
c
c
c V Vc
c c
k ck
kAk2 = max
358,
Properties of SVD

Properties of SVD
SVD Algorithm

given by
vT AT Av
kAvk2
=
max
v
v
kvk2
vT v
P 2 2
cT T c
cT VT AT AVc
k k ck
.
= max
= max P
= max
2
T
T
T
c
c
c
c V Vc
c c
k ck
kAk2 = max
kAk =
maxc
P 2 2
k k ck
P
2
k ck
= max
359,
Properties of SVD

Properties of SVD
SVD Algorithm

given by
vT AT Av
kAvk2
=
max
v
v
kvk2
vT v
P 2 2
cT T c
cT VT AT AVc
k k ck
.
= max
= max P
= max
2
T
T
T
c
c
c
c V Vc
c c
k ck
kAk2 = max
kAk =
maxc
P 2 2
k k ck
P
2
k ck
= max
For a non-singular square matrix,

A
T 1
= (UV )
= V
U = V diag
1 1
1
, , ,
1 2
n
UT .
360,
Properties of SVD

Properties of SVD
SVD Algorithm

given by
vT AT Av
kAvk2
=
max
v
v
kvk2
vT v
P 2 2
cT T c
cT VT AT AVc
k k ck
.
= max
= max P
= max
2
T
T
T
c
c
c
c V Vc
c c
k ck
kAk2 = max
kAk =
P 2 2
k k ck
P
2
k ck
maxc
= max
For a non-singular square matrix,

A
T 1
= (UV )
Then, kA1 k =
= V
1
min
U = V diag
1 1
1
, , ,
1 2
n
and the condition number is

max
(A) = kAk kA1 k =
.
min
UT .
361,
Properties of SVD

Properties of SVD
SVD Algorithm
Revision of definition of norm and condition number:

The norm of a matrix is the same as its largest singular
value, while its condition number is given by the ratio of
the largest singular value to the least.
362,
Properties of SVD

Properties of SVD
SVD Algorithm

Arranging singular values in decreasing order, with Rank(A) = r ,

and V = [Vr V],
U]

T
r 0
Vr
T
A = UV = [Ur U]
T ,
V
0 0
U = [Ur
or,
A=
Ur r VrT
r
X
k=1
k uk vkT .
363,
Properties of SVD

Properties of SVD
SVD Algorithm

Arranging singular values in decreasing order, with Rank(A) = r ,

and V = [Vr V],
U]

T
r 0
Vr
T
A = UV = [Ur U]
T ,
V
0 0
U = [Ur
or,
A=
Ur r VrT
r
X
k uk vkT .
k=1
Efficient storage and reconstruction!
364,
Theorem and Construction

Pseudoinverse and Solution of Linear SVD
Systems
Properties
of SVD
SVD Algorithm
Generalized inverse: G is called a generalized
inverse or g-inverse
of A if, for b Range(A), Gb is a solution of Ax = b.
365,

Systems
Properties
of SVD
SVD Algorithm
The Moore-Penrose inverse or the pseudoinverse:
A# = (UVT )# = (VT )# # U# = V# UT
366,

Systems
Properties
of SVD
SVD Algorithm
A# = (UVT )# = (VT )# # U# = V# UT

1

r 0
r
0
#
With =
, =
.
0 0
0
0
367,

Systems
Properties
of SVD
SVD Algorithm
A# = (UVT )# = (VT )# # U# = V# UT

1

r 0
r
0
#
With =
, =
.
0 0
0
0
1
|
|
2
..
.
|
0
Or, # =
p
|
+
0
|
1
for k 6= 0 or for |k | > ;
k ,
where k =
0, for k = 0 or for |k | .
368,

Systems
Properties
of SVD
Inverse-like facets and beyond
(A# )# = A.
If A is invertible, then A# = A1 .

SVD Algorithm
A# b gives the correct unique solution.
If Ax = b is an under-determined consistent system, then

A# b selects the solution x with the minimum norm.
If the system is inconsistent, then A# b minimizes the least
square error kAx bk.
If the minimizer of kAx bk is not unique, then it picks up

that minimizer which has the minimum norm kxk among such
minimizers.
369,

Systems
Properties
of SVD
Inverse-like facets and beyond
(A# )# = A.
If A is invertible, then A# = A1 .

SVD Algorithm
A# b gives the correct unique solution.
If Ax = b is an under-determined consistent system, then

A# b selects the solution x with the minimum norm.
If the system is inconsistent, then A# b minimizes the least
square error kAx bk.
If the minimizer of kAx bk is not unique, then it picks up

that minimizer which has the minimum norm kxk among such
minimizers.
Contrast with Tikhonov regularization:

Pseudoinverse solution for precision and diagnosis.
Tikhonovs solution for continuity of solution over
variable A and computational efficiency.
370,

Optimality of Pseudoinverse Solution SVD
Properties of SVD
Pseudoinverse solution of Ax = b:
x = V U b =
r
X
k=1
k vk uT
kb

SVD Algorithm
r
X
=
(uT
k b/k )vk
k=1
371,

Properties of SVD
x = V U b =
r
X
k=1
k vk uT
kb

SVD Algorithm
r
X
=
(uT
k b/k )vk
k=1
Minimize
E (x) =
1
1
1
(Ax b)T (Ax b) = xT AT Ax xT AT b + bT b
2
2
2
372,

Properties of SVD
SVD Algorithm
x = V U b =
r
X
k vk uT
kb
k=1
r
X
=
(uT
k b/k )vk
k=1
Minimize
E (x) =
1
1
1
(Ax b)T (Ax b) = xT AT Ax xT AT b + bT b
2
2
2
Condition of vanishing gradient:

E
= 0 AT Ax = AT b
x
V(T )VT x = VT UT b
(T )VT x = T UT b
k2 vkT x = k uT
k b
vkT x = uT
k b/k
for k = 1, 2, 3, , r .
373,

Properties of SVD
SVD Algorithm
= [vr +1
With V
x=
vr +2
vn ], then
r
X
(uT
k b/k )vk + Vy = x + Vy.
k=1
374,

Properties of SVD
SVD Algorithm
= [vr +1
With V
x=
vr +2
vn ], then
r
X
(uT
k=1
How to minimize kxk2 subject to E (x) minimum?
375,

Properties of SVD
SVD Algorithm
= [vr +1
With V
x=
vr +2
vn ], then
r
X
(uT
k=1

2.
Minimize E1 (y) = kx + Vyk
376,

Properties of SVD
SVD Algorithm
= [vr +1
With V
x=
vr +2
vn ], then
r
X
(uT
k=1

2.
Minimize E1 (y) = kx + Vyk
are mutually orthogonal,

Since x and Vy
2 = kx k2 + kVyk
2
E1 (y) = kx + Vyk
= 0, i.e. y = 0.
is minimum when Vy
377,

Properties of SVD
Anatomy of the optimization through SVD

SVD Algorithm
Using basis V for domain and U for co-domain, the variables are
transformed as
VT x = y and UT b = c.
378,

Properties of SVD

SVD Algorithm
transformed as
Then,
Ax = b UVT x = b VT x = UT b y = c.
A completely decoupled system!
379,

Properties of SVD

SVD Algorithm
transformed as
Then,
Usable components: yk = ck /k for k = 1, 2, 3, , r .
For k > r ,
completely redundant information (ck = 0)
purely unresolvable conflict (ck 6= 0)
380,

Properties of SVD

SVD Algorithm
transformed as
Then,
Usable components: yk = ck /k for k = 1, 2, 3, , r .
For k > r ,
completely redundant information (ck = 0)
purely unresolvable conflict (ck 6= 0)
SVD extracts this pure redundancy/inconsistency.
Setting k = 0 for k > r rejects it wholesale!
At the same time, kyk is minimized, and hence kxk too.
381,
Points to note

Properties of SVD
SVD Algorithm
SVD provides a complete orthogonal decomposition of the

domain and co-domain of a linear transformation, separating
out functionally distinct subspaces.
If offers a complete diagnosis of the pathologies of systems of

linear equations.
Pseudoinverse solution of linear systems satisfy meaningful

optimality requirements in several contexts.
With the existence of SVD guaranteed, many important

results can be established in a straightforward manner.
382,
Outline

Group
Field
Vector Space
Linear Transformation
Isomorphism
Inner Product Space
Function Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
383,
Group

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
A set G and a binary operation, say +, fulfilling

Closure:
a + b G a, b G
Associativity: a + (b + c) = (a + b) + c, a, b, c G
Existence of identity: 0 G such that a G , a + 0 = a = 0 + a

Existence of inverse: a G , (a) G such that
a + (a) = 0 = (a) + a
Examples: (Z , +), (R, +), (Q {0}, ), 2 5 real matrices,

Rotations etc.
384,
Group

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

Closure:
a + b G a, b G

a + (a) = 0 = (a) + a

Rotations etc.
Commutative group
Examples:(Z , +), (R, +), (Q {0}, ),
(F, +).
385,
Group

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

Closure:
a + b G a, b G

a + (a) = 0 = (a) + a

Rotations etc.
Commutative group
Examples:(Z , +), (R, +), (Q {0}, ),
Subgroup
(F, +).
386,
Field

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
A set F and two binary operations, say + and , satisfying

Group property for addition: (F , +) is a commutative group.
(Denote the identity element of this group as 0.)
Group property for multiplication: (F {0}, ) is a commutative
group. (Denote the identity element of this group as
1.)
Distributivity: a (b + c) = a b + a c, a, b, c F .
Concept of field: abstraction of a number system
387,
Field

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

1.)
Examples: (Q, +, ), (R, +, ), (C , +, ) etc.
388,
Field

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

1.)
Examples: (Q, +, ), (R, +, ), (C , +, ) etc.
Subfield
389,
Vector Space
A vector space is defined by

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
a field F of scalars,
a commutative group V of vectors, and
a binary operation between F and V, that may be called

scalar multiplication, such that , F , a, b V; the
following conditions hold.
Closure:
a V.
Identity:
1a = a.
Associativity: ()a = (a).
Scalar distributivity: (a + b) = a + b.
Vector distributivity: ( + )a = a + a.
Examples: R n , C n , m n real matrices etc.
390,
Vector Space
A vector space is defined by

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
a field F of scalars,
a commutative group V of vectors, and
a binary operation between F and V, that may be called

scalar multiplication, such that , F , a, b V; the
following conditions hold.
Closure:
a V.
Identity:
1a = a.
Associativity: ()a = (a).
Scalar distributivity: (a + b) = a + b.
Vector distributivity: ( + )a = a + a.
Examples: R n , C n , m n real matrices etc.

Field Number system
Vector space Space
391,
Vector Space
Suppose V is a vector space.
Take a vector 1 6= 0 in it.

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
392,
Vector Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
Then, vectors linearly dependent on 1 :

1 1 V 1 F .
393,
Vector Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

1 1 V 1 F .
Question: Are the elements of V exhausted?
394,
Vector Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

1 1 V 1 F .

If not, then take 2 V: linearly independent from 1 .
Then, 1 1 + 2 2 V 1 , 2 F .
395,
Vector Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

1 1 V 1 F .

Then, 1 1 + 2 2 V 1 , 2 F .
Question: Are the elements of V exhausted now?
396,
Vector Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

1 1 V 1 F .

Then, 1 1 + 2 2 V 1 , 2 F .

397,
Vector Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

1 1 V 1 F .

Then, 1 1 + 2 2 V 1 , 2 F .

Question: Will this process ever end?
398,
Vector Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

1 1 V 1 F .

Then, 1 1 + 2 2 V 1 , 2 F .

Suppose it does.
399,
Vector Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

1 1 V 1 F .

Then, 1 1 + 2 2 V 1 , 2 F .

Suppose it does.
finite dimensional vector space
400,
Vector Space
Finite dimensional vector space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
Suppose the above process ends after n choices of linearly

independent vectors.
= 1 1 + 2 2 + + n n
401,
Vector Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

= 1 1 + 2 2 + + n n
Then,
n: dimension of the vector space
ordered set 1 , 2 , , n : a basis
1 , 2 , , n F : coordinates of in that basis
402,
Vector Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

= 1 1 + 2 2 + + n n
Then,
R n , R m etc: vector spaces over the field of real numbers
403,
Vector Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

= 1 1 + 2 2 + + n n
Then,
R n , R m etc: vector spaces over the field of real numbers
Subspace
404,
A mapping T : V W satisfying

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
T(a + b) = T(a) + T(b) , F and a, b V

where V and W are vector spaces over the field F .
405,

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
T(a + b) = T(a) + T(b) , F and a, b V

Question: How to describe the linear transformation T?
406,

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
T(a + b) = T(a) + T(b) , F and a, b V

For V, basis 1 , 2 , , n
For W, basis 1 , 2 , , m
1 V gets mapped to T(1 ) W.

T(1 ) = a11 1 + a21 2 + + am1 m
P
Similarly, enumerate T(j ) = m
i =1 aij i .
407,
Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
T(a + b) = T(a) + T(b) , F and a, b V

For V, basis 1 , 2 , , n
For W, basis 1 , 2 , , m
1 V gets mapped to T(1 ) W.

T(1 ) = a11 1 + a21 2 + + am1 m
P
Similarly, enumerate T(j ) = m
i =1 aij i .
Matrix A = [a1
a2
an ] codes this description!
408,

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
A general element of V can be expressed as
= x1 1 + x2 2 + + xn n
Coordinates in a column: x = [x1 x2 xn ]T
409,

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
= x1 1 + x2 2 + + xn n
Mapping:
T() = x1 T(1 ) + x2 T(2 ) + + xn T(n ),
with coordinates Ax, as we know!
410,

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
= x1 1 + x2 2 + + xn n
Mapping:
T() = x1 T(1 ) + x2 T(2 ) + + xn T(n ),
with coordinates Ax, as we know!
Summary:
basis vectors of V get mapped to vectors in W whose

coordinates are listed in columns of A, and
a vector of V, having its coordinates in x, gets mapped to a

vector in W whose coordinates are obtained from Ax.
411,
Understanding:

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
Vector is an actual object in the set V and the column

x R n is merely a list of its coordinates.
T : V W is the linear transformation and the matrix A

simply stores coefficients needed to describe it.
By changing bases of V and W, the same vector and the

same linear transformation are now expressed by different x
and A, respectively.
412,
Understanding:

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space



Matrix representation emerges as the natural description
of a linear transformation between two vector spaces.
413,
Understanding:

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space



Matrix representation emerges as the natural description
of a linear transformation between two vector spaces.
Exercise: Set of all T : V W form a vector space of their own!!

Analyze and describe that vector space.
414,
Isomorphism

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
Consider T : V W that establishes a one-to-one correspondence.
Linear transformation T defines a one-one onto mapping,

which is invertible.
dim V = dim W
Inverse linear transformation T1 : W V
T defines (is) an isomorphism.
Vector spaces V and W are isomorphic to each other.
Isomorphism is an equivalence relation. V and W are

equivalent!
415,
Isomorphism

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
Consider T : V W that establishes a one-to-one correspondence.
Linear transformation T defines a one-one onto mapping,

which is invertible.
dim V = dim W
Inverse linear transformation T1 : W V
T defines (is) an isomorphism.
Vector spaces V and W are isomorphic to each other.
Isomorphism is an equivalence relation. V and W are

equivalent!
If we need to perform some operations on vectors in one vector

space, we may as well
1. transform the vectors to another vector space through an
isomorphism,
2. conduct the required operations there, and
3. map the results back to the original space through the inverse.
416,
Isomorphism

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
Consider vector spaces V and W over the same field F and of the
same dimension n.
417,
Isomorphism

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
same dimension n.
Question: Can we define an isomorphism between them?
418,
Isomorphism

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
same dimension n.
Answer: Of course. As many as we want!
419,
Isomorphism

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
same dimension n.
The underlying field and the dimension together
completely specify a vector space, up to an isomorphism.
420,
Isomorphism

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
same dimension n.
The underlying field and the dimension together
completely specify a vector space, up to an isomorphism.
All n-dimensional vector spaces over the field F are

isomorphic to one another.
In particular, they are all isomorphic to F n .
The representation (columns) can be considered as the

objects (vectors) themselves.
421,
Inner Product Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
Inner product (a, b) in a real or complex vector space: a scalar

function p : V V F satisfying
Closure:
a, b V, (a, b) F
Associativity: (a, b) = (a, b)
Distributivity: (a + b, c) = (a, c) + (b, c)

Conjugate commutativity: (b, a) = (a, b)
Positive definiteness: (a, a) 0; and (a, a) = 0 iff a = 0
Note: Property of conjugate commutativity forces (a, a) to be real.
Examples: aT b, aT Wb in R, a b in C etc.
422,
Inner Product Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
Inner product (a, b) in a real or complex vector space: a scalar

function p : V V F satisfying
Closure:
a, b V, (a, b) F
Associativity: (a, b) = (a, b)
Distributivity: (a + b, c) = (a, c) + (b, c)

Conjugate commutativity: (b, a) = (a, b)
Positive definiteness: (a, a) 0; and (a, a) = 0 iff a = 0
Note: Property of conjugate commutativity forces (a, a) to be real.
Examples: aT b, aT Wb in R, a b in C etc.
Inner product space: a vector space possessing an inner product
Euclidean space: over R
Unitary space: over C
423,
Inner Product Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
Inner products bring in ideas of angle and length in the geometry

of vector spaces.
424,
Inner Product Space
Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

of vector spaces.
Orthogonality: (a, b) = 0
Norm: k k : V R, such that kak =
(a, a)
425,
Inner Product Space
Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

of vector spaces.
Associativity: kak = || kak
(a, a)
Positive definiteness: kak > 0 for a 6= 0 and k0k = 0

Triangle inequality: ka + bk kak + kbk
Cauchy-Schwarz inequality: |(a, b)| kak kbk
426,
Inner Product Space
Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

of vector spaces.
Associativity: kak = || kak
(a, a)
Positive definiteness: kak > 0 for a 6= 0 and k0k = 0

Triangle inequality: ka + bk kak + kbk
Cauchy-Schwarz inequality: |(a, b)| kak kbk

A distance function or metric: dV : V V R such that
dV (a, b) = ka bk
427,
Function Space
Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
Suppose we decide to represent a continuous function

f : [a, b] R by the listing
vf = [f (x1 ) f (x2 )
f (x3 )
f (xN )]T
with a = x1 < x2 < x3 < < xN = b.

Note: The true representation will require N to be infinite!
428,
Function Space
Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

vf = [f (x1 ) f (x2 )
f (x3 )
f (xN )]T
with a = x1 < x2 < x3 < < xN = b.

Here, vf is a real column vector.
Do such vectors form a vector space?
429,
Function Space
Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space

vf = [f (x1 ) f (x2 )
f (x3 )
f (xN )]T
with a = x1 < x2 < x3 < < xN = b.

Here, vf is a real column vector.
Do such vectors form a vector space?
Correspondingly, does the set F of continuous functions
over [a, b] form a vector space?
infinite dimensional vector space
430,
Function Space
Vector space of continuous functions
First,
(F, +) is a commutative group.

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
431,
Function Space
First,
Next, with , R, x [a, b],
if f (x) R, then f (x) R

1 f (x) = f (x)
()f (x) = [f (x)]
[f1 (x) + f2 (x)] = f1 (x) + f2 (x)
( + )f (x) = f (x) + f (x)

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
432,
Function Space
First,

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
Next, with , R, x [a, b],
if f (x) R, then f (x) R

1 f (x) = f (x)
()f (x) = [f (x)]
[f1 (x) + f2 (x)] = f1 (x) + f2 (x)
( + )f (x) = f (x) + f (x)
Thus, F forms a vector space over R.
Every function in this space is an (infinite dimensional) vector.
Listing of values is just an obvious basis.
433,
Function Space

Group
Field
Vector Space
Isomorphism
Inner
1 Product Space
2
Function Space
Linear dependence of (non-zero) functions f and f
f2 (x) = kf1 (x) for all x in the domain
k1 f1 (x) + k2 f2 (x) = 0, x with k1 and k2 not both zero.
Linear independence: k1 f1 (x) + k2 f2 (x) = 0 x k1 = k2 = 0
434,
Function Space

Group
Field
Vector Space
Isomorphism
Inner
1 Product Space
2
Function Space

In general,
Functions f1 , f2 , f3 , , fn F are linearly dependent if

k1 , k2 , k3 , , kn , not all zero, such that
k1 f1 (x) + k2 f2 (x) + k3 f3 (x) + + kn fn (x) = 0 x [a, b].
k1 f1 (x) + k2 f2 (x) + k3 f3 (x) + + kn fn (x) = 0 x [a, b]

k1 , k2 , k3 , , kn = 0 means that functions f1 , f2 , f3 , , fn are
linearly independent.
435,
Function Space

Group
Field
Vector Space
Isomorphism
Inner
1 Product Space
2
Function Space

In general,
Functions f1 , f2 , f3 , , fn F are linearly dependent if

k1 , k2 , k3 , , kn , not all zero, such that
k1 f1 (x) + k2 f2 (x) + k3 f3 (x) + + kn fn (x) = 0 x [a, b].
k1 f1 (x) + k2 f2 (x) + k3 f3 (x) + + kn fn (x) = 0 x [a, b]

k1 , k2 , k3 , , kn = 0 means that functions f1 , f2 , f3 , , fn are
linearly independent.
Example: functions 1, x, x 2 , x 3 , are a set of linearly

independent functions.
Incidentally, this set is a commonly used basis.
436,
Function Space

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
Inner product: For functions f (x) and g (x) in F, the usual inner
product between corresponding vectors:
(vf , vg ) = vfT vg = f (x1 )g (x1 ) + f (x2 )g (x2 ) + f (x3 )g (x3 ) +
P
Weighted inner product: (vf , vg ) = vfT Wvg = i wi f (xi )g (xi )
437,
Function Space
Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
P
For the functions,
(f , g ) =
w (x)f (x)g (x)dx
438,
Function Space
Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
P
For the functions,
(f , g ) =
w (x)f (x)g (x)dx
Rb
Orthogonality: (f , g ) = a w (x)f (x)g (x)dx = 0
qR
b
2
Norm: kf k =
a w (x)[f (x)] dx
Orthonormal
R b basis:
(fj , fk ) = a w (x)fj (x)fk (x)dx = jk j, k
439,
Points to note

Group
Field
Vector Space
Isomorphism
Inner Product Space
Function Space
Matrix algebra provides a natural description for vector spaces

and linear transformations.
Through isomorphisms, R n can represent all n-dimensional

real vector spaces.
Through the definition of an inner product, a vector space

incorporates key geometric features of physical space.
Continuous functions over an interval constitute an infinite

dimensional vector space, complete with the usual notions.
440,
Outline

Derivatives in Multi-Dimensional Spaces
Taylors Series
Chain Rule and Change of Variables
Numerical Differentiation
An Introduction to Tensors*

Taylors Series
441,

Taylors Series
Gradient

f
f
(x) =
f (x)
x
x1
f
x2
Up to the first order, f [f (x)]T x

Directional derivative
f
f (x + d) f (x)
= lim
d 0
f
xn
T
442,

Taylors Series
Gradient

f
f
(x) =
f (x)
x
x1
f
x2
f
xn
T
Up to the first order, f [f (x)]T x

Directional derivative
f
f (x + d) f (x)
= lim
d 0
Relationships:
f
f
=
,
ej
xj
f
= dT f (x)
d
and
f
= kf (x)k
Among all unit vectors, taken as directions,

the rate of change of a function in a direction is the same as
the component of its gradient along that direction, and
the rate of change along the direction of the gradient is the
greatest and is equal to the magnitude of the gradient.
443,

Taylors Series
Hessian
2f
H(x) =
=
x 2
2f
x1 2
2f
x1 x2
2f
x2 x1
2f
x2 2
..
.
2f
xn x1
2f
xn x2
2f
x1 xn
2f
x2 xn
2f
xn 2
..
.
Meaning: f (x + x) f (x)
..
.
2f
(x)
x 2
..
.
444,

Taylors Series
Hessian
2f
H(x) =
=
x 2
2f
x1 2
2f
x1 x2
2f
x2 x1
2f
x2 2
..
.
2f
xn x1
2f
xn x2
2f
x1 xn
2f
x2 xn
2f
xn 2
..
.
Meaning: f (x + x) f (x)
..
.
2f
(x)
x 2
For a vector function h(x), Jacobian

h
h
h
(x) =
J(x) =
x
x1 x2
Underlying notion: h [J(x)]x
..
.
h
xn
445,
Taylors Series
Taylors formula in the remainder form:

Taylors Series
f (x + x) = f (x) + f (x)x
1
1
1
f (n1) (x)x n1 + f (n) (xc )x n
+ f (x)x 2 + +
2!
(n 1)!
n!
where xc = x + tx with 0 t 1
Mean value theorem: existence of xc
446,
Taylors Series

Taylors Series
f (x + x) = f (x) + f (x)x
1
1
1
f (n1) (x)x n1 + f (n) (xc )x n
+ f (x)x 2 + +
2!
(n 1)!
n!
Taylors series:
1
f (x + x) = f (x) + f (x)x + f (x)x 2 +
2!
447,
Taylors Series

Taylors Series
f (x + x) = f (x) + f (x)x
1
1
1
f (n1) (x)x n1 + f (n) (xc )x n
+ f (x)x 2 + +
2!
(n 1)!
n!
Taylors series:
1
f (x + x) = f (x) + f (x)x + f (x)x 2 +
2!
For a multivariate function,
1
f (x + x) = f (x) + [xT ]f (x) + [xT ]2 f (x) +
2!
1
1
[xT ]n1 f (x) + [xT ]n f (x + tx)
+
(n 1)!
n!

2
1
f
f (x + x) f (x) + [f (x)]T x + xT
(x)
x
2
x 2
448,

For f (x), the total differential:
df = [f (x)]T dx =

Taylors Series
f
f
f
dx1 +
dx2 + +
dxn
x1
x2
xn
Ordinary derivative or total derivative:

df
dx
= [f (x)]T
dt
dt
449,

df = [f (x)]T dx =

Taylors Series
f
f
f
dx1 +
dx2 + +
dxn
x1
x2
xn

df
dx
= [f (x)]T
dt
dt
For f (t, x(t)), total derivative:
df
dt
f
t
+ [f (x)]T dx
dt
450,

df = [f (x)]T dx =
451,

Taylors Series
f
f
f
dx1 +
dx2 + +
dxn
x1
x2
xn

df
dx
= [f (x)]T
dt
dt
f
T dx
For f (t, x(t)), total derivative: df
dt = t + [f (x)] dt
For f (v, x(v)) = f (v1 , v2 , , vm , x1 (v), x2 (v), , xn (v)),
f
(v, x(v)) =
vi
f
vi
T

x
f
f
x
(v, x)
=
+
+[x f (v, x)]T
x
vi
vi x
vi
x

T
x
(v) x f (v, x)
f (v, x(v)) = v f (v, x) +
v


Let x R m+n and h(x) R m .

Taylors Series
Partition x R m+n into z R n and w R m .
452,


Taylors Series

System of equations h(x) = 0 means h(z, w) = 0.
Question: Can we work out the function w = w(z)?
453,


Taylors Series

Solution of m equations in m unknowns?
454,


Taylors Series

Question: If we have one valid pair (z, w), then is it possible to
develop w = w(z) in the local neighbourhood?
455,


Taylors Series

h
Answer: Yes, if Jacobian w
is non-singular.
Implicit function theorem
456,


Taylors Series

h
Answer: Yes, if Jacobian w
is non-singular.
Implicit function theorem

h 1 h
h
h w
w
+
=0
=
z
w z
z
w
z
h i
Upto first order, w1 = w + w
z (z1 z).
457,

Taylors Series
For a multiple integral

Z Z Z
f (x, y , z) dx dy dz,
I =
A
change of variables x = x(u, v , w ), y = y (u, v , w ), z = z(u, v , w )

gives
Z Z Z
I =
f (x(u, v , w ), y (u, v , w ), z(u, v , w )) |J(u, v , w )| du dv dw ,

(x,y ,z)
where Jacobian determinant |J(u, v , w )| = (u,v
,w ) .
458,

Taylors Series
For the differential

P1 (x)dx1 + P2 (x)dx2 + + Pn (x)dxn ,
we ask: does there exist a function f (x),
of which this is the differential;
or equivalently, the gradient of which is P(x)?
Perfect or exact differential: can be integrated to find f .
459,

Taylors Series
Differentiation under the integral sign

R v (x)
How To differentiate (x) = (x, u(x), v (x)) = u(x) f (x, t) dt?
460,

Taylors Series

R v (x)
In the expression
(x) =
we have
Rv
du dv
+
+
,
x
u dx
v dx
f
u x (x, t)dt.
461,

Taylors Series

R v (x)
In the expression
(x) =
du dv
+
+
,
x
u dx
v dx
R v f
we have
x = u x (x, t)dt.
(x,t)
,
Now, considering function F (x, t) such that f (x, t) = Ft
Z v
F
(x, t)dt = F (x, v ) F (x, u) (x, u, v ).
(x) =
u t
462,

Taylors Series

R v (x)
In the expression
(x) =
du dv
+
+
,
x
u dx
v dx
R v f
we have
x = u x (x, t)dt.
(x,t)
,
Now, considering function F (x, t) such that f (x, t) = Ft
Z v
F
(x, t)dt = F (x, v ) F (x, u) (x, u, v ).
(x) =
u t
Using
= f (x, v ) and
(x) =
v (x)
u(x)
= f (x, u),
f
dv
du
(x, t)dt + f (x, v )
f (x, u) .
x
dx
dx
Leibnitz rule
463,

Taylors Series
Forward difference formula

f (x + x) f (x)
f (x) =
+ O(x)
x
464,

Taylors Series

f (x + x) f (x)
f (x) =
+ O(x)
x
Central difference formulae
f (x + x) f (x x)
+ O(x 2 )
f (x) =
2x
f (x + x) 2f (x) + f (x x)
f (x) =
+ O(x 2 )
x 2
465,
466,

Taylors Series

f (x + x) f (x)
f (x) =
+ O(x)
x
Central difference formulae
f (x + x) f (x x)
+ O(x 2 )
f (x) =
2x
f (x + x) 2f (x) + f (x x)
f (x) =
+ O(x 2 )
x 2
For gradient f (x) and Hessian,
1
f
(x) = [f (x + ei ) f (x ei )],
xi
2
2f
(x) =
xi 2
2f
(x) =
xi xj
f (x + ei ) 2f (x) + f (x ei )
, and
2
f (x + ei + ej ) f (x + ei ej )
f (x ei + ej ) + f (x ei ej )
42

Taylors Series
Indicial notation and summation convention
Kronecker delta and Levi-Civita symbol
Rotation of reference axes
Tensors of order zero, or scalars
Contravariant and covariant tensors of order one, or vectors
Cartesian tensors
Cartesian tensors of order two
Higher order tensors
Elementary tensor operations
Symmetric tensors
Tensor fields
467,
Points to note

Taylors Series
Gradient, Hessian, Jacobian and the Taylors series
Partial and total gradients
Implicit functions
Leibnitz rule
Numerical derivatives
468,
Outline

Recapitulation of Basic Notions
Curves in Space
Surfaces*

Curves in Space
Surfaces*
469,

Dot and cross products: their implications
Scalar and vector triple products
Differentiation rules

Curves in Space
Surfaces*
470,

Curves in Space
Surfaces*
Dot and cross products: their implications

Scalar and vector triple products
Differentiation rules
Interface with matrix algebra:
a x = aT x,
(a x)b = (baT )x, and

T
a x, for 2-d vectors
ax =
ax, for 3-d vectors

where
a =
ay
ax
0 az
ay
0 ax
and a = az
ay
ax
0
471,
Curves in Space

Curves in Space
Surfaces*
Explicit equation: y = y (x) and z = z(x)

Implicit equation: F (x, y , z) = 0 = G (x, y , z)
Parametric equation:
r(t) = x(t)i + y (t)j + z(t)k [x(t) y (t) z(t)]T
472,
Curves in Space

Curves in Space
Surfaces*

Tangent vector: r (t)

Speed: kr k
r
kr k
Unit tangent: u(t) =
Length of the curve: l =
Rb
a
kdrk =
Rbp
a
r r dt
473,
Curves in Space

Curves in Space
Surfaces*

Tangent vector: r (t)

Speed: kr k
Unit tangent: u(t) =
r
kr k
Rb
Rbp
Length of the curve: l = a kdrk = a r r dt
Arc length function
Z tp
s(t) =
r ( ) r ( ) d
with ds = kdrk =
dx 2 + dy 2 + dz 2 and
ds
dt
= kr k
474,
Curves in Space
Curve r(t) is regular if r (t) 6= 0 t.

Curves in Space
Surfaces*
Reparametrization with respect to parameter t , some

strictly increasing function of t
475,
Curves in Space

Curves in Space
Surfaces*

Observations
Arc length s(t) is obviously a monotonically increasing

function.
For a regular curve,
Then, s(t) has an inverse function.
Inverse t(s) reparametrizes the curve as r(t(s)).
ds
dt
6= 0.
476,
Curves in Space

Curves in Space
Surfaces*

Observations
Arc length s(t) is obviously a monotonically increasing

function.
For a regular curve,
Then, s(t) has an inverse function.
Inverse t(s) reparametrizes the curve as r(t(s)).
ds
dt
6= 0.
For a unit speed curve r(s), kr (s)k = 1 and the unit tangent is
u(s) = r (s).
477,
Curves in Space

Curves in Space
Surfaces*
Curvature: The rate at which the direction changes with arc

length.
(s) = ku (s)k = kr (s)k
Unit principal normal:
p=
1
u (s)
478,
Curves in Space

Curves in Space
Surfaces*

length.
(s) = ku (s)k = kr (s)k
p=
1
u (s)
With general parametrization,

r (t) =
dkr k
du
dkr k
u(t) + kr (t)k
=
u(t) + (t)kr k2 p(t)
dt
dt
dt
479,
Curves in Space

Curves in Space
Surfaces*

length.
(s) = ku (s)k = kr (s)k
p=
1
u (s)
With general parametrization,

r (t) =
dkr k
du
dkr k
u(t) + kr (t)k
=
u(t) + (t)kr k2 p(t)
dt
dt
dt
r/
Osculating plane
Centre of curvature
Radius of curvature
AC = = 1/
u
A
r//
Figure: Tangent and normal to a curve
480,
Curves in Space

Curves in Space
Surfaces*
Binormal: b = u p
Serret-Frenet frame: Right-handed triad {u, p, b}
Osculating, rectifying and normal planes
481,
Curves in Space

Curves in Space
Surfaces*
Binormal: b = u p
Torsion: Twisting out of the osculating plane
rate of change of b with respect to arc length s

b = u p + u p = (s)p p + u p = u p
What is p ?
482,
Curves in Space

Curves in Space
Surfaces*
Binormal: b = u p
Torsion: Twisting out of the osculating plane
rate of change of b with respect to arc length s

b = u p + u p = (s)p p + u p = u p
What is p ?
Taking p = u + b,
b = u (u + b) = p.
Torsion of the curve
(s) = p(s) b (s)
483,
Curves in Space
We have u and b . What is p ?

Curves in Space
Surfaces*
484,
Curves in Space

Curves in Space
Surfaces*

From p = b u,
p = b u + b u = p u + b p = u + b.
485,
Curves in Space

Curves in Space
Surfaces*

From p = b u,
p = b u + b u = p u + b p = u + b.
Serret-Frenet formulae
u =
p,
p = u + b,
b =
p
486,
Curves in Space

Curves in Space
Surfaces*

From p = b u,
p = b u + b u = p u + b p = u + b.
Serret-Frenet formulae
u =
p,
p = u + b,
b =
p
Intrinsic representation of a curve is complete with (s) and (s).

The arc-length parametrization of a curve is completely
determined by its curvature (s) and torsion (s)
functions, except for a rigid body motion.
487,

Curves in Space
Surfaces*
Surfaces*
Parametric surface equation:
r(u, v ) = x(u, v )i+y (u, v )j+z(u, v )k [x(u, v ) y (u, v ) z(u, v )]T

Tangent vectors ru and rv define a tangent plane T .
N = ru rv is normal to the surface and the unit normal is
n=
N
ru rv
=
.
kNk
kru rv k
488,

Curves in Space
Surfaces*
Surfaces*
Parametric surface equation:
r(u, v ) = x(u, v )i+y (u, v )j+z(u, v )k [x(u, v ) y (u, v ) z(u, v )]T

Tangent vectors ru and rv define a tangent plane T .
N = ru rv is normal to the surface and the unit normal is
n=
N
ru rv
=
.
kNk
kru rv k
Question: How does n vary over the surface?

Information on local geometry: curvature tensor
Normal and principal curvatures
Local shape: convex, concave, saddle, cylindrical, planar
489,
Points to note

Curves in Space
Surfaces*
Parametric equation is the general and most convenient

representation of curves and surfaces.
Arc length is the natural parameter and the Serret-Frenet

frame offers the natural frame of reference.
Curvature and torsion are the only inherent properties of a

curve.
The local shape of a surface patch can be understood through

an analysis of its curvature tensor.
490,
Outline

Differential Operations on Field Functions
Integral Operations on Field Functions
Integral Theorems
Closure

Integral Theorems
Closure
491,

Integral Theorems
Closure
Scalar point function or scalar field (x, y , z): R 3 R

Vector point function or vector field V(x, y , z): R 3 R 3
The del or nabla () operator
i
+j
+k
x
y
z
is a vector,
it signifies a differentiation, and
it operates from the left side.
492,

Integral Theorems
Closure
Scalar point function or scalar field (x, y , z): R 3 R

Vector point function or vector field V(x, y , z): R 3 R 3
The del or nabla () operator
i
+j
+k
x
y
z
is a vector,
it signifies a differentiation, and
it operates from the left side.
Laplacian operator:
2
2
2
+
+
x 2 y 2 z 2
= ??
Laplaces equation:
2 2 2
+ 2 + 2 =0
x 2
y
z
Solution of 2 = 0: harmonic function
493,

Integral Theorems
Closure
Gradient
grad =
i+
j+
k
x
y
z
is orthogonal to the level surfaces.

Flow fields: gives the velocity vector.
494,

Integral Theorems
Closure
Gradient
grad =
i+
j+
k
x
y
z
is orthogonal to the level surfaces.

Flow fields: gives the velocity vector.
Divergence
For V(x, y , z) Vx (x, y , z)i + Vy (x, y , z)j + Vz (x, y , z)k,
div V V =
Vy
Vz
Vx
+
+
x
y
z
Divergence of V: flow rate of mass per unit volume out of the

control volume.
Similar relation between field and flux in electromagnetics.
495,

Integral Theorems
Closure
Curl

i
j
k

curl V V = x y
z
V V V
x
y
z

Vx
Vy
Vy
Vz
Vx
Vz
i+
j+
k
=
y
z
z
x
x
y
If V = r represents the velocity field, then angular velocity
=
1
curl V.
2
Curl represents rotationality.

Connections between electric and magnetic fields!
496,

Integral Theorems
Closure
Composite operations
Operator is linear.
( + ) = + ,
(V + W) = V + W,
(V + W) = V + W.
and
497,

Integral Theorems
Closure
Operator is linear.
( + ) = + ,
(V + W) = V + W,
and
(V + W) = V + W.
Considering the products , V, V W, and V W;
() = +
(V) = V + V
(V) = V + V
(V W) = (W )V + (V )W + W ( V) + V ( W)
(V W) = W ( V) V ( W)
(V W) = (W )V W( V) (V )W + V( W)
498,

Integral Theorems
Closure
Operator is linear.
( + ) = + ,
(V + W) = V + W,
and
(V + W) = V + W.
Considering the products , V, V W, and V W;
() = +
(V) = V + V
(V) = V + V
(V W) = (W )V + (V )W + W ( V) + V ( W)
(V W) = W ( V) V ( W)
(V W) = (W )V W( V) (V )W + V( W)
+ Vy y
+ Vz z
is an operator!
Note: the expression V Vx x
499,

Integral Theorems
Second order differential operators
Closure
div grad ()
curl grad ()
div curl V ( V)
curl curl V ( V)
grad div V ( V)
500,

Integral Theorems
Second order differential operators
Closure
div grad ()
curl grad ()
div curl V ( V)
curl curl V ( V)
grad div V ( V)
Important identities:
div grad () = 2
curl grad () = 0
div curl V ( V) = 0
curl curl V ( V)
= ( V) 2 V = grad div V 2 V
501,
Operations on Field Functions

Integral Operations on Field FunctionsDifferential
Integral Theorems
Closure
Line integral along curve C :

Z
Z
V dr = (Vx dx + Vy dy + Vz dz)
I =
C
For a parametrized curve r(t), t [a, b],

Z b
Z
dr
V dt.
V dr =
I =
dt
a
C
502,

Integral Theorems
Closure
Line integral along curve C :

Z
Z
V dr = (Vx dx + Vy dy + Vz dz)
I =
C
For a parametrized curve r(t), t [a, b],

Z b
Z
dr
V dt.
V dr =
I =
dt
a
C
For simple (non-intersecting) paths contained in a simply
connected region, equivalent statements:
Vx dx + Vy dy + Vz dz is an exact differential.
V = for some (r).
R
C V dr is Hindependent of path.
Circulation
V dr = 0 around any closed path.
curl V = 0.
Field V is conservative.
503,

Integral Theorems
Closure
Surface integral over an orientable surface S:

Z Z
Z Z
J=
V dS =
V ndS
S
For r(u, w ), dS = kru rw k du dw and

Z Z
Z Z
J=
V ndS =
V (ru rw ) du dw .
S
504,

Integral Theorems
Closure
Surface integral over an orientable surface S:

Z Z
Z Z
J=
V dS =
V ndS
S
For r(u, w ), dS = kru rw k du dw and

Z Z
Z Z
J=
V ndS =
V (ru rw ) du dw .
S
Volume integrals of point functions over a region T :

Z Z Z
Z Z Z
M=
dv
and
F=
Vdv
T
505,
Integral Theorems
Greens theorem in the plane

Integral Theorems
Closure
R: closed bounded region in the xy -plane

C : boundary, a piecewise smooth closed curve
F1 (x, y ) and F2 (x, y ): first order continuous functions

Z Z
I
F2 F1
dx dy
(F1 dx + F2 dy ) =
x
y
R
C
506,
Integral Theorems

Integral Theorems
Closure
Greens theorem in the plane
R: closed bounded region in the xy -plane

C : boundary, a piecewise smooth closed curve
F1 (x, y ) and F2 (x, y ): first order continuous functions

Z Z
I
F2 F1
dx dy
(F1 dx + F2 dy ) =
x
y
R
C
y
y
D
y2(x)
x1(y)
R
B
R
A
c
O
x2(y)
y1(x)
b x
a
(a) Simple domain
x
(b) General domain
Figure: Regions for proof of Greens theorem in the plane
507,
Integral Theorems
Proof:
Z Z
R
F1
dxdy
y

Integral Theorems
Closure
Z
Z
y2 (x)
y1 (x)
a
b
F1
dydx
y
[F1 {x, y2 (x)} F1 {x, y1 (x)}]dx

Z b
Z a
F1 {x, y1 (x)}dx
F1 {x, y2 (x)}dx
=
a
b
I
F1 (x, y )dx
=
=
508,
Integral Theorems
Proof:
Z Z
R
F1
dxdy
y

Integral Theorems
Closure
Z
Z
y2 (x)
y1 (x)
a
b
F1
dydx
y
[F1 {x, y2 (x)} F1 {x, y1 (x)}]dx

Z b
Z a
F1 {x, y1 (x)}dx
F1 {x, y2 (x)}dx
=
a
b
I
F1 (x, y )dx
=
=
Z d Z x2 (y )
I
F2
F2
F2 (x, y )dy
dxdy =
dxdy =
x
C
c
R
x1 (y ) x
H
R R F2 F1
Difference: C (F1 dx + F2 dy ) = R
dx dy
x y
Z Z
509,
Integral Theorems
Proof:
Z Z
R
F1
dxdy
y

Integral Theorems
Closure
Z
Z
y2 (x)
y1 (x)
a
b
F1
dydx
y
[F1 {x, y2 (x)} F1 {x, y1 (x)}]dx

Z b
Z a
F1 {x, y1 (x)}dx
F1 {x, y2 (x)}dx
=
a
b
I
F1 (x, y )dx
=
=
Z d Z x2 (y )
I
F2
F2
F2 (x, y )dy
dxdy =
dxdy =
x
C
c
R
x1 (y ) x
H
R R F2 F1
Difference: C (F1 dx + F2 dy ) = R
dx dy
x y
H
R R
In alternative form, C F dr = R curl F k dx dy .
Z Z
510,
Integral Theorems

Integral Theorems
Closure
Gausss divergence theorem
T : a closed bounded region

S: boundary, a piecewise smooth closed orientable
surface
F(x, y , z): a first order continuous vector function
Z Z Z
Z Z
div Fdv =
F ndS
T
Interpretation of the definition extended to finite domains.
511,
Integral Theorems
512,

Integral Theorems
Closure

surface
Z Z Z
Z Z
div Fdv =
F ndS
T

Z Z Z
T
To show:
Fx
Fy
Fz
+
+
x
y
z
RR R
T
Fz
z dx
dy dz =
dx dy dz =
Z Z
S
R R
S
Fz nz dS
(Fx nx +Fy ny +Fz nz )dS
Integral Theorems
513,

Integral Theorems
Closure

surface
Z Z Z
Z Z
div Fdv =
F ndS
T

Z Z Z
T
Fx
Fy
Fz
+
+
x
y
z
dx dy dz =
Z Z
(Fx nx +Fy ny +Fz nz )dS
R R
R R R Fz
dx
dy
dz
=
Fz nz dS
To show:
z
S
T
First consider a region, the boundary of which is intersected at
most twice by any line parallel to a coordinate axis.
Integral Theorems

Integral Theorems
Closure
Lower and upper segments of S: z = z1 (x, y ) and z = z2 (x, y ).

Z Z Z
Z Z Z z2
Fz
Fz
dx dy dz =
dz dx dy
z
T
R
z1 z
Z Z
=
[Fz {x, y , z2 (x, y )} Fz {x, y , z1 (x, y )}]dx dy
R
R: projection of T on the xy -plane
514,
Integral Theorems

Integral Theorems
Closure

Z Z Z
Z Z Z z2
Fz
Fz
dx dy dz =
dz dx dy
z
T
R
z1 z
Z Z
=
R

Projection of area element of the upper segment: nz dS = dx dy
Projection of area element of the lower segment: nz dS = dx dy
515,
Integral Theorems

Integral Theorems
Closure

Z Z Z
Z Z Z z2
Fz
Fz
dx dy dz =
dz dx dy
z
T
R
z1 z
Z Z
=
R

R R
R R R Fz
Fz nz dS.
Thus,
z dx dy dz = S
T
Sum of three such components leads to the result.
516,
Integral Theorems

Integral Theorems
Closure

Z Z Z
Z Z Z z2
Fz
Fz
dx dy dz =
dz dx dy
z
T
R
z1 z
Z Z
=
R

R R
R R R Fz
Fz nz dS.
Thus,
z dx dy dz = S
T
Sum of three such components leads to the result.
Extension to arbitrary regions by a suitable subdivision of domain!
517,
Integral Theorems

Integral Theorems
Closure
Greens identities (theorem)

Region T and boundary S: as required in premises of
Gausss theorem
(x, y , z) and (x, y , z): second order continuous scalar
functions
Z Z
Z Z Z
ndS =
(2 + )dv
S
Z Z
Z ZT Z
( ) ndS =
(2 2 )dv
S
Direct consequences of Gausss theorem
518,
Integral Theorems

Integral Theorems
Closure
Greens identities (theorem)

Region T and boundary S: as required in premises of
Gausss theorem
(x, y , z) and (x, y , z): second order continuous scalar
functions
Z Z
Z Z Z
ndS =
(2 + )dv
S
Z Z
Z ZT Z
( ) ndS =
(2 2 )dv
S
Direct consequences of Gausss theorem

To establish, apply Gausss divergence theorem on , and then
on as well.
519,
Integral Theorems

Integral Theorems
Closure
Stokess theorem
S: a piecewise smooth surface
C : boundary, a piecewise smooth simple closed curve
F(x, y , z): first order continuous vector function
Z Z
I
F dr =
curl F ndS
C
n: unit normal given by the right hand clasp rule on C
520,
Integral Theorems
521,

Integral Theorems
Closure
Stokess theorem
Z Z
I
F dr =
curl F ndS
C

For F(x, y , z) = Fx (x, y , z)i,

Z Z
Z Z
I
Fx
Fx
Fx
Fx
j
k ndS =
ny
nz dS.
Fx dx =
z
y
z
y
S
S
C
Integral Theorems
522,

Integral Theorems
Closure
Stokess theorem
Z Z
I
F dr =
curl F ndS
C

For F(x, y , z) = Fx (x, y , z)i,

Z Z
Z Z
I
Fx
Fx
Fx
Fx
j
k ndS =
ny
nz dS.
Fx dx =
z
y
z
y
S
S
C
First, consider a surface S intersected at most once by any line
parallel to a coordinate axis.
Integral Theorems

Integral Theorems
Closure
Represent S as z = z(x, y ) f (x, y ).

f
Unit normal n = [nx ny nz ]T is proportional to [ x
ny = nz
z
y
f
y
1]T .
523,
Integral Theorems

Integral Theorems
Closure
Represent S as z = z(x, y ) f (x, y ).

f
Unit normal n = [nx ny nz ]T is proportional to [ x
ny = nz
Z Z
S
Fx
Fx
ny
nz
z
y
dS =
f
y
1]T .
z
y
Z Z
S
Fx z
Fx
+
y
z y
nz dS
Over projection R of S on xy -plane, (x, y ) = Fx (x, y , z(x, y )).

Z Z
I
I
Fx dx
LHS =
(x, y )dx =
dx dy =
y
R
C
C
Similar results for Fy (x, y , z)j and Fz (x, y , z)k.
524,
Points to note

Integral Theorems
Closure
The del operator
Gradient, divergence and curl
Composite and second order operators
Line, surface and volume intergals
Greens, Gausss and Stokess theorems
Applications in physics (and engineering)
525,
Outline
Basic Principles
Analytical Solution
General Polynomial Equations
Two Simultaneous Equations
Elimination Methods*
Advanced Techniques*
Basic Principles
Analytical Solution
526,
Basic Principles
Fundamental theorem of algebra
Basic Principles
Analytical Solution
p(x) = a0 x n + a1 x n1 + a2 x n2 + + an1 x + an
has exactly n roots x1 , x2 , , xn ; with
p(x) = a0 (x x1 )(x x2 )(x x3 ) (x xn ).
In general, roots are complex.
527,
Basic Principles
Basic Principles
Analytical Solution
p(x) = a0 (x x1 )(x x2 )(x x3 ) (x xn ).
Multiplicity: A root of p(x) with multiplicity k satisfies
p(x) = p (x) = p (x) = = p (k1) (x) = 0.
528,
Basic Principles
Basic Principles
Analytical Solution
p(x) = a0 (x x1 )(x x2 )(x x3 ) (x xn ).
Multiplicity: A root of p(x) with multiplicity k satisfies
p(x) = p (x) = p (x) = = p (k1) (x) = 0.
Descartes rule of signs

Bracketing and separation
Synthetic division and deflation
p(x) = f (x)q(x) + r (x)
529,
Analytical Solution
Quadratic equation
2
ax + bx + c = 0 x =
Basic Principles
Analytical Solution
b 2 4ac
2a
530,
Analytical Solution
Quadratic equation
2
ax + bx + c = 0 x =
Basic Principles
Analytical Solution
b 2 4ac
2a
Method of completing the square:

2

b
b2
c
b 2 b 2 4ac
b
2
= 2
=
x+
x + x+
a
2a
4a
a
2a
4a2
531,
Analytical Solution
Basic Principles
Analytical Solution
Quadratic equation
2
ax + bx + c = 0 x =
b 2 4ac
2a

2

b
b2
c
b 2 b 2 4ac
b
2
= 2
=
x+
x + x+
a
2a
4a
a
2a
4a2
Cubic equations (Cardano):
x 3 + ax 2 + bx + c = 0
Completing the cube?
532,
Analytical Solution
Basic Principles
Analytical Solution
Quadratic equation
2
ax + bx + c = 0 x =
b 2 4ac
2a

2

b
b2
c
b 2 b 2 4ac
b
2
= 2
=
x+
x + x+
a
2a
4a
a
2a
4a2
Cubic equations (Cardano):
x 3 + ax 2 + bx + c = 0
Completing the cube?
Substituting y = x + k,
y 3 + (a 3k)y 2 + (b 2ak + 3k 2 )y + (c bk + ak 2 k 3 ) = 0.
Choose the shift k = a/3.
533,
Analytical Solution
y 3 + py + q = 0
Basic Principles
Analytical Solution
534,
Analytical Solution
y 3 + py + q = 0
Basic Principles
Analytical Solution
Assuming y = u + v , we have y 3 = u 3 + v 3 + 3uv (u + v ).

uv
3
u +v
and hence
= p/3
= q
(u 3 v 3 )2 = q 2 +
4p 3
.
27
Solution:
q
u ,v =
2
3
q2 p3
+
= A, B (say).
4
27
535,
Analytical Solution
y 3 + py + q = 0
Basic Principles
Analytical Solution
Assuming y = u + v , we have y 3 = u 3 + v 3 + 3uv (u + v ).

uv
3
u +v
and hence
= p/3
= q
(u 3 v 3 )2 = q 2 +
4p 3
.
27
Solution:
q
u ,v =
2
3
q2 p3
+
= A, B (say).
4
27
u = A1 , A1 , A1 2 , and v = B1 , B1 , B1 2
y1 = A1 + B1 , y2 = A1 + B1 2 and y3 = A1 2 + B1 .
At least one of the solutions is real!!
536,
Analytical Solution
Basic Principles
Analytical Solution
Quartic equations (Ferrari)

4
x +ax +bx +cx +d = 0
a 2
x + x =
2
2

a2
b x 2 cx d
4
537,
Analytical Solution
Basic Principles
Analytical Solution

4
a 2
x + x =
2
2

a2
b x 2 cx d
4
For a perfect square,

y 2
a
=
x + x+
2
2
2
a2
b+y
4
x +
ay
2
c x +
y2
d
4
Under what condition, the new RHS will be a perfect square?
538,
Analytical Solution
Basic Principles
Analytical Solution

4
a 2
x + x =
2
2

a2
b x 2 cx d
4
For a perfect square,

y 2
a
=
x + x+
2
2
2
a2
b+y
4
x +
ay
2
c x +
y2
d
4
Under what condition, the new RHS will be a perfect square?

ay
2
2
a2
b+y
4

y2
d
4
=0
Resolvent of a quartic:
y 3 by 2 + (ac 4d)y + (4bd a2 d c 2 ) = 0
539,
Analytical Solution
Basic Principles
Analytical Solution
Procedure
Frame the cubic resolvent.
Solve this cubic equation.
Pick up one solution as y .
Insert this y to form

y 2
a
= (ex + f )2 .
x2 + x +
2
2
Split it into two quadratic equations as
y
a
x 2 + x + = (ex + f ).
2
2
Solve each of the two quadratic equations to obtain a total of

four solutions of the original quartic equation.
540,
Basic Principles
Analytical Solution
Analytical solution of the general quintic equation?
541,
Basic Principles
Analytical Solution

Galois: group theory:
A general quintic, or higher degree, equation is not

solvable by radicals.
542,
Basic Principles
Analytical Solution


General polynomial equations: iterative algorithms
Methods for nonlinear equations
Methods specific to polynomial equations
543,
Basic Principles
Analytical Solution


General polynomial equations: iterative algorithms
Methods for nonlinear equations
Methods specific to polynomial equations
Solution through the companion matrix

Roots of a polynomial are the same as the eigenvalues of
its companion matrix.
0 0 0
an
1 0 0 an1
..
Companion matrix: ... ... . . . ...
.
0 0 0
a2
0 0 1
a1
544,
Basic Principles
Analytical Solution
Bairstows method
to separate out factors of small degree.
Attempt to separate real linear factors?
545,
Basic Principles
Analytical Solution
Bairstows method
Real quadratic factors

Synthetic division with a guess factor x 2 + q1 x + q2 :
remainder r1 x + r2
r = [r1 r2 ]T is a vector function of q = [q1 q2 ]T .
Iterate over (q1 , q2 ) to make (r1 , r2 ) zero.
546,
Basic Principles
Analytical Solution
Bairstows method
Real quadratic factors

Synthetic division with a guess factor x 2 + q1 x + q2 :
remainder r1 x + r2
r = [r1 r2 ]T is a vector function of q = [q1 q2 ]T .
Iterate over (q1 , q2 ) to make (r1 , r2 ) zero.
Newton-Raphson (Jacobian based) iteration: see exercise.
547,
Basic Principles
Analytical Solution
p1 x 2 + q1 xy + r1 y 2 + u1 x + v1 y + w1 = 0
p2 x 2 + q2 xy + r2 y 2 + u2 x + v2 y + w2 = 0
Rearranging,
a1 x 2 + b1 x + c1 = 0
a2 x 2 + b2 x + c2 = 0
Cramers rule:
x2
x
1
=
=
b1 c2 b2 c1
a1 c2 a2 c1
a 1 b2 a 2 b1
a1 c2 a2 c1
b1 c2 b2 c1
=
x =
a1 c2 a2 c1
a1 b2 a2 b1
Consistency condition:
(a1 b2 a2 b1 )(b1 c2 b2 c1 ) (a1 c2 a2 c1 )2 = 0

A 4th degree equation in y
548,
Basic Principles
Analytical Solution
The method operates similarly even if the degrees of the original

equations in y are higher.
What about the degree of the eliminant equation?
549,
Basic Principles
Analytical Solution

Two equations in x and y of degrees n1 and n2 :
x-eliminant is an equation of degree n1 n2 in y
Maximum number of solutions:
Bezout number = n1 n2
Note: Deficient systems may have less number of solutions.
550,
Basic Principles
Analytical Solution

Two equations in x and y of degrees n1 and n2 :
x-eliminant is an equation of degree n1 n2 in y
Maximum number of solutions:
Bezout number = n1 n2
Note: Deficient systems may have less number of solutions.
Classical methods of elimination
Sylvesters dialytic method
Bezouts method
551,
Basic Principles
Analytical Solution
Three or more independent equations in as many unknowns?
Cascaded elimination?
Objections!
Exploitation of special structures through clever heuristics

(mechanisms kinematics literature)
Gr
obner basis representation
(algebraic geometry)
Continuation or homotopy method by Morgan

For solving the system f(x) = 0, identify another
structurally similar system g(x) = 0 with known
solutions and construct the parametrized system
h(x) = tf(x) + (1 t)g(x) = 0
for t [0, 1].
Track each solution from t = 0 to t = 1.
552,
Points to note
Roots of cubic and quartic polynomials by the methods of

Cardano and Ferrari
For higher degree polynomials,
Basic Principles
Analytical Solution
Bairstows method: a clever implementation of

Newton-Raphson method for polynomials
Eigenvalue problem of a companion matrix
Reduction of a system of polynomial equations in two

unknowns by elimination
553,
Outline

Methods for Nonlinear Equations
Systems of Nonlinear Equations
Closure

Closure
554,

Closure
Algebraic and transcendental equations in the form

f (x) = 0
Practical problem: to find one real root (zero) of f (x)
Example of f (x): x 3 2x + 5, x 3 ln x sin x + 2, etc.
555,

Closure
Algebraic and transcendental equations in the form

f (x) = 0
Practical problem: to find one real root (zero) of f (x)
Example of f (x): x 3 2x + 5, x 3 ln x sin x + 2, etc.
If f (x) is continuous, then
Bracketing: f (x0 )f (x1 ) < 0 there must be a root of f (x)
between x0 and x1 .
1
Bisection: Check the sign of f ( x0 +x
2 ). Replace either x0 or x1
1
with x0 +x
2 .
556,

Fixed point iteration
Rearrange f (x) = 0 in
the form x = g (x).
Example:
For f (x) = tan x x 3 2,
possible rearrangements:
g1 (x) = tan1 (x 3 + 2)
g2 (x) = (tan x 2)1/3
g3 (x) = tanxx2
2
Iteration: xk+1 = g (xk )

Closure
557,

Fixed point iteration

Closure
Rearrange f (x) = 0 in
the form x = g (x).
Example:
For f (x) = tan x x 3 2,
possible rearrangements:
g1 (x) = tan1 (x 3 + 2)
g2 (x) = (tan x 2)1/3
g3 (x) = tanxx2
2
Iteration: xk+1 = g (xk )
y=x
w
y = g(x)
a n
e
g
qx r
Figure: Fixed point iteration
If x is the unique solution in interval J and

|g (x)| h < 1 in J, then any x0 J converges to x .
558,

Newton-Raphson method
First order Taylor series
f (x + x) f (x) + f (x)x
From f (xk + x) = 0,
x = f (xk )/f (xk )
Iteration:
xk+1 = xk f (xk )/f (xk )
Convergence criterion:
|f (x)f (x)| < |f (x)|2

Closure
559,

f (x + x) f (x) + f (x)x
x = f (xk )/f (xk )
Iteration:
|f (x)f (x)| < |f (x)|2
Draw tangent to f (x).
Take its x-intercept.

Closure
f(x)
a
f
g
x*
d
x0
Figure: Newton-Raphson method
560,

f (x + x) f (x) + f (x)x
x = f (xk )/f (xk )
Iteration:
|f (x)f (x)| < |f (x)|2
Draw tangent to f (x).

Closure
f(x)
a
f
g
x*
d
x0
Figure: Newton-Raphson method
Merit: quadratic speed of convergence: |xk+1 x | = c|xk x |2

Demerit: If the starting point is not appropriate,
haphazard wandering, oscillations or outright divergence!
561,

Closure
Secant method and method of false position

In the Newton-Raphson formula,
f (x )f (x
)
f (x) xkk xk1k1
xk+1 = xk
xk xk1
f (xk )f (xk1 ) f
(xk )
Draw the chord or

secant to f (x) through
(xk1 , f (xk1 )) and (xk , f (xk )).
562,

Closure
Secant method and method of false position

f(x)
In the Newton-Raphson formula,

f (x )f (x
)
f (x) xkk xk1k1
xk+1 = xk
xk xk1
f (xk )f (xk1 ) f
f(x0)
(xk )
Draw the chord or

secant to f (x) through
(xk1 , f (xk1 )) and (xk , f (xk )).
x1
x2
x3
x*
x0
f(x1)
Figure: Method of false position
Special case: Maintain a bracket over the root at every iteration.

The method of false position or regula falsi
Convergence is guaranteed!
563,

Closure
Quadratic interpolation method or Muller method

Evaluate f (x) at three points
and model y = a + bx + cx 2 .
Set y = 0 and solve for x.
564,

Closure

Inverse quadratic interpolation
and model x = a + by + cy 2 .
Set y = 0 to get x = a.
565,

Closure

y
(x0,y0 )
Quadratic
Interpolation

Inverse
Quadratic
Interpolation
(x1 ,y1 )
x3
x3
(x2 ,y2)
Figure: Interpolation schemes
566,

Closure

y
(x0,y0 )
Quadratic
Interpolation

Inverse
Quadratic
Interpolation
(x1 ,y1 )
x3
x3
(x2 ,y2)
Figure: Interpolation schemes
Van Wijngaarden-Dekker Brent method
maintains the bracket,
uses inverse quadratic interpolation, and
accepts outcome if within bounds, else takes a bisection step.
Opportunistic manoeuvring between a fast method and a safe one!
567,

Closure
f1 (x1 , x2 , , xn ) = 0,
f2 (x1 , x2 , , xn ) = 0,
fn (x1 , x2 , , xn ) = 0.
f(x) = 0
568,

Closure
f1 (x1 , x2 , , xn ) = 0,
f2 (x1 , x2 , , xn ) = 0,
fn (x1 , x2 , , xn ) = 0.
f(x) = 0
Number of variables and number of equations?

No bracketing!
Fixed point iteration schemes x = g(x)?
569,

Closure
f1 (x1 , x2 , , xn ) = 0,
f2 (x1 , x2 , , xn ) = 0,
fn (x1 , x2 , , xn ) = 0.
f(x) = 0
Number of variables and number of equations?

No bracketing!
Fixed point iteration schemes x = g(x)?
Newtons method for systems of equations

f
(x) x + f(x) + J(x)x
f(x + x) = f(x) +
x
xk+1 = xk [J(xk )]1 f(xk )
with the usual merits and demerits!
570,
Closure

Closure
Modified Newtons method

xk+1 = xk k [J(xk )]1 f(xk )
Broydens secant method
Jacobian is not evaluated at every iteration, but gets
developed through updates.
571,
Closure

Closure

xk+1 = xk k [J(xk )]1 f(xk )
Broydens secant method
Jacobian is not evaluated at every iteration, but gets
developed through updates.
Optimization-based formulation
Global minimum of the function
kf(x)k2 = f12 + f22 + + fn2
Levenberg-Marquardt method
572,
Points to note

Closure
Iteration schemes for solving f (x) = 0
Newton (or Newton-Raphson) iteration for a system of

equations
xk+1 = xk [J(xk )]1 f(xk )
Optimization formulation of a multi-dimensional root finding

problem
573,
Outline
574,
The Methodology of Optimization

Single-Variable Optimization
Conceptual Background of Multivariate Optimization
575,

Parameters and variables
The statement of the optimization problem

Minimize f (x)
subject to g(x) 0,
h(x) = 0.
Optimization methods
Sensitivity analysis
Optimization problems: unconstrained and constrained
Optimization problems: linear and nonlinear
Single-variable and multi-variable problems
For a function f (x), a point x is defined as a relative (local)

minimum if such that f (x) f (x ) x [x , x + ].
f( x)
a x1
x2
x3
576,

x4
x5
x6
Figure: Schematic of optima of a univariate function
For a function f (x), a point x is defined as a relative (local)

minimum if such that f (x) f (x ) x [x , x + ].
f( x)
a x1
x2
x3
577,

x4
x5
x6
Figure: Schematic of optima of a univariate function
Optimality criteria
First order necessary condition: If x is a local minimum or
maximum point and if f (x ) exists, then f (x ) = 0.
Second order necessary condition: If x is a local minimum point
and f (x ) exists, then f (x ) 0.
Second order sufficient condition: If f (x ) = 0 and f (x ) > 0
then x is a local minimum point.
578,

Higher order analysis: From Taylors series,

f
= f (x + x) f (x )
1
1
1
= f (x )x + f (x )x 2 + f (x )x 3 + f iv (x )x 4 +
2!
3!
4!
For an extremum to occur at point x , the lowest order
derivative with non-zero value should be of even order.
579,

Higher order analysis: From Taylors series,

f
= f (x + x) f (x )
1
1
1
= f (x )x + f (x )x 2 + f (x )x 3 + f iv (x )x 4 +
2!
3!
4!
For an extremum to occur at point x , the lowest order
derivative with non-zero value should be of even order.
If f (x ) = 0, then
x is a stationary point, a candidate for an extremum.

Evaluate higher order derivatives till one of them is found to
be non-zero.
If its order is odd, then x is an inflection point.

If its order is even, then x is a local minimum or maximum,
as the derivative value is positive or negative, respectively.
Iterative methods of line search
Methods based on gradient root finding
Newtons method
xk+1 = xk
f (xk )
f (xk )
Secant method
xk+1 = xk
580,

xk xk1
f (xk )
(x
)
f
)
k
k1
(x
Method of cubic estimation

point of vanishing gradient of the cubic fit with
f (xk1 ), f (xk ), f (xk1 ) and f (xk )
Method of quadratic estimation

point of vanishing gradient of the quadratic fit
through three points
Disadvantage: treating all stationary points alike!
Bracketing:
x1 < x2 < x3 with f (x1 ) f (x2 ) f (x3 )
Exhaustive search method or its variants
581,

Bracketing:
x1 < x2 < x3 with f (x1 ) f (x2 ) f (x3 )
Exhaustive search method or its variants
Direct optimization algorithms
Fibonacci search uses a pre-defined number N, of function
evaluations, and the Fibonacci sequence
F0 = 1, F1 = 1, F2 = 2, , Fj = Fj2 + Fj1 ,
582,

to tighten a bracket with economized number of function

evaluations.
Golden section search uses a constant ratio
51
=
0.618,
2
the golden section ratio, of interval reduction, that is
determined as the limiting case of N and the actual
number of steps is decided by the accuracy desired.
583,

Conceptual Background of Multivariate
Optimization
Single-Variable
Optimization
Unconstrained minimization problem

x is called a local minimum of f (x) if such that
f (x) f (x ) for all x satisfying kx x k < .
584,

Optimization
Single-Variable
Optimization
Unconstrained minimization problem

x is called a local minimum of f (x) if such that
f (x) f (x ) for all x satisfying kx x k < .
Optimality criteria
From Taylors series,
1
f (x) f (x ) = [g(x )]T x + xT [H(x )]x + .
2
For x to be a local minimum,
necessary condition: g(x ) = 0 and H(x ) is positive semi-definite,
sufficient condition: g(x ) = 0 and H(x ) is positive definite.
Indefinite Hessian matrix characterizes a saddle point.
585,

Optimization
Single-Variable
Optimization
Convexity
Set S R n is a convex set if
x1 , x2 S and (0, 1), x1 + (1 )x2 S.
Function f (x) over a convex set S: a convex function if

x1 , x2 S and (0, 1),
f (x1 + (1 )x2 ) f (x1 ) + (1 )f (x2 ).
Chord approximation is an overestimate at intermediate points!
586,

Optimization
Single-Variable
Optimization
Convexity
Set S R n is a convex set if
x1 , x2 S and (0, 1), x1 + (1 )x2 S.
Function f (x) over a convex set S: a convex function if

x1 , x2 S and (0, 1),
f (x1 + (1 )x2 ) f (x1 ) + (1 )f (x2 ).
Chord approximation is an overestimate at intermediate points!

x2
f(x)
f(x2)
X2
f(x1)
X1
x1
Figure: A convex domain
x1
x2
Figure: A convex function
587,

Optimization
Single-Variable
Optimization
First order characterization of convexity

From f (x1 + (1 )x2 ) f (x1 ) + (1 )f (x2 ),
f (x1 ) f (x2 )
f (x2 + (x1 x2 )) f (x2 )

.
As 0, f (x1 ) f (x2 ) + [f (x2 )]T (x1 x2 ).

Tangent approximation is an underestimate at intermediate points!
588,

Optimization
Single-Variable
Optimization

From f (x1 + (1 )x2 ) f (x1 ) + (1 )f (x2 ),
f (x1 ) f (x2 )
f (x2 + (x1 x2 )) f (x2 )

.
As 0, f (x1 ) f (x2 ) + [f (x2 )]T (x1 x2 ).

Second order characterization: Hessian is positive semi-definite.
589,

Optimization
Single-Variable
Optimization

From f (x1 + (1 )x2 ) f (x1 ) + (1 )f (x2 ),
f (x1 ) f (x2 )
f (x2 + (x1 x2 )) f (x2 )

.
As 0, f (x1 ) f (x2 ) + [f (x2 )]T (x1 x2 ).

Second order characterization: Hessian is positive semi-definite.
Convex programming problem: convex function over convex set
A local minimum is also a global minimum, and all
minima are connected in a convex set.
Note: Convexity is a stronger condition than unimodality!
590,

Optimization
Single-Variable
Optimization
Quadratic function
1
q(x) = xT Ax + bT x + c
2
Gradient q(x) = Ax + b and Hessian = A is constant.
591,

Optimization
Single-Variable
Optimization
Quadratic function
1
2
If A is positive definite, then the unique solution of Ax = b

is the only minimum point.
If A is positive semi-definite and b Range(A), then the

entire subspace of solutions of Ax = b are global minima.
If A is positive semi-definite but b

/ Range(A), then the
function is unbounded!
592,

Optimization
Single-Variable
Optimization
Quadratic function
1
2
If A is positive definite, then the unique solution of Ax = b

is the only minimum point.
If A is positive semi-definite and b Range(A), then the

entire subspace of solutions of Ax = b are global minima.
If A is positive semi-definite but b

/ Range(A), then the
function is unbounded!
Note: A quadratic problem (with positive definite Hessian) acts as

a benchmark for optimization algorithms.
593,

Optimization
Single-Variable
Optimization
Optimization Algorithms
From the current point, move to another point, hopefully better.
594,

Optimization
Single-Variable
Optimization
Which way to go? How far to go? Which decision is first?
595,

Optimization
Single-Variable
Optimization
Which way to go? How far to go? Which decision is first?
Strategies and versions of algorithms:
Trust Region: Develop a local quadratic model
1
f (xk + x) = f (xk ) + [g(xk )]T x + xT Fk x,
2
and minimize it in a small trust region around xk .
(Define trust region with dummy boundaries.)
Line search: Identify a descent direction dk and minimize the
function along it through the univariate function
() = f (xk + dk ).
Exact or accurate line search
Inexact or inaccurate line search
Armijo, Goldstein and Wolfe conditions
596,

Optimization
Single-Variable
Optimization
Convergence of algorithms: notions of guarantee and speed

Global convergence: the ability of an algorithm to approach and
converge to an optimal solution for an arbitrary
problem, starting from an arbitrary point
Practically, a sequence (or even subsequence) of
monotonically decreasing errors is enough.
Local convergence: the rate/speed of approach, measured by p,
where
kxk+1 x k
<
= lim
k kxk x kp
Linear, quadratic and superlinear rates of

convergence for p = 1, 2 and intermediate.
Comparison among algorithms with linear rates
of convergence is by the convergence ratio .
Points to note
Theory and methods of single-variable optimization
Optimality criteria in multivariate optimization
Convexity in optimization
The quadratic function
Trust region
Line search
Global and local convergence
597,

Outline
Direct Methods
Steepest Descent (Cauchy) Method
Newtons Method
Hybrid (Levenberg-Marquardt) Method
Least Square Problems
Direct Methods
Newtons Method
598,
Direct Methods
Direct Methods
Newtons Method
Direct search methods using only function values
Cyclic coordinate search
Rosenbrocks method
Hooke-Jeeves pattern search
Boxs complex method
Nelder and Meads simplex search
Powells conjugate directions method
Useful for functions, for which derivative either does not exist at all
points in the domain or is computationally costly to evaluate.
599,
Direct Methods
Direct Methods
Newtons Method
Direct search methods using only function values
Cyclic coordinate search
Rosenbrocks method
Hooke-Jeeves pattern search
Boxs complex method
Nelder and Meads simplex search
Powells conjugate directions method
Useful for functions, for which derivative either does not exist at all
points in the domain or is computationally costly to evaluate.
Note: When derivatives are easily available, gradient-based
algorithms appear as mainstream methods.
600,
Direct Methods
Direct Methods
Newtons Method
Nelder and Meads simplex method

Simplex in n-dimensional space: polytope formed by n + 1 vertices
Nelder and Meads method iterates over simplices that are
non-degenerate (i.e. enclosing non-zero hypervolume).
601,
Direct Methods
Direct Methods
Newtons Method

First, n + 1 suitable points are selected for the starting simplex.
Among vertices of the current simplex, identify the worst point xw ,
the best point xb and the second worst point xs .
Need to replace xw with a good point.
602,
Direct Methods
Direct Methods
Newtons Method

First, n + 1 suitable points are selected for the starting simplex.
Among vertices of the current simplex, identify the worst point xw ,
the best point xb and the second worst point xs .
Need to replace xw with a good point.
Centre of gravity of the face not containing xw :
1
xc =
n
n+1
X
xi
i =1,i 6=w
Reflect xw with respect to xc as xr = 2xc xw . Consider options.
603,
Direct Methods
Direct Methods
Newtons Method
Default xnew = xr .
Revision possibilities:
f(xb)
Expansion
f(x s )
Default
f(xw)
Positive
Contraction
Negative
Contraction
x new
xw
xr
x new
xw
x r = xnew
xw
xr
xw
Figure: Nelder and Meads simplex method
xr
604,
Direct Methods
Direct Methods
Newtons Method
Default xnew = xr .
Revision possibilities:
f(xb)
Expansion
f(x s )
Default
f(xw)
Positive
Contraction
Negative
Contraction
x new
xw
xr
x new
xw
x r = xnew
xw
xr
xw
Figure: Nelder and Meads simplex method
1. For f (xr ) < f (xb ), expansion:

xnew = xc + (xc xw ), > 1.
2. For f (xr ) f (xw ), negative contraction:
xnew = xc (xc xw ), 0 < < 1.
3. For f (xs ) < f (xr ) < f (xw ), positive contraction:
xnew = xc + (xc xw ), with 0 < < 1.
Replace xw with xnew . Continue with new simplex.
xr
605,
Direct Methods
Newtons Method
From a point xk , a move through units in direction dk :

f (xk + dk ) = f (xk ) + [g(xk )]T dk + O(2 )
Descent direction dk : For > 0, [g(xk )]T dk < 0
Direction of steepest descent: dk = gk
[ or
dk = gk /kgk k]
606,
Direct Methods
Newtons Method
From a point xk , a move through units in direction dk :

f (xk + dk ) = f (xk ) + [g(xk )]T dk + O(2 )
Descent direction dk : For > 0, [g(xk )]T dk < 0
Direction of steepest descent: dk = gk
[ or
dk = gk /kgk k]
Minimize
() = f (xk + dk ).
Exact line search:
(k ) = [g(xk + k dk )]T dk = 0
Search direction tangential to the contour surface at (xk + k dk ).
Note: Next direction dk+1 = g(xk+1 ) orthogonal to dk
607,
Direct Methods
Newtons Method
Steepest descent algorithm

1. Select a starting point x0 , set k = 0 and several parameters:
tolerance G on gradient, absolute tolerance A on reduction
in function value, relative tolerance R on reduction in
function value and maximum number of iterations M.
2. If kgk k G , STOP. Else dk = gk /kgk k.
3. Line search: Obtain k by minimizing () = f (xk + dk ),

> 0. Update xk+1 = xk + k dk .
4. If |f (xk+1 ) f (xk )| A + R |f (xk )|,STOP. Else k k + 1.
5. If k > M, STOP. Else go to step 2.
608,
Direct Methods
Newtons Method
Steepest descent algorithm

1. Select a starting point x0 , set k = 0 and several parameters:
tolerance G on gradient, absolute tolerance A on reduction
in function value, relative tolerance R on reduction in
function value and maximum number of iterations M.
2. If kgk k G , STOP. Else dk = gk /kgk k.
3. Line search: Obtain k by minimizing () = f (xk + dk ),

> 0. Update xk+1 = xk + k dk .
4. If |f (xk+1 ) f (xk )| A + R |f (xk )|,STOP. Else k k + 1.
5. If k > M, STOP. Else go to step 2.
Very good global convergence.
But, why so many STOPS?
609,

Analysis on a quadratic function
Direct Methods
Newtons Method
For minimizing q(x) = 21 xT Ax + bT x, the error function:

1
(x x )T A(x x )
2

(A)1 2
E (xk+1 )
Convergence ratio: E (xk ) (A)+1
E (x) =
Local convergence is poor.
610,

Analysis on a quadratic function
Direct Methods
Newtons Method
For minimizing q(x) = 21 xT Ax + bT x, the error function:

1
(x x )T A(x x )
2

(A)1 2
E (xk+1 )
Convergence ratio: E (xk ) (A)+1
E (x) =
Local convergence is poor.
Importance of steepest descent method
conceptual understanding
initial iterations in a completely new problem
spacer steps in other sophisticated methods
Re-scaling of the problem through change of variables?
611,
Newtons Method
Second order approximation of a function:
Direct Methods
Newtons Method
1
f (x) f (xk ) + [g(xk )]T (x xk ) + (x xk )T H(xk )(x xk )
2
Vanishing of gradient
g(x) g(xk ) + H(xk )(x xk )
gives the iteration formula
xk+1 = xk [H(xk )]1 g(xk ).
Excellent local convergence property!
kxk+1 x k

kxk x k2
612,
Newtons Method
Second order approximation of a function:
Direct Methods
Newtons Method
1
f (x) f (xk ) + [g(xk )]T (x xk ) + (x xk )T H(xk )(x xk )
2
Vanishing of gradient
g(x) g(xk ) + H(xk )(x xk )
gives the iteration formula
xk+1 = xk [H(xk )]1 g(xk ).
Excellent local convergence property!
kxk+1 x k

kxk x k2
Caution: Does not have global convergence.
If H(xk ) is positive definite then dk = [H(xk )]1 g(xk )
is a descent direction.
613,
Newtons Method
Direct Methods
Newtons Method
Replace the Hessian by Fk = H(xk ) + I .
Replace full Newtons step by a line search.
Algorithm
1. Select x0 , tolerance and > 0. Set k = 0.
2. Evaluate gk = g(xk ) and H(xk ).
Choose , find Fk = H(xk ) + I , solve Fk dk = gk for dk .
3. Line search: obtain k to minimize () = f (xk + dk ).
Update xk+1 = xk + k dk .
4. Check convergence: If |f (xk+1 ) f (xk )| < , STOP.
Else, k k + 1 and go to step 2.
614,
Direct Methods
Newtons Method

Methods of deflected gradients
xk+1 = xk k [Mk ]gk
identity matrix in place of Mk : steepest descent step
Mk = F1
k : step of modified Newtons method
Mk = [H(xk )]1 and k = 1: pure Newtons step
615,
Direct Methods
Newtons Method

xk+1 = xk k [Mk ]gk
Mk = F1
In Mk = [H(xk ) + k I ]1 , tune parameter k over iterations.
Initial value of : large enough to favour steepest descent

trend
Improvement in an iteration: reduced by a factor
Increase in function value: step rejected and increased
Opportunism systematized!
616,
Direct Methods
Newtons Method

xk+1 = xk k [Mk ]gk
Mk = F1
In Mk = [H(xk ) + k I ]1 , tune parameter k over iterations.
Initial value of : large enough to favour steepest descent

trend
Improvement in an iteration: reduced by a factor
Increase in function value: step rejected and increased
Opportunism systematized!
Note: Cost of evaluating the Hessian remains a bottleneck.
Useful for problems where Hessian estimates come cheap!
617,
Direct Methods
Newtons Method
Linear least square problem:

y () = x1 1 () + x2 2 () + + xn n ()
For measured values y (i ) = yi ,
ei =
n
X
k=1
xk k (i ) yi = [(i )]T x yi .
Error vector: e = Ax y
Last square fit:
Minimize E =
1
2
2
i ei
= 12 eT e
Pseudoinverse solution and its variants
618,
Direct Methods
Newtons Method
Nonlinear least square problem

For model function in the form
y () = f (, x) = f (, x1 , x2 , , xn ),
square error function
E (x) =
1X 2 1X
1 T
ei =
[f (i , x) yi ]2
e e=
2
2
2
i
619,
Direct Methods
Newtons Method

y () = f (, x) = f (, x1 , x2 , , xn ),
E (x) =
1X 2 1X
1 T
ei =
[f (i , x) yi ]2
e e=
2
2
2
i
(i , x) yi ]f (i , x) = JT e
P
2
2
T
T
Hessian: H(x) = x
2 E (x) = J J +
i ei x 2 f (i , x) J J
Gradient: g(x) = E (x) =
i [f
620,
Direct Methods
Newtons Method

y () = f (, x) = f (, x1 , x2 , , xn ),
E (x) =
1X 2 1X
1 T
ei =
[f (i , x) yi ]2
e e=
2
2
2
i
(i , x) yi ]f (i , x) = JT e
P
2
2
T
T
Hessian: H(x) = x
2 E (x) = J J +
i ei x 2 f (i , x) J J
Gradient: g(x) = E (x) =
i [f
Combining a modified form diag(JT J) x = g(x) of steepest

descent formula with Newtons formula,
Levenberg-Marquardt step: [JT J + diag(JT J)]x = g(x)
621,
Direct Methods
Newtons Method
Levenberg-Marquardt algorithm
1. Select x0 , evaluate E (x0 ). Select tolerance , initial and its
update factor. Set k = 0.
k = JT J + diag(JT J).
2. Evaluate gk and H
k x = gk . Evaluate E (xk + x).
Solve H
3. If |E (xk + x) E (xk )| < , STOP.
4. If E (xk + x) < E (xk ), then decrease ,

update xk+1 = xk + x, k k + 1.
Else increase .
5. Go to step 2.
Professional procedure for nonlinear least square problems and also
for solving systems of nonlinear equations in the form h(x) = 0.
622,
Points to note
Direct Methods
Newtons Method
Simplex method of Nelder and Mead
Steepest descent method with its global convergence
Newtons method for fast local convergence
Levenberg-Marquardt method for equation solving and least

squares
Necessary Exercises: 1,2,3,4,5,6
623,
Outline

Conjugate Direction Methods
Quasi-Newton Methods
Closure

Closure
624,

Closure
Conjugacy of directions:
Two vectors d1 and d2 are mutually conjugate with
respect to a symmetric matrix A, if dT
1 Ad2 = 0.
Linear independence of conjugate directions:
Conjugate directions with respect to a positive definite
matrix are linearly independent.
625,

Closure
Conjugacy of directions:
Two vectors d1 and d2 are mutually conjugate with
respect to a symmetric matrix A, if dT
1 Ad2 = 0.
Linear independence of conjugate directions:
Conjugate directions with respect to a positive definite
matrix are linearly independent.
Expanding subspace property: In R n , with conjugate vectors
{d0 , d1 , , dn1 } with respect to symmetric positive definite A,
for any x0 R n , the sequence {x0 , x1 , x2 , , xn } generated as
xk+1 = xk + k dk ,
with k =
gkT dk
,
dT
k Adk
where gk = Axk + b, has the property that

xk minimizes q(x) = 12 xT Ax + bT x on the line
xk1 + dk1 , as well as on the linear variety x0 + Bk ,
where Bk is the span of d0 , d1 , , dk1 .
626,

Closure
Question: How to find a set of n conjugate directions?

Gram-Schmidt procedure is a poor option!
627,

Closure
Question: How to find a set of n conjugate directions?

Gram-Schmidt procedure is a poor option!
Starting from d0 = g0 ,
dk+1 = gk+1 + k dk
Imposing the condition of conjugacy of dk+1 with dk ,
k =
T Ad
gk+1
k
dT
k Adk
T (g
gk+1
k+1 gk )
k dT
k Adk
Resulting dk+1 conjugate to all the earlier directions, for

a quadratic problem.
628,

Closure
Using k in place of k + 1 in the formula for dk+1 ,

dk = gk + k1 dk1
gkT dk = gkT gk and k =
gkT gk
dT
k Adk
629,

Closure

dk = gk + k1 dk1
Polak-Ribiere formula:
k =
No need to know A!
T (g
gk+1
k+1 gk )
gkT gk
gkT gk
dT
k Adk
630,

Closure

dk = gk + k1 dk1
gkT gk
dT
k Adk
Polak-Ribiere formula:
k =
T (g
gk+1
k+1 gk )
gkT gk
No need to know A!
Further,
T
T
gk+1
dk = 0 gk+1
gk = k1 (gkT + k dT
k A)dk1 = 0.
Fletcher-Reeves formula:
k =
T g
gk+1
k+1
gkT gk
631,

Closure
Extension to general (non-quadratic) functions

Varying Hessian A: determine the step size by line search.
After n steps, minimum not attained.
But, gkT dk = gkT gk implies guaranteed descent.
Globally convergent, with superlinear rate of convergence.
What to do after n steps? Restart or continue?
632,

Closure
Extension to general (non-quadratic) functions

Varying Hessian A: determine the step size by line search.
After n steps, minimum not attained.
But, gkT dk = gkT gk implies guaranteed descent.
Globally convergent, with superlinear rate of convergence.
What to do after n steps? Restart or continue?
Algorithm
1. Select x0 and tolerances G , D . Evaluate g0 = f (x0 ).
2. Set k = 0 and dk = gk .
3. Line search: find k ; update xk+1 = xk + k dk .
4. Evaluate gk+1 = f (xk+1 ). If kgk+1 k G , STOP.
T (g
gk+1
k+1 gk )
(Polak-Ribiere)
gkT gk
T
gk+1 gk+1
(Fletcher-Reeves).
gkT gk
5. Find k =
or k =
Obtain dk+1 = g k+1 + k dk .
dT d

6. If 1 kdk kk kdk+1
< D , reset g0 = gk+1 and go to step 2.
k+1 k
Else, k k + 1 and go to step 3.
633,

Closure
Powells conjugate direction method

For q(x) = 12 xT Ax + bT x, suppose
x1 = xA + 1 d such that dT g1 = 0 and
x2 = xB + 2 d such that dT g2 = 0.
Then, dT A(x2 x1 ) = dT (g2 g1 ) = 0.
634,
635,

Closure
Powells conjugate direction method

For q(x) = 12 xT Ax + bT x, suppose
x1 = xA + 1 d such that dT g1 = 0 and
x2 = xB + 2 d such that dT g2 = 0.
Then, dT A(x2 x1 ) = dT (g2 g1 ) = 0.
Parallel subspace property: In R n , consider two parallel

linear varieties S1 = v1 + Bk and S2 = v2 + Bk , with
Bk = {d1 , d2 , , dk }, k < n.
If x1 and x2
minimize q(x) = 12 xT Ax+bT x on S1 and S2 , respectively,
then x2 x1 is conjugate to d1 , d2 , , dk .
Assumptions imply g1 , g2 Bk and hence

T
(g2 g1 ) Bk dT
i A(x2 x1 ) = di (g2 g1 ) = 0 for i = 1, 2, , k.

Closure
Algoithm
1. Select x0 , and a set of n linearly independent (preferably
normalized) directions d1 , d2 , , dn ; possibly di = ei .
2. Line search along dn and update x1 = x0 + dn ; set k = 1.

3. Line searches
P along d1 , d2 , , dn in sequence to obtain
z = xk + nj=1 j dj .
4. New conjugate direction d = z xk . If kdk < , STOP.
5. Reassign directions dj dj+1 for j = 1, 2, , (n 1) and

dn = d/kdk.
(Old d1 gets discarded at this step.)
6. Line search and update xk+1 = z + dn ; set k k + 1 and
go to step 3.
636,

Closure
x0 -x1 and b-z1 : x1 -z1 is conjugate to b-z1 .

b-z1 -x2 and c-d-z2 : c-d, d-z2 and x2 -z2 are mutually
conjugate.
x3
x3
z2
x2
c
x2
z1
x1
x1
b
x0
Figure: Schematic of Powells conjugate direction method
637,

Closure
x0 -x1 and b-z1 : x1 -z1 is conjugate to b-z1 .

b-z1 -x2 and c-d-z2 : c-d, d-z2 and x2 -z2 are mutually
conjugate.
x3
x3
z2
x2
c
x2
z1
x1
x1
b
x0
Figure: Schematic of Powells conjugate direction method
Performance of Powells method approaches that of the

conjugate gradient method!
638,

Closure
Variable metric methods

attempt to construct the inverse Hessian Bk .
pk = xk+1 xk
and
qk = gk+1 gk
qk Hpk
With n such steps, B = PQ1 : update and construct Bk H1 .
639,

Closure

pk = xk+1 xk
and
qk = gk+1 gk
qk Hpk

Rank one correction: Bk+1 = Bk + ak zk zT
k?
640,

Closure

pk = xk+1 xk
and
qk = gk+1 gk
qk Hpk

k?
Rank two correction:
T
Bk+1 = Bk + ak zk zT
k + bk w k w k
Davidon-Fletcher-Powell (DFP) method
641,

Closure

pk = xk+1 xk
and
qk = gk+1 gk
qk Hpk

k?
Rank two correction:
T
Bk+1 = Bk + ak zk zT
k + bk w k w k
Davidon-Fletcher-Powell (DFP) method

Select x0 , tolerance and B0 = In . For k = 0, 1, 2, ,
dk = Bk gk .
Line search for k ; update pk = k dk , xk+1 = xk + pk ,
qk = gk+1 gk .
If kpk k < or kqk k < , STOP.
Rank two correction: BDFP

k+1 = Bk +
pk pT
k
pT
k qk
Bk qk qT
k Bk
.
qT
k Bk qk
642,

Closure
Properties of DFP iterations:

1. If Bk is symmetric and positive definite, then so is Bk+1 .
2. For quadratic function with positive definite Hessian H,
and
pT
i Hpj = 0
for
Bk+1 Hpi = pi
for
0 i < j k,
0 i k.
643,

Closure

and
pT
i Hpj = 0
for
Bk+1 Hpi = pi
for
0 i < j k,
0 i k.
Implications:
1. Positive definiteness of inverse Hessian estimate is never lost.
2. Successive search directions are conjugate directions.
3. With B0 = I, the algorithm is a conjugate gradient method.
4. For a quadratic problem, the inverse Hessian gets completely
constructed after n steps.
644,

Closure

and
pT
i Hpj = 0
for
Bk+1 Hpi = pi
for
0 i < j k,
0 i k.
Implications:
1. Positive definiteness of inverse Hessian estimate is never lost.
2. Successive search directions are conjugate directions.
3. With B0 = I, the algorithm is a conjugate gradient method.
4. For a quadratic problem, the inverse Hessian gets completely
constructed after n steps.
Variants: Broyden-Fletcher-Goldfarb-Shanno (BFGS)
method and the Broyden family of methods
645,
Closure
646,

Closure
Table 23.1: Summary of performance of optimization methods

Cauchy
(Steepest
Descent)
Newton
Levenberg-Marquardt
(Hybrid)
(Deflected Gradient)
DFP/BFGS
(Quasi-Newton)
(Variable Metric)
FR/PR
(Conjugate
Gradient)
Powell
(Direction
Set)
N
Indefinite
N
Unknown
n2
Nf
Ng
2f
2g
1H
Nf
Ng
NH
(n + 1)f
(n + 1)g
(n + 1)f
(n + 1)g
n2 f
N (2n + 1)
2n2 + 2n + 1
N (2n2 + 1)
2n2 + 3n + 1
2n2 + 3n + 1
n2
Line searches
N or 0
n2
Storage
Performance in
general problems
Practically good for
Vector
Matrix
Matrix
Matrix
Vector
Matrix
Slow
Unknown
start-up
Risky
Good
functions
Costly
NL Eqn. systems
NL least squares
Flexible
Bad
functions
Good
Large
problems
Okay
Small
problems
For Quadratic
Problems:
Convergence steps
Evaluations
Equivalent function
evaluations
Points to note

Closure
Conjugate directions and the expanding subspace property
Powell-Smith direction set method
The quasi-Newton concept in professional optimization
647,
Outline
Constraints
Optimality Criteria
Sensitivity
Duality*
Structure of Methods: An Overview*
Constraints
Optimality Criteria
Sensitivity
Duality*
648,
Constraints
Constrained optimization problem:
Minimize
f (x)
subject to gi (x) 0
and hj (x) = 0
Constraints
Optimality Criteria
Sensitivity
Duality*
for i = 1, 2, , l ,
for j = 1, 2, , m,
Conceptually, minimize f (x), x .
or g(x) 0;
or h(x) = 0.
649,
Constraints
Minimize
f (x)
subject to gi (x) 0
and hj (x) = 0
Constraints
Optimality Criteria
Sensitivity
Duality*
for i = 1, 2, , l ,
for j = 1, 2, , m,
or g(x) 0;
or h(x) = 0.

Equality constraints reduce the domain to a surface or a manifold,
possessing a tangent plane at every point.
650,
Constraints
Minimize
f (x)
subject to gi (x) 0
and hj (x) = 0
Constraints
Optimality Criteria
Sensitivity
Duality*
for i = 1, 2, , l ,
for j = 1, 2, , m,
or g(x) 0;
or h(x) = 0.

Equality constraints reduce the domain to a surface or a manifold,
possessing a tangent plane at every point.
Gradient of the vector function h(x):
T
h(x) [h1 (x) h2 (x) hm (x)]
related to the usual Jacobian as Jh (x) =
h
x
h
x1
hT
x2
..
.
hT
xn
= [h(x)]T .
651,
Constraints
Constraint qualification
Constraints
Optimality Criteria
Sensitivity
Duality*
h1 (x), h2 (x) etc are linearly independent, i.e. h(x) is

full-rank.
If a feasible point x0 , with h(x0 ) = 0, satisfies the constraint
qualification condition, we call it a regular point.
652,
Constraints
Constraints
Optimality Criteria
Sensitivity
Duality*

full-rank.
At a regular feasible point x0 , tangent plane
M = {y : [h(x0 )]T y = 0}
gives the collection of feasible directions.
653,
Constraints
Constraints
Optimality Criteria
Sensitivity
Duality*

full-rank.
At a regular feasible point x0 , tangent plane
M = {y : [h(x0 )]T y = 0}
gives the collection of feasible directions.
Equality constraints reduce the dimension of the problem.
Variable elimination?
654,
Constraints
Active inequality constraints gi (x0 ) = 0:
Constraints
Optimality Criteria
Sensitivity
Duality*
included among hj (x0 )

for the tangent plane.
Cone of feasible directions:
[h(x0 )]T d = 0
and [gi (x0 )]T d 0 for i I
where I is the set of indices of active inequality constraints.
655,
Constraints
Active inequality constraints gi (x0 ) = 0:
Constraints
Optimality Criteria
Sensitivity
Duality*
included among hj (x0 )

for the tangent plane.
Cone of feasible directions:
[h(x0 )]T d = 0
and [gi (x0 )]T d 0 for i I
where I is the set of indices of active inequality constraints.

Handling inequality constraints:
Active set strategy maintains a list of active constraints,

keeps checking at every step for a change of scenario and
updates the list by inclusions and exclusions.
Slack variable strategy replaces all the inequality constraints

by equality constraints as gi (x) + xn+i = 0 with the inclusion
of non-negative slack variables (xn+i ).
656,
Optimality Criteria
Suppose x is a regular point with
Constraints
Optimality Criteria
Sensitivity
Duality*
active inequality constraints: g(a) (x) 0

inactive constraints: g(i ) (x) 0
Columns of h(x ) and g(a) (x ): basis for orthogonal

complement of the tangent plane
657,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*


Basis of the tangent plane: D = [d1
Then, [D
h(x )
g(a) (x )]:
d2
basis of
Rn
dk ]
658,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*


Then, [D
h(x )
g(a) (x )]:
d2
basis of
dk ]
Rn
Now, f (x ) is a vector in R n .
f (x ) = [D
z
h(x ) g(a) (x )]
(a)
with unique z, and (a) for a given f (x ).
659,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*


Then, [D
h(x )
g(a) (x )]:
d2
basis of
dk ]
Rn
Now, f (x ) is a vector in R n .
f (x ) = [D
z
h(x ) g(a) (x )]
(a)
with unique z, and (a) for a given f (x ).
What can you say if x is a solution to the NLP problem?
660,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*
Components of f (x ) in the tangent plane must be zero.

z=0
f (x ) = [h(x )] + [g(a) (x )](a)
For inactive constraints, insisting on (i ) = 0,
f (x ) = [h(x )] + [g
(a)
(x )
(i )
g (x )]
or
f (x ) + [h(x )] + [g(x )] = 0

(a)
(a)
g (x)
.
and =
where g(x) =
(i )
g(i ) (x)
(a)
(i )
661,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*
Components of f (x ) in the tangent plane must be zero.

z=0
f (x ) = [h(x )] + [g(a) (x )](a)
For inactive constraints, insisting on (i ) = 0,
f (x ) = [h(x )] + [g
(a)
(x )
(i )
g (x )]
(a)
(i )
or
f (x ) + [h(x )] + [g(x )] = 0

(a)
(a)
g (x)
.
and =
where g(x) =
(i )
g(i ) (x)
Notice: g(a) (x ) = 0 and (i ) = 0
i gi (x ) = 0 i , or
T g(x ) = 0.
Now, components in g(x) are free to appear in any order.
662,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*
Finally, what about the feasible directions in the cone?
663,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*
Answer: Negative gradient f (x ) can have no component

(a)
(a)
towards decreasing gi (x), i.e. i 0, i .
(i )
Combining it with i = 0,
0.
664,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*

(a)
(a)
(i )
0.
First order necessary conditions or Karusch-Kuhn-Tucker

(KKT) conditions: If x is a regular point of the constraints and
a solution to the NLP problem, then there exist Lagrange
multiplier vectors, and , such that
Optimality: f (x ) + [h(x )] + [g(x )] = 0, 0;
Feasibility:
h(x ) = 0, g(x ) 0;
Complementarity:
T g(x ) = 0.
665,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*

(a)
(a)
(i )
0.
First order necessary conditions or Karusch-Kuhn-Tucker

(KKT) conditions: If x is a regular point of the constraints and
a solution to the NLP problem, then there exist Lagrange
multiplier vectors, and , such that
Optimality: f (x ) + [h(x )] + [g(x )] = 0, 0;
Feasibility:
h(x ) = 0, g(x ) 0;
Complementarity:
T g(x ) = 0.
Convex programming problem: Convex objective function f (x)
and convex domain (convex gi (x) and linear hj (x)):
KKT conditions are sufficient as well!
666,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*
Lagrangian function:
L(x, , ) = f (x) + T h(x) + T g(x)

Necessary conditions for a stationary point of the Lagrangian:
x L = 0,
L = 0
667,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*
Lagrangian function:
L(x, , ) = f (x) + T h(x) + T g(x)

Necessary conditions for a stationary point of the Lagrangian:
x L = 0,
L = 0
Second order conditions

Consider curve z(t) in the tangent plane with z(0) = x .

d2

f
(z(t))
=

dt 2
t=0

d
T
[f (z(t)) z (t)]
dt
t=0
= z (0)T H(x )z (0) + [f (x )]T

z (0) 0
Similarly, from hj (z(t)) = 0,

z (0)T Hhj (x )z (0) + [hj (x )]T
z (0) = 0.
668,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*
Including contributions from all active constraints,

d2
f (z(t))
= z (0)T HL (x )z (0) + [x L(x , , )]T
z (0) 0,
2
dt
t=0
where HL (x) =
2L
x 2
= H(x) +
j Hhj (x) +
i Hgi (x).
669,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*
Including contributions from all active constraints,

d2
f (z(t))
= z (0)T HL (x )z (0) + [x L(x , , )]T
z (0) 0,
2
dt
t=0
where HL (x) =
2L
x 2
= H(x) +
j Hhj (x) +
i Hgi (x).
First order necessary condition makes the second term vanish!

Second order necessary condition:
The Hessian matrix of the Lagrangian function is positive
semi-definite on the tangent plane M.
Sufficient condition: x L = 0 and HL (x) positive definite on M.
Restriction of the mapping HL (x ) : R n R n on subspace M?
670,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*
Take y M, operate HL (x ) on it, project the image back to M.

Restricted mapping LM : M M
Question: Matrix representation for LM of size (n m) (n m)?
671,
Optimality Criteria
Constraints
Optimality Criteria
Sensitivity
Duality*
Take y M, operate HL (x ) on it, project the image back to M.

Restricted mapping LM : M M
Question: Matrix representation for LM of size (n m) (n m)?

Select local orthonormal basis D R n(nm) for M.
For arbitrary z R nm , map y = Dz R n as HL y = HL Dz.
Its component along di : dT
i HL Dz
Hence, projection back on M:
LM z = DT HL Dz,
The (n m) (n m) matrix LM = DT HL D: the restriction!
Second order necessary/sufficient condition: LM p.s.d./p.d.
672,
Sensitivity
Constraints
Optimality Criteria
Sensitivity
Duality*
Suppose original objective and constraint functions as

f (x, p), g(x, p) and h(x, p)
By choosing parameters (p), we arrive at x . Call it x (p).

Question: How does f (x (p), p) depend on p?
673,
Sensitivity
Constraints
Optimality Criteria
Sensitivity
Duality*
Suppose original objective and constraint functions as

f (x, p), g(x, p) and h(x, p)
By choosing parameters (p), we arrive at x . Call it x (p).

Question: How does f (x (p), p) depend on p?
Total gradients
p f (x (p), p) = p x (p)x f (x , p) + p f (x , p),
p h(x (p), p) = p x (p)x h(x , p) + p h(x , p) = 0,
and similarly for g(x (p), p).

In view of x L = 0, from KKT conditions,
p f (x (p), p) = p f (x , p) + [p h(x , p)] + [p g(x , p)]
674,
Sensitivity
Constraints
Optimality Criteria
Sensitivity
Duality*
Sensitivity to constraints
In particular, in a revised problem, with h(x) = c and g(x) d,
using p = c,
p f (x , p) = 0, p h(x , p) = I and p g(x , p) = 0.
c f (x (p), p) =
Similarly, using p = d, we get
d f (x (p), p) = .
675,
Sensitivity
Constraints
Optimality Criteria
Sensitivity
Duality*
Sensitivity to constraints
In particular, in a revised problem, with h(x) = c and g(x) d,
using p = c,
p f (x , p) = 0, p h(x , p) = I and p g(x , p) = 0.
c f (x (p), p) =
Similarly, using p = d, we get
d f (x (p), p) = .
Lagrange multipliers and signify costs of pulling the minimum

point in order to satisfy the constraints!
Equality constraint: both sides infeasible, sign of j identifies

one side or the other of the hypersurface.
Inequality constraint: one side is feasible, no cost of pulling

from that side, so i 0.
676,
Constraints
Optimality Criteria
Sensitivity
Duality*
Duality*
Dual problem:
Reformulation of a problem in terms of the Lagrange multipliers.
Suppose x as a local minimum for the problem
Minimize f (x) subject to h(x) = 0,
with Lagrange multiplier (vector) .
f (x ) + [h(x )] = 0
677,
Constraints
Optimality Criteria
Sensitivity
Duality*
Duality*
Dual problem:
Reformulation of a problem in terms of the Lagrange multipliers.
Suppose x as a local minimum for the problem
Minimize f (x) subject to h(x) = 0,
with Lagrange multiplier (vector) .
f (x ) + [h(x )] = 0
If HL (x ) is positive definite (assumption of local duality), then x

is also a local minimum of
f(x) = f (x) + T h(x).
If we vary around , the minimizer of
L(x, ) = f (x) + T h(x)
varies continuously with .
678,
Constraints
Optimality Criteria
Sensitivity
Duality*
Duality*
In the neighbourhood of , define the dual function
() = min L(x, ) = min[f (x) + T h(x)].

x
For a pair {x, }, the dual solution is feasible if and only

if the primal solution is optimal.
679,
Constraints
Optimality Criteria
Sensitivity
Duality*
Duality*
() = min L(x, ) = min[f (x) + T h(x)].

x

Define x() as the local minimizer of L(x, ).
() = L(x(), ) = f (x()) + T h(x())
680,
Constraints
Optimality Criteria
Sensitivity
Duality*
Duality*
() = min L(x, ) = min[f (x) + T h(x)].

x

Define x() as the local minimizer of L(x, ).
() = L(x(), ) = f (x()) + T h(x())
First derivative:
() = x()x L(x(), ) + h(x()) = h(x())
For a pair {x, }, the dual solution is optimal if and only
if the primal solution is feasible.
681,
Duality*
Hessian of the dual function:
Constraints
Optimality Criteria
Sensitivity
Duality*
H () = x()x h(x())
Differentiating x L(x(), ) = 0, we have
x()HL (x(), ) + [x h(x())]T = 0.
Solving for x() and substituting,
H () = [x h(x())]T [HL (x(), )]1 x h(x()),
negative definite!
682,
Duality*
Hessian of the dual function:
Constraints
Optimality Criteria
Sensitivity
Duality*
H () = x()x h(x())
Differentiating x L(x(), ) = 0, we have
x()HL (x(), ) + [x h(x())]T = 0.
Solving for x() and substituting,
H () = [x h(x())]T [HL (x(), )]1 x h(x()),
negative definite!
At , x( ) = x , ( ) = h(x ) = 0, H ( ) is negative
definite and the dual function is maximized.
( ) = L(x , ) = f (x )
683,
Duality*
Consolidation (including all constraints)
Constraints
Optimality Criteria
Sensitivity
Duality*
Assuming local convexity, the dual function:

(, ) = min L(x, , ) = min[f (x) + T h(x) + T g(x)].
x
Constraints on the dual: x L(x, , ) = 0, optimality of the

primal.
Corresponding to inequality constraints of the primal problem,

non-negative variables in the dual problem.
First order necessary conditons for the dual optimality:

equivalent to the feasibility of the primal problem.
The dual function is concave globally!
Under suitable conditions, ( ) = L(x , ) = f (x ).
The Lagrangian L(x, , ) has a saddle point in the combined

space of primal and dual variables: positive curvature along x
directions and negative curvature along and directions.
684,
Constraints
Optimality Criteria
Sensitivity
Duality*
For a problem of n variables, with m active constraints,

nature and dimension of working spaces
Penalty methods (R n ): Minimize the penalized function

q(c, x) = f (x) + cP(x).
Example: P(x) = 12 kh(x)k2 + 12 [max(0, g(x))]2 .
Primal methods (R nm ): Work only in feasible domain, restricting

steps to the tangent plane.
Example: Gradient projection method.
Dual methods (R m ): Transform the problem to the space of
Lagrange multipliers and maximize the dual.
Example: Augmented Lagrangian method.
Lagrange methods (R m+n ): Solve equations appearing in the KKT
conditions directly.
Example: Sequential quadratic programming.
685,
Points to note
KKT conditions
Second order conditions
Basic ideas for solution strategy
Constraints
Optimality Criteria
Sensitivity
Duality*
686,
Outline

Linear Programming
Quadratic Programming

Linear Programming
687,
Linear Programming
Linear Programming
Standard form of an LP problem:

Minimize
subject to
f (x) = cT x,
Ax = b, x 0;
with b 0.
688,
Linear Programming
Linear Programming

Minimize
subject to
f (x) = cT x,
Ax = b, x 0;
with b 0.
Preprocessing to cast a problem to the standard form

Maximization: Minimize the negative function.
Variables of unrestricted sign: Use two variables.
Inequality constraints: Use slack/surplus variables.
Negative RHS: Multiply with 1.
689,
Linear Programming
Linear Programming

Minimize
subject to
f (x) = cT x,
Ax = b, x 0;
with b 0.
Preprocessing to cast a problem to the standard form

Maximization: Minimize the negative function.
Variables of unrestricted sign: Use two variables.
Inequality constraints: Use slack/surplus variables.
Negative RHS: Multiply with 1.
Geometry of an LP problem
Infinite domain: does a minimum exist?
Finite convex polytope: existence guaranteed
Operating with vertices sufficient as a strategy
Extension with slack/surplus variables: original solution space
a subspace in the extented space, x 0 marking the domain
Essence of the non-negativity condition of variables
690,
Linear Programming
Linear Programming
The simplex method

Suppose x R N , b R M and A R MN full-rank, with M < N.
IM xB + A xNB = b
Basic and non-basic variables: xB R M and xNB R NM
Basic feasible solution: xB = b 0 and xNB = 0
691,
Linear Programming
Linear Programming
The simplex method

IM xB + A xNB = b
At every iteration,
selection of a non-basic variable to enter the basis
selection of a basic variable to leave the basis
edge of travel selected based on maximum rate of descent

no qualifier: current vertex is optimal
based on the first constraint becoming active along the edge
no constraint ahead: function is unbounded
elementary row operations: new basic feasible solution
692,
Linear Programming
Linear Programming
The simplex method

IM xB + A xNB = b
At every iteration,
selection of a non-basic variable to enter the basis
selection of a basic variable to leave the basis
edge of travel selected based on maximum rate of descent

no qualifier: current vertex is optimal
based on the first constraint becoming active along the edge
no constraint ahead: function is unbounded
elementary row operations: new basic feasible solution
Two-phase method: Inclusion of a pre-processing phase with

artificial variables to develop a basic feasible solution
693,
Linear Programming

Linear Programming
General perspective
LP problem:
Minimize
subject to
Lagrangian:
T
f (x, y) = cT
1 x + c2 y;
A11 x + A12 y = b1 , A21 x + A22 y b2 ,
y 0.
T
L(x, y, , , ) = cT
1 x + c2 y
+ T (A11 x + A12 y b1 ) + T (A21 x + A22 y b2 ) T y
694,
Linear Programming
Linear Programming
General perspective
LP problem:
Minimize
subject to
Lagrangian:
T
f (x, y) = cT
1 x + c2 y;
A11 x + A12 y = b1 , A21 x + A22 y b2 ,
y 0.
T
L(x, y, , , ) = cT
1 x + c2 y
+ T (A11 x + A12 y b1 ) + T (A21 x + A22 y b2 ) T y
Optimality conditions:
T
c1 + AT
11 + A21 = 0
and
T
= c2 + AT
12 + A22 0
Substituting back, optimal function value: f = T b1 T b2
f
Sensitivity to the constraints: f
b1 = and b2 =
695,
Linear Programming
696,
Linear Programming
General perspective
LP problem:
Minimize
subject to
Lagrangian:
T
f (x, y) = cT
1 x + c2 y;
A11 x + A12 y = b1 , A21 x + A22 y b2 ,
y 0.
T
L(x, y, , , ) = cT
1 x + c2 y
+ T (A11 x + A12 y b1 ) + T (A21 x + A22 y b2 ) T y
Optimality conditions:
T
c1 + AT
11 + A21 = 0
and
T
= c2 + AT
12 + A22 0
Substituting back, optimal function value: f = T b1 T b2
f
Sensitivity to the constraints: f
b1 = and b2 =
Dual problem:
maximize
subject to
T
(, ) = bT
1 b2 ;
T
T
T
A11 + A21 = c1 , AT
12 + A22 c2 ,
Notice the symmetry between the primal and dual problems.
0.

Linear Programming
A quadratic objective function and linear constraints define

a QP problem.
Equations from the KKT conditions: linear!
Lagrange methods are the natural choice!
697,

Linear Programming

a QP problem.
With equality constraints only,
Minimize
1
f (x) = xT Qx + cT x,
2
subject to Ax = b.
First order necessary conditions:

c
x
Q AT
=
b
A 0
Solution of this linear system yields the complete result!
698,

Linear Programming

a QP problem.
With equality constraints only,
Minimize
1
f (x) = xT Qx + cT x,
2
subject to Ax = b.
First order necessary conditions:

c
x
Q AT
=
b
A 0
Solution of this linear system yields the complete result!
Caution: This coefficient matrix is indefinite.
699,

Linear Programming
Active set method

Minimize
subject to
f (x) = 21 xT Qx + cT x;
A1 x = b1 ,
A2 x b2 .
700,
Linear Programming
Active set method

Minimize
subject to
f (x) = 21 xT Qx + cT x;
A1 x = b1 ,
A2 x b2 .
Start the iterative process from a feasible point.

Construct active set of constraints as Ax = b.
From the current point xk , with x = xk + dk ,
1
(xk + dk )T Q(xk + dk ) + cT (xk + dk )
2
1 T
=
d Qdk + (c + Qxk )T dk + f (xk ).
2 k
Since gk f (xk ) = c + Qxk , subsidiary quadratic program:
f (x) =
T
minimize 21 dT
k Qdk + gk dk
subject to Adk = 0.
Examining solution dk and Lagrange multipliers, decide to

terminate, proceed or revise the active set.
701,
Linear Programming
Linear complementary problem (LCP)

Slack variable strategy with inequality constraints
Minimize
1 T
2 x Qx
+ cT x,
subject to Ax b,
x 0.
702,
Linear Programming

Minimize
1 T
2 x Qx
+ cT x,
subject to Ax b,
KKT conditions: With x, y, , 0,

Qx + c + AT = 0,
Ax + y = b,
x = T y = 0.
x 0.
703,
Linear Programming

Minimize
1 T
2 x Qx
+ cT x,
subject to Ax b,
x 0.
KKT conditions: With x, y, , 0,

Qx + c + AT = 0,
Ax + y = b,
x = T y = 0.
Denoting

x
c
Q AT
z=
,w =
,q =
,
and M =
y
A 0
b
w Mz = q,
wT z = 0.
Find mutually complementary non-negative w and z.
704,
Linear Programming
If q 0, then w = q, z = 0 is a solution!
Lemkes method: artificial variable z0 with e = [1 1 1 1]T :

Iw Mz ez0 = q
With z0 = max(qi ),
w = q + ez0 0 and z = 0: basic feasible solution
705,
Linear Programming

Iw Mz ez0 = q
With z0 = max(qi ),
Evolution of the basis similar to the simplex method.
Out of a pair of w and z variables, only one can be there in

any basis.
At every step, one variable is driven out of the basis and its
partner called in.
The step driving out z0 flags termination.
706,
Linear Programming

Iw Mz ez0 = q
With z0 = max(qi ),

any basis.
partner called in.
Handling of equality constraints?
707,
Linear Programming

Iw Mz ez0 = q
With z0 = max(qi ),

any basis.
partner called in.
Handling of equality constraints?
Very clumsy!!
708,
Points to note

Linear Programming
Fundamental issues and general perspective of the linear

programming problem
The simplex method

Quadratic programming
The active set method

Lemkes method via the linear complementary problem
709,
Outline

Polynomial Interpolation
Piecewise Polynomial Interpolation
Interpolation of Multivariate Functions
A Note on Approximation of Functions
Modelling of Curves and Surfaces*

710,

Problem: To develop an analytical representation of a function

from information at discrete data points.
Purpose
Evaluation at arbitrary points
Differentiation and/or integration
Drawing conclusion regarding the trends or nature
711,

Problem: To develop an analytical representation of a function

from information at discrete data points.
Purpose
Evaluation at arbitrary points
Differentiation and/or integration
Drawing conclusion regarding the trends or nature
Interpolation: one of the ways of function representation
sampled data are exactly satisfied
Polynomial: a convenient class of basis functions

For yi = f (xi ) for i = 0, 1, 2, , n with x0 < x1 < x2 < < xn ,
p(x) = a0 + a1 x + a2 x 2 + + an x n .
Find the coefficients such that p(xi ) = f (xi ) for i = 0, 1, 2, , n.
Values of p(x) for x [x0 , xn ] interpolate n + 1 values
of f (x), an outside estimate is extrapolation.
712,

To determine p(x), solve the linear system
1 x0 x02 x0n
a0
f (x0 )
1 x1 x 2 x n
a1
1
1
f (x1 )
1 x2 x 2 x n
f (x2 )
a
=
2
2
..
..
.. . .
..
.
.
.
.
.
n
2
a
f
(x
n
n)
1 xn xn xn
Vandermonde matrix: invertible, but typically ill-conditioned!
713,

To determine p(x), solve the linear system
1 x0 x02 x0n
a0
f (x0 )
1 x1 x 2 x n
a1
1
1
f (x1 )
1 x2 x 2 x n
f (x2 )
a
=
2
2
..
..
.. . .
..
.
.
.
.
.
n
2
a
f
(x
n
n)
1 xn xn xn
Vandermonde matrix: invertible, but typically ill-conditioned!

Invertibility means existence and uniqueness of polynomial p(x).
Two polynomials p1 (x) and p2 (x) matching the function f (x) at
x0 , x1 , x2 , , xn imply
n-th degree polynomial p(x) = p1 (x) p2 (x) with
n + 1 roots!
p 0 p1 (x) = p2 (x): p(x) is unique.
714,
Lagrange interpolation
Basis functions:
Qn
j=0,j6=k (x xj )
Lk (x) = Qn
j=0,j6=k (xk xj )
=

(x x0 )(x x1 ) (x xk1 )(x xk+1 ) (x xn )

(xk x0 )(xk x1 ) (xk xk1 )(xk xk+1 ) (xk xn )
Interpolating polynomial:
p(x) = 0 L0 (x) + 1 L1 (x) + 2 L2 (x) + + n Ln (x)
715,
Lagrange interpolation
Basis functions:
Qn
j=0,j6=k (x xj )
Lk (x) = Qn
j=0,j6=k (xk xj )
=

(x x0 )(x x1 ) (x xk1 )(x xk+1 ) (x xn )

(xk x0 )(xk x1 ) (xk xk1 )(xk xk+1 ) (xk xn )
p(x) = 0 L0 (x) + 1 L1 (x) + 2 L2 (x) + + n Ln (x)

At the data points, Lk (xi ) = ik .
Coefficient matrix identity and i = f (xi ).
Lagrange interpolation formula:
n
X
p(x) =
f (xk )Lk (x) = L0 (x)f (x0 )+L1 (x)f (x1 )+ +Ln (x)f (xn )
k=0
Existence of p(x) is a trivial consequence!
716,
Two interpolation formulae

one costly to determine, but easy to process
the other trivial to determine, costly to process
717,

Newton interpolation for an intermediate trade-off: Q

p(x) = c0 + c1 (x x0 ) + c2 (x x0 )(x x1 ) + + cn n1
i =0 (x xi )
718,


p(x) = c0 + c1 (x x0 ) + c2 (x x0 )(x x1 ) + + cn n1
i =0 (x xi )
Hermite interpolation
uses derivatives as well as function values.

Data: f (xi ), f (xi ), , f (ni 1) (xi ) at x = xi , for i = 0, 1, , m:
Pm
At (m + 1) points, a total of n + 1 =
i =0 ni conditions
719,


p(x) = c0 + c1 (x x0 ) + c2 (x x0 )(x x1 ) + + cn n1
i =0 (x xi )
Hermite interpolation
uses derivatives as well as function values.

Data: f (xi ), f (xi ), , f (ni 1) (xi ) at x = xi , for i = 0, 1, , m:
Pm
At (m + 1) points, a total of n + 1 =
i =0 ni conditions
Limitations of single-polynomial interpolation
With large number of data points, polynomial degree is high.
Computational cost and numerical imprecision
Lack of representative nature due to oscillations
720,

Piecewise linear interpolation
f (x) = f (xi 1 ) +

f (xi ) f (xi 1 )
(x xi 1 ) for x [xi 1 , xi ]
xi xi 1
Handy for many uses with dense data. But, not differentiable.
721,

f (x) = f (xi 1 ) +

f (xi ) f (xi 1 )
(x xi 1 ) for x [xi 1 , xi ]
xi xi 1
Piecewise cubic interpolation
With function values and derivatives at (n + 1) points,
n cubic Hermite segments
Data for the j-th segment:
f (xj1 ) = fj1 , f (xj ) = fj , f (xj1 ) = fj1

and f (xj ) = fj
pj (x) = a0 + a1 x + a2 x 2 + a3 x 3
, f
Coefficients a0 , a1 , a2 , a3 : linear combinations of fj1 , fj , fj1
j
722,

f (x) = f (xi 1 ) +

f (xi ) f (xi 1 )
(x xi 1 ) for x [xi 1 , xi ]
xi xi 1
Piecewise cubic interpolation
With function values and derivatives at (n + 1) points,
n cubic Hermite segments
Data for the j-th segment:
f (xj1 ) = fj1 , f (xj ) = fj , f (xj1 ) = fj1

and f (xj ) = fj
pj (x) = a0 + a1 x + a2 x 2 + a3 x 3
, f
Coefficients a0 , a1 , a2 , a3 : linear combinations of fj1 , fj , fj1
j
Composite function C 1 continuous at knot points.
723,

General formulation through normalization of intervals

x = xj1 + t(xj xj1 ), t [0, 1]
With g (t) = f (x(t)), g (t) = (xj xj1 )f (x(t));
and g1 = (xj xj1 )fj .

g0 = fj1 , g1 = fj , g0 = (xj xj1 )fj1
Cubic polynomial for the j-th segment:

qj (t) = 0 + 1 t + 2 t 2 + 3 t 3
724,


x = xj1 + t(xj xj1 ), t [0, 1]

g0 = fj1 , g1 = fj , g0 = (xj xj1 )fj1

qj (t) = 0 + 1 t + 2 t 2 + 3 t 3
Modular expression:
1
t
qj (t) = [0 1 2 3 ]
t 2 = [g0 g1 g0 g1 ] W
t3
1
t
t 2 = Gj WT
t3
Packaging data, interpolation type and variable terms separately!
725,


x = xj1 + t(xj xj1 ), t [0, 1]

g0 = fj1 , g1 = fj , g0 = (xj xj1 )fj1

qj (t) = 0 + 1 t + 2 t 2 + 3 t 3
Modular expression:
1
t
qj (t) = [0 1 2 3 ]
t 2 = [g0 g1 g0 g1 ] W
t3
1
t
t 2 = Gj WT
t3
Packaging data, interpolation type and variable terms separately!

Question: How to supply derivatives? And, why?
726,

Spline interpolation

Spline: a drafting tool to draw a smooth curve through key points.
727,



Data: fi = f (xi ), for x0 < x1 < x2 < < xn .
If kj = f (xj ), then
pj (x) can be determined in terms of fj1 , fj , kj1 , kj
and pj+1 (x) in terms of fj , fj+1 , kj , kj+1 .
(x ): a linear equation in k
Then, pj (xj ) = pj+1
j
j1 , kj and kj+1
728,



j
j1 , kj and kj+1
From n 1 interior knot points,
n 1 linear equations in derivative values k0 , k1 , , kn .
Prescribing k0 and kn , a diagonally dominant tridiagonal system!
729,



j
j1 , kj and kj+1
From n 1 interior knot points,
n 1 linear equations in derivative values k0 , k1 , , kn .
Prescribing k0 and kn , a diagonally dominant tridiagonal system!

A spline is a smooth interpolation, with C 2 continuity.
730,
Piecewise bilinear interpolation

Data: f (x, y ) over a dense rectangular grid

x = x0 , x1 , x2 , , xm and y = y0 , y1 , y2 , , yn
Rectangular domain: {(x, y ) : x0 x xm , y0 y yn }
731,


For xi 1 x xi and yj1 y yj ,
f (x, y ) = a0,0 + a1,0 x + a0,1 y + a1,1 xy = [1 x]
a0,0 a0,1
a1,0 a1,1

1
y
With data at four corner points, coefficient matrix determined from

1
1
fi 1,j1 fi 1,j
a0,0 a0,1
1 xi 1
.
=
fi ,j1
fi ,j
a1,0 a1,1
yj1 yj
1 xi
732,


For xi 1 x xi and yj1 y yj ,
f (x, y ) = a0,0 + a1,0 x + a0,1 y + a1,1 xy = [1 x]
a0,0 a0,1
a1,0 a1,1

1
y
With data at four corner points, coefficient matrix determined from

1
1
fi 1,j1 fi 1,j
a0,0 a0,1
1 xi 1
.
=
fi ,j1
fi ,j
a1,0 a1,1
yj1 yj
1 xi
Approximation only C 0 continuous.
733,
Alternative local formula through reparametrization

y yj1
xxi 1
With u = xi xi 1 and v = yj yj1 , denoting
fi 1,j1 = g0,0 , fi ,j1 = g1,0 , fi 1,j = g0,1 and fi ,j = g1,1 ;
bilinear interpolation:
g (u, v ) = [1 u]
0,0 0,1
1,0 1,1

1
v
for u, v [0, 1].
734,
735,
Alternative local formula through reparametrization

y yj1
xxi 1
With u = xi xi 1 and v = yj yj1 , denoting
fi 1,j1 = g0,0 , fi ,j1 = g1,0 , fi 1,j = g0,1 and fi ,j = g1,1 ;
bilinear interpolation:
g (u, v ) = [1 u]
0,0 0,1
1,0 1,1

1
v
for u, v [0, 1].
Values at four corner points fix the coefficient matrix as

1 0
0,0 0,1
1 1
g0,0 g0,1
=
.
1,0 1,1
1 1
g1,0 g1,1
0
1
g (u, v ) = UT WT Gi ,j WV
Concisely,
U=
1
u
,V=
1
v
,W=
1 1
0
1
in which

, Gi ,j =
fi 1,j1 fi 1,j
fi ,j1
fi ,j
Piecewise bicubic interpolation

f f
2f
Data: f , x
, y and xy
over grid points
With normalizing parameters u and v ,
g
u
f
= (xi xi 1 ) x
,
2g
uv
g
v

f
= (yj yj1 ) y
, and
2
f
= (xi xi 1 )(yj yj1 ) xy
736,
Piecewise bicubic interpolation

f f
2f
Data: f , x
, y and xy
over grid points
With normalizing parameters u and v ,
g
u
f
= (xi xi 1 ) x
,
2g
uv
g
v

f
= (yj yj1 ) y
, and
2
f
= (xi xi 1 )(yj yj1 ) xy
In {(x, y ) : xi 1 x xi , yj1 y yj } or {(u, v ) : u, v [0, 1]},

g (u, v ) = UT WT Gi ,j WV,
with U = [1 u u 2 u 3 ]T ,
g (0, 0)
g (1, 0)
Gi ,j =
gu (0, 0)
gu (1, 0)
V = [1 v v 2 v 3 ]T , and
g (0, 1) gv (0, 0) gv (0, 1)

g (1, 1) gv (1, 0) gv (1, 1)
.
gu (0, 1) guv (0, 0) guv (0, 1)
gu (1, 1) guv (1, 0) guv (1, 1)
737,
A common strategy of function approximation is to
express a function as a linear combination of a set of basis

functions (which?), and
determine coefficients based on some criteria (what?).
738,

Criteria:
Interpolatory approximation: Exact agreement with sampled data
Least square approximation: Minimization of a sum (or integral) of
square errors over sampled data
Minimax approximation: Limiting the largest deviation
739,

Criteria:
Interpolatory approximation: Exact agreement with sampled data
Least square approximation: Minimization of a sum (or integral) of
square errors over sampled data
Minimax approximation: Limiting the largest deviation
Basis functions:
polynomials, sinusoids, orthogonal eigenfunctions or
field-specific heuristic choice
740,
Points to note

Lagrange, Newton and Hermite interpolations
Piecewise polynomial functions and splines
Bilinear and bicubic interpolation of bivariate functions
Direct extension to vector functions: curves and surfaces!
741,
Outline
742,
Newton-Cotes Integration Formulae

Richardson Extrapolation and Romberg Integration
Further Issues

Further Issues

J=
f (x)dx
a
743,

Further Issues

J=
f (x)dx
a
Divide [a, b] into n sub-intervals with

a = x0 < x1 < x2 < < xn1 < xn = b,
where xi xi 1 = h =
J =
n
X
i =1
Taking
xi
744,

Further Issues
ba
n .
hf (xi ) = h[f (x1 ) + f (x2 ) + + f (xn )]
[xi 1 , xi ] as xi 1 and xi , we get summations J1 and J2 .
As n (i.e. h 0), if J1 and J2 approach the same

limit, then function f (x) is integrable over interval [a, b].
A rectangular rule or a one-point rule

J=
f (x)dx
a
Divide [a, b] into n sub-intervals with

a = x0 < x1 < x2 < < xn1 < xn = b,
where xi xi 1 = h =
J =
n
X
i =1
Taking
xi
745,

Further Issues
ba
n .
hf (xi ) = h[f (x1 ) + f (x2 ) + + f (xn )]
[xi 1 , xi ] as xi 1 and xi , we get summations J1 and J2 .
As n (i.e. h 0), if J1 and J2 approach the same

limit, then function f (x) is integrable over interval [a, b].
A rectangular rule or a one-point rule
Question: Which point to take as xi ?

Mid-point rule
x +x
Selecting xi as xi = i 12 i ,
Z xi
f (x)dx hf (
xi ) and
xi 1
b
a
746,

Further Issues
f (x)dx h
n
X
i =1
f (
xi ).

Mid-point rule
x +x
Z xi
f (x)dx hf (
xi ) and
xi 1
b
a
747,

Further Issues
f (x)dx h
n
X
f (
xi ).
i =1
Error analysis: From Taylors series of f (x) about xi ,

Z xi
Z xi
(x xi )2
f (x)dx =
f (
xi ) + f (
xi )(x xi ) + f (
xi )
+ dx
2
xi 1
xi 1
= hf (
xi ) +
third order accurate!
h5 iv
h3
f (
xi ) +
f (
xi ) + ,
24
1920

Mid-point rule
x +x
Z xi
f (x)dx hf (
xi ) and
xi 1
b
a
748,

Further Issues
f (x)dx h
n
X
f (
xi ).
i =1
Error analysis: From Taylors series of f (x) about xi ,

Z xi
Z xi
(x xi )2
f (x)dx =
f (
xi ) + f (
xi )(x xi ) + f (
xi )
+ dx
2
xi 1
xi 1
= hf (
xi ) +
h5 iv
h3
f (
xi ) +
f (
xi ) + ,
24
1920
third order accurate!

Over the entire domain [a, b],
Z b
n
n
n
X
X
h2
h3 X
f (
xi )+ (ba)f (),
f (
xi ) = h
f (
xi )+
f (x)dx h
24
24
a
i =1
i =1
i =1
for [a, b] (from mean value theorem): second order accurate.
Trapezoidal rule
Approximating function f (x) with a linear interpolation,
Z xi
h
f (x)dx [f (xi 1 ) + f (xi )]
2
xi 1
and
Z
#
n1
X
1
1
f (x)dx h f (x0 ) +
f (xi ) + f (xn ) .
2
2
"
i =1
749,

Further Issues
750,

Further Issues
Trapezoidal rule
Approximating function f (x) with a linear interpolation,
Z xi
h
f (x)dx [f (xi 1 ) + f (xi )]
2
xi 1
and
Z
#
n1
X
1
1
f (x)dx h f (x0 ) +
f (xi ) + f (xn ) .
2
2
"
i =1
Taylor series expansions about the mid-point:

h
f (xi 1 ) = f (
xi ) f (
xi ) +
2
h
xi ) +
f (xi ) = f (
xi ) + f (
2
h2
f (
xi )
8
h2
f (
xi ) +
8
h3
f (
xi ) +
48
h3
f (
xi ) +
48
h4 iv
f (
xi )
384
h4 iv
f (
xi ) +
384
h
h3
h5 iv
[f (xi 1 ) + f (xi )] = hf (
xi ) + f (
xi ) +
f (
xi ) +
2
8
384
Rx
h3
h5 iv
xi ) + 24
Recall xi i1 f (x)dx = hf (
f (
xi ) + 1920
f (
xi ) + .
751,

Further Issues
Error estimate of trapezoidal rule

Z xi
h3
h5 iv
h
xi )
f (
xi ) +
f (x)dx = [f (xi 1 ) + f (xi )] f (
2
12
480
xi 1
Over an extended domain,
"
#
Z b
n1
X
1
h2
f (x)dx = h {f (x0 ) + f (xn )} +
f (xi ) (ba)f ()+ .
2
12
a
i =1
The same order of accuracy as the mid-point rule!
Error estimate of trapezoidal rule

Z xi
h3
h5 iv
h
xi )
f (
xi ) +
f (x)dx = [f (xi 1 ) + f (xi )] f (
2
12
480
xi 1
Over an extended domain,
"
#
Z b
n1
X
1
h2
f (x)dx = h {f (x0 ) + f (xn )} +
f (xi ) (ba)f ()+ .
2
12
a
i =1
The same order of accuracy as the mid-point rule!

Different sources of merit
Mid-point rule: Use of mid-point leads to symmetric

error-cancellation.
Trapezoidal rule: Use of end-points allows double utilization

of boundary points in adjacent intervals.
How to use both the merits?
752,

Further Issues
Simpsons rules
Divide [a, b] into an even number (n = 2m) of intervals.
Fit a quadratic polynomial over a panel of two intervals.
For this panel of length 2h, two estimates:
M(f ) = 2hf (xi ) and T (f ) = h[f (xi 1 ) + f (xi +1 )]
h5
h3
f (xi ) + f iv (xi ) +
3
60
h5
2h3
f (xi ) f iv (xi ) +
J = T (f )
3
15
J = M(f ) +
753,

Further Issues
754,

Further Issues
Simpsons rules
h5
h3
f (xi ) + f iv (xi ) +
3
60
h5
2h3
J = T (f )
3
15
Simpsons one-third rule (with error estimate):
Z xi +1
h
h5
f (x)dx = [f (xi 1 ) + 4f (xi ) + f (xi +1 )] f iv (xi )
3
90
xi 1
J = M(f ) +
Fifth (not fourth) order accurate!
755,

Further Issues
Simpsons rules
h5
h3
f (xi ) + f iv (xi ) +
3
60
h5
2h3
J = T (f )
3
15
Simpsons one-third rule (with error estimate):
Z xi +1
h
h5
f (x)dx = [f (xi 1 ) + 4f (xi ) + f (xi +1 )] f iv (xi )
3
90
xi 1
J = M(f ) +
Fifth (not fourth) order accurate!

A four-point rule: Simpsons three-eighth rule
Still higher order rules NOT advisable!
756,

Richardson Extrapolation and Romberg
Integration
Richardson
Extrapolation and Romberg Integration
Further Issues
To determine quantity F
using a step size h, estimate F (h)
error terms: h p , h q , h r etc (p < q < r )
F = lim0 F ()?
plot F (h), F (h), F (2 h) (with < 1) and extrapolate?
757,

Integration
Richardson
Further Issues
F = lim0 F ()?
F (h) = F + chp + O(hq )
F (h) = F + c(h)p + O(hq )
F (2 h) = F + c(2 h)p + O(hq )
758,

Integration
Richardson
Further Issues
F = lim0 F ()?
F (h) = F + chp + O(hq )
F (h) = F + c(h)p + O(hq )
F (2 h) = F + c(2 h)p + O(hq )

Eliminate c and determine (better estimates of) F :
F1 (h) =
F1 (h) =
F (h) p F (h)
= F + c1 hq + O(hr )
1 p
F (2 h) p F (h)
= F + c1 (h)q + O(hr )
1 p
759,

Integration
Richardson
Further Issues
F = lim0 F ()?
F (h) = F + chp + O(hq )
1
2
4
F (h) = F + c(h)p + O(hq )
F (2 h) = F + c(2 h)p + O(hq )

3
F1 (h) =
F1 (h) =
F (h) p F (h)
= F + c1 hq + O(hr )
1 p
F (2 h) p F (h)
= F + c1 (h)q + O(hr )
1 p
Still better estimate:
F2 (h) =
F1 (h)q F1 (h)
1q
= F + O(hr )
760,

Integration
Richardson
Further Issues
F = lim0 F ()?
F (h) = F + chp + O(hq )
F (h) = F + c(h)p + O(hq )
F (2 h) = F + c(2 h)p + O(hq )

F1 (h) =
F1 (h) =
F (h) p F (h)
= F + c1 hq + O(hr )
1 p
F (2 h) p F (h)
= F + c1 (h)q + O(hr )
1 p
Still better estimate:
F2 (h) =
F1 (h)q F1 (h)
1q
Richardson extrapolation
= F + O(hr )
761,

Integration
Richardson
Further Issues
Trapezoidal rule for J =
Rb
a
f (x)dx: p = 2, q = 4, r = 6 etc
T (f ) = J + ch2 + dh4 + eh6 +

With = 12 , half the sum available for successive levels.
762,

Integration
Richardson
Further Issues
Trapezoidal rule for J =
Rb
a
f (x)dx: p = 2, q = 4, r = 6 etc
T (f ) = J + ch2 + dh4 + eh6 +

With = 12 , half the sum available for successive levels.
Romberg integration
Trapezoidal rule with h = H: find J11 .
With h = H/2, find J12 .
2
J12 21 J11
4J12 J11
.
J22 =
=

1 2
3
1 2
If |J22 J12 | is within tolerance, STOP. Accept J J22 .

With h = H/4, find J13 .
4
J23 12 J22
16J23 J22
4J13 J12
=
and J33 =
.
J23 =

1 4
3
15
1
2
If |J33 J23 | is within tolerance, STOP with J J33 .
Further Issues
Featured functions: adaptive quadrature
763,

Further Issues
With prescribed tolerance , assign quota i =

error to every interval [xi 1 , xi ].
(xi xi 1 )
ba
For each interval, find two estimates of the integral and

estimate the error.
If error estimate is not within quota, then subdivide.
of
Further Issues
764,

Further Issues

(xi xi 1 )
ba

estimate the error.
of
Function as tabulated data
Only trapezoidal rule applicable?
Fit a spline over data points and integrate the segments?
Further Issues

(xi xi 1 )
ba

estimate the error.
of
Function as tabulated data
Only trapezoidal rule applicable?
Fit a spline over data points and integrate the segments?
Improper integral: Newton-Cotes closed formulae not applicable!
Open Newton-Cotes formulae
Gaussian quadrature
765,

Further Issues
Points to note
Definition of an integral and integrability
Closed Newton-Cotes formulae and their error estimates
Richardson extrapolation as a general technique
Romberg integration
Adaptive quadrature
766,

Further Issues
Outline

Gaussian Quadrature
Multiple Integrals
Gaussian Quadrature
Multiple Integrals
767,
Gaussian Quadrature

Gaussian Quadrature
Multiple Integrals
P
A typical quadrature formula: a weighted sum ni=0 wi fi
fi : function value at i -th sampled point
wi : corresponding weight
Newton-Cotes formulae:
Abscissas (xi s) of sampling prescribed
Coefficients or weight values determined to eliminate
dominant error terms
768,
Gaussian Quadrature

Gaussian Quadrature
Multiple Integrals
P
Gaussian quadrature rules:
no prescription of quadrature points
only the number of quadrature points prescribed
locations as well as weights contribute to the accuracy criteria
769,
Gaussian Quadrature

Gaussian Quadrature
Multiple Integrals
P
with n integration points, 2n degrees of freedom
can be made exact for polynomials of degree up to 2n 1
770,
Gaussian Quadrature

Gaussian Quadrature
Multiple Integrals
P
with n integration points, 2n degrees of freedom
can be made exact for polynomials of degree up to 2n 1
best locations: interior points
open quadrature rules: can handle integrable singularities
771,
Gaussian Quadrature

Gaussian Quadrature
Multiple Integrals
Gauss-Legendre quadrature
Z
f (x)dx = w1 f (x1 ) + w2 f (x2 )
Four variables: Insist that it is exact for 1, x, x 2 and x 3 .
772,
Gaussian Quadrature
Gaussian Quadrature
Multiple Integrals
Z
f (x)dx = w1 f (x1 ) + w2 f (x2 )

w1 + w2 =
w1 x1 + w2 x2 =
w1 x12 + w2 x22 =
and w1 x13 + w2 x23 =
1
Z 1
1
Z 1
1
1
dx = 2,
xdx = 0,
x 2 dx =
2
3
x 3 dx = 0.
773,
Gaussian Quadrature
Gaussian Quadrature
Multiple Integrals
Z
f (x)dx = w1 f (x1 ) + w2 f (x2 )

w1 + w2 =
w1 x1 + w2 x2 =
w1 x12 + w2 x22 =
and w1 x13 + w2 x23 =
x1 = x2 , w1 = w2
1
Z 1
1
Z 1
1
1
dx = 2,
xdx = 0,
x 2 dx =
2
3
x 3 dx = 0.
774,
Gaussian Quadrature
Gaussian Quadrature
Multiple Integrals
Z
f (x)dx = w1 f (x1 ) + w2 f (x2 )

w1 + w2 =
w1 x1 + w2 x2 =
w1 x12 + w2 x22 =
and w1 x13 + w2 x23 =
1
Z 1
1
Z 1
1
1
dx = 2,
xdx = 0,
x 2 dx =
2
3
x 3 dx = 0.
x1 = x2 , w1 = w2 w1 = w2 = 1, x1 = 13 , x2 =
1
3
775,
Gaussian Quadrature

Gaussian Quadrature
Multiple Integrals
Two-point Gauss-Legendre quadrature formula

R1
1
1
1 f (x)dx = f ( 3 ) + f ( 3 )
Exact for any cubic polynomial: parallels Simpsons rule!
776,
Gaussian Quadrature

Gaussian Quadrature
Multiple Integrals

R1
1
1
1 f (x)dx = f ( 3 ) + f ( 3 )

Three-point quadrature rule along similar lines:
r !
r !
Z 1
5
8
5
3
3
f (x)dx = f
+ f (0) + f
9
5
9
9
5
1
777,
Gaussian Quadrature

Gaussian Quadrature
Multiple Integrals

R1
1
1
1 f (x)dx = f ( 3 ) + f ( 3 )

r !
r !
Z 1
5
8
5
3
3
f (x)dx = f
+ f (0) + f
9
5
9
9
5
1
A large number of formulae: Consult mathematical handbooks.
778,
Gaussian Quadrature
Gaussian Quadrature
Multiple Integrals

R1
1
1
1 f (x)dx = f ( 3 ) + f ( 3 )

r !
r !
Z 1
5
8
5
3
3
f (x)dx = f
+ f (0) + f
9
5
9
9
5
1
A large number of formulae: Consult mathematical handbooks.
For domain of integration [a, b],
x=
a+b ba
+
t
2
2
and
dx =
ba
dt
2
With scaling and relocation,

Z
Z b
ba 1
f [x(t)]dt
f (x)dx =
2
1
a
779,
Gaussian Quadrature

Gaussian Quadrature
Multiple Integrals
General Framework for n-point formula

f (x): a polynomial of degree 2n 1
p(x): Lagrange polynomial through the n quadrature points

f (x) p(x): a (2n 1)-degree polynomial having n of its roots at
the quadrature points
780,
Gaussian Quadrature
Gaussian Quadrature
Multiple Integrals


Then, with (x) = (x x1 )(x x2 ) (x xn ),
f (x) p(x) = (x)q(x).
P
i
Quotient polynomial: q(x) = n1
i =0 i x
Direct integration:
Z "
Z
Z
(x)
p(x)dx +
f (x)dx =
n1
X
i =0
i x
dx
781,
Gaussian Quadrature
Gaussian Quadrature
Multiple Integrals


Then, with (x) = (x x1 )(x x2 ) (x xn ),
f (x) p(x) = (x)q(x).
P
i
Quotient polynomial: q(x) = n1
i =0 i x
Direct integration:
Z "
Z
Z
(x)
p(x)dx +
f (x)dx =
How to make the second term vanish?
n1
X
i =0
i x
dx
782,
Gaussian Quadrature

Gaussian Quadrature
Multiple Integrals
Choose quadrature points x1 , x2 , , xn so that (x) is orthogonal

to all polynomials of degree less than n.
Legendre polynomial
783,
Gaussian Quadrature
Gaussian Quadrature
Multiple Integrals
Choose quadrature points x1 , x2 , , xn so that (x) is orthogonal

to all polynomials of degree less than n.
Legendre polynomial
1. Choose Pn (x), Legendre polynomial of degree n, as (x).
2. Take its roots x1 , x2 , , xn as the quadrature points.
3. Fit Lagrange polynomial of f (x), using these n points.
p(x) = L1 (x)f (x1 ) + L2 (x)f (x2 ) + + Ln (x)f (xn )

4.
Z
f (x)dx =
Weight values: wj =
p(x)dx =
R1
1 Lj (x)dx,
n
X
j=1
f (xj )
Lj (x)dx
1
for j = 1, 2, , n
784,
Gaussian Quadrature

Gaussian Quadrature
Multiple Integrals
Weight functions in Gaussian quadrature

What is so great about exact integration of polynomials?
785,
Gaussian Quadrature
Gaussian Quadrature
Multiple Integrals
Weight functions in Gaussian quadrature

What is so great about exact integration of polynomials?
Demand something else: generalization
Exact integration of polynomials times function W (x)
Given weight function W (x) and number (n) of quadrature points,
work out the locations (xj s) of the n points and the
corresponding weights (wj s), so that integral
Z
W (x)f (x)dx =
n
X
wj f (xj )
j=1
is exact for an arbitrary polynomial f (x) of degree up to

(2n 1).
786,
Gaussian Quadrature

Gaussian Quadrature
Multiple Integrals
A family of orthogonal polynomials with increasing degree:

quadrature points: roots of n-th member of the family.
787,
Gaussian Quadrature

Gaussian Quadrature
Multiple Integrals

For different kinds of functions and different domains,
Gauss-Chebyshev quadrature
Gauss-Laguerre quadrature
Gauss-Hermite quadrature
Several singular functions and infinite domains can be handled.
788,
Gaussian Quadrature

Gaussian Quadrature
Multiple Integrals

For different kinds of functions and different domains,
Gauss-Chebyshev quadrature
Gauss-Laguerre quadrature
Gauss-Hermite quadrature
Several singular functions and infinite domains can be handled.

A very special case:
For W (x) = 1, Gauss-Legendre quadrature!
789,
Multiple Integrals
S=
Gaussian Quadrature
Multiple Integrals
g2 (x)
f (x, y ) dy dx
g1 (x)
790,
Multiple Integrals
Gaussian Quadrature
Multiple Integrals
S=
F (x) =
g2 (x)
f (x, y ) dy dx
g1 (x)
g2 (x)
g1 (x)
f (x, y ) dy and S =
F (x)dx
with complete flexibility of individual quadrature methods.
791,
Multiple Integrals
Gaussian Quadrature
Multiple Integrals
S=
F (x) =
g2 (x)
f (x, y ) dy dx
g1 (x)
g2 (x)
g1 (x)
F (x)dx

Double integral on rectangular domain
Two-dimensional version of Simpsons one-third rule:
Z 1Z 1
f (x, y )dxdy
1
= w0 f (0, 0) + w1 [f (1, 0) + f (1, 0) + f (0, 1) + f (0, 1)]

+ w2 [f (1, 1) + f (1, 1) + f (1, 1) + f (1, 1)]
792,
Multiple Integrals
Gaussian Quadrature
Multiple Integrals
S=
F (x) =
g2 (x)
f (x, y ) dy dx
g1 (x)
g2 (x)
g1 (x)
F (x)dx

Double integral on rectangular domain
Two-dimensional version of Simpsons one-third rule:
Z 1Z 1
f (x, y )dxdy
1
= w0 f (0, 0) + w1 [f (1, 0) + f (1, 0) + f (0, 1) + f (0, 1)]

+ w2 [f (1, 1) + f (1, 1) + f (1, 1) + f (1, 1)]
Exact for bicubic functions: w0 = 16/9, w1 = 4/9 and w2 = 1/9.
793,
Multiple Integrals
Gaussian Quadrature
Multiple Integrals
Monte Carlo integration

I =
f (x)dV
794,
Multiple Integrals
Gaussian Quadrature
Multiple Integrals

I =
f (x)dV
Requirements:
a simple volume V enclosing the domain
a point classification scheme
Generating random points in V ,

f (x)
F (x) =
0
if x ,
otherwise .
795,
Multiple Integrals
Gaussian Quadrature
Multiple Integrals

I =
f (x)dV
Requirements:
a simple volume V enclosing the domain
a point classification scheme
Generating random points in V ,

f (x)
F (x) =
0
I
if x ,
otherwise .
N
V X
F (xi )
N
i =1
Estimate of I (usually) improves with increasing N.
796,
Points to note

Gaussian Quadrature
Multiple Integrals
Basic strategy of Gauss-Legendre quadrature
Formulation of a double integral from fundamental principle
797,
Outline
798,
Single-Step Methods
Practical Implementation of Single-Step Methods
Systems of ODEs
Multi-Step Methods*

Single-Step Methods
Systems of ODEs
Multi-Step Methods*
Single-Step Methods
799,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
Initial value problem (IVP) of a first order ODE:

dy
= f (x, y ), y (x0 ) = y0
dx
To determine: y (x) for x [a, b] with x0 = a.
Single-Step Methods

dy
= f (x, y ), y (x0 ) = y0
dx
Numerical solution: Start from the point (x0 , y0 ).
y1 = y (x1 ) = y (x0 + h) =?
Found (x1 , y1 ).
800,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
Single-Step Methods

dy
= f (x, y ), y (x0 ) = y0
dx
y1 = y (x1 ) = y (x0 + h) =?
Found (x1 , y1 ). Repeat up to x = b.
801,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
Single-Step Methods

dy
= f (x, y ), y (x0 ) = y0
dx
y1 = y (x1 ) = y (x0 + h) =?
Found (x1 , y1 ). Repeat up to x = b.
802,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
Information at how many points are used at every step?
Single-step method: Only the current value
Multi-step method: History of several recent steps
Single-Step Methods
803,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
Eulers method
At (xn , yn ), evaluate slope
For a small step h,
dy
dx
= f (xn , yn ).
yn+1 = yn + hf (xn , yn )
Repitition of such steps constructs y (x).
Single-Step Methods
804,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
Eulers method
For a small step h,
dy
dx
= f (xn , yn ).
First order truncated Taylors series:
Expected error: O(h2 )
Accumulation over steps
Total error: O(h)
Eulers method is a first order method.
Single-Step Methods
805,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
Eulers method
For a small step h,
dy
dx
= f (xn , yn ).
First order truncated Taylors series:
Expected error: O(h2 )
Accumulation over steps
Total error: O(h)
Eulers method is a first order method.
Question: Total error = Sum of errors over the steps?
Answer: No, in general.
Single-Step Methods
Initial slope for the entire step: is it a good idea?

y
C
C1
y3
C2
y3
C3
y0
x0
x1
x2
806,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
x3
Figure: Eulers method
Single-Step Methods

y
C
C1
y3
C2
Q*
y3
y0
Q2
Q
C3
y0
C1
Q1
P1
x0
x1
x2
807,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
x3
x0
x1
Figure: Improved Eulers method
Improved Eulers method or Heuns method
Single-Step Methods

y
C
C1
y3
C2
Q*
y3
Q2
Q
C3
y0
y0
C1
Q1
P1
x0
x1
x2
808,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
x3
x0
x1
Figure: Improved Eulers method
Improved Eulers method or Heuns method

yn+1 = yn + h2 [f (xn , yn ) + f (xn+1 , yn+1 )]
The order of Heuns method is two.
Single-Step Methods
Runge-Kutta methods
Second order method:
and
k1 = hf (xn , yn ), k2 = hf (xn + h, yn + k1 )
k = w1 k1 + w2 k2 ,
xn+1 = xn + h, yn+1 = yn + k
Force agreement up to the second order.
809,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
Single-Step Methods
Runge-Kutta methods
and
810,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
k = w1 k1 + w2 k2 ,
xn+1 = xn + h, yn+1 = yn + k

yn+1
= yn + w1 hf (xn , yn ) + w2 h[f (xn , yn ) + hfx (xn , yn ) + k1 fy (xn , yn ) +
= yn + (w1 + w2 )hf (xn , yn ) + h2 w2 [fx (xn , yn ) + f (xn , yn )fy (xn , yn )] +

From Taylors series, using y = f (x, y ) and y = fx + ffy ,
y (xn+1 ) = yn + hf (xn , yn ) +
h2
[fx (xn , yn ) + f (xn , yn )fy (xn , yn )] +
2
Single-Step Methods
Runge-Kutta methods
and
811,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
k = w1 k1 + w2 k2 ,
xn+1 = xn + h, yn+1 = yn + k

yn+1

y (xn+1 ) = yn + hf (xn , yn ) +
h2
2
w1 + w2 = 1, w2 = w2 =
1
2
Single-Step Methods
Runge-Kutta methods
and
812,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
k = w1 k1 + w2 k2 ,
xn+1 = xn + h, yn+1 = yn + k

yn+1

y (xn+1 ) = yn + hf (xn , yn ) +
h2
2
w1 + w2 = 1, w2 = w2 =
1
2
==
1
2w2 ,
w1 = 1 w2
Single-Step Methods
With continuous choice of w2 ,
813,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
a family of second order Runge Kutta (RK2) formulae

Popular form of RK2: with choice w2 = 1,
k1 = hf (xn , yn ), k2 = hf (xn + h2 , yn +
xn+1 = xn + h, yn+1 = yn + k2
k1
2)
Single-Step Methods
With continuous choice of w2 ,
a family of second order Runge Kutta (RK2) formulae

Popular form of RK2: with choice w2 = 1,
k1 = hf (xn , yn ), k2 = hf (xn + h2 , yn +
xn+1 = xn + h, yn+1 = yn + k2
Fourth order Runge-Kutta method (RK4):
k1
k2
k3
k4
814,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
= hf (xn , yn )
= hf (xn + h2 , yn + k21 )
= hf (xn + h2 , yn + k22 )
= hf (xn + h, yn + k3 )
k = 61 (k1 + 2k2 + 2k3 + k4 )

xn+1 = xn + h, yn+1 = yn + k
k1
2)
815,
Single-Step Methods
Practical Implementation of Single-Step
Methods
Practical
Implementation of Single-Step Methods
Systems of ODEs
Multi-Step Methods*
Question: How to decide whether the error is within tolerance?
816,
Single-Step Methods
Methods
Practical
Systems of ODEs
Multi-Step Methods*

Additional estimates:
handle to monitor the error
further efficient algorithms
817,
Single-Step Methods
Methods
Practical
Systems of ODEs
Multi-Step Methods*

Runge-Kutta method with adaptive step size
In an interval [xn , xn + h],
(1)
yn+1 = yn+1 + ch5 + higher order terms

Over two steps of size h2 ,
(2)
yn+1
5
h
+ higher order terms
= yn+1 + 2c
2
818,
Single-Step Methods
Methods
Practical
Systems of ODEs
Multi-Step Methods*

Runge-Kutta method with adaptive step size
In an interval [xn , xn + h],
(1)
yn+1 = yn+1 + ch5 + higher order terms

Over two steps of size h2 ,
(2)
yn+1
5
h
+ higher order terms
= yn+1 + 2c
2
Difference of two estimates:

(1)
(2)
= yn+1 yn+1
(2)
Best available value: yn+1

= yn+1
15
15 5
ch
16
(2)
(1)
16yn+1 yn+1
15
819,
Single-Step Methods
Methods
Practical
Systems of ODEs
Multi-Step Methods*
Evaluation of a step:
> : Step size is too large for accuracy.
Subdivide the interval.
<< : Step size is inefficient!
820,
Single-Step Methods
Methods
Practical
Systems of ODEs
Multi-Step Methods*
Start with a large step size.
Keep subdividing intervals whenever > .
Fast marching over smooth segments and small steps in
zones featured with rapid changes in y (x).
821,
Single-Step Methods
Methods
Practical
Systems of ODEs
Multi-Step Methods*
Start with a large step size.
Keep subdividing intervals whenever > .
Fast marching over smooth segments and small steps in
zones featured with rapid changes in y (x).
Runge-Kutta-Fehlberg method
With six function values,
An RK4 formula embedded in an RK5 formula
two independent estimates and an error estimate!
RKF45 in professional implementations
Systems of ODEs
Methods for a single first order ODE
822,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
directly applicable to a first order vector ODE

A typical IVP with an ODE system:
dy
= f(x, y), y(x0 ) = y0
dx
Systems of ODEs

dy
= f(x, y), y(x0 ) = y0
dx
An n-th order ODE: convert into a system of first order ODEs
Defining state vector z(x) = [y (x) y (x) y (n1) (x)]T ,
work out
dz
dx
to form the state space equation.
Initial condition: z(x0 ) = [y (x0 ) y (x0 )
823,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
y (n1) (x0 )]T
Systems of ODEs

dy
= f(x, y), y(x0 ) = y0
dx
An n-th order ODE: convert into a system of first order ODEs
Defining state vector z(x) = [y (x) y (x) y (n1) (x)]T ,
work out
dz
dx
to form the state space equation.
Initial condition: z(x0 ) = [y (x0 ) y (x0 )
y (n1) (x0 )]T
A system of higher order ODEs with the highest order derivatives

of orders n1 , n2 , n3 , , nk
824,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
Cast into the state space form with the state vector of
dimension n = n1 + n2 + n3 + + nk
Systems of ODEs
825,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
State space formulation is directly applicable when
the highest order derivatives can be solved explicitly.

The resulting form of the ODEs: normal system of ODEs
Systems of ODEs

Example:
d 2x
y 2 3
dt
dy
dt
826,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
2
r 2
dx
d y
+ 2x
+4 = 0
dt
dt 2
2 3/2
3
d y
xy d y
e
+ 2x + 1 = e t
y
3
dt
dt 2

dx
dt
Systems of ODEs

Example:
2
r 2
dx
d y
+ 2x
+4 = 0
dt
dt 2
2 3/2
3
d y
xy d y
e
+ 2x + 1 = e t
y
3
dt
dt 2
h
iT
d2y
dy
State vector: z(t) = x dx
y
2
dt
dt
dt
d 2x
y 2 3
dt
dy
dt
827,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*

dx
dt
With three trivial derivatives z1 (t) = z2 , z3 (t) = z4 and z4 (t) = z5

and the other two obtained from the given ODEs,
we get the state space equations as
dz
dt
= f(t, z).
Multi-Step Methods*
Single-step methods: every step a brand new IVP!

Why not try to capture the trend?
828,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
Multi-Step Methods*
829,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*

A typical multi-step formula:
yn+1 = yn + h[c0 f (xn+1 , yn+1 ) + c1 f (xn , yn )
+ c2 f (xn1 , yn1 ) + c3 f (xn2 , yn2 ) + ]
Determine coefficients by demanding the exactness for leading
polynomial terms.
Multi-Step Methods*
830,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*

A typical multi-step formula:
yn+1 = yn + h[c0 f (xn+1 , yn+1 ) + c1 f (xn , yn )
+ c2 f (xn1 , yn1 ) + c3 f (xn2 , yn2 ) + ]
Determine coefficients by demanding the exactness for leading
polynomial terms.
Explicit methods: c0 = 0, evaluation easy, but involves
extrapolation.
Implicit methods: c0 6= 0, difficult to evaluate, but better stability.
Predictor-corrector methods
Example: Adams-Bashforth-Moulton method
Points to note
Eulers and Runge-Kutta methods
Step size adaptation
State space formulation of dynamic systems
831,
Single-Step Methods
Systems of ODEs
Multi-Step Methods*
Outline

Stability Analysis
Implicit Methods
Stiff Differential Equations
Boundary Value Problems

Stability Analysis
Implicit Methods
832,
Stability Analysis

Stability Analysis
Implicit Methods
Adaptive RK4 is an extremely successful method.

But, its scope has a limitation.
Focus of explicit methods (such as RK) is accuracy and efficiency.

The issue of stabilty is handled indirectly.
833,
Stability Analysis

Stability Analysis
Implicit Methods


Stabilty of explicit methods
For the ODE system y = f(x, y), Eulers method gives
yn+1 = yn + f(xn , yn )h + O(h2 ).
Taylors series of the actual solution:
y(xn+1 ) = y(xn ) + f(xn , y(xn ))h + O(h2 )
834,
Stability Analysis

Stability Analysis
Implicit Methods


Stabilty of explicit methods
For the ODE system y = f(x, y), Eulers method gives
yn+1 = yn + f(xn , yn )h + O(h2 ).
Taylors series of the actual solution:
y(xn+1 ) = y(xn ) + f(xn , y(xn ))h + O(h2 )
Discrepancy or error:
n+1 = yn+1 y(xn+1 )
= [yn y(xn )] + [f(xn , yn ) f(xn , y(xn ))]h + O(h2 )

f
(xn ,
yn )n h + O(h2 ) (I + hJ)n
= n +
y
835,
Stability Analysis

Stability Analysis
Implicit Methods
Eulers step magnifies the error by a factor (I + hJ).
836,
Stability Analysis

Stability Analysis
Implicit Methods

Using J loosely as the representative Jacobian,
n+1 (I + hJ)n 1 .
For stability, n+1 0 as n .
Eigenvalues of (I + hJ) must fall within the unit circle

|z| = 1. By shift theorem, eigenvalues of hJ must fall
inside the unit circle with the centre at z0 = 1.
|1 + h| < 1 h <
2Re ()
||2
837,
Stability Analysis

Stability Analysis
Implicit Methods

Using J loosely as the representative Jacobian,
n+1 (I + hJ)n 1 .
For stability, n+1 0 as n .
Eigenvalues of (I + hJ) must fall within the unit circle

|z| = 1. By shift theorem, eigenvalues of hJ must fall
inside the unit circle with the centre at z0 = 1.
|1 + h| < 1 h <
2Re ()
||2
Note: Same result for single ODE w = w , with complex .

For second order Runge-Kutta method,

h 2 2
n+1 = 1 + h +
n
2

2

Region of stability in the plane of z = h: 1 + z + z2 < 1
838,
Stability Analysis
Stability Analysis
Implicit Methods
3
UNSTABLE
2
UNSTABLE
RK2
Euler
Im(h)
1
RK4
3
5
Re(h)
Figure: Stability regions of explicit methods
Question: What do these stability regions mean with reference to

the system eigenvalues?
Question: How does the step size adaptation of RK4 operate on a
system with eigenvalues on the left half of complex plane?
839,
Stability Analysis
Stability Analysis
Implicit Methods
3
UNSTABLE
2
UNSTABLE
RK2
Euler
Im(h)
1
RK4
3
5
Re(h)
Figure: Stability regions of explicit methods
Question: What do these stability regions mean with reference to

the system eigenvalues?
Question: How does the step size adaptation of RK4 operate on a
system with eigenvalues on the left half of complex plane?
Step size adaptation tackles instability by its symptom!
840,
Implicit Methods
Backward Eulers method

Stability Analysis
Implicit Methods
yn+1 = yn + f(xn+1 , yn+1 )h

Solve it?
841,
Implicit Methods

Stability Analysis
Implicit Methods
yn+1 = yn + f(xn+1 , yn+1 )h

Solve it?
Is it worth solving?
n+1 yn+1 y(xn+1 )
= [yn y(xn )] + h[f(xn+1 , yn+1 ) f(xn+1 , y(xn+1 ))]
= n + hJ(xn+1 ,
yn+1 )n+1
Notice the flip in the form of this equation.
842,
Implicit Methods

Stability Analysis
Implicit Methods
yn+1 = yn + f(xn+1 , yn+1 )h

Solve it?
Is it worth solving?
n+1 yn+1 y(xn+1 )
= [yn y(xn )] + h[f(xn+1 , yn+1 ) f(xn+1 , y(xn+1 ))]
= n + hJ(xn+1 ,
yn+1 )n+1
Notice the flip in the form of this equation.

n+1 (I hJ)1 n
Stability: eigenvalues of (I hJ) outside the unit circle |z| = 1

2Re ()
||2
Absolute stability for a stable ODE, i.e. one with Re () < 0
|h 1| > 1 h >
843,
Implicit Methods
Stability Analysis
Implicit Methods
1.5
STABLE
STABLE
UNSTABLE
Im(h)
0.5
O
0.5
STABLE
1.5
2
1.5
0.5
0.5
1.5
2.5
3.5
Re(h)
Figure: Stability region of backward Eulers method
844,
Implicit Methods
Stability Analysis
Implicit Methods
1.5
STABLE
STABLE
UNSTABLE
Im(h)
0.5
O
0.5
STABLE
1.5
2
1.5
0.5
0.5
1.5
2.5
3.5
Re(h)
How to solve g(yn+1 ) = yn + hf(xn+1 , yn+1 ) yn+1 = 0 for yn+1 ?
845,
Implicit Methods
Stability Analysis
Implicit Methods
1.5
STABLE
STABLE
UNSTABLE
Im(h)
0.5
O
0.5
STABLE
1.5
2
1.5
0.5
0.5
1.5
2.5
3.5
Re(h)
How to solve g(yn+1 ) = yn + hf(xn+1 , yn+1 ) yn+1 = 0 for yn+1 ?

Typical Newtons iteration:
h

i
(k+1)
(k)
(k)
(k)
yn+1 = yn+1 + (I hJ)1 yn yn+1 + hf xn+1 , yn+1
Semi-implicit Eulers method for local solution:
yn+1 = yn + h(I hJ)1 f(xn+1 , yn )
846,
Stability Analysis
Implicit Methods
Example: IVP of a mass-spring-damper system:
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
x + c x + kx = 0, x(0) = 0, x(0)
=1
(a) c = 3, k = 2: x = e t e 2t
(b) c = 49, k = 600: x = e 24t e 25t
0.2
0.2
0.4
0.4
0.6
0.6
0.8
0.8
0.5
1.5
2
t
2.5
3.5
(a) Case of c = 3, k = 2
0.1
0.2
0.3
0.4
0.5
t
0.6
0.7
0.8
0.9
(b) Case of c = 49, k = 600
Figure: Solutions of a mass-spring-damper system: ordinary situations
847,

(c) c = 302, k = 600: x =
e 2t e 300t
298
Stability Analysis
Implicit Methods
x 10
0.1
0.2
0.3
0.4
0.5
t
0.6
0.7
0.8
0.9
(c) With RK4

Figure: Solutions of a mass-spring-damper system: stiff situation
848,

(c) c = 302, k = 600: x =
e 2t e 300t
298
x 10
0.1
0.2
0.3
0.4
0.5
t
0.6
0.7
0.8
(c) With RK4
0.9
x 10
Stability Analysis
Implicit Methods
0.1
0.2
0.3
0.4
0.5
t
0.6
0.7
0.8
0.9
(d) With implicit Euler
Figure: Solutions of a mass-spring-damper system: stiff situation
To solve stiff ODE systems,

use implicit method, preferably with explicit Jacobian.
849,

Stability Analysis
Implicit Methods
A paradigm shift from the initial value problems
A ball is thrown with a particular velocity. What trajectory

does the ball follow?
How to throw a ball such that it hits a particular window at a

neighbouring house after 15 seconds?
850,

Stability Analysis
Implicit Methods


Two-point BVP in ODEs:

boundary conditions at two values of the independent
variable
851,

Stability Analysis
Implicit Methods


Two-point BVP in ODEs:

boundary conditions at two values of the independent
variable
Methods of solution
Shooting method
Finite difference (relaxation) method
Finite element method
852,

Stability Analysis
Implicit Methods
Shooting method
follows the strategy to adjust trials to hit a target.
Consider the 2-point BVP
y = f(x, y), g1 (y(a)) = 0, g2 (y(b)) = 0,
where g1 R n1 , g2 R n2 and n1 + n2 = n.
Parametrize initial state: y(a) = h(p) with p R n2 .

Guess n2 values of p to define IVP
y = f(x, y), y(a) = h(p).
Solve this IVP for [a, b] and evaluate y(b).
Define error vector E(p) = g2 (y(b)).
853,

Stability Analysis
Implicit Methods
Objective: To solve E(p) = 0

From current vector p, n2 perturbations as p + ei : Jacobian
Each Newtons step: solution of n2 + 1 initial value
problems!
E
p
854,

Stability Analysis
Implicit Methods

problems!
Computational cost
Convergence not guaranteed (initial guess important)
E
p
855,

Stability Analysis
Implicit Methods

problems!
Computational cost
Convergence not guaranteed (initial guess important)
Merits of shooting method
Very few parameters to start
In many cases, it is found quite efficient.
E
p
856,


Stability Analysis
Implicit Methods
adopts a global perspective.

1. Discretize domain [a, b]: grid of points
a = x0 < x1 < x2 < < xN1 < xN = b.
Function values y(xi ): n(N + 1) unknowns
2. Replace the ODE over intervals by finite difference equations.
Considering mid-points, a typical (vector) FDE:

xi + xi 1 yi + yi 1
,
yi yi 1 hf
= 0, for i = 1, 2, 3, , N
2
2
nN (scalar) equations
3. Assemble additional n equations from boundary conditions.
4. Starting from a guess solution over the grid, solve this system.
(Sparse Jacobian is an advantage.)
857,


Stability Analysis
Implicit Methods
adopts a global perspective.

1. Discretize domain [a, b]: grid of points
a = x0 < x1 < x2 < < xN1 < xN = b.
Function values y(xi ): n(N + 1) unknowns
2. Replace the ODE over intervals by finite difference equations.
Considering mid-points, a typical (vector) FDE:

xi + xi 1 yi + yi 1
,
yi yi 1 hf
= 0, for i = 1, 2, 3, , N
2
2
nN (scalar) equations
3. Assemble additional n equations from boundary conditions.
4. Starting from a guess solution over the grid, solve this system.
(Sparse Jacobian is an advantage.)
Iterative schemes for solution of systems of linear equations.
858,
Points to note

Stability Analysis
Implicit Methods
Numerical stability of ODE solution methods
Computational cost versus better stability of implicit methods
Multiscale responses leading to stiffness: failure of explicit

methods
Implicit methods for stiff systems
Shooting method for two-point boundary value problems
Relaxation method for boundary value problems
859,
Outline

Well-Posedness of Initial Value Problems
Uniqueness Theorems
Extension to ODE Systems
Closure

Uniqueness Theorems
Closure
860,

Uniqueness Theorems
Closure
Pierre Simon de Laplace (1749-1827):
We may regard the present state of the
universe as the effect of its past and the
cause of its future. An intellect which at a
certain moment would know all forces that
set nature in motion, and all positions of all
items of which nature is composed, if this
intellect were also vast enough to submit
these data to analysis, it would embrace in a
single formula the movements of the greatest
bodies of the universe and those of the
tiniest atom; for such an intellect nothing
would be uncertain and the future just like
the past would be present before its eyes.
861,

Uniqueness Theorems
Initial value problem
Closure
y = f (x, y ), y (x0 ) = y0
From (x, y ), the trajectory develops according to y = f (x, y ).
The new point: (x + x, y + f (x, y )x)
The slope now: f (x + x, y + f (x, y )x)
Question: Was the old direction of approach valid?
862,

Uniqueness Theorems
Closure
y = f (x, y ), y (x0 ) = y0
With x 0, directions appropriate, if
lim f (x, y ) = f (
x , y (
x )),
x
x
i.e. if f (x, y ) is continuous.
863,

Uniqueness Theorems
Closure
y = f (x, y ), y (x0 ) = y0
With x 0, directions appropriate, if
lim f (x, y ) = f (
x , y (
x )),
x
x
i.e. if f (x, y ) is continuous.

If f (x, y ) = , then y = and trajectory is vertical.
For the same value of x, several values of y !
y (x) not a function, unless f (x, y ) 6= , i.e. f (x, y ) is bounded.
864,

Uniqueness Theorems
Peanos theorem: If f (x, y ) is continuous Closure

and bounded in a
rectangle R = {(x, y ) : |x x0 | < h, |y y0 | < k}, with
|f (x, y )| M < , then the IVP y = f (x, y ), y (x0 ) = y0 has a
solution y (x) defined in a neighbourhood of x0 .
865,

Uniqueness Theorems
Peanos theorem: If f (x, y ) is continuous Closure

and bounded in a
rectangle R = {(x, y ) : |x x0 | < h, |y y0 | < k}, with
|f (x, y )| M < , then the IVP y = f (x, y ), y (x0 ) = y0 has a
solution y (x) defined in a neighbourhood of x0 .
y
0
1
0
1
0
1
0
1
(x0,y0)
0
1
0
1
0
1
0
1
0 1
1
0
1
y +k0000000
0
1
1111111
0
0
1
0
1
0 1
0
1
0
1
0 1
0
1
0
1
0
000000000000
000000000000
111111111111
111111
000000
0
1
0
1
0
1
0 111111111111
1
000000000000
111111111111
000000000000
111111111111
0 1
1
0
1
0
000000000000
111111111111
000000000000
111111111111
0
1
0
1
0 k 111111111111
1
000000000000
000000000000
111111111111
0 1
1
0
1
0
000000000000
111111111111
000000000000
111111111111
Mh
0
1
0
1
0 111111111111
1
000000000000
000000000000
111111111111
0 1
1
0
1
0
000000000000
111111111111
000000000000
111111111111
0
1
1111111
0000000
0 111111111111
1
000000000000
111111111111
000000000000
111111111111
111111
000000
0
1
0 1
1
y1
0
000000000000
111111111111
0
1
00
0 000000000000
1
000000000000
111111111111
000000000000
111111111111
0
1
0 1
1
0
000000000000
000000000000
111111111111
0
1
0
1
0 111111111111
1
000000000000
111111111111
000000000000 Mh
111111111111
0
1
0 1
1
0
000000000000
000000000000
111111111111
0
1
0
1
0 k 111111111111
1
000000000000
111111111111
000000000000
111111111111
0
1
0 1
1
0
000000000000
111111111111
000000000000
111111111111
0
1
0
1
111111
000000
0
1
0
1
0
1
0 1
1
0
1
0
1
0
0
1
0
0
1
0
1
0
1
1111111
0000000
0 1
0
1
y k1
0 1
1
0 1
1
0
0
0
1
0
0
1
0 1
0
0
1
0
1
0
1
0 1
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
h
h
0
1
0
1
0
1
0
1
0
1
0
1
0
1
111111111111
000000000000
111111111111
000000000000
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
11111111111111111111111111111111
00000000000000000000000000000000
0
1
0
1
0
1
0
0
1
O1
x 0 h
x0
x 0+h
x
y
0
1
0
1
0
1
0
1
0
1
(x0,y0)
0
1
0
1
00000000000
11111111111
00000000000
11111111111
y +k1
0
00000000000
11111111111
00000000000
11111111111
0
0 1
00000000000
11111111111
00000000000
11111111111
0
1
00000000000
11111111111
00000000000
11111111111
0
1
00000000000
11111111111
00000000000
11111111111
Mh
0
1
k
00000000000
11111111111
00000000000
11111111111
0
1
00000000000
11111111111
00000000000
11111111111
0
1
00000000000
11111111111
00000000000
11111111111
0
1
00000000000
11111111111
00000000000
11111111111
0
1
00000000000
11111111111
00000000000
11111111111
0
1
y1
00000000000
11111111111
00000000000
11111111111
0
00000000000
11111111111
00000000000
11111111111
01
0
00000000000
11111111111
00000000000
11111111111
0
1
00000000000
11111111111
00000000000
11111111111
0
1
00000000000
00000000000
11111111111
k 11111111111
0
1
00000000000
11111111111
00000000000
11111111111
Mh
0
1
00000000000
11111111111
00000000000
11111111111
0
1
00000000000
11111111111
00000000000
11111111111
0
1
00000000000
11111111111
00000000000
11111111111
0
1
00000000000
11111111111
00000000000
11111111111
0
1
00000000000
11111111111
00000000000
11111111111
y0k
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
h
h
0
1
0
1
11111111111
00000000000
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
11111111111111111111111111111111
00000000000000000000000000000000
0
O1
x 0h
x0
x0+h
x
(a) Mh <= k
(b) Mh >= k
Figure: Regions containing the trajectories
Guaranteed neighbourhood:
k
[x0 , x0 + ], where = min(h, M
)>0
866,

Uniqueness Theorems
Closure
Example:
y =
Function f (x, y ) =
y 1
x
y 1
, y (0) = 1
x
undefined at (0, 1).
Premises of existence theorem not satisfied.
867,

Uniqueness Theorems
Closure
Example:
y =
y 1
x
y 1
, y (0) = 1
x

But, premises here are sufficient, not necessary!
Result inconclusive.
868,

Uniqueness Theorems
Closure
Example:
y =
y 1
x
y 1
, y (0) = 1
x

The IVP has solutions: y (x) = 1 + cx for all values of c.
The solution is not unique.
869,

Uniqueness Theorems
Closure
Example:
y =
y 1
x
y 1
, y (0) = 1
x

The IVP has solutions: y (x) = 1 + cx for all values of c.
The solution is not unique.
Example: y 2 = |y |, y (0) = 0
Existence theorem guarantees a solution.
But, there are two solutions:

y (x) = 0 and y (x) = sgn(x) x 2 /4.
870,

Uniqueness Theorems
Closure
Physical system to mathematical model

Mathematical solution
Interpretation about the physical system
Meanings of non-uniqueness of a solution
Mathematical model admits of extraneous solution(s)?
Physical system itself can exhibit alternative behaviours?
871,

Uniqueness Theorems
Closure

Indeterminacy of the solution
Mathematical model of the system is not complete.

The initial value problem is not well-posed.
872,

Uniqueness Theorems
Closure

Indeterminacy of the solution
Mathematical model of the system is not complete.

The initial value problem is not well-posed.
After existence, next important question:

Uniqueness of a solution
873,

Uniqueness Theorems
Closure
Continuous dependence on initial condition

Suppose that for IVP y = f (x, y ), y (x0 ) = y0 ,
unique solution: y1 (x).
Applying a small perturbation to the initial condition, the new IVP:

y = f (x, y ), y (x0 ) = y0 +
unique solution: y2 (x)
Question: By how much y2 (x) differs from y1 (x) for x > x0 ?
874,

Uniqueness Theorems
Closure


y = f (x, y ), y (x0 ) = y0 +

Large difference: solution sensitive to initial condition
Practically unreliable solution
875,

Uniqueness Theorems
Closure


y = f (x, y ), y (x0 ) = y0 +

Large difference: solution sensitive to initial condition
Practically unreliable solution
Well-posed IVP:
An initial value problem is said to be well-posed if there
exists a solution to it, the solution is unique and it
depends continuously on the initial conditions.
876,
Uniqueness Theorems
Lipschitz condition:

Uniqueness Theorems
Closure
|f (x, y ) f (x, z)| L|y z|
L: finite positive constant (Lipschitz constant)
Theorem: If f (x, y ) is a continuous function satisfying a

Lipschitz condition on a strip
S = {(x, y ) : a < x < b, < y < }, then for any
point (x0 , y0 ) S, the initial value problem of
y = f (x, y ), y (x0 ) = y0 is well-posed.
877,
Uniqueness Theorems

Uniqueness Theorems
Closure
|f (x, y ) f (x, z)| L|y z|

Assume y1 (x) and y2 (x): solutions of the ODE y = f (x, y ) with
initial conditions y (x0 ) = (y1 )0 and y (x0 ) = (y2 )0
Consider E (x) = [y1 (x) y2 (x)]2 .
E (x) = 2(y1 y2 )(y1 y2 ) = 2(y1 y2 )[f (x, y1 ) f (x, y2 )]
878,
Uniqueness Theorems

Uniqueness Theorems
Closure
|f (x, y ) f (x, z)| L|y z|

Assume y1 (x) and y2 (x): solutions of the ODE y = f (x, y ) with
initial conditions y (x0 ) = (y1 )0 and y (x0 ) = (y2 )0
Consider E (x) = [y1 (x) y2 (x)]2 .
E (x) = 2(y1 y2 )(y1 y2 ) = 2(y1 y2 )[f (x, y1 ) f (x, y2 )]
Applying Lipschitz condition,
|E (x)| 2L(y1 y2 )2 = 2LE (x).
Need to consider the case of E (x) 0 only.
879,
Uniqueness Theorems
E (x)
2L
E (x)

Uniqueness Theorems
Closure
x0
E (x)
dx 2L(x x0 )
E (x)
Integrating, E (x) E (x0 )e 2L(xx0 ) .
880,
Uniqueness Theorems
E (x)
2L
E (x)

Uniqueness Theorems
Closure
x0
E (x)
dx 2L(x x0 )
E (x)

Hence,
|y1 (x) y2 (x)| e L(xx0 ) |(y1 )0 (y2 )0 |.
881,
Uniqueness Theorems
E (x)
2L
E (x)

Uniqueness Theorems
Closure
x0
E (x)
dx 2L(x x0 )
E (x)

Hence,
|y1 (x) y2 (x)| e L(xx0 ) |(y1 )0 (y2 )0 |.
Since x [a, b], e L(xx0 ) is finite.
|(y1 )0 (y2 )0 | = |y1 (x) y2 (x)| e L(xx0 )
continuous dependence of the solution on initial condition
882,
Uniqueness Theorems
E (x)
2L
E (x)

Uniqueness Theorems
Closure
x0
E (x)
dx 2L(x x0 )
E (x)

Hence,
|y1 (x) y2 (x)| e L(xx0 ) |(y1 )0 (y2 )0 |.
Since x [a, b], e L(xx0 ) is finite.
|(y1 )0 (y2 )0 | = |y1 (x) y2 (x)| e L(xx0 )
continuous dependence of the solution on initial condition
In particular, (y1 )0 = (y2 )0 = y0 y1 (x) = y2 (x) x [a, b].
The initial value problem is well-posed.
883,
Uniqueness Theorems

Uniqueness Theorems
Closure
A weaker theorem (hypotheses are stronger):
f
are continuous and
Picards theorem: If f (x, y ) and y
bounded on a rectangle
R = {(x, y ) : a < x < b, c < y < d}, then for every
(x0 , y0 ) R, the IVP y = f (x, y ), y (x0 ) = y0 has a
unique solution in some neighbourhood |x x0 | h.
884,
Uniqueness Theorems

Uniqueness Theorems
Closure
f
are continuous and
From the mean value theorem,

f
(x, )(y1 y2 ).
y

f
With Lipschitz constant L = sup y
,
f (x, y1 ) f (x, y2 ) =
Lipschitz condition is satisfied lavishly !
885,
Uniqueness Theorems

Uniqueness Theorems
Closure
f
are continuous and
From the mean value theorem,

f
(x, )(y1 y2 ).
y

f
With Lipschitz constant L = sup y
,
f (x, y1 ) f (x, y2 ) =
Lipschitz condition is satisfied lavishly !
Note: All these theorems give only sufficient conditions!

Hypotheses of Picards theorem Lipschitz condition
Well-posedness Existence and uniqueness
886,

Uniqueness Theorems
Closure
For ODE System

dy
= f(x, y), y(x0 ) = y0
dx
kf(x, y) f(x, z)k Lky zk
Scalar function E (x) generalized as

E (x) = ky1 (x) y2 (x)k2 = (y1 y2 )T (y1 y2 )
f
y
replaced by the Jacobian A =
f
y
Partial derivative
Boundedness to be inferred from the boundedness of its norm
With these generalizations, the formulations work as usual.
887,

IVP of linear first order ODE system

Uniqueness Theorems
Closure
y = A(x)y + g(x), y(x0 ) = y0

Rate function: f(x, y) = A(x)y + g(x)
Continuity and boundedness of the coefficient functions
in A(x) and g(x) are sufficient for well-posedness.
888,

IVP of linear first order ODE system

Uniqueness Theorems
Closure
y = A(x)y + g(x), y(x0 ) = y0

Rate function: f(x, y) = A(x)y + g(x)
Continuity and boundedness of the coefficient functions
in A(x) and g(x) are sufficient for well-posedness.
An n-th order linear ordinary differential equation
y (n) +P1 (x)y (n1) +P2 (x)y (n2) + +Pn1 (x)y +Pn (x)y = R(x)
State vector: z = [y y y y (n1) ]T
With z1 = z2 , z2 = z3 , , zn1
= zn and zn from the ODE,
state space equation in the form z = A(x)z + g(x)
Continuity and boundedness of P1 (x), P2 (x), , Pn (x)

and R(x) guarantees well-posedness.
889,
Closure

Uniqueness Theorems
Closure
A practical by-product of existence and uniqueness results:
important results concerning the solutions
890,
Closure

Uniqueness Theorems
Closure
A sizeable segment of current research: ill-posed problems

Dynamics of some nonlinear systems
Chaos: sensitive dependence on initial conditions
891,
Closure

Uniqueness Theorems
Closure

For boundary value problems,

No general criteria for existence and uniqueness
892,
Closure

Uniqueness Theorems
Closure

For boundary value problems,

No general criteria for existence and uniqueness
Note: Taking clue from the shooting method, a BVP in ODEs
can be visualized as a complicated root-finding problem!
Multiple solutions or non-existence of solution is no surprise.
893,
Points to note

Uniqueness Theorems
Closure
For a solution of initial value problems, questions of existence,

uniqueness and continuous dependence on initial condition are
of crucial importance.
These issues pertain to aspects of practical relevance

regarding a physical system and its dynamic simulation
Lipschitz condition is the tightest (available) criterion for

deciding these questions regarding well-posedness
894,
Outline
895,
Formation of Differential Equations and Their Solutio

Separation of Variables
ODEs with Rational Slope Functions
Some Special ODEs
Exact Differential Equations and Reduction to the Ex
First Order Linear (Leibnitz) ODE and Associated Fo
Orthogonal Trajectories
Modelling and Simulation

Formation of Differential Equations and Their Solutions
Some Special ODEs
Exact Differential Equations and Reduction to the Exact Form
First Order Linear (Leibnitz) ODE and Associated Forms
896,

Formation of Differential Equations and
Their
Solutions
Separation
of Variables
Some Special ODEs

Order Linear (Leibnitz) ODE and Associated Fo

A differential equation represents a class of First
functions.
Example: y (x) = cx k
897,

Their
Solutions
Separation
of Variables
Some Special ODEs


functions.
With
dy
dx
= ckx k1 and
d2y
dx 2
= ck(k 1)x k2 ,
d 2y
xy 2 = x
dx
A compact intrinsic description.
dy
dx
2
dy
dx
898,

Their
Solutions
Separation
of Variables
Some Special ODEs


functions.
With
dy
dx
= ckx k1 and
d2y
dx 2
= ck(k 1)x k2 ,
d 2y
xy 2 = x
dx
dy
dx
2
dy
dx

Important terms
Order and degree of differential equations
Homogeneous and non-homogeneous ODEs
899,

Their
Solutions
Separation
of Variables
Some Special ODEs


functions.
With
dy
dx
= ckx k1 and
d2y
dx 2
= ck(k 1)x k2 ,
d 2y
xy 2 = x
dx
dy
dx
2
dy
dx

Important terms
Order and degree of differential equations
Homogeneous and non-homogeneous ODEs
Solution of a differential equation
general, particular and singular solutions
ODE form with separable variables:
y = f (x, y )
900,

Some Special ODEs
(x)
dy
=
or (y )dy = (x)dx
dx
(y )
y = f (x, y )
901,

Some Special ODEs
(x)
dy
=
or (y )dy = (x)dx
dx
(y )
Solution as quadrature:
Z
Z
(y )dy = (x)dx + c.
y = f (x, y )
902,

Some Special ODEs
(x)
dy
=
or (y )dy = (x)dx
dx
(y )
Solution as quadrature:
Z
Z
(y )dy = (x)dx + c.
Separation of variables through substitution
Example:
y = g (x + y + )
Substitute v = x + y + to arrive at
Z
dv
dv
= + g (v ) x =
+c
dx
+ g (v )
903,
of Differential Equations and Their Solutio

ODEs with Rational Slope FunctionsFormation
f1 (x, y )
y =
f2 (x, y )
Some Special ODEs

904,

Some Special ODEs


f1 (x, y )
y =
f2 (x, y )
If f1 and f2 are homogeneous functions of n-th degree, then
substitution y = ux separates variables x and u.
dy
1 (y /x)
du
1 (u)
=
u+x
=
dx
2 (y /x)
dx
2 (u)
905,

Some Special ODEs


f1 (x, y )
y =
f2 (x, y )
dy
1 (y /x)
du
1 (u)
dx
2 (u)
=
u+x
=
=
du
dx
2 (y /x)
dx
2 (u)
x
1 (u) u2 (u)
906,

Some Special ODEs


f1 (x, y )
y =
f2 (x, y )
dy
1 (y /x)
du
1 (u)
dx
2 (u)
=
u+x
=
=
du
dx
2 (y /x)
dx
2 (u)
x
1 (u) u2 (u)
For y =
a1 x+b1 y +c1
a2 x+b2 y +c2 ,
coordinate shift
x = X + h, y = Y + k y =
dy
dY
=
dx
dX
produces
a1 X + b1 Y + (a1 h + b1 k + c1 )
dY
=
.
dX
a2 X + b2 Y + (a2 h + b2 k + c2 )
Choose h and k such that
a1 h + b1 k + c1 = 0 = a2 h + b2 k + c2 .
If the system is inconsistent, then substitute u = a x + b y .
Some Special ODEs

Clairauts equation
y = xy + f (y )
Substitute p = y and differentiate:
p =p+x
dp
dp
+ f (p)
dx
dx
907,

Some Special ODEs
dp
[x + f (p)] = 0
dx
Some Special ODEs

Clairauts equation
y = xy + f (y )
p =p+x
dp
dp
+ f (p)
dx
dx
dp
dx = 0 means y = p = m (constant)
family of straight lines y = mx +
908,

Some Special ODEs
dp
[x + f (p)] = 0
dx
f (m) as general solution
Some Special ODEs

Clairauts equation
y = xy + f (y )
p =p+x
dp
dp
+ f (p)
dx
dx
dp
dx = 0 means y = p = m (constant)
family of straight lines y = mx +
dp
[x + f (p)] = 0
dx
f (m) as general solution
Singular solution:
x = f (p) and
909,

Some Special ODEs
y = f (p) pf (p)
Singular solution is the envelope of the family of straight

lines that constitute the general solution.
Some Special ODEs
910,

Some Special ODEs
Second order ODEs with the function not appearing

explicitly
f (x, y , y ) = 0
Substitute y = p and solve f (x, p, p ) = 0 for p(x).
Some Special ODEs

explicitly
f (x, y , y ) = 0

Second order ODEs with independent variable not appearing
explicitly
f (y , y , y ) = 0
Use y = p and
y =
Solve for p(y ).
911,

Some Special ODEs
dp dy
dp
dp
dp
=
=p
f (y , p, p ) = 0.
dx
dy dx
dy
dy
Some Special ODEs

explicitly
f (x, y , y ) = 0

Second order ODEs with independent variable not appearing
explicitly
f (y , y , y ) = 0
Use y = p and
y =
912,

Some Special ODEs
dp dy
dp
dp
dp
=
=p
f (y , p, p ) = 0.
dx
dy dx
dy
dy
Solve for p(y ).

Resulting equation solved through a quadrature as
Z
dy
dy
= p(y ) x = x0 +
.
dx
p(y )
913,

Exact Differential Equations and Reduction
to the Exact For
Mdx + Ndy : an exact differential if

M=
and N =
,
y
or,
Some Special ODEs

M
N
=
y
x
914,

to the Exact For

M=
and N =
,
y
or,
Some Special ODEs

M
N
=
y
x
M(x, y )dx + N(x, y )dy = 0 is an exact ODE if
M
y
N
x
915,

to the Exact For

M=
and N =
,
y
or,
Some Special ODEs

M
N
=
y
x

With M(x, y ) =
and N(x, y ) =
M
y
y ,
dx +
dy = 0 d = 0.
x
y
Solution: (x, y ) = c
N
x
916,

to the Exact For

M=
and N =
,
y
or,
Some Special ODEs

M
N
=
y
x

With M(x, y ) =
and N(x, y ) =
M
y
N
x
y ,
dx +
dy = 0 d = 0.
x
y
Working rule:
Z
Z
1 (x, y ) = M(x, y )dx+g1 (y ) and 2 (x, y ) = N(x, y )dy +g2 (x)
Determine g1 (y ) and g2 (x) from 1 (x, y ) = 2 (x, y ) = (x, y ).
917,

to the Exact For

M=
and N =
,
y
or,
Some Special ODEs

M
N
=
y
x

With M(x, y ) =
and N(x, y ) =
M
y
N
x
y ,
dx +
dy = 0 d = 0.
x
y
Working rule:
Z
Z
1 (x, y ) = M(x, y )dx+g1 (y ) and 2 (x, y ) = N(x, y )dy +g2 (x)
Determine g1 (y ) and g2 (x) from 1 (x, y ) = 2 (x, y ) = (x, y ).
N
If M
y 6= x , but y (FM) = x (FN)?
F : Integrating factor
918,

Some Special ODEs

General first order linear ODE:
dy
+ P(x)y = Q(x)
dx
Leibnitz equation
919,

Some Special ODEs

dy
+ P(x)y = Q(x)
dx
Leibnitz equation
For integrating factor F (x),
F (x)
dy
d
dF
+ F (x)P(x)y =
[F (x)y ]
= F (x)P(x).
dx
dx
dx
920,

Some Special ODEs

dy
+ P(x)y = Q(x)
dx
Leibnitz equation
dy
d
dF
+ F (x)P(x)y =
[F (x)y ]
= F (x)P(x).
dx
dx
dx
Separating variables,
Z
Z
Z
dF
= P(x)dx ln F = P(x)dx.
F
F (x)
Integrating factor: F (x) = e
P(x)dx
921,

Some Special ODEs

dy
+ P(x)y = Q(x)
dx
Leibnitz equation
dy
d
dF
+ F (x)P(x)y =
[F (x)y ]
= F (x)P(x).
dx
dx
dx
Separating variables,
Z
Z
Z
dF
= P(x)dx ln F = P(x)dx.
F
F (x)
Integrating factor: F (x) = e

ye
P(x)dx
P(x)dx
Q(x)e
P(x)dx
dx + C

Bernoullis equation
922,

Some Special ODEs
Modelling
and Simulation
k
dy
+ P(x)y = Q(x)y
dx

Bernoullis equation
dy
+ P(x)y = Q(x)y
dx
Substitution: z = y 1k ,
923,

Some Special ODEs
Modelling
and Simulation
k
dz
dx
= (1 k)y k dy
dx gives
dz
+ (1 k)P(x)z = (1 k)Q(x),
dx
in the Leibnitz form.

Bernoullis equation
dy
+ P(x)y = Q(x)y
dx
924,

Some Special ODEs
Modelling
and Simulation
k
dz
dx
= (1 k)y k dy
dx gives
dz
+ (1 k)P(x)z = (1 k)Q(x),
dx
Riccati equation
y = a(x) + b(x)y + c(x)y 2
If one solution y1 (x) is known, then propose y (x) = y1 (x) + z(x).

Bernoullis equation
dy
+ P(x)y = Q(x)y
dx
925,

Some Special ODEs
Modelling
and Simulation
k
dz
dx
= (1 k)y k dy
dx gives
dz
+ (1 k)P(x)z = (1 k)Q(x),
dx
Riccati equation
y = a(x) + b(x)y + c(x)y 2
If one solution y1 (x) is known, then propose y (x) = y1 (x) + z(x).
y1 (x) + z (x) = a(x) + b(x)[y1 (x) + z(x)] + c(x)[y1 (x) + z(x)]2
Since y1 (x) = a(x) + b(x)y1 (x) + c(x)[y1 (x)]2 ,
z (x) = [b(x) + 2c(x)y1 (x)]z(x) + c(x)[z(x)]2 ,
in the form of Bernoullis equation.
926,

Some Special ODEs
In xy -plane, one-parameter equation (x, y , c) = 0:

a family of curves
Differential equation of the family of curves:

dy
= f1 (x, y )
dx
927,

Some Special ODEs

a family of curves

dy
= f1 (x, y )
dx
Slope of curves orthogonal to (x, y , c) = 0:
dy
1
=
dx
f1 (x, y )
Solving this ODE, another family of curves (x, y , k) = 0.
Orthogonal trajectories
928,

Some Special ODEs

a family of curves

dy
= f1 (x, y )
dx
Slope of curves orthogonal to (x, y , c) = 0:
dy
1
=
dx
f1 (x, y )
Solving this ODE, another family of curves (x, y , k) = 0.
Orthogonal trajectories
If (x, y , c) = 0 represents the potential lines (contours),
then (x, y , k) = 0 will represent the streamlines!
Points to note
Meaning and solution of ODEs
Separating variables
Exact ODEs and integrating factors
Linear (Leibnitz) equations
Orthogonal families of curves
929,

Some Special ODEs
Outline
930,
Introduction
Homogeneous Equations with Constant Coefficients
Euler-Cauchy Equation
Theory of the Homogeneous Equations
Basis for Solutions

Introduction
Basis for Solutions
Introduction
931,
Introduction
Basis for Solutions
Second order ODE:

f (x, y , y , y ) = 0
Special case of a linear (non-homogeneous) ODE:

y + P(x)y + Q(x)y = R(x)
Non-homogeneous linear ODE with constant coefficients:
y + ay + by = R(x)
Introduction
932,
Introduction
Basis for Solutions
Second order ODE:

f (x, y , y , y ) = 0
Special case of a linear (non-homogeneous) ODE:

y + P(x)y + Q(x)y = R(x)
Non-homogeneous linear ODE with constant coefficients:
y + ay + by = R(x)
For R(x) = 0, linear homogeneous differential equation
y + P(x)y + Q(x)y = 0
and linear homogeneous ODE with constant coefficients
y + ay + by = 0
933,
Introduction
Homogeneous Equations with Constant
Coefficients
Homogeneous
Equations with Constant Coefficients
Basis for Solutions
y + ay + by = 0
934,
Introduction
Coefficients
Homogeneous
Basis for Solutions
y + ay + by = 0
Assume
y = e x
y = e x and y = 2 e x .
Substitution: (2 + a + b)e x = 0
Auxiliary equation:
2 + a + b = 0
Solve for 1 and 2 :
Solutions: e 1 x and e 2 x
935,
Introduction
Coefficients
Homogeneous
Basis for Solutions
y + ay + by = 0
Assume
y = e x
y = e x and y = 2 e x .
Substitution: (2 + a + b)e x = 0
Auxiliary equation:
2 + a + b = 0
Solve for 1 and 2 :
Solutions: e 1 x and e 2 x
Three cases
Real and distinct (a2 > 4b): 1 6= 2

y (x) = c1 y1 (x) + c2 y2 (x) = c1 e 1 x + c2 e 2 x
936,
Introduction
Coefficients
Homogeneous
Real and equal
(a2

Basis for Solutions
= 4b): 1 = 2 = = a2
only solution in hand: y1 = e x

Method to develop another solution?
937,
Introduction
Coefficients
Homogeneous
Real and equal
(a2

Basis for Solutions
= 4b): 1 = 2 = = a2

Verify that y2 = xe x is another solution.
y (x) = c1 y1 (x) + c2 y2 (x) = (c1 + c2 x)e x
938,
Introduction
Coefficients
Homogeneous
Real and equal
(a2

Basis for Solutions
= 4b): 1 = 2 = = a2

y (x) = c1 y1 (x) + c2 y2 (x) = (c1 + c2 x)e x
Complex conjugate (a2 < 4b): 1,2 = 2a i

a
y (x) = c1 e ( 2 +i )x + c2 e ( 2 i )x
ax
= e 2 [c1 (cos x + i sin x) + c2 (cos x i sin x)]

ax
= e 2 [A cos x + B sin x],
with A = c1 + c2 , B = i (c1 c2 ).
939,
Introduction
Coefficients
Homogeneous
Real and equal
(a2

Basis for Solutions
= 4b): 1 = 2 = = a2

y (x) = c1 y1 (x) + c2 y2 (x) = (c1 + c2 x)e x
Complex conjugate (a2 < 4b): 1,2 = 2a i

a
y (x) = c1 e ( 2 +i )x + c2 e ( 2 i )x
ax
= e 2 [c1 (cos x + i sin x) + c2 (cos x i sin x)]

ax
= e 2 [A cos x + B sin x],
with A = c1 + c2 , B = i (c1 c2 ).
ax
A third form: y (x) = Ce 2 cos(x )
940,
Introduction
Basis for Solutions
x 2 y + axy + by = 0
Substituting y = x k , auxiliary (or indicial) equation:
k 2 + (a 1)k + b = 0
941,
Introduction
Basis for Solutions
x 2 y + axy + by = 0
k 2 + (a 1)k + b = 0
1. Roots real and distinct [(a 1)2 > 4b]: k1 6= k2 .
y (x) = c1 x k1 + c2 x k2 .
2. Roots real and equal [(a 1)2 = 4b]: k1 = k2 = k = a1

2 .
y (x) = (c1 + c2 ln x)x k .
3. Roots complex conjugate [(a 1)2 < 4b]: k1,2 = a1

2 i .
y (x) = x
a1
2
[A cos( ln x)+B sin( ln x)] = Cx
a1
2
cos( ln x).
942,
Introduction
Basis for Solutions
x 2 y + axy + by = 0
k 2 + (a 1)k + b = 0
1. Roots real and distinct [(a 1)2 > 4b]: k1 6= k2 .
y (x) = c1 x k1 + c2 x k2 .
2. Roots real and equal [(a 1)2 = 4b]: k1 = k2 = k = a1

2 .
y (x) = (c1 + c2 ln x)x k .
3. Roots complex conjugate [(a 1)2 < 4b]: k1,2 = a1

2 i .
y (x) = x
a1
2
[A cos( ln x)+B sin( ln x)] = Cx
a1
2
cos( ln x).
Alternative approach: substitution

dx
dt
1
x = e t t = ln x,
= e t = x and
= , etc.
dt
dx
x
943,
Introduction
Basis for Solutions
y + P(x)y + Q(x)y = 0
Well-posedness of its IVP:
The initial value problem of the ODE, with arbitrary
initial conditions y (x0 ) = Y0 , y (x0 ) = Y1 , has a unique
solution, as long as P(x) and Q(x) are continuous in the
interval under question.
944,
Introduction
Basis for Solutions
y + P(x)y + Q(x)y = 0
At least two linearly independent solutions:
y1 (x): IVP with initial conditions y (x0 ) = 1, y (x0 ) = 0

c1 y1 (x) + c2 y2 (x) = 0 c1 = c2 = 0
945,
Introduction
Basis for Solutions
y + P(x)y + Q(x)y = 0
At least two linearly independent solutions:

c1 y1 (x) + c2 y2 (x) = 0 c1 = c2 = 0
At most two linearly independent solutions?
946,
Introduction
Wronskian of two solutions y1 (x) and y2 (x):

Basis for Solutions

y y
W (y1 , y2 ) = 1 2 = y1 y2 y2 y1
y1 y2
947,
Introduction

Basis for Solutions

y y
W (y1 , y2 ) = 1 2 = y1 y2 y2 y1
y1 y2
Solutions y1 and y2 are linearly dependent, if and only if x0

such that W [y1 (x0 ), y2 (x0 )] = 0.
948,
Introduction

Basis for Solutions

y y
W (y1 , y2 ) = 1 2 = y1 y2 y2 y1
y1 y2

such that W [y1 (x0 ), y2 (x0 )] = 0.
W [y1 (x0 ), y2 (x0 )] = 0 W [y1 (x), y2 (x)] = 0 x.
949,
Introduction

Basis for Solutions

y y
W (y1 , y2 ) = 1 2 = y1 y2 y2 y1
y1 y2

such that W [y1 (x0 ), y2 (x0 )] = 0.
W [y1 (x0 ), y2 (x0 )] = 0 W [y1 (x), y2 (x)] = 0 x.
W [y1 (x1 ), y2 (x1 )] 6= 0 W [y1 (x), y2 (x)] 6= 0 x, and y1 (x)
and y2 (x) are linearly independent solutions.
950,
Introduction

Basis for Solutions

y y
W (y1 , y2 ) = 1 2 = y1 y2 y2 y1
y1 y2

such that W [y1 (x0 ), y2 (x0 )] = 0.
W [y1 (x0 ), y2 (x0 )] = 0 W [y1 (x), y2 (x)] = 0 x.
W [y1 (x1 ), y2 (x1 )] 6= 0 W [y1 (x), y2 (x)] 6= 0 x, and y1 (x)
and y2 (x) are linearly independent solutions.
Complete solution:
If y1 (x) and y2 (x) are two linearly independent solutions,
then the general solution is
y (x) = c1 y1 (x) + c2 y2 (x).
And, the general solution is the complete solution .

No third linearly independent solution. No singular solution.
951,
Introduction
If y1 (x) and y2 (x) are linearly dependent, then

y2Solutions
= ky1 .
Basis for
W (y1 , y2 ) = y1 y2 y2 y1 = y1 (ky1 ) (ky1 )y1 = 0

In particular, W [y1 (x0 ), y2 (x0 )] = 0
952,
Introduction

y2Solutions
= ky1 .
Basis for
W (y1 , y2 ) = y1 y2 y2 y1 = y1 (ky1 ) (ky1 )y1 = 0

Conversely, if there is a value x0 , where

y (x ) y2 (x0 )
W [y1 (x0 ), y2 (x0 )] = 1 0
y1 (x0 ) y2 (x0 )
then for
y1 (x0 ) y2 (x0 )
y1 (x0 ) y2 (x0 )
coefficient matrix is singular.

c1
c2
= 0,

= 0,

953,
Introduction

y2Solutions
= ky1 .
Basis for
W (y1 , y2 ) = y1 y2 y2 y1 = y1 (ky1 ) (ky1 )y1 = 0


y (x ) y2 (x0 )
W [y1 (x0 ), y2 (x0 )] = 1 0
y1 (x0 ) y2 (x0 )
then for

= 0,

y1 (x0 ) y2 (x0 )
c1
= 0,
y1 (x0 ) y2 (x0 )
c2
coefficient matrixis singular.

c1
Choose non-zero
and frame y (x) = c1 y1 + c2 y2 , satisfying
c2
IVP y + Py + Qy = 0, y (x0 ) = 0, y (x0 ) = 0.
954,
Introduction

y2Solutions
= ky1 .
Basis for
W (y1 , y2 ) = y1 y2 y2 y1 = y1 (ky1 ) (ky1 )y1 = 0


y (x ) y2 (x0 )
W [y1 (x0 ), y2 (x0 )] = 1 0
y1 (x0 ) y2 (x0 )
then for

= 0,

y1 (x0 ) y2 (x0 )
c1
= 0,
y1 (x0 ) y2 (x0 )
c2
coefficient matrixis singular.

c1
Choose non-zero
and frame y (x) = c1 y1 + c2 y2 , satisfying
c2
IVP y + Py + Qy = 0, y (x0 ) = 0, y (x0 ) = 0.
Therefore, y (x) = 0 y1 and y2 are linearly dependent.
955,
Introduction
Basis for Solutions
Pick a candidate solution Y (x), choose a point x0 , evaluate

functions y1 , y2 , Y and their derivatives at that point, frame

Y (x0 )
y1 (x0 ) y2 (x0 )
C1
=
Y (x0 )
y1 (x0 ) y2 (x0 )
C2
and ask for solution
C1
C2

.
956,
Introduction
Basis for Solutions


Y (x0 )
y1 (x0 ) y2 (x0 )
C1
=
Y (x0 )
y1 (x0 ) y2 (x0 )
C2
C1
C2

.
Unique solution for C1 , C2 . Hence, particular solution

y (x) = C1 y1 (x) + C2 y2 (x)
is the unique solution of the IVP
y + Py + Qy = 0, y (x0 ) = Y (x0 ), y (x0 ) = Y (x0 ).
957,
Introduction
Basis for Solutions


Y (x0 )
y1 (x0 ) y2 (x0 )
C1
=
Y (x0 )
y1 (x0 ) y2 (x0 )
C2
C1
C2

.

y (x) = C1 y1 (x) + C2 y2 (x)
y + Py + Qy = 0, y (x0 ) = Y (x0 ), y (x0 ) = Y (x0 ).
But, that is the candidate function Y (x)!
958,
Introduction
Basis for Solutions


Y (x0 )
y1 (x0 ) y2 (x0 )
C1
=
Y (x0 )
y1 (x0 ) y2 (x0 )
C2
C1
C2

.

y (x) = C1 y1 (x) + C2 y2 (x)
y + Py + Qy = 0, y (x0 ) = Y (x0 ), y (x0 ) = Y (x0 ).
But, that is the candidate function Y (x)! Hence, Y (x) = y (x).
Basis for Solutions
959,
Introduction
Basis for Solutions
For completely describing the solutions, we need

two linearly independent solutions.
No guaranteed procedure to identify two basis members!
Basis for Solutions
960,
Introduction
Basis for Solutions


If one solution y1 (x) is available, then to find another?
Reduction of order
Basis for Solutions
961,
Introduction
Basis for Solutions


Reduction of order
Assume the second solution as
y2 (x) = u(x)y1 (x)
and determine u(x) such that y2 (x) satisfies the ODE.
u y1 + 2u y1 + uy1 + P(u y1 + uy1 ) + Quy1 = 0
Basis for Solutions
962,
Introduction
Basis for Solutions


Reduction of order
Assume the second solution as
y2 (x) = u(x)y1 (x)
and determine u(x) such that y2 (x) satisfies the ODE.
u y1 + 2u y1 + uy1 + P(u y1 + uy1 ) + Quy1 = 0
u y1 + 2u y1 + Pu y1 + u(y1 + Py1 + Qy1 ) = 0.
Since y1 + Py1 + Qy1 = 0, we have y1 u + (2y1 + Py1 )u = 0
Basis for Solutions

y
Denoting u = U, U + (2 y11 + P)U = 0.
963,
Introduction
Basis for Solutions
Rearrangement and integration of the reduced equation:

R
dy1
dU
+2
+ Pdx = 0 Uy12 e Pdx = C = 1 (choose).
U
y1
Basis for Solutions

y
Denoting u = U, U + (2 y11 + P)U = 0.

R
dy1
dU
+2
U
y1
Then,
u = U =
Integrating,
u(x) =
and
y2 (x) = y1 (x)
1 R Pdx
e
,
y12
1 R Pdx
e
dx,
y12
Z
964,
Introduction
Basis for Solutions
1 R Pdx
e
dx.
y12
Basis for Solutions

y
Denoting u = U, U + (2 y11 + P)U = 0.

R
dy1
dU
+2
U
y1
Then,
u = U =
Integrating,
u(x) =
and
y2 (x) = y1 (x)
1 R Pdx
e
,
y12
1 R Pdx
e
dx,
y12
Z
1 R Pdx
e
dx.
y12
Note: The factor u(x) is never constant!
965,
Introduction
Basis for Solutions
Basis for Solutions
Function space perspective:

Operator D means differentiation, operates on an infinite
dimensional function space as a linear transformation.
It maps all constant functions to zero.
It has a one-dimensional null space.
966,
Introduction
Basis for Solutions
Basis for Solutions

967,
Introduction
Basis for Solutions
Second derivative or D 2 is an operator that has a two-dimensional

null space, c1 + c2 x, with basis {1, x}.
Basis for Solutions


Examples of composite operators
(D + a) has a null space ce ax .
(xD + a) has a null space cx a .
968,
Introduction
Basis for Solutions
Basis for Solutions


Examples of composite operators
(D + a) has a null space ce ax .
(xD + a) has a null space cx a .
A second order linear operator D 2 + P(x)D + Q(x) possesses a

two-dimensional null space.
969,
Introduction
Basis for Solutions
Solution of [D 2 + P(x)D + Q(x)]y = 0: description of the

null space, or a basis for it..
Analogous to solution of Ax = 0, i.e. development of a basis
for Null (A).
Points to note
Second order linear homogeneous ODEs
Wronskian and related results
Solution basis
Reduction of order
Null space of a differential operator
970,
Introduction
Basis for Solutions
Outline

Linear ODEs and Their Solutions
Method of Undetermined Coefficients
Method of Variation of Parameters
Closure

Closure
971,

The Complete Analogy

Closure
Table: Linear systems and mappings: algebraic and differential
In ordinary vector space

Ax = b
The system is consistent.
A solution x
Alternative solution:
x
x x satisfies Ax = 0,
is in null space of A.
Complete solution:
P
x = x + i ci (x0 )i
Methodology:
Find null space of A
i.e. basis members (x0 )i .
Find x and compose.
In infinite-dimensional function space

y + Py + Qy = R
P(x), Q(x), R(x) are continuous.
A solution yp (x)
Alternative solution: y (x)
y (x) yp (x) satisfies y + Py + Qy = 0,
is in null space of D 2 + P(x)D + Q(x).
Complete solution:
P
yp (x) + i ci yi (x)
Methodology:
Find null space of D 2 + P(x)D + Q(x)
i.e. basis members yi (x).
Find yp (x) and compose.
972,

Closure
Procedure to solve y + P(x)y + Q(x)y = R(x)

1. First, solve the corresponding homogeneous equation, obtain a
basis with two solutions and construct
yh (x) = c1 y1 (x) + c2 y2 (x).
2. Next, find one particular solution yp (x) of the NHE and
compose the complete solution
y (x) = yh (x) + yp (x) = c1 y1 (x) + c2 y2 (x) + yp (x).
3. If some initial or boundary conditions are known, they can be
imposed now to determine c1 and c2 .
973,

Closure

yh (x) = c1 y1 (x) + c2 y2 (x).
Caution: If y1 and y2 are two solutions of the NHE, then
do not expect c1 y1 + c2 y2 to satisfy the equation.
974,

Closure

yh (x) = c1 y1 (x) + c2 y2 (x).
Caution: If y1 and y2 are two solutions of the NHE, then
do not expect c1 y1 + c2 y2 to satisfy the equation.
Implication of linearity or superposition:
With zero initial conditions, if y1 and y2 are responses
due to inputs R1 (x) and R2 (x), respectively, then the
response due to input c1 R1 + c2 R2 is c1 y1 + c2 y2 .
975,
ODEs and Their Solutions

Method of Undetermined CoefficientsLinear
Closure
y + ay + by = R(x)
What kind of function to propose as yp (x) if R(x) = x n ?
976,

Closure
y + ay + by = R(x)

And what if R(x) = e x ?
977,

Closure
y + ay + by = R(x)

If R(x) = x n + e x , i.e. in the form k1 R1 (x) + k2 R2 (x)?
978,

Closure
y + ay + by = R(x)

The principle of superposition (linearity)
979,

Closure
y + ay + by = R(x)

The principle of superposition (linearity)
Table: Candidate solutions for linear non-homogeneous ODEs
RHS function R(x)

pn (x)
e x
cos x or sin x
e x cos x or e x sin x
pn (x)e x
pn (x) cos x or pn (x) sin x
pn (x)e x cos x or pn (x)e x sin x
Candidate solution yp (x)

qn (x)
ke x
k1 cos x + k2 sin x
k1 e x cos x + k2 e x sin x
qn (x)e x
qn (x) cos x + rn (x) sin x
qn (x)e x cos x + rn (x)e x sin x
980,

Example:
(a) y 6y + 5y = e 3x
(b) y 5y + 6y = e 3x
(c) y 6y + 9y = e 3x
Closure
981,

Example:
(a) y 6y + 5y = e 3x
(b) y 5y + 6y = e 3x
(c) y 6y + 9y = e 3x
Closure
In each case, the first official proposal: yp = ke 3x
982,

Example:
(a) y 6y + 5y = e 3x
(b) y 5y + 6y = e 3x
(c) y 6y + 9y = e 3x
Closure

(a) y (x) = c1 e x + c2 e 5x e 3x /4
983,

Example:
(a) y 6y + 5y = e 3x
(b) y 5y + 6y = e 3x
(c) y 6y + 9y = e 3x
Closure

(a) y (x) = c1 e x + c2 e 5x e 3x /4
(b) y (x) = c1 e 2x + c2 e 3x +
984,

Example:
(a) y 6y + 5y = e 3x
(b) y 5y + 6y = e 3x
(c) y 6y + 9y = e 3x
Closure

(a) y (x) = c1 e x + c2 e 5x e 3x /4
(b) y (x) = c1 e 2x + c2 e 3x + xe 3x
985,

Example:
(a) y 6y + 5y = e 3x
(b) y 5y + 6y = e 3x
(c) y 6y + 9y = e 3x
Closure

(a) y (x) = c1 e x + c2 e 5x e 3x /4
(b) y (x) = c1 e 2x + c2 e 3x + xe 3x
(c) y (x) = c1 e 3x + c2 xe 3x +
986,

Closure
Example:
(a) y 6y + 5y = e 3x
(b) y 5y + 6y = e 3x
(c) y 6y + 9y = e 3x

(a) y (x) = c1 e x + c2 e 5x e 3x /4
(b) y (x) = c1 e 2x + c2 e 3x + xe 3x
(c) y (x) = c1 e 3x + c2 xe 3x +
1 2 3x
2x e
987,

Closure
Example:
(a) y 6y + 5y = e 3x
(b) y 5y + 6y = e 3x
(c) y 6y + 9y = e 3x

(a) y (x) = c1 e x + c2 e 5x e 3x /4
(b) y (x) = c1 e 2x + c2 e 3x + xe 3x
(c) y (x) = c1 e 3x + c2 xe 3x +
1 2 3x
2x e
Modification rule
If the candidate function (ke x , k1 cos x + k2 sin x or

k1 e x cos x + k2 e x sin x) is a solution of the corresponding
HE; with , i or i (respectively) satisfying the
auxiliary equation; then modify it by multiplying with x.
In the case of being a double root, i.e. both e x and xe x
being solutions of the HE, choose yp = kx 2 e x .
988,

Closure
Solution of the HE:

yh (x) = c1 y1 (x) + c2 y2 (x),
in which c1 and c2 are constant parameters.
989,

Closure
Solution of the HE:

yh (x) = c1 y1 (x) + c2 y2 (x),
in which c1 and c2 are constant parameters.
For solution of the NHE,
how about variable parameters ?
Propose
yp (x) = u1 (x)y1 (x) + u2 (x)y2 (x)
and force yp (x) to satisfy the ODE.
A single second order ODE in u1 (x) and u2 (x).
We need one more condition to fix them.
990,

From yp = u1 y1 + u2 y2 ,

Closure
yp = u1 y1 + u1 y1 + u2 y2 + u2 y2 .
991,


Closure
yp = u1 y1 + u1 y1 + u2 y2 + u2 y2 .
Condition
u1 y1 + u2 y2 = 0 gives
yp = u1 y1 + u2 y2 .
992,


Closure
yp = u1 y1 + u1 y1 + u2 y2 + u2 y2 .
Condition
u1 y1 + u2 y2 = 0 gives
yp = u1 y1 + u2 y2 .
Differentiating,
yp = u1 y1 + u2 y2 + u1 y1 + u2 y2 .
993,

994,

Closure
yp = u1 y1 + u1 y1 + u2 y2 + u2 y2 .
Condition
u1 y1 + u2 y2 = 0 gives
yp = u1 y1 + u2 y2 .
Differentiating,
yp = u1 y1 + u2 y2 + u1 y1 + u2 y2 .
Substitution into the ODE:
u1 y1 +u2 y2 +u1 y1 +u2 y2 +P(x)(u1 y1 +u2 y2 )+Q(x)(u1 y1 +u2 y2 ) = R(x)
Rearranging,
u1 y1 +u2 y2 +u1 (y1 +P(x)y1 +Q(x)y1 )+u2 (y2 +P(x)y2 +Q(x)y2 ) = R(x).
As y1 and y2 satisfy the associated HE, u1 y1 + u2 y2 = R(x)

y1 y2
y1 y2

u1
u2

Closure
0
R
995,

y1 y2
y1 y2

u1
u2

Closure
0
R
Since Wronskian is non-zero, this system has unique solution

u1 =
y2 R
W
and u2 =
y1 R
.
W
Direct quadrature:
Z
Z
y2 (x)R(x)
y1 (x)R(x)
dx and u2 (x) =
dx
u1 (x) =
W [y1 (x), y2 (x)]
W [y1 (x), y2 (x)]
996,

y1 y2
y1 y2

u1
u2

Closure
0
R
Since Wronskian is non-zero, this system has unique solution

u1 =
y2 R
W
and u2 =
y1 R
.
W
Direct quadrature:
Z
Z
y2 (x)R(x)
y1 (x)R(x)
dx and u2 (x) =
dx
u1 (x) =
W [y1 (x), y2 (x)]
W [y1 (x), y2 (x)]
In contrast to the method of undetermined multipliers,
variation of parameters is general. It is applicable for all
continuous functions as P(x), Q(x) and R(x).
997,
Points to note

Closure
Function space perspective of linear ODEs
Method of undetermined coefficients
Method of variation of parameters
998,
Outline
999,
Theory of Linear ODEs

Non-Homogeneous Equations
Euler-Cauchy Equation of Higher Order

1000,

General solution: y (x) = yh (x) + yp (x), where

yp (x): a particular solution
yh (x): general solution of corresponding HE
y (n) +P1 (x)y (n1) +P2 (x)y (n2) + +Pn1 (x)y +Pn (x)y = 0
1001,


For the HE, suppose we have n solutions y1 (x), y2 (x), , yn (x).

Assemble the state vectors in matrix
y1
y2
yn
y1
y2
yn
y1
y
y
n
2
Y(x) =
.
.
..
..
..
..
.
.
.
(n1)
(n1)
(n1)
y1
y2
yn
1002,


For the HE, suppose we have n solutions y1 (x), y2 (x), , yn (x).

Assemble the state vectors in matrix
y1
y2
yn
y1
y2
yn
y1
y
y
n
2
Y(x) =
.
.
..
..
..
..
.
.
.
(n1)
(n1)
(n1)
y1
y2
yn
Wronskian:
W (y1 , y2 , , yn ) = det[Y(x)]
1003,

If solutions y1 (x), y2 (x), , yn (x) of HE are linearly

dependent, then for a non-zero k R n ,
n
n
X
X
(j)
ki yi (x) = 0 for j = 1, 2, 3, , (n 1)
ki yi (x) = 0
i =1
i =1
[Y(x)]k = 0 [Y(x)] is singular
W [y1 (x), y2 (x), , yn (x)] = 0.

n
n
X
X
(j)
ki yi (x) = 0 for j = 1, 2, 3, , (n 1)
ki yi (x) = 0
i =1
i =1
W [y1 (x), y2 (x), , yn (x)] = 0.
1004,

If Wronskian is zero at x = x0 , then

Pn Y(x0 ) is singular and a
non-zero k Null[Y(x0 )] gives i =1 ki yi (x) = 0, implying
y1 (x), y2 (x), , yn (x) to be linearly dependent.
Zero Wronskian at some x = x0 implies zero Wronskian
everywhere. Non-zero Wronskian at some x = x1 ensures
non-zero Wronskian everywhere and the corrseponding
solutions as linearly independent.

n
n
X
X
(j)
ki yi (x) = 0 for j = 1, 2, 3, , (n 1)
ki yi (x) = 0
i =1
i =1
W [y1 (x), y2 (x), , yn (x)] = 0.
1005,

If Wronskian is zero at x = x0 , then

Pn Y(x0 ) is singular and a
non-zero k Null[Y(x0 )] gives i =1 ki yi (x) = 0, implying
y1 (x), y2 (x), , yn (x) to be linearly dependent.
Zero Wronskian at some x = x0 implies zero Wronskian
everywhere. Non-zero Wronskian at some x = x1 ensures
non-zero Wronskian everywhere and the corrseponding
solutions as linearly independent.
With n linearly independent solutions y1 (x), y2 (x),
P , yn (x)
of the HE, we have its general solution yh (x) = ni=1 ci yi (x),
acting as the complementary function for the NHE.
1006,

Coefficients
Homogeneous
y (n) + a1 y (n1) + a2 y (n2) + + an1 y + an y = 0
With trial solution y = e x , the auxiliary equation:
n + a1 n1 + a2 n2 + + an1 + an = 0
1007,

Coefficients
Homogeneous
y (n) + a1 y (n1) + a2 y (n2) + + an1 y + an y = 0
With trial solution y = e x , the auxiliary equation:
n + a1 n1 + a2 n2 + + an1 + an = 0
Construction of the basis:
1. For every simple real root = , e x is a solution.
2. For every simple pair of complex roots = i ,
e x cos x and e x sin x are linearly independent solutions.
3. For every real root = of multiplicity r ; e x , xe x , x 2 e x ,
, x r 1 e x are all linearly independent solutions.
4. For every complex pair of roots = i of multiplicity r ;
e x cos x, e x sin x, xe x cos x, xe x sin x, ,
x r 1 e x cos x, x r 1 e x sin x are the required solutions.
y (n) + a1 y (n1) + a2 y (n2) + + an1 y + an y = R(x)
Extension of the second order case
1008,


n
X
ui (x)yi (x)
yp (x) =
i =1
1009,


n
X
ui (x)yi (x)
yp (x) =
i =1
Imposed
condition
Pn
i =1 ui (x)yi (x) = 0
1010,

Derivative
P
yp (x) = ni=1 ui (x)yi (x)
1011,


n
X
ui (x)yi (x)
yp (x) =
i =1
Imposed
condition
Derivative
Pn
P
u (x)yi (x) = 0
P
Pin=1 i

i =1 ui (x)yi (x) = 0

P
Pn
(n1)
(n1)
(n2)
(x)
(x) = ni=1 ui (x)yi
(x) = 0 yp
i =1 ui (x)yi
P
P
(n)
(n1)
(n)
(x) + ni=1 ui (x)yi (x)
Finally, yp (x) = ni=1 ui (x)yi
n
n
i
h
X
X
(n1)
(n)
(n1)
+ + Pn yi = R(x)
ui (x) yi + P1 yi
(x)+
ui (x)yi
Since each yi (x) is a solution of the HE,
n
X
(n1)
ui (x)yi
1012,

(x) = R(x).
i =1
Assembling all conditions on u (x) together,

[Y(x)]u (x) = en R(x).
n
X
(n1)
ui (x)yi
1013,

(x) = R(x).
i =1

[Y(x)]u (x) = en R(x).
Since Y1 =
u (x) =
adj Y
det(Y) ,
1
R(x)
[adj Y(x)]en R(x) =
[last column of adj Y(x)].
det[Y(x)]
W (x)
Using cofactors of elements from last row only,

ui (x) =
Wi (x)
R(x),
W (x)
with Wi (x) = Wronskian evaluated with en in place of i -th column.
n
X
(n1)
ui (x)yi
1014,

(x) = R(x).
i =1

[Y(x)]u (x) = en R(x).
Since Y1 =
u (x) =
adj Y
det(Y) ,
1
R(x)
[adj Y(x)]en R(x) =
[last column of adj Y(x)].
det[Y(x)]
W (x)
Using cofactors of elements from last row only,

ui (x) =
Wi (x)
R(x),
W (x)
with Wi (x) = Wronskian evaluated with en in place of i -th column.

R (x)R(x)
ui (x) = WiW
(x) dx
Points to note
Wronskian for a higher order ODE

General theory of linear ODEs
Variation for parameters for n-th order ODE
1015,

Outline
Laplace Transforms
Introduction
Basic Properties and Results
Application to Differential Equations
Handling Discontinuities
Convolution
Advanced Issues
Laplace Transforms
Introduction
Convolution
Advanced Issues
1016,
Introduction
Classical perspective
Laplace Transforms
Introduction
Convolution
Advanced Issues
Entire differential equation is known in advance.
Go for a complete solution first.
Afterwards, use the initial (or other) conditions.
1017,
Introduction
Laplace Transforms
Introduction
Convolution
Advanced Issues
A practical situation
You have a plant
intrinsic dynamic model as well as the starting conditions.
You may drive the plant with different kinds of inputs on

different occasions.
1018,
Introduction
Laplace Transforms
Introduction
Convolution
Advanced Issues
A practical situation
You have a plant
intrinsic dynamic model as well as the starting conditions.
You may drive the plant with different kinds of inputs on

different occasions.
Implication
Left-hand side of the ODE and the initial conditions are

known a priori.
Right-hand side, R(x), changes from task to task.
1019,
Introduction
Laplace Transforms
Introduction
Convolution
Advanced Issues
Another question: What if R(x) is not continuous?
When power is switched on or off, what happens?
If there is a sudden voltage fluctuation, what happens to the

equipment connected to the power line?
Or, does anything happen in the immediate future?
1020,
Introduction
Laplace Transforms
Introduction
Convolution
Advanced Issues


Something certainly happens. The IVP has a solution!
Laplace transforms provide a tool to find the solution, in
spite of the discontinuity of R(x).
1021,
Laplace Transforms
Introduction
Introduction
Convolution
Advanced Issues


Something certainly happens. The IVP has a solution!
Laplace transforms provide a tool to find the solution, in
spite of the discontinuity of R(x).
Integral transform:
T [f (t)](s) =
K (s, t)f (t)dt

a
s: frequency variable
K (s, t): kernel of the transform
Note: T [f (t)] is a function of s, not t.
1022,
Laplace Transforms
Introduction
Introduction
Convolution
Advanced Issues
With kernel function K (s, t) = e st , and limits a = 0, b = ,

Laplace transform
F (s) = L{f (t)} =
e st f (t)dt = lim
b 0
e st f (t)dt
When this integral exists, f (t) has its Laplace transform.
1023,
Laplace Transforms
Introduction
Introduction
Convolution
Advanced Issues

Laplace transform
F (s) = L{f (t)} =
e st f (t)dt = lim
b 0
e st f (t)dt

Sufficient condition:
f (t) is piecewise continuous, and
it is of exponential order, i.e. |f (t)| < Me ct for some (finite)

M and c.
1024,
Laplace Transforms
Introduction
Introduction
Convolution
Advanced Issues

Laplace transform
F (s) = L{f (t)} =
e st f (t)dt = lim
b 0
e st f (t)dt

Sufficient condition:
f (t) is piecewise continuous, and
it is of exponential order, i.e. |f (t)| < Me ct for some (finite)

M and c.
Inverse Laplace transform:

f (t) = L1 {F (s)}
1025,

Linearity:
Laplace Transforms
Introduction
Convolution
Advanced Issues
L{af (t) + bg (t)} = aL{f (t)} + bL{g (t)}

First shifting property or the frequency shifting rule:
L{e at f (t)} = F (s a)
Laplace transforms of some elementary functions:
st
Z
1
e
st
= ,
e dt =
L(1) =
s
s
0
st0
Z
Z
e
1 st
1
e st tdt = t
L(t) =
+
e dt = 2 ,
s 0
s 0
s
0
n!
L(t n ) = n+1 (for positive integer n),
s
(a + 1)
a
L(t ) =
(for a R + )
s a+1
1
.
and L(e at ) =
s a
1026,
Laplace Transforms
Introduction
Convolution
Advanced Issues
,
L(sin t) = 2
;
2
+
s + 2
s
a
L(cosh at) = 2
,
L(sinh at) = 2
;
2
s a
s a2
s
, L(e t sin t) =
.
L(e t cos t) =
2
2
(s ) +
(s )2 + 2
L(cos t) =
s2
1027,
Laplace Transforms
Introduction
Convolution
Advanced Issues
,
L(sin t) = 2
;
2
+
s + 2
s
a
L(cosh at) = 2
,
L(sinh at) = 2
;
2
s a
s a2
s
, L(e t sin t) =
.
L(e t cos t) =
2
2
(s ) +
(s )2 + 2
L(cos t) =
s2
Laplace transform of derivative:

Z
L{f (t)} =
e st f (t)dt
0
Z

= e st f (t) 0 + s
e st f (t)dt = sL{f (t)} f (0)
Using this process recursively,
L{f (n) (t)} = s n L{f (t)} s (n1) f (0) s (n2) f (0) f (n1) (0).
1028,
Laplace Transforms
Introduction
Convolution
Advanced Issues
,
L(sin t) = 2
;
2
+
s + 2
s
a
L(cosh at) = 2
,
L(sinh at) = 2
;
2
s a
s a2
s
, L(e t sin t) =
.
L(e t cos t) =
2
2
(s ) +
(s )2 + 2
L(cos t) =
s2
Laplace transform of derivative:

Z
L{f (t)} =
e st f (t)dt
0
Z

= e st f (t) 0 + s
e st f (t)dt = sL{f (t)} f (0)
Using this process recursively,
L{f (n) (t)} = s n L{f (t)} s (n1) f (0) s (n2) f (0) f (n1) (0).
Rt
For integral g (t) = 0 f (t)dt, g (0) = 0, and
L{g (t)} = sL{g (t)}g (0) = sL{g (t)} L{g (t)} = 1s L{f (t)}.
1029,
Laplace Transforms
Introduction
Convolution
Advanced Issues
Example:
Initial value problem of a linear constant coefficient ODE
y + ay + by = r (t), y (0) = K0 , y (0) = K1
1030,
Laplace Transforms
Introduction
Convolution
Advanced Issues
Example:
y + ay + by = r (t), y (0) = K0 , y (0) = K1
Laplace transforms of both sides of the ODE:
s 2 Y (s) sy (0) y (0) + a[sY (s) y (0)] + bY (s) = R(s)

(s 2 + as + b)Y (s) = (s + a)K0 + K1 + R(s)
A differential equation in y (t) has been converted to an

algebraic equation in Y (s).
1031,
Laplace Transforms
Introduction
Convolution
Advanced Issues
Example:
y + ay + by = r (t), y (0) = K0 , y (0) = K1
Laplace transforms of both sides of the ODE:
s 2 Y (s) sy (0) y (0) + a[sY (s) y (0)] + bY (s) = R(s)

(s 2 + as + b)Y (s) = (s + a)K0 + K1 + R(s)
A differential equation in y (t) has been converted to an

algebraic equation in Y (s).
Transfer function: ratio of Laplace transform of output function
y (t) to that of input function r (t), with zero initial conditions
Q(s) =
1
Y (s)
= 2
(in this case)
R(s)
s + as + b
Y (s) = [(s + a)K0 + K1 ]Q(s) + Q(s)R(s)

Solution of the given IVP: y (t) = L1 {Y (s)}
1032,
Laplace Transforms
Introduction
Convolution
Advanced Issues
Unit step function

u(t a) =
0 if
1 if
Its Laplace transform:

Z
Z
e st u(t a)dt =
L{u(t a)} =
0
t <a
t >a
a
0dt +
e st dt =
e as
s
1033,
Laplace Transforms
Introduction
Convolution
Advanced Issues
Unit step function

u(t a) =
0 if
1 if
Its Laplace transform:

Z
Z
e st u(t a)dt =
L{u(t a)} =
t <a
t >a
a
0dt +
For input f (t) with a time delay,

f (t a)u(t a) =
has its Laplace transform as

L{f (t a)u(t a)} =
=
Za
0 if
f (t a) if
e st dt =
e as
s
t<a
t>a
e st f (t a)dt
e s(a+ ) f ( )d = e as L{f (t)}.
Second shifting property or the time shifting rule
1034,
Laplace Transforms
Introduction
Convolution
Advanced Issues
Define

at a+k
otherwise
1
1
u(t a) u(t a k)
k
k
fk (t a) =
=
u(ta)
1 u(ta)
k
f (ta)
(ta)
k
1
1/k if
0
k
1
a+k
a a+k
1
1
(a) Unit step function
1 u(tak)
k
(b) Composition
(c) Function f
Figure: Step and impulse functions
(d) Diracs function
1035,
Laplace Transforms
Introduction
Convolution
Advanced Issues
Define

at a+k
otherwise
1
1
u(t a) u(t a k)
k
k
fk (t a) =
=
u(ta)
1 u(ta)
k
f (ta)
(ta)
k
1
1/k if
0
k
1
a+k
a a+k
1
1
(a) Unit step function
1 u(tak)
k
(b) Composition
(c) Function f
(d) Diracs function
Figure: Step and impulse functions
and note that its integral

Z
Z
fk (t a)dt =
Ik =
0
does not depend on k.
a+k
a
1
dt = 1.
k
1036,
Laplace Transforms
In the limit,
(t a) =
or,
1037,
Introduction
Convolution
Advanced Issues
lim fk (t a)

if t = a
(t a) =
0
otherwise
k0
Unit impulse function or
and
(t a)dt = 1.
Diracs delta function
Laplace Transforms
In the limit,
(t a) =
or,
1038,
Introduction
Convolution
Advanced Issues
lim fk (t a)

if t = a
(t a) =
0
otherwise
k0
and
(t a)dt = 1.
1
[L{u(t a)} L{u(t a k)}]
k0 k
e as e (a+k)s
= lim
= e as
k0
ks
L{(t a)} =
lim
Laplace Transforms
In the limit,
(t a) =
or,
1039,
Introduction
Convolution
Advanced Issues
lim fk (t a)

if t = a
(t a) =
0
otherwise
k0
and
(t a)dt = 1.
1
[L{u(t a)} L{u(t a k)}]
k0 k
e as e (a+k)s
= lim
= e as
k0
ks
L{(t a)} =
lim
Through step and impulse functions, Laplace transform

method can handle IVPs with discontinuous inputs.
Laplace Transforms
Convolution
Introduction
Convolution
Advanced Issues
A generalized product of two functions

Z t
f ( )g (t ) d
h(t) = f (t) g (t) =
0
1040,
Laplace Transforms
Convolution
Introduction
Convolution
Advanced Issues

Z t
f ( )g (t ) d
h(t) = f (t) g (t) =
0
Laplace transform of the convolution:

Z
Z t
H(s) =
e st
f ( )g (t )d dt
0
t=
t=
11
00
00
11
00
11
00
11
00
11
00
11
00
11
(a) Original order
11111
00000
00000
11111
t
(b) Changed order
Figure: Region of integration for L{h(t)}
1041,
Laplace Transforms
Convolution
1042,
Introduction
Convolution
Advanced Issues

Z t
f ( )g (t ) d
h(t) = f (t) g (t) =
0
Laplace transform of the convolution:

Z
Z t
Z
H(s) =
e st
f ( )g (t )d dt =
0
f ( )
11
00
00
11
00
11
00
11
00
11
00
11
00
11
(a) Original order
e st g (t )dt d
t=
t=
11111
00000
00000
11111
t
(b) Changed order
Figure: Region of integration for L{h(t)}
Laplace Transforms
Convolution
Introduction
Convolution
Advanced Issues
Through substitution t = t ,
Z
Z
e s(t + ) g (t ) dt d
f ( )
H(s) =
0

Z
Z0
st
s
g (t ) dt d
e
f ( )e
=
0
1043,
Laplace Transforms
Convolution
Introduction
Convolution
Advanced Issues
Through substitution t = t ,
Z
Z
e s(t + ) g (t ) dt d
f ( )
H(s) =
0

Z
Z0
st
s
g (t ) dt d
e
f ( )e
=
0
H(s) = F (s)G (s)

Convolution theorem:
Laplace transform of the convolution integral of two
functions is given by the product of the Laplace
transforms of the two functions.
Utilities:
To invert Q(s)R(s), one can convolute y (t) = q(t) r (t).

In solving some integral equation.
1044,
Points to note
Laplace Transforms
Introduction
Convolution
Advanced Issues
A paradigm shift in solution of IVPs
Handling discontinuous input functions
Extension to ODE systems
The idea of integral transforms
1045,
Outline
ODE Systems
1046,
Fundamental Ideas
Linear Homogeneous Systems with Constant Coeffici
Linear Non-Homogeneous Systems
Nonlinear Systems
ODE Systems
Fundamental Ideas
Linear Homogeneous Systems with Constant Coefficients
Nonlinear Systems
ODE Systems
Fundamental Ideas
1047,
Fundamental Ideas
Nonlinear Systems
y = f(t, y)
Solution: a vector function y = h(t)
ODE Systems
Fundamental Ideas
y = f(t, y)
Autonomous system: y = f(y)
1048,
Fundamental Ideas
Nonlinear Systems
Points in y-space where f(y) = 0:

equilibrium points or critical points
ODE Systems
Fundamental Ideas
y = f(t, y)
Autonomous system: y = f(y)
1049,
Fundamental Ideas
Nonlinear Systems
Points in y-space where f(y) = 0:

equilibrium points or critical points
System of linear ODEs:

y = A(t)y + g(t)
autonomous systems if A and g are constant
homogeneous systems if g(t) = 0
homogeneous constant coefficient systems if A is constant

and g(t) = 0
ODE Systems
Fundamental Ideas
For a homogeneous system,
y = A(t)y
1050,
Fundamental Ideas
Nonlinear Systems
Wronskian: W (y1 , y2 , y3 , , yn ) = |y1 y2 y3 yn |
ODE Systems
Fundamental Ideas
For a homogeneous system,
y = A(t)y
Wronskian: W (y1 , y2 , y3 , , yn ) = |y1 y2 y3 yn |
If Wronskian is non-zero, then
Fundamental matrix: Y(t) = [y1 y2 y3 yn ],
giving a basis.
General solution:
y(t) =
1051,
Fundamental Ideas
Nonlinear Systems
n
X
i =1
ci yi (t) = [Y(t)] c
ODE Systems
1052,
Fundamental Ideas
Linear Homogeneous Systems with Constant
Coefficients
Linear Homogeneous
Systems with Constant Coeffici
Nonlinear Systems
y = Ay
ODE Systems
1053,
Fundamental Ideas
Coefficients
Linear Homogeneous
Nonlinear Systems
y = Ay
Non-degenerate case: matrix A non-singular
Origin (y = 0) is the unique equilibrium point.
ODE Systems
1054,
Fundamental Ideas
Coefficients
Linear Homogeneous
Nonlinear Systems
y = Ay
Attempt y = xe t y = xe t .
Substitution: Axe t = xe t
ODE Systems
1055,
Fundamental Ideas
Coefficients
Linear Homogeneous
Nonlinear Systems
y = Ay
Substitution: Axe t = xe t Ax = x
ODE Systems
1056,
Fundamental Ideas
Coefficients
Linear Homogeneous
Nonlinear Systems
y = Ay
If A is diagonalizable,
n linearly independent solutions yi = xi e i t corresponding to n
eigenpairs
ODE Systems
1057,
Fundamental Ideas
Coefficients
Linear Homogeneous
Nonlinear Systems
y = Ay
eigenpairs
If A is not diagonalizable?
ODE Systems
1058,
Fundamental Ideas
Coefficients
Linear Homogeneous
Nonlinear Systems
y = Ay
eigenpairs
All xi e i t together will not complete the basis.
Try y = xte t ?
ODE Systems
1059,
Fundamental Ideas
Coefficients
Linear Homogeneous
Nonlinear Systems
y = Ay
eigenpairs
All xi e i t together will not complete the basis.
Try y = xte t ? Substitution leads to
xe t + xte t = Axte t xe t = 0 x = 0.
Absurd!
ODE Systems
1060,
Fundamental Ideas
Coefficients
Linear Homogeneous
Nonlinear Systems
Try a linearly independent solution in the form

y = xte t + ue t .
Linear independence here has two implications: in

function space AND in ordinary vector space!
ODE Systems
1061,
Fundamental Ideas
Coefficients
Linear Homogeneous
Nonlinear Systems

y = xte t + ue t .

Substitution:
xe t + xte t + ue t = Axte t + Aue t (A I)u = x
Solve for u, the generalized eigenvector of A.
ODE Systems
1062,
Fundamental Ideas
Coefficients
Linear Homogeneous
Nonlinear Systems

y = xte t + ue t .

Substitution:
xe t + xte t + ue t = Axte t + Aue t (A I)u = x
Solve for u, the generalized eigenvector of A.
For Jordan blocks of larger sizes,
1
y1 = xe t , y2 = xte t +u1 e t , y3 = xt 2 e t +u1 te t +u2 e t etc.
2
Jordan canonical form (JCF) of A provides a set of basis
functions to describe the complete solution of the ODE
system.
ODE Systems

y = Ay + g(t)
1063,
Fundamental Ideas
Nonlinear Systems
ODE Systems

y = Ay + g(t)
Complementary function:
yh (t) =
n
X
ci yi (t) = [Y(t)]c
i =1
Complete solution:
1064,
Fundamental Ideas
Nonlinear Systems
y(t) = yh (t) + yp (t)

We need to develop one particular solution yp .
ODE Systems

y = Ay + g(t)
Complementary function:
yh (t) =
n
X
ci yi (t) = [Y(t)]c
i =1
Complete solution:
1065,
Fundamental Ideas
Nonlinear Systems
y(t) = yh (t) + yp (t)

We need to develop one particular solution yp .
Based on g(t), select candidate function Gk (t) and propose
X
yp =
uk Gk (t),
k
vector coefficients (uk ) to be determined by substitution.
ODE Systems
1066,
Fundamental Ideas
Nonlinear Systems
Method of diagonalization
If A is a diagonalizable constant matrix, with X1 AX = D,
changing variables to z = X1 y, such that y = Xz,
Xz = AXz+g(t) z = X1 AXz+X1 g(t) = Dz+h(t) (say).
ODE Systems


Single decoupled Leibnitz equations
zk = dk zk + hk (t), k = 1, 2, 3, , n;
leading to individual solutions
zk (t) = ck e
dk t
+e
dk t
1067,
Fundamental Ideas
Nonlinear Systems
e dk t hk (t)dt.
ODE Systems


Single decoupled Leibnitz equations
zk = dk zk + hk (t), k = 1, 2, 3, , n;
leading to individual solutions
zk (t) = ck e
dk t
+e
dk t
1068,
Fundamental Ideas
Nonlinear Systems
e dk t hk (t)dt.
After assembling z(t), we reconstruct y = Xz.
ODE Systems
1069,
Fundamental Ideas
Nonlinear Systems

If we can supply a basis Y(t) of the complementary function yh (t),
then we propose
yp (t) = [Y(t)]u(t)
ODE Systems
1070,
Fundamental Ideas
Nonlinear Systems

then we propose
yp (t) = [Y(t)]u(t)
Substitution leads to
Y u + Yu = AYu + g.
Since Y = AY,
Yu = g, or, u = [Y]1 g.
ODE Systems

then we propose
yp (t) = [Y(t)]u(t)
Y u + Yu = AYu + g.
Since Y = AY,
Yu = g, or, u = [Y]1 g.
Complete solution:
y(t) = yh + yp = [Y]c + [Y]
This method is completely general.
1071,
Fundamental Ideas
Nonlinear Systems
[Y]1 gdt
Points to note
ODE Systems
Theory of ODEs in terms of vector functions

Methods to find
complementary functions in the case of constant coefficients

particular solutions for all cases
Necessary Exercises: 1
1072,
Fundamental Ideas
Nonlinear Systems
Outline

Second Order Linear Systems
Nonlinear Dynamic Systems
Lyapunov Stability Analysis

1073,

A system of two first order linear differential equations:

y1 = a11 y1 + a12 y2
y2 = a21 y1 + a22 y2
or,
y = Ay
1074,


y1 = a11 y1 + a12 y2
y2 = a21 y1 + a22 y2
or,
y = Ay
Phase: a pair of values of y1 and y2

Phase plane: plane of y1 and y2
Trajectory: a curve showing the evolution of the system for a
particular initial value problem
Phase portrait: all trajectories together showing the complete
picture of the behaviour of the dynamic system
1075,


y1 = a11 y1 + a12 y2
y2 = a21 y1 + a22 y2
or,
y = Ay
Phase: a pair of values of y1 and y2

Phase plane: plane of y1 and y2
Trajectory: a curve showing the evolution of the system for a
particular initial value problem
Phase portrait: all trajectories together showing the complete
picture of the behaviour of the dynamic system
Allowing only isolated equilibrium points,
matrix A is non-singular: origin is the only equilibrium point.
Eigenvalues of A:
2 (a11 + a22 ) + (a11 a22 a12 a21 ) = 0
1076,

Characteristic equation:
2 p + q = 0,
with p = (a11 + a22 ) = 1 + 2 and q = a11 a22 a12 a21 = 1 2
1077,

2 p + q = 0,
Discriminant D = p 2 4q and
r
p 2
p
p
D
1,2 =
q =
.
2
2
2
2
Solution (for diagonalizable A):
y = c1 x1 e 1 t + c2 x2 e 2 t
1078,

2 p + q = 0,
Discriminant D = p 2 4q and
r
p 2
p
p
D
1,2 =
q =
.
2
2
2
2
Solution (for diagonalizable A):
y = c1 x1 e 1 t + c2 x2 e 2 t
Solution for deficient A:
y = c1 x1 e t + c2 (tx1 + u)e t
y = c1 x1 e t + c2 (x1 + u)e t + tc2 x1 e t
1079,

y2

y2
y1
(a) Saddle point
y2
y1
y1
(b) Centre
(c) Spiral
y2
y2
y2
y1
(d) Improper node
(e) Proper node
y1
(f) Degenerate node
Figure: Neighbourhood of critical points
y1
1080,
1081,

Table: Critical points of linear systems

Type
Saddle pt
Centre
Spiral
Sub-type
Node
improper
proper
degenerate
Eigenvalues
real, opposite signs
pure imaginary
complex, both
non-zero components
real, same sign
unequal in magnitude
equal, diagonalizable
equal, deficient
Position in p-q chart

q<0
q > 0, p = 0
q > 0, p 6= 0
D = p 2 4q < 0
q > 0, p 6= 0, D 0
D>0
D=0
D=0
Stability
unstable
stable
stable
if p < 0,
unstable
if p > 0
1082,

Table: Critical points of linear systems

Type
Saddle pt
Centre
Spiral
Sub-type
Node
improper
proper
degenerate
Eigenvalues
real, opposite signs
pure imaginary
complex, both
non-zero components
real, same sign
unequal in magnitude
equal, diagonalizable
equal, deficient
Position in p-q chart

q<0
q > 0, p = 0
q > 0, p 6= 0
D = p 2 4q < 0
q > 0, p 6= 0, D 0
D>0
D=0
D=0
q
spiral
c
e
n
t
r
e
stable
node
spiral
p2
0
4q =
unstable
node
saddle point
unstable
Figure: Zones of critical points in p-q chart
Stability
unstable
stable
stable
if p < 0,
unstable
if p > 0

Phase plane analysis
Determine all the critical points.
Linearize the ODE system around each of them as

y = J(y0 )(y y0 ).
With z = y y0 , analyze each neighbourhood from z = Jz.

Assemble outcomes of local phase plane analyses.
Features of a dynamic system are typically captured by

its critical points and their neighbourhoods.
1083,


y = J(y0 )(y y0 ).


Limit cycles
isolated closed trajectories (only in nonlinear systems)
1084,


y = J(y0 )(y y0 ).


Limit cycles
isolated closed trajectories (only in nonlinear systems)
Systems with arbitrary dimension of state space?
1085,

Important terms
Stability: If y0 is a critical point of the dynamic system
y = f(y) and for every > 0, > 0 such that
ky(t0 ) y0 k < ky(t) y0 k < t > t0 ,
then y0 is a stable critical point. If, further,
y(t) y0 as t , then y0 is said to be
asymptotically stable.
Positive definite function: A function V (y), with V (0) = 0, is
called positive definite if
V (y) > 0 y 6= 0.
Lyapunov function: A positive definite function V (y), having
continuous V
yi , with a negative semi-definite rate of
change
V = [V (y)]T f(y).
1086,

Lyapunovs stability criteria:

Theorem: For a system y = f(y) with the origin as a
critical point, if there exists a Lyapunov function V (y),
then the system is stable at the origin, i.e. the origin is a
stable critical point.
Further, if V (y) is negative definite, then it is
A generalization of the notion of total energy: negativity of its rate
correspond to trajectories tending to decrease this energy.
1087,


Note: Lyapunovs method becomes particularly important when a
linearized model allows no analysis or when its results are suspect.
1088,


Note: Lyapunovs method becomes particularly important when a
linearized model allows no analysis or when its results are suspect.
Caution: It is a one-way criterion only!
1089,
Points to note

Analysis of second order systems
Classification of critical points
Nonlinear systems and local linearization

Examples in physics, engineering, economics,
biological and social systems
Lyapunovs method of stability analysis
1090,
Outline
1091,
Power Series Method

Frobenius Method
Special Functions Defined as Integrals
Special Functions Arising as Solutions of ODEs

Power Series Method
Frobenius Method
Power Series Method
1092,
Power Series Method

Frobenius Method
Methods to solve an ODE in terms of elementary functions:

restricted in scope
Theory allows study of the properties of solutions!
Power Series Method

restricted in scope
When elementary methods fail,

gain knowledge about solutions through properties, and
for actual evaluation develop infinite series.
Power series:
X
an x n = a0 + a1 x + a2 x 2 + a3 x 3 + a4 x 4 + a5 x 5 +
y (x) =
n=0
or in powers of (x x0 ).
1093,
Power Series Method

Frobenius Method
Power Series Method

restricted in scope
When elementary methods fail,

gain knowledge about solutions through properties, and
for actual evaluation develop infinite series.
Power series:
X
an x n = a0 + a1 x + a2 x 2 + a3 x 3 + a4 x 4 + a5 x 5 +
y (x) =
n=0
or in powers of (x x0 ).
A simple exercise:
Try developing power series solutions in the above form

and study their properties for differential equations
y + y = 0
1094,
Power Series Method

Frobenius Method
and
4x 2 y = y .
Power Series Method
y + P(x)y + Q(x)y = 0
If P(x) and Q(x) are analytic at a point x = x0 ,
i.e. if they possess convergent series expansions in powers
of (x x0 ) with some radius of convergence R,
then the solution is analytic at x0 , and a power series solution
y (x) = a0 + a1 (x x0 ) + a2 (x x0 )2 + a3 (x x0 )3 +
is convergent at least for |x x0 | < R.
1095,
Power Series Method

Frobenius Method
Power Series Method
y + P(x)y + Q(x)y = 0
If P(x) and Q(x) are analytic at a point x = x0 ,
i.e. if they possess convergent series expansions in powers
of (x x0 ) with some radius of convergence R,
then the solution is analytic at x0 , and a power series solution
y (x) = a0 + a1 (x x0 ) + a2 (x x0 )2 + a3 (x x0 )3 +
is convergent at least for |x x0 | < R.
For x0 = 0 (without loss of generality), suppose

P(x) =
X
n=0
Q(x) =
X
n=0
and assume y (x) =
pn x n = p0 + p1 x + p2 x 2 + p3 x 3 + ,
qn x n = q0 + q1 x + q2 x 2 + q3 x 3 + ,
1096,
Power Series Method

Frobenius Method
an x n .
Power Series Method

Differentiation of y (x) =
y (x) =
n=0 an x
(n + 1)an+1 x n and y (x) =
P(x)y
"
qn x n
"
pn x
n=0
Q(x)y
(n + 2)(n + 1)an+2 x n
n=0
n=0
leads to
as
1097,
Power Series Method

Frobenius Method
X
n=0
#
X
n
X
X
n
pnk (k + 1)ak+1 x n
(n + 1)an+1 x =
n=0 k=0
n=0
X
n=0
an x n =
X
n
X
n=0 k=0
qnk ak x n
Power Series Method

y (x) =
n=0 an x
(n + 1)an+1 x n and y (x) =
P(x)y
"
qn x n
"
pn x
n=0
Q(x)y
X
n=0
X
n=0
"
(n + 2)(n + 1)an+2 x n
n=0
n=0
leads to
as
1098,
Power Series Method

Frobenius Method
#
X
n
X
X
n
pnk (k + 1)ak+1 x n
(n + 1)an+1 x =
n=0 k=0
n=0
an x n =
n=0
(n + 2)(n + 1)an+2 +
n
X
k=0
X
n
X
qnk ak x n
n=0 k=0
pnk (k + 1)ak+1 +
n
X
k=0
qnk ak x n = 0
Power Series Method

y (x) =
n=0 an x
(n + 1)an+1 x n and y (x) =
P(x)y
"
qn x n
"
pn x
n=0
Q(x)y
X
n=0
X
n=0
(n + 2)(n + 1)an+2 x n
n=0
n=0
leads to
as
1099,
Power Series Method

Frobenius Method
"
#
X
n
X
X
n
pnk (k + 1)ak+1 x n
(n + 1)an+1 x =
n=0 k=0
n=0
an x n =
n=0
(n + 2)(n + 1)an+2 +
n
X
k=0
Recursion formula:
an+2 =
X
n
X
qnk ak x n
n=0 k=0
pnk (k + 1)ak+1 +
n
X
qnk ak x n = 0
k=0
X
1
[(k + 1)pnk ak+1 + qnk ak ]
(n + 2)(n + 1)
k=0
Frobenius Method
1100,
Power Series Method

Frobenius Method
For the ODE y + P(x)y + Q(x)y = 0, a point x = x0 is

ordinary point if P(x) and Q(x) are analytic at x = x0 : power
series solution is analytic
singular point if any of the two is non-analytic (singular) at x = x0
regular singularity: (x x0 )P(x) and
(x x0 )2 Q(x) are analytic at the point
irregular singularity
Frobenius Method
For the ODE y + P(x)y + Q(x)y = 0, a point x = x0 is

ordinary point if P(x) and Q(x) are analytic at x = x0 : power
series solution is analytic
singular point if any of the two is non-analytic (singular) at x = x0
regular singularity: (x x0 )P(x) and
(x x0 )2 Q(x) are analytic at the point
irregular singularity
The case of regular singularity
For x0 = 0, with P(x) =
1101,
Power Series Method

Frobenius Method
b(x)
x
and Q(x) =
c(x)
,
x2
x 2 y + xb(x)y + c(x)y = 0
in which b(x) and c(x) are analytic at the origin.
Frobenius Method
Working steps:
1. Assume the solution in the form y (x) = x r
2. Differentiate to get the series expansions for
n
n=0 an x .
y (x) and y (x).
3. Substitute these series for y (x), y (x) and y (x) into the
given ODE and collect coefficients of x r , x r +1 , x r +2 etc.
4. Equate the coefficient of x r to zero to obtain an equation in
the index r , called the indicial equation as
r (r 1) + b0 r + c0 = 0;
allowing a0 to become arbitrary.
5. For each solution r , equate other coefficients to obtain a1 , a2 ,
a3 etc in terms of a0 .
Note:
1102,
Power Series Method

Frobenius Method
The need is to develop two solutions.
1103,
Series Method
Special Functions Defined as IntegralsPower
Frobenius Method
R
Gamma function: (n) = 0 e x x n1 dx, convergent for n > 0.
Recurrence relation (1) = 1, (n + 1) = n(n)
allows extension of the definition for the entire real
line except for zero and negative integers.
(n + 1) = n! for non-negative integers.
(A generalization of the factorial function.)
R1
Beta function: B(m, n) = 0 x m1 (1 x)n1 dx =
R /2
2 0 sin2m1 cos2n1 d; m, n > 0.
B(m, n) = B(n, m); B(m, n) = (m)(n)
(m+n)
R
2
x
Error function: erf (x) = 2 0 e t dt.
(Area under the normal or Gaussian distribution)
Rx
Sine integral function: Si (x) = 0 sint t dt.
1104,
Method
Special Functions Arising as SolutionsPower
ofSeries
ODEs
Frobenius
Method
In the study of some important problems in physics,

some variable-coefficient ODEs appear recurrently,
defying analytical solution!
1105,
Method
ofSeries
ODEs
Frobenius
Method

Series solutions properties and connections
further problems further solutions
1106,
Method
ofSeries
ODEs
Frobenius
Method

Series solutions properties and connections
further problems further solutions
Table: Special functions of mathematical physics
Name of the ODE
Legendres equation
Form of the ODE

(1 x 2 )y 2xy + k(k + 1)y = 0
Airys equation
Chebyshevs equation
Hermites equation
y k 2 xy = 0
(1 x 2 )y xy + k 2 y = 0
y 2xy + 2ky = 0
Bessels equation
x 2 y + xy + (x 2 k 2 )y = 0
Gausss hypergeometric
equation
Laguerres equation
x(1 x)y + [c (a + b + 1)x]y aby = 0
Resulting functions
Legendre functions
Legendre polynomials
Airy functions
Chebyshev polynomials
Hermite functions
Hermite polynomials
Bessel functions
Neumann functions
Hankel functions
Hypergeometric function
xy + (1 x)y + ky = 0
Laguerre polynomials
1107,
Method
ofSeries
ODEs
Frobenius
Method
Legendres equation
(1 x 2 )y 2xy + k(k + 1)y = 0

k(k+1)
2x
are analytic at x = 0 with
P(x) = 1x
2 and Q(x) = 1x 2
radius of convergence R = 1.
x = 0 isP
an ordinary point and a power series solution
n
y (x) =
n=0 an x is convergent at least for |x| < 1.
1108,
Method
ofSeries
ODEs
Frobenius
Method
Legendres equation
(1 x 2 )y 2xy + k(k + 1)y = 0

k(k+1)
2x
are analytic at x = 0 with
P(x) = 1x
2 and Q(x) = 1x 2
radius of convergence R = 1.
x = 0 isP
an ordinary point and a power series solution
n
y (x) =
n=0 an x is convergent at least for |x| < 1.
Apply power series method:
k(k + 1)
a0 ,
2!
(k + 2)(k 1)
=
a1
3!
(k n)(k + n + 1)
=
an
(n + 2)(n + 1)
a2 =
a3
and
an+2
Solution: y (x) = a0 y1 (x) + a1 y2 (x)
for n 2.
1109,
Method
ofSeries
ODEs
Frobenius
Method
Legendre functions
k(k + 1) 2 k(k 2)(k + 1)(k + 3) 4

x +
x
2!
4!
(k 1)(k + 2) 3 (k 1)(k 3)(k + 2)(k + 4) 5
y2 (x) = x
x +
x
3!
5!
y1 (x) = 1
1110,
Method
ofSeries
ODEs
Frobenius
Method
Legendre functions
k(k + 1) 2 k(k 2)(k + 1)(k + 3) 4

x +
x
2!
4!
(k 1)(k + 2) 3 (k 1)(k 3)(k + 2)(k + 4) 5
y2 (x) = x
x +
x
3!
5!
y1 (x) = 1
Special significance: non-negative integral values of k
1111,
Method
ofSeries
ODEs
Frobenius
Method
Legendre functions
k(k + 1) 2 k(k 2)(k + 1)(k + 3) 4

x +
x
2!
4!
(k 1)(k + 2) 3 (k 1)(k 3)(k + 2)(k + 4) 5
y2 (x) = x
x +
x
3!
5!
y1 (x) = 1

For each k = 0, 1, 2, 3, ,
one of the series terminates at the term containing x k .
Polynomial solution: valid for the entire real line!
1112,
Method
ofSeries
ODEs
Frobenius
Method
Legendre functions
k(k + 1) 2 k(k 2)(k + 1)(k + 3) 4

x +
x
2!
4!
(k 1)(k + 2) 3 (k 1)(k 3)(k + 2)(k + 4) 5
y2 (x) = x
x +
x
3!
5!
y1 (x) = 1

For each k = 0, 1, 2, 3, ,
one of the series terminates at the term containing x k .
Polynomial solution: valid for the entire real line!

Recurrence relation in reverse:
ak2 =
k(k 1)
ak
2(2k 1)
1113,
Method
ofSeries
ODEs
Frobenius
Method
Legendre polynomial
,
Choosing ak = (2k1)(2k3)31
k!
(2k 1)(2k 3) 3 1
Pk (x) =
k!

k(k
1)
k(k 1)(k 2)(k 3) k4
k
k2
x
x
+
x
.
2(2k 1)
2 4(2k 1)(2k 3)
This choice of ak ensures Pk (1) = 1 and implies Pk (1) = (1)k .
1114,
Method
ofSeries
ODEs
Frobenius
Method
Legendre polynomial
,
Choosing ak = (2k1)(2k3)31
k!
(2k 1)(2k 3) 3 1
Pk (x) =
k!

k(k
1)
k(k 1)(k 2)(k 3) k4
k
k2
x
x
+
x
.
2(2k 1)
2 4(2k 1)(2k 3)
This choice of ak ensures Pk (1) = 1 and implies Pk (1) = (1)k .

Initial Legendre polynomials:
P0 (x) = 1,
P1 (x) = x,
1
P2 (x) = (3x 2 1),
2
1
P3 (x) = (5x 3 3x),
2
1
P4 (x) = (35x 4 30x 2 + 3) etc.
8
1115,
Method
ofSeries
ODEs
Frobenius
Method
1
P0 (x)
0.8
P1(x)
0.6
P3 (x)
0.4
P 2(x)
Pn (x)
0.2
0.2
0.4
P5 (x)
0.6
P 4(x)
0.8
1
1
0.8
0.6
0.4
0.2
0.2
0.4
0.6
Figure: Legendre polynomials
0.8
1116,
Method
ofSeries
ODEs
Frobenius
Method
1
P0 (x)
0.8
P1(x)
0.6
P3 (x)
0.4
P 2(x)
Pn (x)
0.2
0.2
0.4
P5 (x)
0.6
P 4(x)
0.8
1
1
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
All roots of a Legendre polynomial are real and they lie in [1, 1].
1117,
Method
ofSeries
ODEs
Frobenius
Method
1
P0 (x)
0.8
P1(x)
0.6
P3 (x)
0.4
P 2(x)
Pn (x)
0.2
0.2
0.4
P5 (x)
0.6
P 4(x)
0.8
1
1
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
All roots of a Legendre polynomial are real and they lie in [1, 1].
Orthogonality?
1118,
Method
ofSeries
ODEs
Frobenius
Method
Bessels equation
x 2 y + xy + (x 2 k 2 )y = 0
x = 0 is a regular singular point.
1119,
Method
ofSeries
ODEs
Frobenius
Method
Bessels equation
x 2 y + xy + (x 2 k 2 )y = 0
Frobenius method: carrying out the early steps,
(r 2 k 2 )a0 x r +[(r +1)2 k 2 ]a1 x r +1 +
X
[an2 +{r 2 k 2 +n(n+2r )}an ]x r +n = 0
n=2
Indicial equation: r 2 k 2 = 0
r = k
1120,
Method
ofSeries
ODEs
Frobenius
Method
Bessels equation
x 2 y + xy + (x 2 k 2 )y = 0
Frobenius method: carrying out the early steps,
(r 2 k 2 )a0 x r +[(r +1)2 k 2 ]a1 x r +1 +
X
[an2 +{r 2 k 2 +n(n+2r )}an ]x r +n = 0
n=2
Indicial equation: r 2 k 2 = 0 r = k
With r = k, (r + 1)2 k 2 6= 0 a1 = 0 and
an =
an2
n(n + 2r )
for n 2.
Odd coefficients are zero and

a0
a0
, a4 =
, etc.
a2 =
2(2k + 2)
2 4(2k + 2)(2k + 4)
1121,
Method
ofSeries
ODEs
Frobenius
Method
Bessel functions:
1
and using n = 2m,
Selecting a0 = 2k (k+1)
am =
(1)m
.
+ m + 1)
2k+2m m!(k
Bessel function of the first kind of order k:

Jk (x) =
m x k+2m
k+2m
X
(1)
x
2
=
(1)m k+2m
2
m!(k + m + 1)
m!(k + m + 1)
m=0
m=0
1122,
Method
ofSeries
ODEs
Frobenius
Method
Bessel functions:
1
and using n = 2m,
Selecting a0 = 2k (k+1)
am =
(1)m
.
+ m + 1)
2k+2m m!(k
Bessel function of the first kind of order k:

Jk (x) =
m x k+2m
k+2m
X
(1)
x
2
=
(1)m k+2m
2
m!(k + m + 1)
m!(k + m + 1)
m=0
m=0
When k is not an integer, Jk (x) completes the basis.

For integer k, Jk (x) = (1)k Jk (x), linearly dependent!
Reduction of order can be used to find another solution.
Bessel function of the second kind or Neumann function
Points to note
Solution in power series
Ordinary points and singularities
Definition of special functions
Legendre polynomials
Bessel functions
1123,
Power Series Method

Frobenius Method
Outline
Preliminary Ideas
Sturm-Liouville Problems
Eigenfunction Expansions
Preliminary Ideas
1124,
Preliminary Ideas
Preliminary Ideas
A simple boundary value problem:

y + 2y = 0, y (0) = 0, y () = 0
1125,
Preliminary Ideas
Preliminary Ideas

y + 2y = 0, y (0) = 0, y () = 0
General solution of the ODE:
y (x) = a sin(x 2) + b cos(x 2)
Condition y (0) = 0 b = 0. Hence, y (x) = a sin(x 2).

Then, y () = 0 a = 0. Only solution is y (x) = 0.
1126,
Preliminary Ideas
Preliminary Ideas

y + 2y = 0, y (0) = 0, y () = 0
General solution of the ODE:
y (x) = a sin(x 2) + b cos(x 2)
Condition y (0) = 0 b = 0. Hence, y (x) = a sin(x 2).

Then, y () = 0 a = 0. Only solution is y (x) = 0.
Now, consider the BVP
y + 4y = 0, y (0) = 0, y () = 0.
The same steps give y (x) = a sin(2x), with arbitrary value of a.
Infinite number of non-trivial solutions!
1127,
Preliminary Ideas
Preliminary Ideas
Boundary value problems as eigenvalue problems

Explore the possible solutions of the BVP
y + ky = 0, y (0) = 0, y () = 0.
1128,
Preliminary Ideas
Preliminary Ideas

y + ky = 0, y (0) = 0, y () = 0.
With k 0, no hope for a non-trivial solution. Consider

k = 2 > 0.
Solutions: y = a sin(x), only for specific values of (or k):
= 0, 1, 2, 3, ; i.e. k = 0, 1, 4, 9, .
1129,
Preliminary Ideas
Preliminary Ideas

y + ky = 0, y (0) = 0, y () = 0.

k = 2 > 0.
= 0, 1, 2, 3, ; i.e. k = 0, 1, 4, 9, .
Question:
For what values of k (eigenvalues), does the given BVP
possess non-trivial solutions, and
what are the corresponding solutions (eigenfunctions), up to
arbitrary scalar multiples?
Analogous to the algebraic eigenvalue problem Av = v!
1130,
Preliminary Ideas
Preliminary Ideas

y + ky = 0, y (0) = 0, y () = 0.

k = 2 > 0.
= 0, 1, 2, 3, ; i.e. k = 0, 1, 4, 9, .
Question:
For what values of k (eigenvalues), does the given BVP
possess non-trivial solutions, and
what are the corresponding solutions (eigenfunctions), up to
arbitrary scalar multiples?
Analogous to the algebraic eigenvalue problem Av = v!
Analogy of a Hermitian matrix: self-adjoint differential operator.
1131,
Preliminary Ideas
Preliminary Ideas
Consider the ODE y + P(x)y + Q(x)y = 0.

Question:
Is it possible to find functions F (x) and G (x) such that
F (x)y + F (x)P(x)y + F (x)Q(x)y
gets reduced to the derivative of F (x)y + G (x)y ?
1132,
Preliminary Ideas
Preliminary Ideas

Question:
Comparing with
d
[F (x)y + G (x)y ] = F (x)y + [F (x) + G (x)]y + G (x)y ,
dx
F (x) + G (x) = F (x)P(x)
and
G (x) = F (x)Q(x).
1133,
Preliminary Ideas
Preliminary Ideas

Question:
Comparing with
d
[F (x)y + G (x)y ] = F (x)y + [F (x) + G (x)]y + G (x)y ,
dx
F (x) + G (x) = F (x)P(x)
and
G (x) = F (x)Q(x).
Elimination of G (x):
F (x) P(x)F (x) + [Q(x) P (x)]F (x) = 0
This is the adjoint of the original ODE.
1134,
Preliminary Ideas
Preliminary Ideas
The adjoint ODE
The adjoint of the ODE y + P(x)y + Q(x)y = 0 is

F + P1 F + Q1 F = 0,
where P1 = P and Q1 = Q P .
1135,
Preliminary Ideas
Preliminary Ideas
The adjoint ODE

F + P1 F + Q1 F = 0,
Then, the adjoint of F + P1 F + Q1 F = 0 is
+ P2 + Q2 = 0,
where P2 = P1 = P and
Q2 = Q1 P1 = Q P (P ) = Q.
1136,
Preliminary Ideas
Preliminary Ideas
The adjoint ODE

F + P1 F + Q1 F = 0,
+ P2 + Q2 = 0,
Q2 = Q1 P1 = Q P (P ) = Q.
The adjoint of the adjoint of a second order linear

homogeneous equation is the original equation itself.
1137,
Preliminary Ideas
Preliminary Ideas
The adjoint ODE

F + P1 F + Q1 F = 0,
+ P2 + Q2 = 0,
Q2 = Q1 P1 = Q P (P ) = Q.
The adjoint of the adjoint of a second order linear

homogeneous equation is the original equation itself.
When is an ODE its own adjoint?
y + P(x)y + Q(x)y = 0 is self-adjoint only in the trivial case

of P(x) = 0.
What about F (x)y + F (x)P(x)y + F (x)Q(x)y = 0?
1138,
Preliminary Ideas
Preliminary Ideas
Second order self-adjoint ODE

Question: What is the adjoint of Fy + FPy + FQy = 0?
1139,
Preliminary Ideas
Preliminary Ideas

Rephrased question: What is the ODE that (x) has to satisfy if
Fy + FPy + FQy =

d
Fy + (x)y ?
dx
1140,
Preliminary Ideas
Preliminary Ideas

Fy + FPy + FQy =
Comparing terms,
d
(F ) + (x) = FP
dx
Eliminating (x), we have
d2
(F )
dx 2

d
Fy + (x)y ?
dx
and
(x) = FQ.
+ FQ =
d
dx (FP).
1141,
Preliminary Ideas
Preliminary Ideas

Fy + FPy + FQy =
Comparing terms,
d
(F ) + (x) = FP
dx
Eliminating (x), we have
d2
(F )
dx 2

d
Fy + (x)y ?
dx
and
(x) = FQ.
+ FQ =
d
dx (FP).
F + 2F + F + FQ = FP + (FP)

F + (2F FP) + F (FP) + FQ = 0
This is the same as the original ODE, when F (x) = F (x)P(x)
1142,
Preliminary Ideas
Preliminary Ideas
Casting a given ODE into the self-adjoint form:

Equation y + P(x)y + Q(x)y = 0 is converted to the
self-adjoint
R form through the multiplication of
F (x) = e P(x)dx .
General form of self-adjoint equations:
d
[F (x)y ] + R(x)y = 0
dx
1143,
Preliminary Ideas
Preliminary Ideas
Casting a given ODE into the self-adjoint form:

Equation y + P(x)y + Q(x)y = 0 is converted to the
self-adjoint
R form through the multiplication of
F (x) = e P(x)dx .
General form of self-adjoint equations:
d
[F (x)y ] + R(x)y = 0
dx
Working rules:
To determine whether a given ODE is in the self-adjoint form,

check whether the coefficient of y is the derivative of the
coefficient of y .
To convert an ODE into the self-adjoint form, first obtain the

equation in normal form by dividing with the coefficient of y .
If the coefficient of y now
R is P(x), then next multiply the
resulting equation with e Pdx .
1144,
Preliminary Ideas
Sturm-Liouville equation
[r (x)y ] + [q(x) + p(x)]y = 0,
where p, q, r and r are continuous on [a, b], with p(x) > 0 on
[a, b] and r (x) > 0 on (a, b).
1145,
Preliminary Ideas
Sturm-Liouville equation
[r (x)y ] + [q(x) + p(x)]y = 0,
where p, q, r and r are continuous on [a, b], with p(x) > 0 on
[a, b] and r (x) > 0 on (a, b).
With different boundary conditions,
Regular S-L problem:
a1 y (a) + a2 y (a) = 0 and b1 y (b) + b2 y (b) = 0,
vectors [a1 a2 ]T and [b1 b2 ]T being non-zero.
Periodic S-L problem: With r (a) = r (b),
y (a) = y (b) and y (a) = y (b).
Singular S-L problem: If r (a) = 0, no boundary condition is
needed at x = a. If r (b) = 0, no boundary condition
is needed at x = b.
(We just look for bounded solutions over [a, b].)
1146,
Orthogonality of eigenfunctions
Preliminary Ideas
Theorem: If ym (x) and yn (x) are eigenfunctions

(solutions) of a Sturm-Liouville problem corresponding to
distinct eigenvalues m and n respectively, then
(ym , yn )
p(x)ym (x)yn (x)dx = 0,

a
i.e. they are orthogonal with respect to the weight

function p(x).
1147,
Preliminary Ideas

(ym , yn )

a

function p(x).
From the hypothesis,

) + (q + m p)ym = 0
(rym
(ryn ) + (q + n p)yn = 0

) yn
(q + m p)ym yn = (rym
(q + n p)ym yn = (ryn ) ym
1148,
Preliminary Ideas

(ym , yn )

a

function p(x).
From the hypothesis,

) + (q + m p)ym = 0
(rym
(ryn ) + (q + n p)yn = 0
Subtracting,

) yn
(q + m p)ym yn = (rym
(q + n p)ym yn = (ryn ) ym
) yn
)yn (rym
(rym
(m n )pym yn = (ryn ) ym + (ryn )ym

= r (ym yn yn ym ) .
1149,
Integrating both sides,
Z b
p(x)ym (x)yn (x)dx
(m n )
1150,
Preliminary Ideas
= r (b)[ym (b)yn (b) yn (b)ym

(b)] r (a)[ym (a)yn (a) yn (a)ym
(a)].
Z b
p(x)ym (x)yn (x)dx
(m n )
1151,
Preliminary Ideas

(a)].
In a regular S-L problem, from the boundary condition at

x = a, the homogeneous

system

(a)
ym (a) ym
a1
0
=
has non-trivial solutions.
yn (a) yn (a)
a2
0
(a) = 0.
Therefore, ym (a)yn (a) yn (a)ym
Similarly, ym (b)yn (b) yn (b)ym (b) = 0.

In a singular S-L problem, zero value of r (x) at a boundary
makes the corresponding term vanish even without a BC.
In a periodic S-L problem, the two terms cancel out together.
Z b
p(x)ym (x)yn (x)dx
(m n )
1152,
Preliminary Ideas

(a)].
In a regular S-L problem, from the boundary condition at

x = a, the homogeneous

system

(a)
ym (a) ym
a1
0
=
has non-trivial solutions.
yn (a) yn (a)
a2
0
(a) = 0.
Therefore, ym (a)yn (a) yn (a)ym
Similarly, ym (b)yn (b) yn (b)ym (b) = 0.

In a singular S-L problem, zero value of r (x) at a boundary
makes the corresponding term vanish even without a BC.
In a periodic S-L problem, the two terms cancel out together.
Since m 6= n , in all cases,
Z b
p(x)ym (x)yn (x)dx = 0.
Preliminary Ideas
Example: Legendre polynomials over [1, 1]

Legendres equation
d
[(1 x 2 )y ] + k(k + 1)y = 0
dx
is self-adjoint and defines a singular Sturm Liouville problem over
[1, 1] with p(x) = 1, q(x) = 0, r (x) = 1 x 2 and = k(k + 1).
1153,
1154,
Preliminary Ideas

Legendres equation
d
[(1 x 2 )y ] + k(k + 1)y = 0
dx
(mn)(m+n+1)
1
1
1
Pm (x)Pn (x)dx = [(1x 2 )(Pm Pn Pn Pm
)]1 = 0
1155,
Preliminary Ideas

Legendres equation
d
[(1 x 2 )y ] + k(k + 1)y = 0
dx
(mn)(m+n+1)
1
1
1
Pm (x)Pn (x)dx = [(1x 2 )(Pm Pn Pn Pm
)]1 = 0
From orthogonal decompositions 1 = P0 (x), x = P1 (x),

1
2
1
1
(3x 2 1) + = P2 (x) + P0 (x),
x2 =
3
3
3
3
3
2
3
1
(5x 3 3x) + x = P3 (x) + P1 (x),
x3 =
5
5
5
5
8
4
1
4
x =
P4 (x) + P2 (x) + P0 (x) etc;
35
7
5
Pk (x) is orthogonal to all polynomials of degree less than k.
Real eigenvalues
Preliminary Ideas
Eigenvalues of a Sturm-Liouville problem are real.
1156,
Real eigenvalues
Preliminary Ideas

Let eigenvalue = + i and eigenfunction y (x) = u(x) + iv (x).
[r (u + iv )] + [q + ( + i )p](u + iv ) = 0.
1157,
Real eigenvalues
Preliminary Ideas

[r (u + iv )] + [q + ( + i )p](u + iv ) = 0.
Separation of real and imaginary parts:
[ru ] + (q + p)u pv = 0 pv 2 = [ru ] v + (q + p)uv
[rv ] + (q + p)v + pu = 0 pu 2 = [rv ] u (q + p)uv
Adding together,

p(u 2 + v 2 ) = [ru ] v + [ru ]v [rv ]u [rv ] u = r (uv vu )
1158,
Real eigenvalues
Preliminary Ideas

[r (u + iv )] + [q + ( + i )p](u + iv ) = 0.
Separation of real and imaginary parts:
[ru ] + (q + p)u pv = 0 pv 2 = [ru ] v + (q + p)uv
[rv ] + (q + p)v + pu = 0 pu 2 = [rv ] u (q + p)uv
Adding together,

p(u 2 + v 2 ) = [ru ] v + [ru ]v [rv ]u [rv ] u = r (uv vu )
Integration and application of boundary conditions leads to
Z b
p(x)[u 2 (x) + v 2 (x)]dx = 0.
= 0 and =
1159,
Preliminary Ideas
Eigenfunctions of Sturm-Liouville problems:

convenient and powerful instruments to represent and
manipulate fairly general classes of functions
1160,
Preliminary Ideas
Eigenfunctions of Sturm-Liouville problems:

convenient and powerful instruments to represent and
manipulate fairly general classes of functions
{y0 , y1 , y2 , y3 , }: a family of continuous functions over [a, b],
mutually orthogonal with respect to p(x).
Representation of a function f (x) on [a, b]:
f (x) =
m=0
am ym (x) = a0 y0 (x) + a1 y1 (x) + a2 y2 (x) + a3 y3 (x) +
Generalized Fourier series

Analogous to the representation of a vector as a linear combination
of a set of mutually orthogonal vectors.
Question: How to determine the coefficients (an )?
1161,
Inner product:
Z
(f , yn ) =
1162,
Preliminary Ideas
p(x)f (x)yn (x)dx
=
where
b X
[am p(x)ym (x)yn (x)]dx =
a m=0
kyn k =
m=0
(yn , yn ) =
Fourier coefficients: an =
(f ,yn )
kyn k2
am (ym , yn ) = an kyn k2
p(x)yn2 (x)dx
Inner product:
Z
(f , yn ) =
1163,
Preliminary Ideas
p(x)f (x)yn (x)dx
=
where
b X
[am p(x)ym (x)yn (x)]dx =
a m=0
kyn k =
m=0
(yn , yn ) =
(f ,yn )
kyn k2
am (ym , yn ) = an kyn k2
p(x)yn2 (x)dx
Fourier coefficients: an =
Normalized eigenfunctions:
m (x) =
ym (x)
kym (x)k
Generalized Fourier series (in orthonormal basis):

f (x) =
cm m (x) = c0 0 (x)+ c1 1 (x)+ c2 2 (x)+ c3 3 (x)+
1164,
Preliminary Ideas
In terms of a finite number of members of the family {k (x)},

N (x) =
N
X
m=0
Error
m m (x) = 0 0 (x)+1 1 (x)+2 2 (x)+ +N N (x).
E = kf N k2 =
"
p(x) f (x)
N
X
m=0
#2
m m (x)
dx
1165,
Preliminary Ideas
In terms of a finite number of members of the family {k (x)},

N (x) =
N
X
m=0
Error
m m (x) = 0 0 (x)+1 1 (x)+2 2 (x)+ +N N (x).
E = kf N k2 =
"
p(x) f (x)
N
X
m=0
#2
m m (x)
dx
Error is minimized when

#
"
Z b
N
X
E
m m (x) [n (x)]dx = 0
2p(x) f (x)
=
n
a
m=0
b
a
n p(x)2n (x)dx
p(x)f (x)n (x)dx.
n = cn
best approximation in the mean or least square approximation
Preliminary Ideas
Using the Fourier coefficients, error

E = (f , f )2
N
X
cn (f , n )+
N
X
n=0
n=0
cn2 (n , n ) = kf k2 2
E = kf k
Bessels inequality:
N
X
n=0
cn2
kf k =
N
X
n=0
b
a
cn2 0.
p(x)f 2 (x)dx
N
X
n=0
cn2 +
N
X
n=0
cn2
1166,
Preliminary Ideas
Using the Fourier coefficients, error

E = (f , f )2
N
X
cn (f , n )+
N
X
n=0
n=0
cn2 (n , n ) = kf k2 2
E = kf k
Bessels inequality:
N
X
n=0
cn2
kf k =
Partial sum
sk (x) =
N
X
n=0
k
X
N
X
n=0
cn2 +
N
X
n=0
cn2 0.
p(x)f 2 (x)dx
am m (x)
m=0
Question: Does the sequence of {sk } converge?

Answer: The bound in Bessels inequality ensures convergence.
cn2
1167,
Preliminary Ideas
Question: Does it converge to f ?

lim
k a
p(x)[sk (x) f (x)]2 dx = 0?
Answer: Depends on the basis used.
1168,
Preliminary Ideas

lim
k a
p(x)[sk (x) f (x)]2 dx = 0?

Convergence in the mean or mean-square convergence:
An orthonormal set of functions {k (x)} on an interval
a x b is said to be complete in a class of functions,
or to form a basis for it, if the corresponding generalized
Fourier series for a function converges in the mean to the
function, for every function belonging to that class.
P 2
2
Parsevals identity:
n=0 cn = kf k
1169,
Preliminary Ideas

lim
k a
p(x)[sk (x) f (x)]2 dx = 0?

Convergence in the mean or mean-square convergence:
An orthonormal set of functions {k (x)} on an interval
a x b is said to be complete in a class of functions,
or to form a basis for it, if the corresponding generalized
Fourier series for a function converges in the mean to the
function, for every function belonging to that class.
P 2
2
Parsevals identity:
n=0 cn = kf k
Eigenfunction expansion: generalized Fourier series in terms of
eigenfunctions of a Sturm-Liouville problem
convergent for continuous functions with piecewise continuous

derivatives, i.e. they form a basis for this class.
1170,
Points to note
Eigenvalue problems in ODEs
Self-adjoint differential operators
Sturm-Liouville problems
Orthogonal eigenfunctions
Eigenfunction expansions
Preliminary Ideas
1171,
Outline

Basic Theory of Fourier Series
Extensions in Application
Fourier Integrals

Fourier Integrals
1172,

Fourier Integrals
With q(x) = 0 and p(x) = r (x) = 1, periodic S-L problem:

y + y = 0, y (L) = y (L), y (L) = y (L)
1173,

Fourier Integrals

y + y = 0, y (L) = y (L), y (L) = y (L)
x
2x
2x
Eigenfunctions 1, cos x
L , sin L , cos L , sin L ,
constitute an orthogonal basis for representing functions.
1174,

Fourier Integrals

y + y = 0, y (L) = y (L), y (L) = y (L)
x
2x
2x
For a periodic function f (x) of period 2L, we propose

X
nx
nx
an cos
f (x) = a0 +
+ bn sin
L
L
n=1
and determine the Fourier coefficients from Euler formulae

Z L
1
a0 =
f (x)dx,
2L L
Z
Z
1 L
1 L
mx
mx
dx and bm =
dx.
am =
f (x) cos
f (x) sin
L L
L
L L
L
1175,

Fourier Integrals

y + y = 0, y (L) = y (L), y (L) = y (L)
x
2x
2x
For a periodic function f (x) of period 2L, we propose

X
nx
nx
an cos
f (x) = a0 +
+ bn sin
L
L
n=1
and determine the Fourier coefficients from Euler formulae

Z L
1
a0 =
f (x)dx,
2L L
Z
Z
1 L
1 L
mx
mx
dx and bm =
dx.
am =
f (x) cos
f (x) sin
L L
L
L L
L
Question: Does the series converge?
1176,

Fourier Integrals
Dirichlets conditions:
If f (x) and its derivative are piecewise continuous on
[L, L] and are periodic with a period 2L, then the series
(x)
converges to the mean f (x+)+f
of one-sided limits, at
2
all points.
Fourier series
1177,

Fourier Integrals
(x)
2
all points.
Fourier series
Note: The interval of integration can be [x0 , x0 + 2L] for any x0 .
1178,

Fourier Integrals
(x)
2
all points.
Fourier series
Note: The interval of integration can be [x0 , x0 + 2L] for any x0 .
It is valid to integrate the Fourier series term by term.
The Fourier series uniformly converges to f (x) over an

interval on which f (x) is continuous. At a jump discontinuity,
(x)
is not uniform. Mismatch peak
convergence to f (x+)+f
2
shifts with inclusion of more terms (Gibbs phenomenon).
Term-by-term differentiation of the Fourier series at a point

requires f (x) to be smooth at that point.
1179,

Fourier Integrals
Multiplying the Fourier series with f (x),

h
X
nx i
nx
2
+ bn f (x) sin
an f (x) cos
f (x) = a0 f (x) +
L
L
n=1
1180,

Fourier Integrals

h
X
nx i
nx
2
+ bn f (x) sin
an f (x) cos
f (x) = a0 f (x) +
L
L
n=1
Parsevals identity:
a02 +
1X 2
1
(an + bn2 ) =
2
2L
n=1
The Fourier series representation is complete.
f 2 (x)dx
1181,

Fourier Integrals

h
X
nx i
nx
2
+ bn f (x) sin
an f (x) cos
f (x) = a0 f (x) +
L
L
n=1
Parsevals identity:
a02 +
1X 2
1
(an + bn2 ) =
2
2L
n=1
f 2 (x)dx

A periodic function f (x) is composed of its mean value and
several sinusoidal components, or harmonics.
Fourier coefficients are corresponding amplitudes.
Parsevals identity is simply a statement on energy balance!
1182,

Fourier Integrals

h
X
nx i
nx
2
+ bn f (x) sin
an f (x) cos
f (x) = a0 f (x) +
L
L
n=1
Parsevals identity:
a02 +
1X 2
1
(an + bn2 ) =
2
2L
n=1
f 2 (x)dx

A periodic function f (x) is composed of its mean value and
several sinusoidal components, or harmonics.
Fourier coefficients are corresponding amplitudes.
Parsevals identity is simply a statement on energy balance!
Bessels inequality
N
a02
1
1X 2
(an + bn2 )
kf (x)k2
+
2
2L
n=1
1183,

Fourier Integrals
Original spirit of Fouries series

representation of periodic functions over (, ).
Question: What about a function f (x) defined only on [L, L]?
1184,

Fourier Integrals

Answer: Extend the function as
F (x) = f (x) for L x L,
and F (x + 2L) = F (x).
Fourier series of F (x) acts as the Fourier series representation of

f (x) in its own domain.
1185,

Fourier Integrals

and F (x + 2L) = F (x).

In Euler formulae, notice that bm = 0 for an even function.
The Fourier series of an even function is a Fourier
cosine series
f (x) = a0 +
an cos
n=1
where a0 =
1
L
RL
0
f (x)dx and an =
nx
,
L
2
L
RL
0
f (x) cos nx
L dx.
1186,

Fourier Integrals

and F (x + 2L) = F (x).

In Euler formulae, notice that bm = 0 for an even function.
The Fourier series of an even function is a Fourier
cosine series
f (x) = a0 +
an cos
n=1
where a0 =
1
L
RL
0
f (x)dx and an =
nx
,
L
2
L
RL
0
f (x) cos nx
L dx.
Similarly, for an odd function, Fourier sine series.
1187,

Fourier Integrals
Over [0, L], sometimes we need a series of sine terms only, or

cosine terms only!
1188,

Fourier Integrals
Over [0, L], sometimes we need a series of sine terms only, or

cosine terms only!
fc(x)
f(x)
3L
2L
2L
3L
2L
3L
(b) Even periodic extension
(a) Function over ( 0,L)
fs(x)
3L
2L
(c) Odd periodic extension
Figure: Periodic extensions for cosine and sine series
1189,
1190,

Fourier Integrals
Half-range expansions
For Fourier cosine series of a function f (x) over [0, L], even
periodic extension:

f (x)
for 0 x L,
fc (x) =
and fc (x+2L) = fc (x)
f (x) for L x < 0,
For Fourier sine series of a function f (x) over [0, L], odd
periodic extension:

f (x)
for 0 x L,
fs (x) =
and fs (x+2L) = fs (x)
f (x) for L x < 0,
1191,

Fourier Integrals
periodic extension:

f (x)
for 0 x L,
fc (x) =
f (x) for L x < 0,
periodic extension:

f (x)
for 0 x L,
fs (x) =
f (x) for L x < 0,
To develop the Fourier series of a function, which is available as a

set of tabulated values or a black-box library routine,
integrals in the Euler formulae are evaluated numerically.
1192,

Fourier Integrals
periodic extension:

f (x)
for 0 x L,
fc (x) =
f (x) for L x < 0,
periodic extension:

f (x)
for 0 x L,
fs (x) =
f (x) for L x < 0,
To develop the Fourier series of a function, which is available as a

set of tabulated values or a black-box library routine,
integrals in the Euler formulae are evaluated numerically.
Important: Fourier series representation is richer and more
powerful compared to interpolatory or least square approximation
in many contexts.
Fourier Integrals

Fourier Integrals
Question: How to apply the idea of Fourier series to a

non-periodic function over an infinite domain?
1193,
Fourier Integrals

Fourier Integrals

Answer: Magnify a single period to an infinite length.
Fourier series of function fL (x) of period 2L:
X
(an cos pn x + bn sin pn x),
fL (x) = a0 +
n=1
where pn =
n
L
is the frequency of the n-th harmonic.
1194,
Fourier Integrals
1195,

Fourier Integrals

Answer: Magnify a single period to an infinite length.
Fourier series of function fL (x) of period 2L:
X
(an cos pn x + bn sin pn x),
fL (x) = a0 +
n=1
where pn =
n
L
is the frequency of the n-th harmonic.
Inserting the expressions for the Fourier coefficients,

Z L
1
fL (x)dx
fL (x) =
2L L

Z L
Z L

1X
+
cos pn x
fL (v ) cos pn v dv + sin pn x
fL (v ) sin pn v dv p,
L
L
n=1
where p = pn+1 pn = L .
Fourier Integrals
1196,

Fourier Integrals
In the limit (if it exists), as L , p 0,

Z
Z
Z
1
f (x) =
f (v ) sin pv dv dp
f (v ) cos pv dv + sin px
cos px
0
Fourier Integrals
1197,

Fourier Integrals

Z
Z
Z
1
f (x) =
f (v ) sin pv dv dp
cos px
0
Fourier integral of f (x):

Z
[A(p) cos px + B(p) sin px]dp,
f (x) =
0
where amplitude functions

Z
Z
1
1
f (v ) cos pv dv and B(p) =
f (v ) sin pv dv
A(p) =

are defined for a continuous frequency variable p.
Fourier Integrals
1198,

Fourier Integrals

Z
Z
Z
1
f (x) =
f (v ) sin pv dv dp
cos px
0
Fourier integral of f (x):

Z
[A(p) cos px + B(p) sin px]dp,
f (x) =
0
where amplitude functions

Z
Z
1
1
f (v ) sin pv dv
A(p) =

are defined for a continuous frequency variable p.
In phase angle form,
1
f (x) =
f (v ) cos p(x v )dv dp.
Fourier Integrals
Using cos =

Fourier Integrals
e i +e i
2
in the phase angle form,

Z Z
1
f (x) =
f (v )[e ip(xv ) + e ip(xv ) ]dv dp.
2 0
With substitution p = q,
Z Z
Z
ip(xv )
f (v )e
dv dp =
0
f (v )e iq(xv ) dv dq.
1199,
Fourier Integrals
Using cos =

Fourier Integrals
e i +e i
2
in the phase angle form,

Z Z
1
f (x) =
f (v )[e ip(xv ) + e ip(xv ) ]dv dp.
2 0
With substitution p = q,
Z Z
Z
ip(xv )
f (v )e
dv dp =
0
f (v )e iq(xv ) dv dq.
Complex form of Fourier integral

Z Z
Z
1
ip(xv )
C (p)e ipx dp,
f (v )e
dv dp =
f (x) =
2
in which the complex Fourier integral coefficient is

Z
1
f (v )e ipv dv .
C (p) =
2
1200,
Points to note

Fourier Integrals
Fourier series arising out of a Sturm-Liouville problem
A versatile tool for function representation
Fourier integral as the limiting case of Fourier series
1201,
Outline
Fourier Transforms
Definition and Fundamental Properties
Important Results on Fourier Transforms
Discrete Fourier Transform
Fourier Transforms
1202,
Fourier Transforms

Complex form of the Fourier integral:

Z
Z
1
1
iwv
f (t) =
f (v )e
dv e iwt dw
2
2
Composition of an infinite number of functions in the
e iwt
, over a continuous distribution of frequency w .
form
2
1203,
Fourier Transforms


Z
Z
1
1
iwv
f (t) =
f (v )e
dv e iwt dw
2
2
e iwt
form
2
Fourier transform: Amplitude of a frequency component:
Z
1
f (t)e iwt dt
F(f ) f (w ) =
2
Function of the frequency variable.
1204,
Fourier Transforms


Z
Z
1
1
iwv
f (t) =
f (v )e
dv e iwt dw
2
2
e iwt
form
2
Fourier transform: Amplitude of a frequency component:
Z
1
f (t)e iwt dt
F(f ) f (w ) =
2
Function of the frequency variable.
Inverse Fourier transform
F
1
(f) f (t) =
2
recovers the original function.
f(w )e iwt dw
1205,
Fourier Transforms

Example: Fourier transform of f (t) = 1?
1206,
Fourier Transforms


Let us find out the inverse Fourier transform of f(w ) = k(w ).
Z
k
1
1
k(w )e iwt dw =
f (t) = F (f ) =
2
2
F(1) = 2(w )
1207,
Fourier Transforms


Z
k
1
1
k(w )e iwt dw =
f (t) = F (f ) =
2
2
F(1) = 2(w )
Linearity of Fourier transforms:
F{f1 (t) + f2 (t)} = f1 (w ) + f2 (w )
1208,
Fourier Transforms


Z
k
1
1
k(w )e iwt dw =
f (t) = F (f ) =
2
2
F(1) = 2(w )
F{f1 (t) + f2 (t)} = f1 (w ) + f2 (w )
Scaling:
F{f (at)} =
1 w
f
|a|
a
n w o
and F 1 f
= |a|f (at)
a
1209,
Fourier Transforms


Z
k
1
1
k(w )e iwt dw =
f (t) = F (f ) =
2
2
F(1) = 2(w )
F{f1 (t) + f2 (t)} = f1 (w ) + f2 (w )
Scaling:
F{f (at)} =
Shifting rules:
1 w
f
|a|
a
n w o
and F 1 f
= |a|f (at)
a
F{f (t t0 )} = e iwt0 F{f (t)}

F 1 {f(w w0 )} = e iw0 t F 1 {f(w )}
1210,
Fourier Transforms

Fourier transform of the derivative of a function:

If f (t) is continuous
in every interval and f (t) is piecewise
R
continuous, |f (t)|dt converges and f (t) approaches zero as
t ,
then
Z
1
f (t)e iwt dt
F{f (t)} =
2
Z

1
1
iwt
(iw )f (t)e iwt dt
=
f (t)e
2
2
= iw f(w ).
1211,
Fourier Transforms

Fourier transform of the derivative of a function:

If f (t) is continuous
in every interval and f (t) is piecewise
R
continuous, |f (t)|dt converges and f (t) approaches zero as
t ,
then
Z
1
f (t)e iwt dt
F{f (t)} =
2
Z

1
1
iwt
(iw )f (t)e iwt dt
=
f (t)e
2
2
= iw f(w ).
Alternatively, differentiating the inverse Fourier transform,

Z
d
1
d
iwt
[f (t)] =
f (w )e dw
dt
dt
2
Z
i
h
1
f (w )e iwt dw = F 1 {iw f(w )}.
=
2 t
1212,
Fourier Transforms

Under appropriate premises,

F{f (t)} = (iw )2 f(w ) = w 2 f(w ).
In general, F{f (n) (t)} = (iw )n f(w ).
1213,
Fourier Transforms


F{f (t)} = (iw )2 f(w ) = w 2 f(w ).
Fourier transform of an integral:
f (t) is piecewise continuous on every interval,
RIf
|f (t)|dt converges and f (0) = 0, then

F
Z
f ( )d
1
f (w ).
iw
1214,
Fourier Transforms


F{f (t)} = (iw )2 f(w ) = w 2 f(w ).
Fourier transform of an integral:
f (t) is piecewise continuous on every interval,
RIf
|f (t)|dt converges and f (0) = 0, then

F
Z
f ( )d
1
f (w ).
iw
Derivative of a Fourier transform (with respect to the frequency

variable):
dn
F{t n f (t)} = i n n f(w ),
dw
R n
if f (t) is piecewise continuous and |t f (t)|dt converges.
1215,
Fourier Transforms

Convolution of two functions:

h(t) = f (t) g (t) =
f ( )g (t )d
1216,
Fourier Transforms


h(t) = f (t) g (t) =
f ( )g (t )d
) = F{h(t)}
h(w
Z Z
1
=
f ( )g (t )e iwt d dt
2

Z
Z
1
iw
iw (t )
f ( )e
=
g (t )e
dt d
2

Z
Z
1
iwt
iw
g (t )e
dt d
f ( )e
=
2
1217,
Fourier Transforms


h(t) = f (t) g (t) =
f ( )g (t )d
) = F{h(t)}
h(w
Z Z
1
=
f ( )g (t )e iwt d dt
2

Z
Z
1
iw
iw (t )
f ( )e
=
g (t )e
dt d
2

Z
Z
1
iwt
iw
g (t )e
dt d
f ( )e
=
2
Convolution theorem for Fourier transforms:
) = 2 f(w )
g (w )
h(w
1218,
Fourier Transforms

Conjugate of the Fourier transform:

Z
1
f (w ) =
f (t)e iwt dt
2
Inner product of f(w ) and g (w ):

Z
Z
Z
1
f (t)e iwt dt g (w )dw

f (w )
g (w )dw =
2

Z
Z
1
iwt
f (t)
=
g (w )e dw dt
2
Z
f (t)g (t)dt.
=
1219,
Fourier Transforms

Conjugate of the Fourier transform:

Z
1
f (w ) =
f (t)e iwt dt
2
Inner product of f(w ) and g (w ):

Z
Z
Z
1
f (t)e iwt dt g (w )dw

f (w )
g (w )dw =
2

Z
Z
1
iwt
f (t)
=
g (w )e dw dt
2
Z
f (t)g (t)dt.
=
Parsevals identity: For g (t) = f (t) in the above,

Z
Z
kf (t)k2 dt,
kf(w )k2 dw =
equating the total energy content of the frequency spectrum of a

wave or a signal to the total energy flow over time.
1220,
Fourier Transforms
Consider a signal f (t) from actual measurement or sampling.

We want to analyze its amplitude spectrum (versus frequency).
For the FT, how to evaluate the integral over (, )?
1221,
Fourier Transforms


Windowing: Sample the signal f (t) over a finite interval.
A window function:
g (t) =
1
0
for a t b
otherwise
Actual processing takes place on the windowed function f (t)g (t).
1222,
Fourier Transforms


A window function:
g (t) =
1
0
for a t b
otherwise

Next question: Do we need to evaluate the amplitude for all
w (, )?
1223,
Fourier Transforms


A window function:
g (t) =
1
0
for a t b
otherwise

Next question: Do we need to evaluate the amplitude for all
w (, )?
Most useful signals are particularly rich only in their own
characteristic frequency bands.
Decide on an expected frequency band, say [wc , wc ].
1224,

Time step for sampling?
Fourier Transforms
1225,
Fourier Transforms


With N sampling over [a, b),
wc ,
data being collected at t = a, a + , a + 2, , a + (N 1),
with N = b a.
1226,
Fourier Transforms


wc ,
with N = b a.
Nyquist critical frequency
1227,
Fourier Transforms


wc ,
with N = b a.
Note the duality.
Decision of sampling rate determines the band of frequency

content that can be accommodated.
Decision of the interval [a, b) dictates how finely the

frequency spectrum can be developed.
1228,
Fourier Transforms


wc ,
with N = b a.
Note the duality.
Decision of sampling rate determines the band of frequency

content that can be accommodated.
Decision of the interval [a, b) dictates how finely the

frequency spectrum can be developed.
Shannons sampling theorem

A band-limited signal can be reconstructed from a finite
number of samples.
1229,
Fourier Transforms
With discrete data at tk = k for k = 0, 1, 2, 3, , N 1,
where mj = e iwj
h i
f(w) = mjk f(t),
2
h i
and mjk is an N N matrix.
1230,
Fourier Transforms
where mj = e iwj
h i
f(w) = mjk f(t),
2
h i
A similar discrete version of inverse Fourier transform.
1231,
Fourier Transforms
where mj = e iwj
h i
f(w) = mjk f(t),
2
h i

Reconstruction: a trigonometric interpolation of sampled data.
Structure of Fourier and inverse Fourier transforms reduces the

problem with a system of linear equations [O(N 3 ) operations]
to that of a matrix-vector multiplication [O(N 2 ) operations].
1232,
Fourier Transforms
where mj = e iwj
h i
f(w) = mjk f(t),
2
h i


h i
Structure of matrix mjk , with patterns of redundancies,
opens up a trick to reduce it further to O(N log N) operations.
1233,
Fourier Transforms
where mj = e iwj
h i
f(w) = mjk f(t),
2
h i


h i
Cooley-Tuckey algorithm:
1234,
Fourier Transforms
where mj = e iwj
h i
f(w) = mjk f(t),
2
h i


h i
Cooley-Tuckey algorithm:
fast Fourier transform (FFT)
1235,
Fourier Transforms
DFT representation reliable only if the incoming signal is really

band-limited in the interval [wc , wc ].
Frequencies beyond [wc , wc ] distort the spectrum near w = wc
by folding back.
Aliasing
Detection: a posteriori
1236,
Fourier Transforms

by folding back.
Aliasing
Bandpass filtering: If we expect a signal having components only
in certain frequency bands and want to get rid of unwanted noise
frequencies,
for every band [w1 , w2 ] of our interest, we define window
) with intervals [w2 , w1 ] and [w1 , w2 ].
function (w
)f(w ) filters out frequency
Windowed Fourier transform (w
components outside this band.
1237,
Fourier Transforms

by folding back.
Aliasing
Bandpass filtering: If we expect a signal having components only
in certain frequency bands and want to get rid of unwanted noise
frequencies,
for every band [w1 , w2 ] of our interest, we define window
) with intervals [w2 , w1 ] and [w1 , w2 ].
function (w
)f(w ) filters out frequency
Windowed Fourier transform (w
components outside this band.
For recovery,
).
convolve raw signal f (t) with IFT (t) of (w
1238,
Points to note
Fourier Transforms
Fourier transform as amplitude function in Fourier integral
Basic operational tools in Fourier and inverse Fourier

transforms
Conceptual notions of discrete Fourier transform (DFT)
1239,
Outline
Approximation with Chebyshev polynomials
Minimax Polynomial Approximation
1240,

Chebyshev polynomials:
Polynomial solutions of the singular Sturm-Liouville problem
(1 x 2 )y xy + n2 y = 0
or
hp
i
n2
1 x2 y +
y =0
1 x2
over 1 x 1, with Tn (1) = 1 for all n.
1241,

Chebyshev polynomials:
Polynomial solutions of the singular Sturm-Liouville problem
(1 x 2 )y xy + n2 y = 0
or
hp
i
n2
1 x2 y +
y =0
1 x2
over 1 x 1, with Tn (1) = 1 for all n.

Closed-form expressions:
Tn (x) = cos(n cos1 x),

or,
T0 (x) = 1, T1 (x) = x, T2 (x) = 2x 2 1, T3 (x) = 4x 3 3x, ;
with the three-term recurrence relation
Tk+1 (x) = 2xTk (x) Tk1 (x).
1242,

Immediate observations
Coefficients in a Chebyshev polynomial are integers. In
particular, the leading coefficient of Tn (x) is 2n1 .
For even n, Tn (x) is an even function, while for odd n it is an
odd function.
Tn (1) = 1, Tn (1) = (1)n and |Tn (x)| 1 for 1 x 1.
Zeros of a Chebyshev polynomial Tn (x) are real and lie inside
the interval [1, 1] at locations x = cos (2k1)
for
2n
k = 1, 2, 3, , n.
These locations are also called Chebyshev accuracy points.
Further, zeros of Tn (x) are interlaced by those of Tn+1 (x).
Extrema of Tn (x) are of magnitude equal to unity, alternate in
sign and occur at x = cos k
n for k = 0, 1, 2, 3, , n.
Orthogonality and norms:
Z 1
if m 6= n,
0
Tm (x)Tn (x)
if
m = n 6= 0, and
dx =
2
1 x2
1
if m = n = 0.
1243,
1244,

1
P 8 (x)
T8 (x)
0.8
0.8
0.6
0.4
0.4
0.2
0.2
T3 (x)
0.6
0.2
0.2
0.4
0.4
0.6
0.6
0.8
extrema
zeroes
1.5
0.5
0
x
0.5
0.8
1.5
1
1
0.8
0.6
0.4
0.2
0
x
0.2
0.4
0.6
0.8
Figure: Extrema and zeros of T3 (x) Figure: Contrast: P8 (x) and T8 (x)
1245,

1
P 8 (x)
T8 (x)
0.8
0.8
0.6
0.4
0.4
0.2
0.2
T3 (x)
0.6
0.2
0.2
0.4
0.4
0.6
0.6
0.8
extrema
zeroes
1.5
0.5
0
x
0.5
0.8
1.5
1
1
0.8
0.6
0.4
0.2
0
x
0.2
0.4
0.6
0.8
Being cosines and polynomials at the same time, Chebyshev

polynomials possess a wide variety of interesting properties!
1246,

1
P 8 (x)
T8 (x)
0.8
0.8
0.6
0.4
0.4
0.2
0.2
T3 (x)
0.6
0.2
0.2
0.4
0.4
0.6
0.6
0.8
extrema
zeroes
1.5
0.5
0.5
0.8
1.5
1
1
0.8
0.6
0.4
0.2
0
x
0.2
0.4
0.6
0.8
Being cosines and polynomials at the same time, Chebyshev

polynomials possess a wide variety of interesting properties!
Most striking property:
equal-ripple oscillations, leading to minimax property

Minimax property
Theorem: Among all polynomials pn (x) of degree n > 0
with the leading coefficient equal to unity, 21n Tn (x)
deviates least from zero in [1, 1]. That is,
max |pn (x)| max |21n Tn (x)| = 21n .
1x1
1x1
1247,

Minimax property
max |pn (x)| max |21n Tn (x)| = 21n .
1x1
1x1
If there exists a monic polynomial pn (x) of degree n such that

max |pn (x)| < 21n ,
1x1
then at (n + 1) locations of alternating extrema of 21n Tn (x), the

polynomial
qn (x) = 21n Tn (x) pn (x)
will have the same sign as 21n Tn (x).
1248,

Minimax property
max |pn (x)| max |21n Tn (x)| = 21n .
1x1
1x1
If there exists a monic polynomial pn (x) of degree n such that

max |pn (x)| < 21n ,
1x1
then at (n + 1) locations of alternating extrema of 21n Tn (x), the

polynomial
qn (x) = 21n Tn (x) pn (x)
will have the same sign as 21n Tn (x).
With alternating signs at (n + 1) locations in sequence, qn (x) will
have n intervening zeros, even though it is a polynomial of degree
at most (n 1): CONTRADICTION!
1249,
1250,

Chebyshev series
f (x) = a0 T0 (x) + a1 T1 (x) + a2 T2 (x) + a3 T3 (x) +
with coefficients
Z
Z
2 1 f (x)Tn (x)
1 1 f (x)T0 (x)
dx and an =
dx for n = 1, 2, 3,
a0 =
1
1
1 x2
1 x2
1251,

Chebyshev series
f (x) = a0 T0 (x) + a1 T1 (x) + a2 T2 (x) + a3 T3 (x) +
with coefficients
Z
Z
2 1 f (x)Tn (x)
1 1 f (x)T0 (x)
dx and an =
dx for n = 1, 2, 3,
a0 =
1
1
1 x2
1 x2
A truncated series
Pn
k=0 ak Tk (x):
Chebyshev economization
Leading error term an+1 Tn+1 (x) deviates least from zero over
[1, 1] and is qualitatively similar to the error function.
1252,

Chebyshev series
f (x) = a0 T0 (x) + a1 T1 (x) + a2 T2 (x) + a3 T3 (x) +
with coefficients
Z
Z
2 1 f (x)Tn (x)
1 1 f (x)T0 (x)
dx and an =
dx for n = 1, 2, 3,
a0 =
1
1
1 x2
1 x2
A truncated series
Pn
k=0 ak Tk (x):
Chebyshev economization
Leading error term an+1 Tn+1 (x) deviates least from zero over
[1, 1] and is qualitatively similar to the error function.
Question: How to develop a Chebyshev series approximation?
Find out so many Chebyshev polynomials and evaluate coefficients?

For approximating f (t) over [a, b], scale the variable as

ba
t = a+b
2 + 2 x, with x [1, 1].
P
Remark: The economized series nk=0 ak Tk (x) gives minimax
deviation of the leading error term an+1 Tn+1 (x).
1253,


ba
t = a+b
2 + 2 x, with x [1, 1].
P
Assuming an+1 Tn+1 (x) to be the error, at the zeros of Tn+1 (x),
the error will be officially zero, i.e.
n
X
ak Tk (xj ) = f (t(xj )),
k=0
where x0 , x1 , x2 , , xn are the roots of Tn+1 (x).
1254,


ba
t = a+b
2 + 2 x, with x [1, 1].
P
Assuming an+1 Tn+1 (x) to be the error, at the zeros of Tn+1 (x),
the error will be officially zero, i.e.
n
X
ak Tk (xj ) = f (t(xj )),
k=0
where x0 , x1 , x2 , , xn are the roots of Tn+1 (x).
Recall: Values of an n-th degree polynomial at n + 1

points uniquely fix the entire polynomial.
Interpolation of these n + 1 values leads to the same polynomial!

Chebyshev-Lagrange approximation
1255,
Situations in which minimax approximation is desirable:

Develop the approximation once and keep it for use in future.
Requirement: Uniform quality control over the entire domain
Minimax approximation:
deviation limited by the constant amplitude of ripple
1256,
Situations in which minimax approximation is desirable:

Develop the approximation once and keep it for use in future.
Requirement: Uniform quality control over the entire domain
Minimax approximation:
deviation limited by the constant amplitude of ripple
Chebyshevs minimax theorem
Theorem: Of all polynomials of degree up to n, p(x) is
the minimax polynomial approximation of f (x), i.e. it
minimizes
max |f (x) p(x)|,
if and only if there are n + 2 points xi such that
a x1 < x2 < x3 < < xn+2 b,
where the difference f (x) p(x) takes its extreme values
of the same magnitude and alternating signs.
1257,

Utilize any gap to reduce the deviation at the other extrema with
values at the bound.
y
d
f(x) p(x)
p(x)
/2
O
a
/2
Figure: Schematic of an approximation that is not minimax
Construction of the minimax polynomial: Remez algorithm
1258,

Utilize any gap to reduce the deviation at the other extrema with
values at the bound.
y
d
f(x) p(x)
p(x)
/2
O
a
/2
Figure: Schematic of an approximation that is not minimax
Construction of the minimax polynomial: Remez algorithm

Note: In the light of this theorem and algorithm, examine how
Tn+1 (x) is qualitatively similar to the complete error function!
1259,
Points to note
Unique features of Chebyshev polynomials
The equal-ripple and minimax properties
Chebyshev series and Chebyshev-Lagrange approximation
Fundamental ideas of general minimax approximation
1260,
Outline

Introduction
Hyperbolic Equations
Parabolic Equations
Elliptic Equations
Two-Dimensional Wave Equation

Introduction
Parabolic Equations
Elliptic Equations
1261,
Introduction
Quasi-linear second order PDEs
a

Introduction
Parabolic Equations
Elliptic Equations
2u
2u
2u
+
c
+
2b
= F (x, y , u, ux , uy )
x 2
xy
y 2
hyperbolic if b 2 ac > 0, modelling phenomena which evolve in

time perpetually and do not approach a steady state
parabolic if b 2 ac = 0, modelling phenomena which evolve in
time in a transient manner, approaching steady state
elliptic if b 2 ac < 0, modelling steady-state configurations,
without evolution in time
1262,
Introduction
a

Introduction
Parabolic Equations
Elliptic Equations
2u
2u
2u
+
c
+
2b
= F (x, y , u, ux , uy )
x 2
xy
y 2

If F (x, y , u, ux , uy ) = 0,
second order linear homogeneous differential equation
Principle of superposition: A linear combination of different
solutions is also a solution.
1263,
Introduction
a

Introduction
Parabolic Equations
Elliptic Equations
2u
2u
2u
+
c
+
2b
= F (x, y , u, ux , uy )
x 2
xy
y 2

If F (x, y , u, ux , uy ) = 0,
second order linear homogeneous differential equation
Principle of superposition: A linear combination of different
solutions is also a solution.
Solutions are often in the form of infinite series.
Solution techniques in PDEs typically attack the boundary
value problem directly.
1264,
Introduction

Introduction
Parabolic Equations
Elliptic Equations
Initial and boundary conditions

Time and space variables are qualitatively different.
1265,
Introduction

Introduction
Parabolic Equations
Elliptic Equations

Conditions in time: typically initial conditions.

For second order PDEs, u and ut over the entire space
domain: Cauchy conditions
Time is a single variable and is decoupled from the space

variables.
1266,
Introduction

Introduction
Parabolic Equations
Elliptic Equations



variables.
Conditions in space: typically boundary conditions.

For u(t, x, y ), boundary conditions over the entire curve in the
x-y plane that encloses the domain. For second order PDEs,
Dirichlet condition: value of the function

Neumann condition: derivative normal to the boundary
Mixed (Robin) condition
1267,
Introduction

Introduction
Parabolic Equations
Elliptic Equations



variables.
Conditions in space: typically boundary conditions.

For u(t, x, y ), boundary conditions over the entire curve in the
x-y plane that encloses the domain. For second order PDEs,
Dirichlet condition: value of the function

Neumann condition: derivative normal to the boundary
Mixed (Robin) condition
Dirichlet, Neumann and Cauchy problems
1268,
Introduction
Method of separation of variables
For u(x, y ), propose a solution in the form

Introduction
Parabolic Equations
Elliptic Equations
u(x, y ) = X (x)Y (y )
and substitute
ux = X Y , uy = XY , uxx = X Y , uxy = X Y , uyy = XY
to cast the equation into the form
(x, X , X , X ) = (y , Y , Y , Y ).
1269,
Introduction

Introduction
Parabolic Equations
Elliptic Equations
u(x, y ) = X (x)Y (y )
and substitute
(x, X , X , X ) = (y , Y , Y , Y ).
If the manoeuvre succeeds then, x and y being independent
variables, it implies
(x, X , X , X ) = (y , Y , Y , Y ) = k.
1270,
Introduction

Introduction
Parabolic Equations
Elliptic Equations
u(x, y ) = X (x)Y (y )
and substitute
(x, X , X , X ) = (y , Y , Y , Y ).
If the manoeuvre succeeds then, x and y being independent
variables, it implies
(x, X , X , X ) = (y , Y , Y , Y ) = k.
Nature of the separation constant k is decided based on the
context, resulting ODEs are solved in consistency with the
boundary conditions and assembled to construct u(x, y ).
1271,
Introduction
Parabolic Equations
Elliptic Equations
Transverse vibrations of a string
Q
u
Q
P
Figure: Transverse vibration of a stretched string
Small deflection and slope: cos 1, sin tan
1272,
Introduction
Parabolic Equations
Elliptic Equations
Transverse vibrations of a string
Q
u
Q
P
Figure: Transverse vibration of a stretched string
Small deflection and slope: cos 1, sin tan

Horizontal (longitudinal) forces on PQ balance.
From Newtons second law, vertical (transverse) deflection u(x, t):
T sin( + ) T sin = x
2u
t 2
1273,
Under the assumptions, denoting c 2 =
2u
x 2 = c 2
t
"
T
,
Introduction
Parabolic Equations
Elliptic Equations

#
u
u
.
x Q
x P
In the limit, as x 0, PDE of transverse vibration:

2
2u
2 u
=
c
t 2
x 2
one-dimensional wave equation
1274,
Under the assumptions, denoting c 2 =
2u
x 2 = c 2
t
"
T
,
Introduction
Parabolic Equations
Elliptic Equations

#
u
u
.
x Q
x P
In the limit, as x 0, PDE of transverse vibration:

2
2u
2 u
=
c
t 2
x 2
one-dimensional wave equation

Boundary conditions (in this case): u(0, t) = u(L, t) = 0
Initial configuration and initial velocity:
u(x, 0) = f (x) and ut (x, 0) = g (x)
Cauchy problem: Determine u(x, t) for 0 x L, t 0.
1275,
Solution by separation of variables

Introduction
Parabolic Equations
Elliptic Equations
utt = c 2 uxx , u(0, t) = u(L, t) = 0, u(x, 0) = f (x), ut (x, 0) = g (x)

Assuming
u(x, t) = X (x)T (t),
and substituting utt = XT and uxx = X T , variables are
separated as
X
T
=
= p 2 .
c 2T
X
The PDE splits into two ODEs
X + p 2 X = 0 and T + c 2 p 2 T = 0.
1276,
Solution by separation of variables

Introduction
Parabolic Equations
Elliptic Equations
utt = c 2 uxx , u(0, t) = u(L, t) = 0, u(x, 0) = f (x), ut (x, 0) = g (x)

Assuming
u(x, t) = X (x)T (t),
and substituting utt = XT and uxx = X T , variables are
separated as
X
T
=
= p 2 .
c 2T
X
The PDE splits into two ODEs
X + p 2 X = 0 and T + c 2 p 2 T = 0.
Eigenvalues of BVP X + p 2 X = 0, X (0) = X (L) = 0 are p =
and eigenfunctions
nx
Xn (x) = sin px = sin
for n = 1, 2, 3, .
L
Second ODE: T + 2n T = 0, with n = cn
L
n
L
1277,
Corresponding solution:

Introduction
Parabolic Equations
Elliptic Equations
Tn (t) = An cos n t + Bn sin n t
1278,

Introduction
Parabolic Equations
Elliptic Equations

Then, for n = 1, 2, 3, ,
un (x, t) = Xn (x)Tn (t) = (An cos n t + Bn sin n t) sin
satisfies the PDE and the boundary conditions.
nx
L
1279,

Introduction
Parabolic Equations
Elliptic Equations

Then, for n = 1, 2, 3, ,
nx
L

Since the PDE and the BCs are homogeneous, by superposition,
u(x, t) =
X
n=1
[An cos n t + Bn sin n t] sin
nx
.
L
Question: How to determine coefficients An and Bn ?
1280,

Introduction
Parabolic Equations
Elliptic Equations

Then, for n = 1, 2, 3, ,
nx
L

Since the PDE and the BCs are homogeneous, by superposition,
u(x, t) =
[An cos n t + Bn sin n t] sin
n=1
nx
.
L
Question: How to determine coefficients An and Bn ?

Answer: By imposing the initial conditions.
1281,
Introduction
Parabolic Equations
Elliptic Equations
Initial conditions: Fourier sine series of f (x) and g (x)

u(x, 0) = f (x) =
X
n=1
ut (x, 0) = g (x) =
X
n=1
An sin
nx
L
n Bn sin
nx
L
1282,
Introduction
Parabolic Equations
Elliptic Equations

u(x, 0) = f (x) =
X
n=1
ut (x, 0) = g (x) =
An sin
nx
L
nx
L
n Bn sin
n=1
Hence, coefficients:
Z
nx
2 L
dx
f (x) sin
An =
L 0
L
2
and Bn =
cn
g (x) sin
nx
dx
L
1283,
Introduction
Parabolic Equations
Elliptic Equations

u(x, 0) = f (x) =
X
n=1
ut (x, 0) = g (x) =
X
n=1
An sin
nx
L
n Bn sin
nx
L
Hence, coefficients:
Z
Z L
2
nx
nx
2 L
dx and Bn =
dx
f (x) sin
g (x) sin
An =
L 0
L
cn 0
L
Related problems:
Different boundary conditions: other kinds of series
Long wire: infinite domain, continuous frequencies and
solution from Fourier integrals
Alternative: Reduce the problem using Fourier transforms.
General wave equation in 3-d: utt = c 2 2 u
Membrane equation: utt = c 2 (uxx + uyy )
1284,

Introduction
Parabolic Equations
Elliptic Equations
DAlemberts solution of the wave equation

Method of characteristics
Canonical form
By coordinate transformation from (x, y ) to (, ), with
U(, ) = u[x(, ), y (, )],
hyperbolic equation: U =
parabolic equation: U =
elliptic equation: U + U =
in which (, , U, U , U ) is free from second derivatives.
1285,

Introduction
Parabolic Equations
Elliptic Equations
DAlemberts solution of the wave equation

Method of characteristics
Canonical form
By coordinate transformation from (x, y ) to (, ), with
U(, ) = u[x(, ), y (, )],
hyperbolic equation: U =
parabolic equation: U =
elliptic equation: U + U =
in which (, , U, U , U ) is free from second derivatives.
For a hyperbolic equation, entire domain becomes a network of -
coordinate curves, known as characteristic curves,
along which decoupled solutions can be tracked!
1286,
For a hyperbolic equation in the form
a
Introduction
Parabolic Equations
Elliptic Equations
2u
2u
2u
+
c
+
2b
= F (x, y , u, ux , uy ),
x 2
xy
y 2
roots of am2 + 2bm + c are

m1,2 =
b 2 ac
,
a
real and distinct.

Coordinate transformation
= y + m1 x, = y + m2 x
leads to U = (, , U, U , U ).
1287,
For a hyperbolic equation in the form
a
Introduction
Parabolic Equations
Elliptic Equations
2u
2u
2u
+
c
+
2b
= F (x, y , u, ux , uy ),
x 2
xy
y 2
roots of am2 + 2bm + c are

m1,2 =
b 2 ac
,
a
real and distinct.

Coordinate transformation
= y + m1 x, = y + m2 x
leads to U = (, , U, U , U ).
For the BVP
utt = c 2 uxx , u(0, t) = u(L, t) = 0, u(x, 0) = f (x), ut (x, 0) = g (x),
canonical coordinate transformation:
= x ct, = x + ct,
1
1
( ).
with x = ( + ), t =
2
2c
1288,
Substitution of derivatives
1289,
Introduction
Parabolic Equations
Elliptic Equations
ux = U x + U x = U + U uxx = U + 2U + U
ut = U t + U t = cU + cU utt = c 2 U 2c 2 U + c 2 U
into the PDE utt = c 2 uxx gives
c 2 (U 2U + U ) = c 2 (U + 2U + U ).
Canonical form: U = 0
1290,
Introduction
Parabolic Equations
Elliptic Equations
ux = U x + U x = U + U uxx = U + 2U + U
c 2 (U 2U + U ) = c 2 (U + 2U + U ).
Integration:
U =
U(, ) =
U d + () = ()
()d + f2 () = f1 () + f2 ()
1291,
Introduction
Parabolic Equations
Elliptic Equations
ux = U x + U x = U + U uxx = U + 2U + U
c 2 (U 2U + U ) = c 2 (U + 2U + U ).
Integration:
U =
U(, ) =
U d + () = ()
()d + f2 () = f1 () + f2 ()
DAlemberts solution: u(x, t) = f1 (x ct) + f2 (x + ct)

Introduction
Parabolic Equations
Elliptic Equations
Physical insight from DAlemberts solution:
f1 (x ct): a progressive wave in forward direction with speed c

Reflection at boundary:
in a manner depending upon the boundary condition
Reflected wave f2 (x + ct): another progressive wave, this one in
backward direction with speed c
1292,

Introduction
Parabolic Equations
Elliptic Equations

Superposition of two waves: complete solution (response)
1293,

Introduction
Parabolic Equations
Elliptic Equations

Superposition of two waves: complete solution (response)
Note: Components of the earlier solution: with n =
nx
L
nx
sin n t sin
L
cos n t sin
=
=
cn
L ,
i
1 h n
n
sin
(x ct) + sin
(x + ct)
2
L
L
i
n
1h
n
cos
(x ct) cos
(x + ct)
2
L
L
1294,
Parabolic Equations
Introduction
Parabolic Equations
Elliptic Equations
Heat conduction equation or diffusion equation:

u
= c 2 2 u
t
One-dimensional heat (diffusion) equation:
ut = c 2 uxx
Heat conduction in a finite bar: For a thin bar of length L with

end-points at zero temperature,
ut = c 2 uxx , u(0, t) = u(L, t) = 0, u(x, 0) = f (x).
1295,
Parabolic Equations
Introduction
Parabolic Equations
Elliptic Equations
Heat conduction equation or diffusion equation:

u
= c 2 2 u
t
One-dimensional heat (diffusion) equation:
ut = c 2 uxx
Heat conduction in a finite bar: For a thin bar of length L with

end-points at zero temperature,
ut = c 2 uxx , u(0, t) = u(L, t) = 0, u(x, 0) = f (x).
Assumption u(x, t) = X (x)T (t) leads to
XT = c 2 X T
X
T
=
= p 2 ,
c 2T
X
giving rise to two ODEs as

X + p 2 X = 0 and T + c 2 p 2 T = 0.
1296,
Parabolic Equations

Introduction
Parabolic Equations
Elliptic Equations
BVP in the space coordinate X + p 2 X = 0, X (0) = X (L) = 0

has solutions
nx
.
Xn (x) = sin
L
1297,
Parabolic Equations
Introduction
Parabolic Equations
Elliptic Equations
BVP in the space coordinate X + p 2 X = 0, X (0) = X (L) = 0

has solutions
nx
.
Xn (x) = sin
L
With n = cn
L , the ODE in T (t) has the corresponding solutions
2
Tn (t) = An e n t .
By superposition,
u(x, t) =
An sin
n=1
nx 2n t
e
,
L
coefficients being determined from initial condition as

u(x, 0) = f (x) =
An sin
n=1
a Fourier sine series.

As t , u(x, t) 0 (steady state)
nx
,
L
1298,
Parabolic Equations

Introduction
Parabolic Equations
Elliptic Equations
Non-homogeneous boundary conditions:
ut = c 2 uxx , u(0, t) = u1 , u(L, t) = u2 , u(x, 0) = f (x).

For u1 6= u2 , with u(x, t) = X (x)T (t), BCs do not separate!
1299,
Parabolic Equations

Introduction
Parabolic Equations
Elliptic Equations

Assume
u(x, t) = U(x, t) + uss (x),
where component uss (x), steady-state temperature (distribution),
does not enter the differential equation.
u2 u1
x
uss
(x) = 0, uss (0) = u1 , uss (L) = u2 uss (x) = u1 +
L
1300,
Parabolic Equations
Introduction
Parabolic Equations
Elliptic Equations

Assume
u(x, t) = U(x, t) + uss (x),
where component uss (x), steady-state temperature (distribution),
does not enter the differential equation.
u2 u1
x
uss
(x) = 0, uss (0) = u1 , uss (L) = u2 uss (x) = u1 +
L
Substituting into the BVP,
Ut = c 2 Uxx , U(0, t) = U(L, t) = 0, U(x, 0) = f (x) uss (x).
Final solution:
u(x, t) =
X
n=1
Bn sin
nx 2n t
e
+ uss (x),
L
Bn being coefficients of Fourier sine series of f (x) uss (x).
1301,
Parabolic Equations
Heat conduction in an infinite wire

Introduction
Parabolic Equations
Elliptic Equations
ut = c 2 uxx , u(x, 0) = f (x)
1302,
Parabolic Equations
Heat conduction in an infinite wire
Introduction
Parabolic Equations
Elliptic Equations
ut = c 2 uxx , u(x, 0) = f (x)

In place of n
L , now we have continuous frequency p.
Solution as superposition of all frequencies:
Z
Z
2 2
[A(p) cos px+B(p) sin px]e c p t dp
up (x, t)dp =
u(x, t) =
0
Initial condition
u(x, 0) = f (x) =
[A(p) cos px + B(p) sin px]dp
gives the Fourier integral of f (x) and amplitude functions

Z
Z
1
1
A(p) =
f (v ) sin pv dv .

1303,
Parabolic Equations
Solution using Fourier transforms

Introduction
Parabolic Equations
Elliptic Equations
ut = c 2 uxx , u(x, 0) = f (x)
1304,
Parabolic Equations

Introduction
Parabolic Equations
Elliptic Equations
ut = c 2 uxx , u(x, 0) = f (x)

Using derivative formula of Fourier transforms,
u
= c 2 w 2 u,
F(ut ) = c 2 (iw )2 F(u)
t
since variables x and t are independent.
Initial value problem in u(w , t):
u
= c 2 w 2 u, u(0) = f(w )
t
2 2
Solution: u(w , t) = f(w )e c w t
1305,
Parabolic Equations

Introduction
Parabolic Equations
Elliptic Equations
ut = c 2 uxx , u(x, 0) = f (x)

Using derivative formula of Fourier transforms,
u
= c 2 w 2 u,
F(ut ) = c 2 (iw )2 F(u)
t
since variables x and t are independent.
Initial value problem in u(w , t):
u
= c 2 w 2 u, u(0) = f(w )
t
2 2
Solution: u(w , t) = f(w )e c w t
Inverse Fourier transform gives solution of the original problem as
Z
1
2 2
1
u(x, t) = F {
u (w , t)} =
f(w )e c w t e iwx dw
2
Z
Z
1
2 2
cos(wx wv )e c w t dw dv .
f (v )
u(x, t) =

0
1306,
Elliptic Equations
Introduction
Parabolic Equations
Elliptic Equations
Heat flow in a plate: two-dimensional heat equation

2
2u
u
2 u
=c
+ 2
t
x 2
y
Steady-state temperature distribution:
2u 2u
+
=0
x 2 y 2
Laplaces equation
1307,
Elliptic Equations
Introduction
Parabolic Equations
Elliptic Equations

2
2u
u
2 u
=c
+ 2
t
x 2
y
2u 2u
+
=0
x 2 y 2
Laplaces equation
Steady-state heat flow in a rectangular plate:
uxx + uyy = 0, u(0, y ) = u(a, y ) = u(x, 0) = 0, u(x, b) = f (x);
a Dirichlet problem over the domain 0 x a, 0 y b.
1308,
Elliptic Equations
Introduction
Parabolic Equations
Elliptic Equations

2
2u
u
2 u
=c
+ 2
t
x 2
y
2u 2u
+
=0
x 2 y 2
Laplaces equation
Proposal u(x, y ) = X (x)Y (y ) leads to
X Y + XY = 0
Y
X
=
= p 2 .
X
Y
1309,
Elliptic Equations
Introduction
Parabolic Equations
Elliptic Equations

2
2u
u
2 u
=c
+ 2
t
x 2
y
2u 2u
+
=0
x 2 y 2
Laplaces equation
Proposal u(x, y ) = X (x)Y (y ) leads to
X Y + XY = 0
Y
X
=
= p 2 .
X
Y
Separated ODEs:
X + p 2 X = 0 and Y p 2 Y = 0
1310,
Elliptic Equations

Introduction
Parabolic Equations
Elliptic Equations
nx
Two-Dimensional
n Wave Equation
From BVP X + p 2 X = 0, X (0) = X (a) = 0,

Corresponding solution of Y p 2 Y = 0:
Yn (y ) = An cosh
X (x) = sin
ny
ny
+ Bn sinh
a
a
1311,
Elliptic Equations
Introduction
Parabolic Equations
Elliptic Equations
nx
Two-Dimensional
n Wave Equation
From BVP X + p 2 X = 0, X (0) = X (a) = 0,

Yn (y ) = An cosh
X (x) = sin
ny
ny
+ Bn sinh
a
a
Condition Y (0) = 0 An = 0, and

un (x, y ) = Bn sin
nx
ny
sinh
a
a
The complete solution:

u(x, y ) =
X
n=1
Bn sin
nx
ny
sinh
a
a
1312,
Elliptic Equations
Introduction
Parabolic Equations
Elliptic Equations
nx
Two-Dimensional
n Wave Equation
From BVP X + p 2 X = 0, X (0) = X (a) = 0,

Yn (y ) = An cosh
X (x) = sin
ny
ny
+ Bn sinh
a
a

un (x, y ) = Bn sin
nx
ny
sinh
a
a

u(x, y ) =
X
n=1
Bn sin
nx
ny
sinh
a
a
The last boundary condition u(x, b) = f (x) fixes the coefficients

from the Fourier sine series of f (x).
1313,
Elliptic Equations
Introduction
Parabolic Equations
Elliptic Equations
nx
Two-Dimensional
n Wave Equation
From BVP X + p 2 X = 0, X (0) = X (a) = 0,

Yn (y ) = An cosh
X (x) = sin
ny
ny
+ Bn sinh
a
a

un (x, y ) = Bn sin
nx
ny
sinh
a
a

u(x, y ) =
X
n=1
Bn sin
nx
ny
sinh
a
a
The last boundary condition u(x, b) = f (x) fixes the coefficients

from the Fourier sine series of f (x).
Note: In the example, BCs on three sides were homogeneous.
How did it help? What if there are more non-homogeneous BCs?
1314,
Elliptic Equations
Introduction
Parabolic Equations
Elliptic Equations
Steady-state heat flow with internal heat generation

2 u = (x, y )
Poissons equation
Separation of variables impossible!
1315,
Elliptic Equations
Introduction
Parabolic Equations
Elliptic Equations
Steady-state heat flow with internal heat generation

2 u = (x, y )
Poissons equation
Separation of variables impossible!
Consider function u(x, y ) as
u(x, y ) = uh (x, y ) + up (x, y )
Sequence of steps
one particular solution up (x, y ) that may or may not satisfy

some or all of the boundary conditions
solution of the corresponding homogeneous equation, namely
uxx + uyy = 0 for uh (x, y )
such that u = uh + up satisfies all the boundary conditions
1316,

Introduction
Parabolic Equations
Elliptic Equations
Transverse vibration of a rectangular membrane:

2
2u
2u
2 u
=c
+
t 2
x 2 y 2
A Cauchy problem of the membrane:
utt = c 2 (uxx + uyy );

u(x, y , 0) = f (x, y ), ut (x, y , 0) = g (x, y );
u(0, y , t) = u(a, y , t) = u(x, 0, t) = u(x, b, t) = 0.
1317,
Introduction
Parabolic Equations
Elliptic Equations
Transverse vibration of a rectangular membrane:

2
2u
2u
2 u
=c
+
t 2
x 2 y 2
A Cauchy problem of the membrane:
utt = c 2 (uxx + uyy );

u(x, y , 0) = f (x, y ), ut (x, y , 0) = g (x, y );
u(0, y , t) = u(a, y , t) = u(x, 0, t) = u(x, b, t) = 0.
Separate the time variable from the space variables:
u(x, y , t) = F (x, y )T (t)
Fxx + Fyy
T
= 2 = 2
F
c T
Helmholtz equation:
Fxx + Fyy + 2 F = 0
1318,

Assuming F (x, y ) = X (x)Y (y ),

Introduction
Parabolic Equations
Elliptic Equations
Y + 2 Y
X
=
= 2
X
Y
X + 2 X = 0 and Y + 2 Y = 0,
p
such that = 2 + 2 .
1319,

Assuming F (x, y ) = X (x)Y (y ),
Introduction
Parabolic Equations
Elliptic Equations
Y + 2 Y
X
=
= 2
X
Y
X + 2 X = 0 and Y + 2 Y = 0,
p
such that = 2 + 2 .
With BCs X (0) = X (a) = 0 and Y (0) = Y (b) = 0,
Xm (x) = sin
mx
a
Corresponding values of are

r
mn =
and
Yn (y ) = sin
ny
.
b
m 2 n 2
+
a
b
with solutions of T + c 2 2 T = 0 as
Tmn (t) = Amn cos cmn t + Bmn sin cmn t.
1320,
Introduction
Parabolic Equations
Elliptic Equations
Composing Xm (x), Yn (y ) and Tmn (t) and superposing,
X
X
mx
ny
[Amn cos cmn t+Bmn sin cmn t] sin
u(x, y , t) =
sin
,
a
b
m=1 n=1
coefficients being determined from the double Fourier series

X
X
mx
ny
Amn sin
f (x, y ) =
sin
a
b
and
g (x, y ) =
m=1 n=1
X
X
m=1 n=1
cmn Bmn sin
ny
mx
sin
.
a
b
1321,
Introduction
Parabolic Equations
Elliptic Equations
Composing Xm (x), Yn (y ) and Tmn (t) and superposing,
X
X
mx
ny
[Amn cos cmn t+Bmn sin cmn t] sin
u(x, y , t) =
sin
,
a
b
m=1 n=1
coefficients being determined from the double Fourier series

X
X
mx
ny
Amn sin
f (x, y ) =
sin
a
b
and
g (x, y ) =
m=1 n=1
X
X
m=1 n=1
cmn Bmn sin
ny
mx
sin
.
a
b
BVPs modelled in polar coordinates

For domains of circular symmetry, important in many practical
systems, the BVP is conveniently modelled in polar coordinates,
the separation of variables quite often producing
Bessels equation, in cylindrical coordinates, and
Legendres equation, in spherical coordinates
1322,
Points to note

Introduction
Parabolic Equations
Elliptic Equations
PDEs in physically relevant contexts
Separation of variables
Examples of boundary value problems with hyperbolic,
parabolic and elliptic equations
Modelling, solution and interpretation
Cascaded application of separation of variables for problems

with more than two independent variables
1323,
Outline
Analytic Functions
Analyticity of Complex Functions
Conformal Mapping
Potential Theory
Analytic Functions
Conformal Mapping
Potential Theory
1324,
Analytic Functions
Conformal Mapping
Potential Theory
Function f of a complex variable z

gives a rule to associate a unique complex number
w = u + iv to every z = x + iy in a set.
1325,
Analytic Functions

Conformal Mapping
Potential Theory

Limit: If f (z) is defined in a neighbourhood of z0 (except possibly
at z0 itself) and l C such that > 0, > 0 such that
0 < |z z0 | < |f (z) l | < ,
then
l = lim f (z).
zz0
1326,
Analytic Functions

Conformal Mapping
Potential Theory

0 < |z z0 | < |f (z) l | < ,
then
l = lim f (z).
zz0
Crucial difference from real functions: z can approach z0 in all

possible manners in the complex plane.
Definition of the limit is more restrictive.
1327,
Analytic Functions

Conformal Mapping
Potential Theory

0 < |z z0 | < |f (z) l | < ,
then
l = lim f (z).
zz0
Crucial difference from real functions: z can approach z0 in all

possible manners in the complex plane.
Definition of the limit is more restrictive.
Continuity: limzz0 f (z) = f (z0 )
Continuity in a domain D: continuity at every point in D
1328,
Analytic Functions
Conformal Mapping
Potential Theory
Derivative of a complex function:

f (z0 ) = lim
zz0
f (z0 + z) f (z0 )
f (z) f (z0 )
= lim
z0
z z0
z
When this limit exists, function f (z) is said to be differentiable.

Extremely restrictive definition!
1329,
Analytic Functions
Conformal Mapping
Potential Theory

f (z0 ) = lim
zz0
f (z0 + z) f (z0 )
f (z) f (z0 )
= lim
z0
z z0
z

Analytic function
A function f (z) is called analytic in a domain D if it is
defined and differentiable at all points in D.
1330,
Analytic Functions
Conformal Mapping
Potential Theory

f (z0 ) = lim
zz0
f (z0 + z) f (z0 )
f (z) f (z0 )
= lim
z0
z z0
z

Analytic function
Points to be settled later:
Derivative of an analytic function is also analytic.
An analytic function possesses derivatives of all orders.
1331,
Analytic Functions
Conformal Mapping
Potential Theory

f (z0 ) = lim
zz0
f (z0 + z) f (z0 )
f (z) f (z0 )
= lim
z0
z z0
z

Analytic function
Points to be settled later:
Derivative of an analytic function is also analytic.
An analytic function possesses derivatives of all orders.
A great qualitative difference between functions of a real variable

and those of a complex variable!
1332,
Analytic Functions

Conformal Mapping
Potential Theory
Cauchy-Riemann conditions
If f (z) = u(x, y ) + iv (x, y ) is analytic then
u + i v
f (z) = lim
x,y 0 x + i y
along all paths of approach for z = x + i y 0 or x, y 0.
y
z = iy
2
z0
z0
4
5
Figure: Paths approaching z0
z = x
Figure: Paths in C-R equations
1333,
Analytic Functions

Conformal Mapping
Potential Theory
Cauchy-Riemann conditions
If f (z) = u(x, y ) + iv (x, y ) is analytic then
u + i v
f (z) = lim
x,y 0 x + i y
along all paths of approach for z = x + i y 0 or x, y 0.
y
z = iy
2
z0
z0
4
5
Figure: Paths approaching z0
z = x
Figure: Paths in C-R equations
Two expressions for the derivative:

u
v
v
u
f (z) =
+i
=
i
x
x
y
y
1334,
Analytic Functions
Conformal Mapping
Potential Theory
Cauchy-Riemann equations or conditions

v
u
u
and y
= v
x = y
x
are necessary for analyticity.

Question: Do the C-R conditions imply analyticity?
1335,
Analytic Functions
Conformal Mapping
Potential Theory

v
u
u
and y
= v
x = y
x

Consider u(x, y ) and v (x, y ) having continuous first order partial
derivatives that satisfy the Cauchy-Riemann conditions.
By mean value theorem,
u
u
u = u(x + x, y + y ) u(x, y ) = x (x1 , y1 ) + y (x1 , y1 )
x
y
with x1 = x + x, y1 = y + y for some [0, 1]; and
v
v
v = v (x + x, y + y ) v (x, y ) = x (x2 , y2 ) + y (x2 , y2 )
x
y
with x2 = x + x, y2 = y + y for some [0, 1].
1336,
Analytic Functions
1337,

Conformal Mapping
Potential Theory

v
u
u
and y
= v
x = y
x

Consider u(x, y ) and v (x, y ) having continuous first order partial
derivatives that satisfy the Cauchy-Riemann conditions.
By mean value theorem,
u
u
u = u(x + x, y + y ) u(x, y ) = x (x1 , y1 ) + y (x1 , y1 )
x
y
with x1 = x + x, y1 = y + y for some [0, 1]; and
v
v
v = v (x + x, y + y ) v (x, y ) = x (x2 , y2 ) + y (x2 , y2 )
x
y
with x2 = x + x, y2 = y + y for some [0, 1].
Then,

v
v
u
u
f = x (x1 , y1 ) + i y (x2 , y2 ) +i x (x2 , y2 ) i y (x1 , y1 )
x
y
x
y
Analytic Functions
= v
x ,

u
u
u
(x2 , y2 )
(x1 , y1 )
= (x + i y ) (x1 , y1 ) + i y
x
x
x

v
v
v
+ i (x + i y ) (x1 , y1 ) + i x
(x2 , y2 )
(x1 , y1 )
x
x
x
u
v
=
(x1 , y1 ) + i (x1 , y1 ) +
x
x

v
y u
u
x v
(x2 , y2 )
(x1 , y1 ) + i
(x2 , y2 )
(x1 , y1 )
i
z x
x
z x
x
Using C-R conditions

f
f
z
v
y
u
x
and
u
y
1338,

Conformal Mapping
Potential Theory
Analytic Functions

v
y
= v
x ,

u
u
u
(x2 , y2 )
(x1 , y1 )
f = (x + i y ) (x1 , y1 ) + i y
x
x
x

v
v
v
+ i (x + i y ) (x1 , y1 ) + i x
(x2 , y2 )
(x1 , y1 )
x
x
x
f
u
v
=
(x1 , y1 ) + i (x1 , y1 ) +
z
x
x

v
y u
u
x v
(x2 , y2 )
(x1 , y1 ) + i
(x2 , y2 )
(x1 , y1 )
i
z x
x
z x
x
y

Since x
z , z 1, as z 0, the limit exists and
Using C-R conditions
f (z) =
u
x
1339,

Conformal Mapping
Potential Theory
and
u
y
v
u v
u
+i
= i
+
.
x
x
y
y
Cauchy-Riemann conditions are necessary and sufficient

for function w = f (z) = u(x, y ) + iv (x, y ) to be analytic.
Analytic Functions

Harmonic function
Differentiating C-R equations
v
y
u
x

Conformal Mapping
Potential Theory
and
u
y
= v
x ,
2u
2u
2u
2v
2v
2v
2v
2u
,
,
=
=
=
=
,
x 2
xy y 2
y x y x
y 2 xy
x 2
2v
2v
2u 2u
+
=
0
=
+
.
x 2 y 2
x 2 y 2
Real and imaginary components of an analytic functions

are harmonic functions.
Conjugate harmonic function of u(x, y ): v (x, y )
1340,
Analytic Functions

Harmonic function
Differentiating C-R equations
v
y
u
x

Conformal Mapping
Potential Theory
and
u
y
= v
x ,
2u
2u
2u
2v
2v
2v
2v
2u
,
,
=
=
=
=
,
x 2
xy y 2
y x y x
y 2 xy
x 2
2v
2v
2u 2u
+
=
0
=
+
.
x 2 y 2
x 2 y 2
Real and imaginary components of an analytic functions

are harmonic functions.
Conjugate harmonic function of u(x, y ): v (x, y )
Families of curves u(x, y ) = c and v (x, y ) = k are mutually
orthogonal, except possibly at points where f (z) = 0.
Question: If u(x, y ) is given, then how to develop the complete
analytic function w = f (z) = u(x, y ) + iv (x, y )?
1341,
Conformal Mapping
Analytic Functions
Conformal Mapping
Potential Theory
Function: mapping of elements in domain to their images in range
1342,
Conformal Mapping
Analytic Functions
Conformal Mapping
Potential Theory

Depiction of a complex variable requires a plane with two axes.
Mapping of a complex function w = f (z) is shown in two planes.
1343,
Analytic Functions
Conformal Mapping

Conformal Mapping
Potential Theory

Depiction of a complex variable requires a plane with two axes.
Mapping of a complex function w = f (z) is shown in two planes.
Example: mapping of a rectangle under transformation w = e z
3.5
2
3
C
1.5
2.5
2
1
1.5
0.5
0.5
0
O A
0
O
0.5
0.5
1
1
0.5
0.5
x
(a) The z-plane
1.5
1
1
0.5
0.5
1.5
(b) The w -plane
Figure: Mapping corresponding to function w = e z
2.5
3.5
1344,
Conformal Mapping
Analytic Functions
Conformal Mapping
Potential Theory
Conformal mapping: a mapping that preserves the angle between

any two directions in magnitude and sense.
Verify: w = e z defines a conformal mapping.
Through relative orientations of curves at the points of
intersection, local shape of a figure is preserved.
1345,
Conformal Mapping
Analytic Functions
Conformal Mapping
Potential Theory
Conformal mapping: a mapping that preserves the angle between

any two directions in magnitude and sense.
Verify: w = e z defines a conformal mapping.
Through relative orientations of curves at the points of
intersection, local shape of a figure is preserved.
Take curve z(t), z(0) = z0 and image w (t) = f [z(t)], w0 = f (z0 ).
For analytic f (z), w (0) = f (z0 )z(0),
implying
|w (0)| = |f (z0 )| |z(0)|
and arg w (0) = arg f (z0 ) + arg z(0).
For several curves through z0 ,

image curves pass through w0 and all of them turn by the
same angle arg f (z0 ).
Cautions
f (z) varies from point to point. Different scaling and turning
effects take place at different points. Global shape changes.
For f (z) = 0, argument is undefined and conformality is lost.
1346,
Conformal Mapping
Analytic Functions
Conformal Mapping
Potential Theory
An analytic function defines a conformal mapping except

at its critical points where its derivative vanishes.
Except at critical points, an analytic function is invertible.
We can establish an inverse of any conformal mapping.
1347,
Conformal Mapping
Analytic Functions
Conformal Mapping
Potential Theory

Examples
Linear function w = az + b (for a 6= 0)
Linear fractional transformation
az + b
, ad bc 6= 0
w=
cz + d
Other elementary functions like z n , e z etc
1348,
Conformal Mapping
Analytic Functions
Conformal Mapping
Potential Theory

Examples
Linear function w = az + b (for a 6= 0)
Linear fractional transformation
az + b
, ad bc 6= 0
w=
cz + d
Other elementary functions like z n , e z etc
Special significance of conformal mappings:
A harmonic function (u, v ) in the w -plane is also a
harmonic function, in the form (x, y ) in the z-plane, as
long as the two planes are related through a conformal
mapping.
1349,
Potential Theory
Analytic Functions
Conformal Mapping
Potential Theory
Riemann mapping theorem: Let D be a simply connected

domain in the z-plane bounded by a closed curve C . Then there
exists a conformal mapping that gives a one-to-one correspondence
between D and the unit disc |w | < 1 as well as between C and the
unit circle |w | = 1, bounding the unit disc.
1350,
Potential Theory
Analytic Functions
Conformal Mapping
Potential Theory

Application to boundary value problems
First, establish a conformal mapping between the given

domain and a domain of simple geometry.
Next, solve the BVP in this simple domain.
Finally, using the inverse of the conformal mapping, construct

the solution for the given domain.
1351,
Potential Theory
Analytic Functions
Conformal Mapping
Potential Theory

Application to boundary value problems
First, establish a conformal mapping between the given

domain and a domain of simple geometry.
Next, solve the BVP in this simple domain.
Finally, using the inverse of the conformal mapping, construct

the solution for the given domain.
Example: Dirichlet problem with Poissons integral formula

Z 2
(R 2 r 2 )f (Re i )
1
i
d
f (re ) =
2 0 R 2 2Rr cos( ) + r 2
1352,
Potential Theory
Analytic Functions
Conformal Mapping
Potential Theory
Two-dimensional potential flow
Velocity potential (x, y ) gives velocity components Vx =

and Vy =
y .
A streamline is a curve in the flow field, the tangent to which

at any point is along the local velocity vector.
Stream function (x, y ) remains constant along a streamline.
(x, y ) is the conjugate harmonic function of (x, y ).
Complex potential function (z) = (x, y ) + i (x, y ) defines

the flow.
1353,
Potential Theory
Analytic Functions
Conformal Mapping
Potential Theory
Two-dimensional potential flow
Velocity potential (x, y ) gives velocity components Vx =

and Vy =
y .
A streamline is a curve in the flow field, the tangent to which

at any point is along the local velocity vector.
Stream function (x, y ) remains constant along a streamline.
(x, y ) is the conjugate harmonic function of (x, y ).
Complex potential function (z) = (x, y ) + i (x, y ) defines

the flow.
If a flow field encounters a solid boundary of a complicated shape,

transform the boundary conformally to a simple boundary
to facilitate the study of the flow pattern.
1354,
Points to note
Analytic Functions
Conformal Mapping
Potential Theory
Analytic functions and Cauchy-Riemann conditions
Conformality of analytic functions
Applications in solving BVPs and flow description
1355,
Outline

Line Integral
Cauchys Integral Theorem
Cauchys Integral Formula

Line Integral
1356,
Line Integral
Line Integral
For w = f (z) = u(x, y ) + iv (x, y ), over a smooth curve C ,

Z
Z
Z
Z
f (z)dz = (u+iv )(dx+idy ) = (udxvdy )+i (vdx+udy ).
C
Extension to piecewise smooth curves is obvious.
1357,
Line Integral
Line Integral

Z
Z
Z
Z
C
With parametrization, for z = z(t), a t b, with z(t)
6= 0,
Z b
Z
f [z(t)]z(t)dt.
f (z)dz =
C
1358,
Line Integral
Line Integral

Z
Z
Z
Z
C
6= 0,
Z b
Z
f [z(t)]z(t)dt.
f (z)dz =
a
C
H
Over a simple closed curve, contour integral: C f (z)dz
1359,
Line Integral
Line Integral

Z
Z
Z
Z
C
6= 0,
Z b
Z
f [z(t)]z(t)dt.
f (z)dz =
a
C
H
Over a simple
curve, contour integral: C f (z)dz
H closed
Example: C z n dz for integer n, around circle z = e i

I
Z 2
0 for n 6= 1,
n
n+1
i (n+1)
z dz = i
e
d =
2i
for n = 1.
C
0
1360,
Line Integral
Line Integral

Z
Z
Z
Z
C
6= 0,
Z b
Z
f [z(t)]z(t)dt.
f (z)dz =
a
C
H
Over a simple
curve, contour integral: C f (z)dz
H closed
Example: C z n dz for integer n, around circle z = e i

I
Z 2
0 for n 6= 1,
n
n+1
i (n+1)
z dz = i
e
d =
2i
for n = 1.
C
0
The M-L inequality: If C is a curve of finite length L and

|f (z)| < M on C , then
Z
Z
Z

f (z)dz
|dz| = ML.
|f
(z)|
|dz|
<
M

C
1361,

Line Integral
C is a simple closed curve in a simply connected domain D.

Function f (z) = u + iv is analytic in D.
H
Contour integral C f (z)dz =?
1362,

Line Integral

H
If f (z) is continuous, then by Greens theorem in the plane,

Z Z
I
Z Z
v
u
u v
f (z)dz =
dxdy +i
dxdy ,
x
y
x
y
R
C
R
where R is the region enclosed by C .
1363,

Line Integral

H
If f (z) is continuous, then by Greens theorem in the plane,

Z Z
I
Z Z
v
u
u v
f (z)dz =
dxdy +i
dxdy ,
x
y
x
y
R
C
R
where R is the region enclosed by C .

H
From C-R conditions,
C f (z)dz = 0.
Proof by Goursat: without the hypothesis of continuity of f (z)

Cauchy-Goursat theorem
HIf f (z) is analytic in a simply connected domain D, then
C f (z)dz = 0 for every simple closed curve C in D.
Importance of Goursats contribution:

continuity of f (z) appears as consequence!
1364,
1365,
Line Integral
Principle of path independence

Two points z1 and z2 on the close curve C
two open paths C1 and C2 from z1 to z2
Cauchys theorem on C , comprising of C1 in the forward direction
and C2 in the reverse direction:
Z
Z
Z z2
Z
Z
f (z)dz
f (z)dz = 0
f (z)dz =
f (z)dz =
f (z)dz
C1
C2
z1
C1
C2
1366,
Line Integral

Z
Z
Z z2
Z
Z
f (z)dz
f (z)dz = 0
f (z)dz =
f (z)dz =
f (z)dz
C1
C2
z1
C1
For an analytic
R z function f (z) in a simply connected
domain D, z12 f (z)dz is independent of the path and
depends only on the end-points, as long as the path is
completely contained in D.
C2
1367,
Line Integral

Z
Z
Z z2
Z
Z
f (z)dz
f (z)dz = 0
f (z)dz =
f (z)dz =
f (z)dz
C1
C2
z1
C1
For an analytic
R z function f (z) in a simply connected
domain D, z12 f (z)dz is independent of the path and
depends only on the end-points, as long as the path is
completely contained in D.
Consequence: Definition of the function
Z z
F (z) =
f ()d
z0
What does the formulation suggest?
C2

Line Integral
Indefinite integral
Question: Is F (z) analytic? Is F (z) = f (z)?
1368,
1369,
Line Integral
Indefinite integral
F (z + z) F (z)
f (z) =
z
1
z
1
z
Z
Z
z+z
z0
z+z
f ()d
z
z0
f ()d f (z)
[f () f (z)]d
1370,
Line Integral
Indefinite integral
F (z + z) F (z)
f (z) =
z
1
z
1
z
Z
Z
z+z
z0
z+z
f ()d
z
z0
f ()d f (z)
[f () f (z)]d
f is continuous , such that | z| < |f () f (z)| <
1371,
Line Integral
Indefinite integral
F (z + z) F (z)
f (z) =
z
1
z
1
z
Z
Z
z+z
z0
z+z
f ()d
z
z0
f ()d f (z)
[f () f (z)]d
f is continuous , such that | z| < |f () f (z)| <

Choosing z < ,

Z z+z

F (z + z) F (z)

f (z) <
d = .

z
z
z
If f (z) is analytic in a simply connected domain D, then

there exists an analytic function F (z) in D such that
Z z2
f (z)dz = F (z2 ) F (z1 ).

F (z) = f (z) and
z1
Line Integral
Principle of deformation of paths

C*
f (z) analytic everywhere other

than isolated points s1 , s2 , s3
z2
s1
D
C1
C2
f (z)dz =
C1
f (z)dz =
C2
Not so for path C .
f (z)dz
z1
s3
C3
C3
Figure: Path deformation
s2
1372,
Line Integral
Principle of deformation of paths

C*
f (z) analytic everywhere other

than isolated points s1 , s2 , s3
z2
s1
D
C1
C2
f (z)dz =
C1
f (z)dz =
C2
Not so for path C .
f (z)dz
z1
s3
C3
C3
Figure: Path deformation
The line integral remains unaltered through a continuous

deformation of the path of integration with fixed
end-points, as long as the sweep of the deformation
includes no point where the integrand is non-analytic.
s2
1373,
Line Integral
Cauchys theorem in multiply connected domain

L1
C
C1
C2
L2
C3
L3
Figure: Contour for multiply connected domain
f (z)dz
C1
f (z)dz
C2
f (z)dz
C3
f (z)dz = 0.
1374,
Line Integral
Cauchys theorem in multiply connected domain

L1
C
C1
C2
L2
C3
L3
Figure: Contour for multiply connected domain
f (z)dz
C1
f (z)dz
C2
f (z)dz
f (z)dz = 0.
C3
If f (z) is analytic in a region bounded by the contour C

as the outer boundary and non-overlapping contours C1 ,
C2 , C3 , , Cn as inner boundaries, then
I
f (z)dz =
n I
X
i =1
f (z)dz.
Ci
1375,

Line Integral
f (z): analytic function in a simply connected domain D

For z0 D and simple closed curve C in D,
I
f (z)
dz = 2if (z0 ).
C z z0
1376,

Line Integral

I
f (z)
dz = 2if (z0 ).
C z z0
Consider C as a circle with centre at z0 and radius ,
with no loss of generality (why?).
1377,
Line Integral

I
f (z)
dz = 2if (z0 ).
C z z0
I
f (z)
dz = f (z0 )
z z0
dz
+
z z0
f (z) f (z0 )
dz
z z0
1378,
Line Integral

I
f (z)
dz = 2if (z0 ).
C z z0
I
f (z)
dz = f (z0 )
z z0
dz
+
z z0
f (z) f (z0 )
dz
z z0
From continuity of f (z), such that for any ,

f (z) f (z0 )
< ,

|z z0 | < |f (z) f (z0 )| < and
z z0
with < .
1379,
Line Integral

I
f (z)
dz = 2if (z0 ).
C z z0
I
f (z)
dz = f (z0 )
z z0
dz
+
z z0
f (z) f (z0 )
dz
z z0
From continuity of f (z), such that for any ,

f (z) f (z0 )
< ,

|z z0 | < |f (z) f (z0 )| < and
z z0
with < .
From M-L inequality, the second integral vanishes.
1380,

Line Integral
Direct applications
Evaluation of contour integral:
If g (z) is analytic on the contour

H and in the enclosed region,
the Cauchys theorem implies C g (z)dz = 0.
If the contour encloses a singularity at z0 , then Cauchys
formula supplies a non-zero contribution to the integral, if
f (z) = g (z)(z z0 ) is analytic.
Evaluation of function at a point: If finding the integral on

the left-hand-side is relatively simple, then we use it to
evaluate f (z0 ).
Significant in the solution of boundary value
problems!
Example: Poissons integral formula
Z 2
1
(R 2 r 2 )u(R, )
u(r , ) =
d
2
2 0 R 2Rr cos( ) + r 2
for the Dirichlet problem over a circular disc.
1381,

Line Integral
Poissons integral formula

Taking z0 = re i and z = Re i (with r < R) in Cauchys formula,
Z 2
f (Re i )
i
(iRe i )d.
2if (re ) =
Re i re i
0
How to get rid of imaginary quantities from the expression?
1382,
Line Integral
Poissons integral formula

Taking z0 = re i and z = Re i (with r < R) in Cauchys formula,
Z 2
f (Re i )
i
(iRe i )d.
2if (re ) =
Re i re i
0
How to get rid of imaginary quantities from the expression?
2
Develop a complement. With Rr in place of r ,
Z 2
Z 2
f (Re i )
f (Re i )
i
(ire i )d.
0=
(iRe
)d
=
2
re i Re i
Re i R e i
0
0
r
Subtracting,

Re i
re i
f (Re )
+ i
2if (re ) = i
d
Re i re i
Re
re i
0
Z 2
(R 2 r 2 )f (Re i )
= i
d
(Re i re i )(Re i re i )
0
Z 2
1
(R 2 r 2 )f (Re i )
i
f (re ) =
d.
2 0 R 2 2Rr cos( ) + r 2
i
1383,

Line Integral
Cauchys integral formula evaluates contour integral of g (z),

if the contour encloses a point z0 where g (z) is
non-analytic but g (z)(z z0 ) is analytic.
If g (z)(z z0 ) is also non-analytic, but g (z)(z z0 )2 is analytic?
1384,
Line Integral

f (z0 ) =
f (z0 ) =
f (z0 ) =
=
f (n) (z0 ) =
I
f (z)
1
dz,
2i C z z0
I
1
f (z)
dz,
2i C (z z0 )2
I
f (z)
2!
dz,
2i C (z z0 )3
,
I
f (z)
n!
dz.
2i C (z z0 )n+1
1385,
Line Integral

f (z0 ) =
f (z0 ) =
f (z0 ) =
=
f (n) (z0 ) =
I
f (z)
1
dz,
2i C z z0
I
1
f (z)
dz,
2i C (z z0 )2
I
f (z)
2!
dz,
2i C (z z0 )3
,
I
f (z)
n!
dz.
2i C (z z0 )n+1
The formal expressions can be established through differentiation

under the integral sign.
1386,
1387,
Line Integral

I
1
1
1
=
f (z)
dz
2i z C
z z0 z
z z0
I
1
f (z)dz
=
2i C (z z0 z)(z z0 )

I
I
1
f (z)dz
1
1
1
+
f (z)
dz
=
2i C (z z0 )2 2i C
(z z0 z)(z z0 ) (z z0 )2
I
I
f (z)dz
f (z)dz
1
1
+
z
=
2
2
2i C (z z0 )
2i
C (z z0 z)(z z0 )
f (z0 + z) f (z0 )
z
1388,
Line Integral

I
1
1
1
=
f (z)
dz
2i z C
z z0 z
z z0
I
1
f (z)dz
=
2i C (z z0 z)(z z0 )

I
I
1
f (z)dz
1
1
1
+
f (z)
dz
=
2i C (z z0 )2 2i C
(z z0 z)(z z0 ) (z z0 )2
I
I
f (z)dz
f (z)dz
1
1
+
z
=
2
2
2i C (z z0 )
2i
C (z z0 z)(z z0 )
f (z0 + z) f (z0 )
z
If |f (z)| < M on C , L is path length and d0 = min |z z0 |,

I

ML|z|
f (z)dz
<
z
0 as z 0.

(z z0 z)(z z0 )2 d 2 (d |z|)
C
An analytic function possesses derivatives of all orders at

every point in its domain.
Analyticity implies much more than mere differentiability!
Points to note

Line Integral
Concept of line integral in complex plane
Cauchys integral theorem
Consequences of analyticity
Cauchys integral formula
Derivatives of arbitrary order for analytic functions
1389,
Outline

Series Representations of Complex Functions
Zeros and Singularities
Residues
Evaluation of Real Integrals

Residues
1390,

Residues
Taylors series of function f (z), analytic in a neighbourhood of z0 :

f (z) =
X
n=0
an (zz0 )n = a0 +a1 (zz0 )+a2 (zz0 )2 +a3 (zz0 )3 + ,
with coefficients
1
1
an = f (n) (z0 ) =
n!
2i
f (w )dw
,
(w z0 )n+1
where C is a circle with centre at z0 .

Form of the series and coefficients: similar to real functions
1391,

Residues

f (z) =
X
n=0
with coefficients
1
1
an = f (n) (z0 ) =
n!
2i
f (w )dw
,
(w z0 )n+1

The series representation is convergent within a disc
|z z0 | < R, where radius of convergence R is the
distance of the nearest singularity from z0 .
1392,

Residues

f (z) =
X
n=0
with coefficients
1
1
an = f (n) (z0 ) =
n!
2i
f (w )dw
,
(w z0 )n+1

Note: No valid power series representation around z0 , i.e. in
powers of (z z0 ), if f(z) is not analytic at z0
1393,

Residues

f (z) =
X
n=0
with coefficients
1
1
an = f (n) (z0 ) =
n!
2i
f (w )dw
,
(w z0 )n+1

Note: No valid power series representation around z0 , i.e. in
powers of (z z0 ), if f(z) is not analytic at z0
Question: In that case, what about a series representation that
includes negative powers of (z z0 ) as well?
1394,
1395,

Residues
Laurents series: If f (z) is analytic on circles C1 (outer) and C2

(inner) with centre at z0 , and in the annulus in between, then
f (z) =
n=
an (z z0 )n =
m=0
bm (z z0 )m +
m=1
cm
;
(z z0 )m
with coefficients
I
f (w )dw
1
;
2i C (w z0 )n+1
I
I
f (w )dw
1
1
, cm =
f (w )(w z0 )m1 dw ;
bm =
2i C (w z0 )m+1
2i C
an =
or,
the contour C lying in the annulus and enclosing C2 .
1396,

Residues

f (z) =
n=
an (z z0 )n =
m=0
bm (z z0 )m +
m=1
cm
;
(z z0 )m
with coefficients
I
f (w )dw
1
;
2i C (w z0 )n+1
I
I
f (w )dw
1
1
, cm =
f (w )(w z0 )m1 dw ;
bm =
2i C (w z0 )m+1
2i C
an =
or,

Validity of this series representation: in annular region obtained by
growing C1 and shrinking C2 till f (z) ceases to be analytic.
1397,

Residues

f (z) =
n=
an (z z0 )n =
m=0
bm (z z0 )m +
m=1
cm
;
(z z0 )m
with coefficients
I
f (w )dw
1
;
2i C (w z0 )n+1
I
I
f (w )dw
1
1
, cm =
f (w )(w z0 )m1 dw ;
bm =
2i C (w z0 )m+1
2i C
an =
or,

Validity of this series representation: in annular region obtained by
growing C1 and shrinking C2 till f (z) ceases to be analytic.
Observation: If f (z) is analytic inside C2 as well, then cm = 0 and
Laurents series reduces to Taylors series.

Residues
Proof of Laurents series

Cauchys integral formula for any point z in the annulus,
I
I
f (w )dw
f (w )dw
1
1
.
f (z) =
2i C1 w z
2i C2 w z
1398,

Residues

I
I
f (w )dw
f (w )dw
1
1
.
f (z) =
2i C1 w z
2i C2 w z
Organization of the series:
1
w z
1
w z
1
(w z0 )[1 (z z0 )/(w z0 )]
1
=
(z z0 )[1 (w z0 )/(z z0 )]
=
z
w
z0
C2
C1
Figure: The annulus
1399,
1400,

Residues

I
I
f (w )dw
f (w )dw
1
1
.
f (z) =
2i C1 w z
2i C2 w z
1
w z
1
w z
1
(w z0 )[1 (z z0 )/(w z0 )]
1
=
(z z0 )[1 (w z0 )/(z z0 )]
=
z
w
z0
C2
C1
Figure: The annulus
Using the expression for the sum of a geometric series,

1
qn
1 qn
= 1+q+q 2 + +q n1 +
.
1+q+q 2 + +q n1 =
1q
1q
1q
1401,

Residues

I
I
f (w )dw
f (w )dw
1
1
.
f (z) =
2i C1 w z
2i C2 w z
1
w z
1
w z
1
(w z0 )[1 (z z0 )/(w z0 )]
1
=
(z z0 )[1 (w z0 )/(z z0 )]
z
w
z0
C2
C1
Figure: The annulus
Using the expression for the sum of a geometric series,

1
qn
1 qn
= 1+q+q 2 + +q n1 +
.
1+q+q 2 + +q n1 =
1q
1q
1q
We use q =
zz0
w z0
for integral over C1 and q =
w z0
zz0
over C2 .

Residues
Proof of Laurents series (contd)

0
Using q = wzz
z0 ,

z z0 n 1
1
z z0
(z z0 )n1
1
=
+
+
+
+
w z
w z0 (w z0 )2
(w z0 )n
w z0
w z
I
1
f (w )dw
= a0 + a1 (z z0 ) + + an1 (z z0 )n1 + Tn ,
2i C1 w z
with coefficients as required and

I
1
z z0 n f (w )
Tn =
dw .
2i C1 w z0
w z
1402,

Residues
Proof of Laurents series (contd)

0
Using q = wzz
z0 ,

z z0 n 1
1
z z0
(z z0 )n1
1
=
+
+
+
+
w z
w z0 (w z0 )2
(w z0 )n
w z0
w z
I
1
f (w )dw
= a0 + a1 (z z0 ) + + an1 (z z0 )n1 + Tn ,
2i C1 w z
with coefficients as required and

I
1
z z0 n f (w )
Tn =
dw .
2i C1 w z0
w z
z0
,
Similarly, with q = wzz
0
I
f (w )dw
1
= a1 (z z0 )1 + + an (z z0 )n + Tn ,
2i C2 w z
with appropriate coefficients and the remainder term

I
w z0 n f (w )
1
Tn =
dw .
2i C2 z z0
z w
1403,

Residues
Convergence of Laurents series

f (z) =
n1
X
k=n
where
ak (z z0 )k + Tn + Tn ,
Tn =
1
2i
and Tn =
1
2i
C1
C2

z z0
w z0
w z0
z z0
n
n
f (w )
dw
w z
f (w )
dw .
z w
1404,

Residues

f (z) =
n1
X
k=n
where
ak (z z0 )k + Tn + Tn ,
Tn =
1
2i
and Tn =
1
2i
C1
C2

z z0
w z0
w z0
z z0
n
n
f (w ) is

bounded

zz0
z0
w z0 < 1 over C1 and wzz
< 1 over C2
0
f (w )
dw
w z
f (w )
dw .
z w
Use M-L inequality to show that
remainder terms Tn and Tn approach zero as n .
1405,

Residues

f (z) =
n1
X
k=n
where
ak (z z0 )k + Tn + Tn ,
Tn =
1
2i
and Tn =
1
2i
C1
C2

z z0
w z0
w z0
z z0
n
n
f (w ) is

bounded

zz0
z0
w z0 < 1 over C1 and wzz
< 1 over C2
0
f (w )
dw
w z
f (w )
dw .
z w
Use M-L inequality to show that
remainder terms Tn and Tn approach zero as n .
Remark: For actually developing Taylors or Laurents series of a

function, algebraic manipulation of known facts are employed quite
often, rather than evaluating so many contour integrals!
1406,

Residues
Zeros of an analytic function: points where the function vanishes

If, at a point z0 ,
a function f (z) vanishes along with first m 1 of its
6 0;
derivatives, but f (m) (z0 ) =
then z0 is a zero of f (z) of order m, giving the Taylors series as
f (z) = (z z0 )m g (z).
1407,

Residues
Zeros of an analytic function: points where the function vanishes

If, at a point z0 ,
a function f (z) vanishes along with first m 1 of its
6 0;
derivatives, but f (m) (z0 ) =
then z0 is a zero of f (z) of order m, giving the Taylors series as
f (z) = (z z0 )m g (z).
An isolated zero has a neighbourhood containing no other zero.
For an analytic function, not identically zero, every point
has a neighbourhood free of zeros of the function, except
possibly for that point itself. In particular, zeros of such
an analytic function are always isolated.
Implication: If f (z) has a zero in every neighbourhood around

z0 then it cannot be analytic at z0 , unless it is the zero function
[i.e. f (z) = 0 everywhere].
1408,

Residues
Entire function: A function which is analytic everywhere

Examples: z n (for positive integer n), e z , sin z etc.
The Taylors series of an entire function has an infinite

radius of convergence.
1409,

Residues
Entire function: A function which is analytic everywhere

Examples: z n (for positive integer n), e z , sin z etc.
The Taylors series of an entire function has an infinite

radius of convergence.
Singularities: points where a function ceases to be analytic
Removable singularity: If f (z) is not defined at z0 , but has a limit.
z
Example: f (z) = e z1 at z = 0.
Pole: If f (z) has a Laurents series around z0 , with a finite
number of terms with negative powers. If an = 0 for
n < m, but am 6= 0, then z0 is a pole of order m,
limzz0 (z z0 )m f (z) being a non-zero finite number.
A simple pole: a pole of order one.
Essential singularity: A singularity which is neither a removable
singularity nor a pole. If the function has a Laurents
series, then it has infinite terms with negative
powers. Example: f (z) = e 1/z at z = 0.
1410,

Residues
Zeros and poles: complementary to each other
Poles are necessarily isolated singularities.

1
A zero of f (z) of order m is a pole of f (z)
of the same order
and vice versa.
If f (z) has a zero of order m at z0 where g (z) has a pole of
the same order, then f (z)g (z) is either analytic at z0 or has a
removable singularity there.
Argument theorem:
If f (z) is analytic inside and on a simple closed
curve C except for a finite number of poles inside
and f (z) 6= 0 on C , then
I
1
f (z)
dz = N P,
2i C f (z)
where N and P are total numbers of zeros and poles
inside C respectively, counting multiplicities (orders).
1411,
Residues

Residues
Term by term integration of Laurents series:
f (z)dz = 2ia1
1412,
Residues

Residues
Term by term integration of Laurents

series:
H
1
Res
Residue: z f (z) = a1 = 2i C f (z)dz
0
f (z)dz = 2ia1
1413,
Residues

Residues

series:
H
1
Res
0
If f (z) has a pole (of order m) at z0 , then
(z z0 )m f (z) =
is analytic at z0 , and
n=m
f (z)dz = 2ia1
an (z z0 )m+n
X
(m + n)!
d m1
m
an (z z0 )n+1
[(z
z
)
f
(z)]
=
0
m1
dz
(n + 1)!
n=1
1414,
Residues

Residues

series:
H
1
Res
0
(z z0 )m f (z) =
n=m
f (z)dz = 2ia1
an (z z0 )m+n
X
(m + n)!
d m1
m
an (z z0 )n+1
[(z
z
)
f
(z)]
=
0
m1
dz
(n + 1)!
n=1
Resf (z)
z0
= a1 =
d m1
1
lim
[(z z0 )m f (z)].
(m 1)! zz0 dz m1
1415,
Residues

Residues

series:
H
1
Res
0
(z z0 )m f (z) =
n=m
f (z)dz = 2ia1
an (z z0 )m+n
X
(m + n)!
d m1
m
an (z z0 )n+1
[(z
z
)
f
(z)]
=
0
m1
dz
(n + 1)!
n=1
d m1
1
lim
[(z z0 )m f (z)].
(m 1)! zz0 dz m1
Residue theorem: If f (z) is analytic inside and on simple closed
curve C , with singularities at z1 , z2 , z3 , , zk inside C ; then
I
k
X
Resf (z).
f (z)dz = 2i
z
Resf (z)
z0
= a1 =
i =1
1416,

Residues
General strategy
Identify the required integral as a contour integral of a
complex function, or a part thereof.
If the domain of integration is infinite, then extend the
contour infinitely, without enclosing new singularities.
1417,

Residues
General strategy
Identify the required integral as a contour integral of a
complex function, or a part thereof.
If the domain of integration is infinite, then extend the
contour infinitely, without enclosing new singularities.
Example:
I =
(cos , sin )d
0
With z = e i and dz = izd,

I
I
1
dz
1
1
1
I =
f (z)dz,
=
z+
,
z
2
z
2i
z
iz
C
C
where C is the unit circle centred at the origin.
Denoting poles falling inside the unit circle C as pj ,
X
Resf (z).
I = 2i
pj
j
1418,

Example: For real rational function f (x),
Z
f (x)dx,
I =

Residues
denominator of f (x) being of degree two higher than numerator.
1419,

Z
f (x)dx,
I =

Residues

Consider contour C enclosing semi-circular region |z| R, y 0,
large enough to enclose all singularities above the x-axis.
y
iR
p
p
Figure: The contour
1420,

Z
f (x)dx,
I =

Residues

I
f (z)dz =
f (x)dx +
f (z)dz
iR
S
R
For finite M, |f (z)| < RM2 on C

Z

f (z)dz < M R = M .

R2
R
S
p
p
Figure: The contour
1421,

Z
f (x)dx,
I =

Residues

I
f (z)dz =
f (x)dx +
f (z)dz
iR
S
R
For finite M, |f (z)| < RM2 on C

Z

f (z)dz < M R = M .

R2
R
S
I =
f (x)dx = 2i
Resf (z)
pj
p
p
Figure: The contour
as R .
1422,

Residues
Example: Fourier integral coefficients

Z
Z
f (x) cos sx dx and B(s) =
A(s) =
f (x) sin sx dx
1423,

Residues
Example: Fourier integral coefficients

Z
Z
f (x) cos sx dx and B(s) =
A(s) =
Consider
I = A(s) + iB(s) =
f (x) sin sx dx
f (x)e isx dx.
Similar to the previous case,

Z R
Z
I
f (z)e isz dz =
f (x)e isx dx + f (z)e isz dz.
C
As
|e isz |
|e isx | |e sy |
|e sy |
=
1 for y 0, we have
Z

f (z)e isz dz < M R = M ,

R2
R
S
which yields, as R ,
I = 2i
X
j
Res[f
pj
(z)e isz ].
1424,
Points to note

Residues
Taylors series and Laurents series
Zeros and poles of analytic functions
Residue theorem
Evaluation of real integrals through contour integration of

suitable complex functions
Necessary Exercises: 1,2,3,5,8,9,10
1425,
Outline
Introduction
Eulers Equation
Direct Methods
Introduction
Eulers Equation
Direct Methods
1426,
Introduction
Introduction
Eulers Equation
Direct Methods
Consider a particle moving on a smooth surface z = (q1 , q2 ).

With position r = [q1 (t) q2 (t) (q1 (t), q2 (t))]T on the surface
and r = [q1 q2 ()T q]T in the tangent plane, length of the
path from qi = q(ti ) to qf = q(tf ) is
Z tf h
Z
Z tf
i1/2
q 12 + q 22 + ( T q)
2
dt.
kr kdt =
l = krk =
ti
ti
For shortest path or geodesic, minimize the path length l .
1427,
Introduction
Introduction
Eulers Equation
Direct Methods

Z tf h
Z
Z tf
i1/2
q 12 + q 22 + ( T q)
2
dt.
kr kdt =
l = krk =
ti
ti

Question: What are the variables of the problem?
1428,
Introduction
Introduction
Eulers Equation
Direct Methods

Z tf h
Z
Z tf
i1/2
q 12 + q 22 + ( T q)
2
dt.
kr kdt =
l = krk =
ti
ti

Question: What are the variables of the problem?
Answer: The entire curve or function q(t).
Variational problem:
Optimization of a function of functions, i.e. a functional.
1429,
Introduction
Introduction
Eulers Equation
Direct Methods
Functionals and their extremization

Suppose that a candidate curve is represented as a sequence of
points qj = q(tj ) at time instants
ti = t0 < t1 < t2 < t3 < < tN1 < tN = tf .
Geodesic problem: a multivariate optimization problem with the
2(N 1) variables in {qj , 1 j N 1}.
With N , we obtain the actual function.
1430,
Introduction
Introduction
Eulers Equation
Direct Methods

ti = t0 < t1 < t2 < t3 < < tN1 < tN = tf .
First order necessary condition: Functional is stationary with

respect to arbitrary small variations in {qj }.
[Equivalent to vanishing of the gradient]
1431,
Introduction
Introduction
Eulers Equation
Direct Methods

ti = t0 < t1 < t2 < t3 < < tN1 < tN = tf .
First order necessary condition: Functional is stationary with

respect to arbitrary small variations in {qj }.
[Equivalent to vanishing of the gradient]
This gives equations for the stationary points.

Here, these equations are differential equations!
1432,
Introduction
Introduction
Eulers Equation
Direct Methods
Examples of variational problems

Rb
Geodesic path: Minimize l = a kr (t)kdt
Minimal surface ofRrevolution: Minimize
Rb p
S = 2yds = 2 a y 1 + y 2 dx
The brachistochrone problem: To find the curve along which the
descent is fastest.
R ds
R b q 1+y 2
Minimize T = v = a
2gy dx
Fermats principle: Light takes the
fastest
path.
R u2 x 2 +y 2 +z 2
Minimize T = u1
du
c(x,y ,z)
Isoperimetric problem: Largest area in the plane enclosed by a
closed curve of given perimeter. By extension,
extremize a functional under one or more equality
constraints.
Hamiltons principle of least action: Evolution of a dynamic
system through the minimization of the action
Z t2
Z t2
(K P)dt
Ldt =
s=
t1
t1
1433,
Eulers Equation
Introduction
Eulers Equation
Direct Methods
Find out a function y (x), that will make the functional

Z x2
I [y (x)] =
f [x, y (x), y (x)]dx
x1
stationary, with boundary conditions y (x1 ) = y1 and y (x2 ) = y2 .
1434,
Eulers Equation
Introduction
Eulers Equation
Direct Methods

Z x2
I [y (x)] =
f [x, y (x), y (x)]dx
x1

Consider variation y (x) with y (x1 ) = y (x2 ) = 0 and consistent
variation y (x).

Z x2
f
f
y + y dx
I =
y
y
x1
1435,
Eulers Equation
Introduction
Eulers Equation
Direct Methods

Z x2
I [y (x)] =
f [x, y (x), y (x)]dx
x1

Consider variation y (x) with y (x1 ) = y (x2 ) = 0 and consistent
variation y (x).

Z x2
f
f
y + y dx
I =
y
y
x1
Integration of the second term by parts:

x2 Z x2
Z x2
Z x2
d f
f d
f
f
(y )dx =
y dx =
y
y dx
y
x1 dx y
x1 y dx
x1 y
x1
With y (x1 ) = y (x2 ) = 0, the first term vanishes identically, and

Z x2
f
d f
I =
y dx.
y
dx y
x1
1436,
Eulers Equation
Introduction
Eulers Equation
Direct Methods
For I to vanish for arbitrary y (x),

d f
dx y
f
y
= 0.
1437,
Eulers Equation
Introduction
Eulers Equation
Direct Methods

d f
dx y
f
y
= 0.
Functions involving higher order derivatives

Z x2

f x, y , y , y , , y (n) dx
I [y (x)] =
x1
with prescribed boundary values for y , y , y , , y (n1)
1438,
Eulers Equation
Introduction
Eulers Equation
Direct Methods

d f
dx y
f
y
= 0.

Z x2

f x, y , y , y , , y (n) dx
I [y (x)] =
x1

I =
x2
x1

f
f
f
f
(n)
y + y + y + + (n) y
dx
y
y
y
y
Working rule: Starting from the last term, integrate one term at
a time by parts, using consistency of variations and BCs.
1439,
Eulers Equation
Introduction
Eulers Equation
Direct Methods

d f
dx y
f
y
= 0.

Z x2

f x, y , y , y , , y (n) dx
I [y (x)] =
x1

I =
x2
x1

f
f
f
f
(n)
y + y + y + + (n) y
dx
y
y
y
y
Working rule: Starting from the last term, integrate one term at
a time by parts, using consistency of variations and BCs.
Eulers equation:
n f
d f
d 2 f
f
n d
+
(1)
= 0,
y
dx y dx 2 y
dx n y (n)
an ODE of order 2n, in general.
1440,
Eulers Equation
Introduction
Eulers Equation
Direct Methods
Functionals of a vector function

Z t2
I [r(t)] =
f (t, r, r )dt
t1
1441,
Eulers Equation
Introduction
Eulers Equation
Direct Methods

Z t2
I [r(t)] =
f (t, r, r )dt
t1
f
r
In terms of partial gradients

and f
r ,
"
T #
Z t2 T
f
f
I =
r +
r dt
r
r
t1
"
#t2 Z

Z t2 T
t2
d f T
f T
f
rdt +
r
rdt
=
r
r
r
t1 dt
t1
t1

Z t2
d f T
f
rdt.
=
r
dt r
t1
1442,
Eulers Equation
Introduction
Eulers Equation
Direct Methods

Z t2
I [r(t)] =
f (t, r, r )dt
t1
f
r
In terms of partial gradients

and f
r ,
"
T #
Z t2 T
f
f
I =
r +
r dt
r
r
t1
"
#t2 Z

Z t2 T
t2
d f T
f T
f
rdt +
r
rdt
=
r
r
r
t1 dt
t1
t1

Z t2
d f T
f
rdt.
=
r
dt r
t1
Eulers equation: a system of second order ODEs
d f
f
=0
dt r
r
or
d f
f
= 0 for each i .
dt ri
ri
1443,
Eulers Equation
Introduction
Eulers Equation
Direct Methods
Functionals of functions of several variables

Z Z
I [u(x, y )] =
f (x, y , u, ux , uy )dx dy
D
1444,
Eulers Equation
Introduction
Eulers Equation
Direct Methods

Z Z
I [u(x, y )] =
D
Eulers equation:
f
x ux
f
y uy
f
u
=0
1445,
Eulers Equation
Introduction
Eulers Equation
Direct Methods

Z Z
I [u(x, y )] =
D
Eulers equation:
f
x ux
f
y uy
f
u
=0
Moving boundaries
Revision of the basic case: allowing non-zero y (x1 ), y (x2 )
1446,
Eulers Equation
Introduction
Eulers Equation
Direct Methods

Z Z
I [u(x, y )] =
D
Eulers equation:
f
x ux
f
y uy
f
u
=0
Moving boundaries
f
At an end-point, y
y has to vanish for arbitrary y (x).
f
y
vanishes at the boundary.
Euler boundary condition or natural boundary condition
1447,
Eulers Equation
Introduction
Eulers Equation
Direct Methods

Z Z
I [u(x, y )] =
D
Eulers equation:
f
x ux
f
y uy
f
u
=0
Moving boundaries
f
At an end-point, y
f
y

Equality constraints and isoperimetric problems
Rx
Rx
Minimize I = x12 f (x, y , y )dx subject to J = x12 g (x, y , y )dx = J0 .
In another level of generalization, constraint (x, y , y ) = 0.
1448,
Eulers Equation
Introduction
Eulers Equation
Direct Methods

Z Z
I [u(x, y )] =
D
Eulers equation:
f
x ux
f
y uy
f
u
=0
Moving boundaries
f
At an end-point, y
f
y

Equality constraints and isoperimetric problems
Rx
Rx
Minimize I = x12 f (x, y , y )dx subject to J = x12 g (x, y , y )dx = J0 .
In another level of generalization, constraint (x, y , y ) = 0.
Operate with f (x, y , y , ) = f (x, y , y ) + (x)g (x, y , y ).
1449,
Direct Methods
Introduction
Eulers Equation
Direct Methods
Finite difference method

With given boundary values y (a) and y (b),
Z b
f [x, y (x), y (x)]dx
I [y (x)] =
a
Represent y (x) by its values over xi = a + ih with

i = 0, 1, 2, , N, where b a = Nh.
Approximate the functional by
I [y (x)] (y1 , y2 , y3 , , yN1 ) =
x +x
y +y
N
X
f (
xi , yi , yi )h,
i =1
y y
where xi = i 2 i 1 , yi = i 2 i 1 and yi = i h i 1 .
Minimize (y1 , y2 , y3 , , yN1 ) with respect to yi ;
= 0 for all i .
for example, by solving y
i
1450,
Direct Methods
Introduction
Eulers Equation
Direct Methods
Finite difference method

With given boundary values y (a) and y (b),
Z b
f [x, y (x), y (x)]dx
I [y (x)] =
a
Represent y (x) by its values over xi = a + ih with

i = 0, 1, 2, , N, where b a = Nh.
Approximate the functional by
I [y (x)] (y1 , y2 , y3 , , yN1 ) =
x +x
y +y
N
X
f (
xi , yi , yi )h,
i =1
y y
where xi = i 2 i 1 , yi = i 2 i 1 and yi = i h i 1 .
Minimize (y1 , y2 , y3 , , yN1 ) with respect to yi ;
= 0 for all i .
for example, by solving y
i
Exercise: Show that
yi
= 0 is equivalent to Eulers equation.
1451,
Direct Methods
Introduction
Eulers Equation
Direct Methods
Rayleigh-Ritz method
In terms of a set of basis functions, express the solution as
y (x) =
N
X
i wi (x).
i =1
Represent functional I [y (x)] as a multivariate function ().

Optimize () to determine i s.
1452,
Direct Methods
Introduction
Eulers Equation
Direct Methods
y (x) =
N
X
i wi (x).
i =1

Note: As N , the numerical solution approaches exactitude.
For a particular tolerance, one can truncate appropriately.
1453,
Direct Methods
Introduction
Eulers Equation
Direct Methods
y (x) =
N
X
i wi (x).
i =1

Observation: With these direct methods, no need to reduce the
variational (optimization) problem to Eulers equation!
1454,
Direct Methods
Introduction
Eulers Equation
Direct Methods
y (x) =
N
X
i wi (x).
i =1

Observation: With these direct methods, no need to reduce the
variational (optimization) problem to Eulers equation!
Question: Is it possible to reformulate a BVP as a variational
problem and then use a direct method?
1455,
Direct Methods
Introduction
Eulers Equation
Direct Methods
The inverse problem: From

!
Z b
N
N
X
X
i wi (x) dx,
i wi (x),
f x,
I [y (x)] () =
a
i =1
i =1
1456,
Direct Methods
Introduction
Eulers Equation
Direct Methods

!
Z b
N
N
X
X
i wi (x) dx,
i wi (x),
f x,
I [y (x)] () =
a
Z b
a
2
4
f
y
@x,
N
X
i =1
i wi ,
N
X
i =1
i =1
i =1
i wi A wi (x) +
f
y
@x,
N
X
i =1
i wi ,
N
X
i =1
i wi A wi (x)5 dx.
1457,
Direct Methods
Introduction
Eulers Equation
Direct Methods

!
Z b
N
N
X
X
i wi (x) dx,
i wi (x),
f x,
I [y (x)] () =
a
Z b
a
2
4
f
y
@x,
N
X
i =1
i wi ,
N
X
i =1
i =1
i =1
i wi A wi (x) +
f
y
@x,
N
X
i =1
i wi ,
N
X
i =1
i wi A wi (x)5 dx.
Integrating the second term by parts and using wi (a) = wi (b) = 0,

#
Z b "X
N
i wi wi (x)dx,
R
=
i
a
i =1
f
y
where R[y ]
variational problem.
d f
dx y
= 0 is the Eulers equation of the
1458,
Direct Methods
Introduction
Eulers Equation
Direct Methods

!
Z b
N
N
X
X
i wi (x) dx,
i wi (x),
f x,
I [y (x)] () =
a
Z b
a
2
4
f
y
@x,
N
X
i =1
i wi ,
N
X
i =1
i =1
i =1
i wi A wi (x) +
f
y
@x,
N
X
i =1
i wi ,
N
X
i =1
i wi A wi (x)5 dx.
Integrating the second term by parts and using wi (a) = wi (b) = 0,

#
Z b "X
N
i wi wi (x)dx,
R
=
i
a
i =1
f
y
d f
dx y
where R[y ]
= 0 is the Eulers equation of the

variational problem.
Def.: R[z(x)]: residual of the differential equation R[y ] = 0
operated over the function z(x)
Residual of the Eulers equation of a variational problem
operated upon the solution obtained by Rayleigh-Ritz
method is orthogonal to basis functions wi (x).
1459,
Direct Methods
Introduction
Eulers Equation
Direct Methods
Galerkin method
Question: What if we cannot find a corresponding variational
problem for the differential equation?
Answer: Work with the residual directly and demand
Z b
R[z(x)]wi (x)dx = 0.
a
1460,
Direct Methods
Introduction
Eulers Equation
Direct Methods
Galerkin method
Z b
a
Freedom to choose two different families of functions as basis

functions j (x) and trial functions wi (x):
Z b
X
j j (x) wi (x)dx = 0
R
a
1461,
Direct Methods
Introduction
Eulers Equation
Direct Methods
Galerkin method
Z b
a

Z b
X
R
a
A singular case of the Galerkin method:
delta functions, at discrete points, as trial functions
1462,
Direct Methods
Introduction
Eulers Equation
Direct Methods
Galerkin method
Z b
a

Z b
X
R
a
A singular case of the Galerkin method:
delta functions, at discrete points, as trial functions

Satisfaction of the differential equation exactly at the chosen
points, known as collocation points:
Collocation method
1463,
Direct Methods
Introduction
Eulers Equation
Direct Methods
Finite element methods
discretization of the domain into elements of simple geometry
basis functions of low order polynomials with local scope
design of basis functions so as to achieve enough order of

continuity or smoothness across element boundaries
piecewise continuous/smooth basis functions for entire

domain, with a built-in sparse structure
some weighted residual method to frame the algebraic

equations
solution gives coefficients which are actually the nodal values
1464,
Direct Methods
Introduction
Eulers Equation
Direct Methods
Finite element methods
discretization of the domain into elements of simple geometry
basis functions of low order polynomials with local scope
design of basis functions so as to achieve enough order of

continuity or smoothness across element boundaries
piecewise continuous/smooth basis functions for entire

domain, with a built-in sparse structure
some weighted residual method to frame the algebraic

equations
solution gives coefficients which are actually the nodal values
Suitability of finite element analysis in software environments
effectiveness and efficiency
neatness and modularity
1465,
Points to note
Introduction
Eulers Equation
Direct Methods
Optimization with respect to a function
Concept of a functional
Eulers equation
Rayleigh-Ritz and Galerkin methods
Optimization and equation-solving in the infinite-dimensional

function space: practical methods and connections
1466,
Outline
Epilogue
Epilogue
1467,
Epilogue
Source for further information:
Destination for feedback:
dasgupta@iitk.ac.in
Epilogue
1468,
Epilogue
Source for further information:
Destination for feedback:
dasgupta@iitk.ac.in
Some general courses in immediate continuation
Advanced Mathematical Methods
Scientific Computing
Advanced Numerical Analysis
Optimization
Advanced Differential Equations
Finite Element Methods
Epilogue
1469,
Epilogue
Some specialized courses in immediate continuation
Linear Algebra and Matrix Theory
Approximation Theory
Variational Calculus and Optimal Control
Advanced Mathematical Physics
Geometric Modelling
Computational Geometry
Computer Graphics
Signal Processing
Image Processing
Epilogue
1470,
Outline
Selected References
Selected References
1471,
Selected References
Selected References I
F. S. Acton.
Numerical Methods that usually Work.
The Mathematical Association of America (1990).
C. M. Bender and S. A. Orszag.
Advanced Mathematical Methods for Scientists and Engineers.
Springer-Verlag (1999).
G. Birkhoff and G.-C. Rota.
Ordinary Differential Equations.
John Wiley and Sons (1989).
G. H. Golub and C. F. Van Loan.
Matrix Computations.
The John Hopkins University Press (1983).
1472,
Selected References II
M. T. Heath.
Scientific Computing .
Tata McGraw-Hill Co. Ltd (2000).
E. Kreyszig.
Advanced Engineering Mathematics.
John Wiley and Sons (2002).
E. V. Krishnamurthy and S. K. Sen.
Numerical Algorithms.
Affiliated East-West Press Pvt Ltd (1986).
D. G. Luenberger.
Linear and Nonlinear Programming .
Addison-Wesley (1984).
P. V. ONeil.
Thomson Books (2004).
Selected References
1473,
Selected References
Selected References III

W. H. Press, S. A. Teukolsky, W. T. Vellerling and B. P.
Flannery.
Numerical Recipes.
Cambridge University Press (1998).
G. F. Simmons.
Differential Equations with Applications and Historical Notes.
J. Stoer and R. Bulirsch.
Introduction to Numerical Analysis.
Springer-Verlag (1993).
C. R. Wylie and L. C. Barrett.
1474,

Mathematical Methods in Engineering and Science: (Http://Home - Iitk.Ac - In/ Dasgupta/Mathcourse)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mathematical Methods in Engineering and Science: (Http://Home - Iitk.Ac - In/ Dasgupta/Mathcourse)

Uploaded by

Copyright:

Available Formats

Mathematical Methods in Engineering and Science