You are on page 1of 53

Lecture Notes on

Numerical Analysis
MATH 435

Professor Biswa Nath Datta

Department of Mathematical Sciences


Northern Illinois University
DeKalb, IL. 60115 USA

E−mail: dattab@math.niu.edu
URL: www.math.niu.edu/~dattab
PART I.
Numerical Methods for Root Finding Problem
1 PART I: Numerical Methods for the Root-Finding Problem
1.1 Introduction
In this lecture, we will discuss numerical methods for the Root-Finding Problem. As the title suggests,
the Root-Finding Problem is the problem of finding a root of the equation f (x) = 0, where f (x) is a function
of a single variable x. Specifically, the problem is stated as follows:

The Root-Finding Problem

Given a function f (x), find x = ξ such that f (ξ) = 0.

The number ξ is called a root of f (x) = 0 or a zero of the function f (x). The function of f (x) may be alge-
braic or trigonometric functions. The well-known examples of algebraic functions are polynomials.
A polynomial of degree n is written as Pn (x) = a0 + a1 x + a2 x2 + ... + anxn . The numbers ai , i = 0, · · · , n are
called the coefficients of the polynomial. If a0 = 1, the polynomial Pn (x) is called the monic polynomial.

Applications of the Root-Finding Problem


The root-finding problem is one of the most important computational problems. It arises in a wide variety
of practical applications in physics, chemistry, biosciences, engineering, etc. As a matter of fact,
determination of any unknown appearing implicitly in scientific or engineering formulas, gives rise to a root-
finding problem. We consider one such simple application here. For other applications of the root-finding
problem, see Section.

(INCOMPLETE)

1.2 Bisection-Method
As the title suggests, the method is based on repeated bisections of an interval containing the root. The
basic idea is very simple.
Basic-Idea:
Suppose f (x) = 0 is known to have a real root x = ξ in an interval [a, b].
a+b
• Then bisect the interval [a, b], and let c = be the middle point of [a, b]. If c is the root, then we
2
are done. Otherwise, one of the intervals [a, c] or [c, b] will contain the root.

• Find the one that contains the root and bisect that interval again.

• Continue the process of bisections until the root is trapped in an interval as small warranted by the
desired accuracy.

To implement the idea, we need to know which of the two intervals in each iteration contains the root of
f (x) = 0. The Intermediate Mean-Value Theorem of calculus can help us identify the interval in each
iteration. For a proof of this theorem, see any calculus book (e.g., ).

Intermediate Mean-Value Theorem

Let f (x) be a continuous function defined on [a, b], such that

f(a) and f (b) are of opposite signs (i.e., f (a)f (b) < 0).

Then there is a root x = c of f (x) = 0 in [a, b].

1
Algorithm 1.1 The Bisection Method for Rooting-Finding

Inputs: (i) f (x) - The given function

(ii) a0 , b0 - The two numbers such that f (a0 )f (b0 ) < 0.

Output: An approximation of the root of f (x) = 0 in [a0 , b0 ].


For k = 0, 1, 2, · · · , do until satisfied:
ak + b k
• Compute ck = .
2
• Test if ck is the desired root, if so, stop.
• If ck is not the desired root, test if f (ck )f (ak ) < 0. If so, set bk+1 = ck and ak+1 = ak .
Otherwise, set ak+1 = ck , bk+1 = bk .

End.

Example 1.1 f (x) = x3 − 6x2 + 11x − 6.


Let a0 = 2.5, b0 = 4. Then f (a0 )f (b0 ) < 0. Then there is a root in [2.5, 4].

Iteration 1. k = 0 :
a0 + b 0 4+2·5 6·5
c0 = = = = 3 · 25.
2 2 2
Since f (c0 )f (a0 ) = f (3 · 25)f (2 · 5) < 0, set b1 = c0 , a1 = a0 .

Iteration 2. k = 1 :
3 · 25 + 2 · 5
c1 = = 2 · 8750.
2
Since f (c1 )f (a1 ) > 0, set a2 = 2 · 875, b2 = b1

Iteration 3. k = 2 :
a2 + b 2 2 · 875 + 3 · 250
c2 = = = 3 · 0625
2 2
Since f (c2 )f (a2 ) = f (3 · 0625)f (2 · 875) < 0, set b3 = c2 , a3 = a2 .
Iteration 4. k = 3
a3 + b 3 2 · 875 + 3 · 0625
c3 = = = 2.9688.
2 2
It is clear that the iterations are converging towards the root x = 3.
Notes:
1. From the statement of the Bisection algorithm, it is clear that the algorithm always converges.
2. The example above shows that the convergence, however, can be very slow.

Stopping Criteria
Since this is an iterative method, we must determine some stopping criteria that will allow the iteration to
stop. Here are some commonly used stopping criteria.

2
Let  be the tolerance; that is, we would like to obtain the root with an error of at
most of . Then
Accept x = ck as a root of f (x) = 0 if any of the following criteria is satisfied:

1. |f (ck )| ≤  (The functional value is less than or equal to the tolerance).

|ck−1 − ck |
2. ≤  (The relative change is less than or equal to the tolerance).
|ck |

(b − a)
3. ≤  (The length of the interval after k iterations is less than or equal to the tolerance).
2k

4. The number of iterations k is greater than or equal to a predetermined number, say N .

Comments:
1. The Criterion 1 can be misleading, since, it is possible to have |f (ck )| very small, even if ck is not close
to the root. (Do an example to convince yourself that it is true).
2. The
 Criterion
 3 is based on the fact that after k steps, the root will be computed with error at most
b 0 − a0
.
2k

Number of Iterations Needed in the Bisection Method to Achieve Certain Accuracy


Let’s now find out what is the minimum number of iterations N needed with the Bisection method to achieve
a certain desired accuracy.
b 0 − a0
The Criterion 3 can be used to answer this. The interval length after N iterations is .
2N
b 0 − a0
So, to obtain an accuracy of , using the Criterion 3, we must have ≤ .
2N
That is, 2−N (b0 − a0 ) ≤ 


or 2−N ≤
(b0 − a0 )
 

or −N log10 2 ≤ log10
b 0 − a0
 

or N log10 2 ≥ − log10 .
b 0 − a0
 
− log10 b0 −a
0
or N≥ .
log10 2
[log10 (b0 − a0 ) − log10 ()]
or N≥
log10 2

Theorem 1.1 The number of iterations N needed in the Bisection Method to obtain an accuracy of  is
given by

[log10 (b0 − a0 ) − log10 ()]


N≥
log10 2

3


Example 1.2 Suppose we would like to determine a priori the minimum number of iterations needed in the
Bisection Algorithm, given a0 = 2.5, b0 = 4, and  = 10−3 . By Theorem 1.1, we have
log10 (1.5) − log10 (10−3 ) log10 (1.5) + 3
N≥ = = 10.5507.
log10 2 log10 (2)
Thus, a minimum of 11 iterations will be needed to obtain the desired accuracy using the Bisection Method.
Remarks: Since the number of iterations N needed to achieve a certain accuracy depends upon the initial
length of the interval containing the root, it is desirable to choose the initial interval [a0 , b0 ] as small as
possible.

1.3 Fixed-Point Iteration


A number ξ is a fixed point of a function g(x) if g(ξ) = ξ.
Suppose that the equation f (x) = 0 is written in the form x = g(x); that is, f (x) = x − g(x) = 0. Then any
fixed point ξ of g(x) is a root of f (x) = 0; because f (ξ) = ξ − g(ξ) = ξ − ξ = 0. Thus a root of f (x) = 0 can
be found by finding a fixed point of x = g(x), which correspond to f (x) = 0.

Finding a root of f (x) = 0 by finding a fixed point of x = g(x) immediately suggests an iterative procedure
of the following type.
Start with an initial guess x0 of the root, and form a sequence {xk } defined by

xk+1 = g(xk ), k = 0, 1, 2, · · ·

If the sequence {xk } converges, then lim xk = ξ will be a root of f (x) = 0.


k→∞

The question therefore rises:

Given f (x) = 0, how to write f (x) = 0 in the form x = g(x),

so that starting with any x0 in [a, b], the sequence {xk } defined

by xk+1 = g(xk ) is guaranteed to converge?

The simplest way to write f (x) = 0 in the form x = g(x) is to add x on both sides, that is,

x = f (x) + x = g(x).

But it does not very often work.


To convince yourself, consider Example 1.1 again. Here

f (x) = x3 − 6x2 + 11x − 6 = 0.

Define g(x) = x + f (x) = x3 − 6x2 + 12x − 6. We know that there is a root of f (x) in [2.5, 4]; namely x = 3.
Let’s start the iteration xk+1 = g(xk ) with x0 = 3.5,

then we have: x1 = g(x0 ) = g(3.5) = 5.3750


x2 = g(x1 ) = g(5.3750) = 40.4434
x3 = g(x2 ) = g(40.4434) = 5.6817 × 104
x4 = g(x3 ) = g(5.6817 × 104 ) = 1.8340 × 1014

The sequence {xk } is clearly diverging.

4
The convergence and divergence of the fixed-point iteration are illustrated by the following graphs.
Fixed-Point

x
=
y

x2 x1 x0

Figure 1.1: Convergence of the Fixed-Point Iteration


x
=
y

Fixed-Point

x0 x1 x2

Figure 1.2: Divergence of the Fixed-Point Iteration

The following theorem gives a sufficient condition on g(x) which ensures the convergence of the sequence
{xk }.

Theorem 1.2 (Fixed-Point Iteration Theorem): Let f (x) = 0 be written in the form x = g(x).
Assume that g(x) has the following properties:

1. For all x in [a, b], g(x) ∈ [a, b]; that is g(x) takes every value between a and b.
2. g 0 (x) exists on (a, b) with the property that there exists a positive constant r < 1 such that

|g 0 (x)| ≤ r,

for all x in (a, b).

Then

5
(i) there is a unique fixed point x = ξ of g(x) in [a, b].
(ii) For any x0 in [a, b], the sequence {xk } defined by

xk+1 = g(xk ), k = 0, 1, · · ·

converges to the fixed point x = ξ; that is to the root ξ of f (x) = 0.

The proof of the Theorem requires the Mean Value Theorem of Calculus;

The Mean Value Theorem

Let f (x) be a continuous function on [a, b], and be differentiable on (a, b).
f (b) − f (a)
Then there is a number c in (a, b) such that = f 0 (c).
b−a

Proof of the Fixed-Proof Theorem:


The proof comes into three parts: Existence, Uniqueness, and Convergence. We first prove that there
is a root of f (x) = 0 in [a, b] and it is unique. Since a fixed point of x = g(x) is a root of f (x) = 0, this
amounts to proving that there is a fixed point in [a, b] and it is unique. We then prove that the sequence
xk+1 = g(xk ) converges to the root.

Existence:
If a = g(a), then a is a fixed point. If b = g(b), then b is a fixed point. If not, because of the assumption
1, g(a) > a, and g(b) < b. Thus, f (a) = g(a) − a > 0, and f (b) = g(b) − b < 0. Also f (x) = g(x) − x is
continuous on [a, b].
Thus, by the Intermediate Mean Value Theorem, there is a root of f (x) in [a, b]. This proves the
existence.
Uniqueness:
To prove uniqueness, suppose ξ1 and ξ2 are for two fixed points in [a, b], and ξ1 6= ξ2 .
Then by the Mean Value Theorem, there is a number c in (a, b) such that

g(ξ1 ) − g(ξ2 )
= g 0 (c).
(ξ1 − ξ2 )
Since g(ξ1 ) = ξ1 , and g(ξ2 ) = ξ2 , we have
ξ1 − ξ2
= g 0 (c).
ξ1 − ξ2
That is, g 0 (c) = 1, which is a contradiction to our assumption 2. Thus, p can not be different from q.

Convergence:
Let ξ be the root of f (x) = 0. Then the absolute error at step k + 1 is given by ek+1 = |xk+1 − ξ|. To
prove that the sequence convergence, we need to show that limk→∞ ek+1=0 .
Now ek+1 = |xk+1 − ξ| = |g(xk ) − g(ξ)| (note that ξ = g(ξ)).

Thus by the Mean Value Theorem, we have ek+1 = |g 0 (c)| |xk − ξ| = |g 0 (c)|ek , where c is in (xk , ξ).
Since g 0 (c) ≤ r, we have ek+1 ≤ rek .
Similarly, we have ek ≤ rek−1 . Thus ek+1 ≤ r2 ek−1 .

6
Continuing in this way, we finally have ek+1 ≤ rk+1 e0 , where e0 is the initial error. (That is, e0 = |x0 − ξ|).
Since r < 1, we have rk+1 → 0 as k → ∞.
Thus,
lim ek+1 = lim |xk+1 − ξ|
k→∞ k→∞

≤ lim rk+1 e0
k→∞

= 0.
This proves that the sequence {xk } converges x = ξ. But x = ξ is the only fixed point of g(x).

In practice, it is not easy to verify Assumption 1. The following corollary of the fixed-point theorem is more
useful in practice; because, the conditions here are more easily verifiable. We leave the proof of the corollary
as an [Exercise].

Corollary 1.1 Let g(x) be continuously differentiable in some open interval containing the fixed point ξ,
and let |g 0 (ξ)| < 1, then there exists a number  > 0 so that the iteration xk+1 = g(xk ) converges whenever
x0 is chosen in |x0 − ξ| ≤ .

Example 1.3 Let f (x) = x3 − 6x2 + 11x − 6 = 0. Choose a = 2.5, b = 4. Let’s write f (x) = 0 in the form
1
x = g(x) as follows: x = (−x3 + 6x2 + 6) = g(x).
11
Then
1
g 0 (x) = (−3x2 + 12x).
11
It is easy to verify graphically and analytically that |g 0 (x)| is less than 1 for all x is (2.5, 4), (note that
g 0 (4) = 0, and g 0 (2.5) = 1.0227).

So, by Corollary 1.1, the iteration xk+1 = g(xt ) should converge to the root x = 3 for any x0 ∈ [2.5, 4]. This
is verified as follows:

x0 = 3.5
x1 = g(x0 ) = 3.3295
x2 = g(x1 ) = 3.2368
x3 = g(x2 ) = 3.1772
x4 = g(xx ) = 3.1359
x5 = g(x4 ) = 3.1059
x6 = g(x5 ) = 3.0835
x7 = g(x6 ) = 3.0664
x8 = g(x7 ) = 3.0531
x9 = g(x8 ) = 3.0426
x10 = g(x9 ) = 3.0344
x11 = g(x10 ) = 3.0778
The sequence is clearly converging to the root x = 3. This can be justified from the above theorem.
π
Example 1.4 Find a root of f (x) = x − cos x = 0 in [0, ].
2

7
π
Let’s write x − cos x = 0 as x = cos x = g(x). Then |g 0 (x)| = | sin x| < 1 in (0, ). Thus, by Corollary 1.1
2
should converge. Indeed,

x0 =0
x1 = cos xo = 1
x2 = cos x1 = 0.54
x3 = cos x2 = 0.86
..
.
x17 = 0.73956
x18 = cos x17 = 0.73956

1.4 The Newton-Raphson Method


The Newton-Raphson method, described below, shows that there is a special choice of g(x) that will allow
us to readily use the results of Corollary 1.1.
Assume that f 00 (x) exists and is continuous on [a, b] and ξ is a simple root of f (x), that is, f (ξ) = 0 and
f 0 (ξ) 6= 0.

Choose
f (x)
g(x) = x − .
f 0 (x)

(f 0 (x))2 − f (x)f 00 (x) f (x)f 00 (x)


Then g 0 (x) = 1 − = .
(f 0 (x))2 (f 0 (x))2
f (ξ)f 00 (ξ)
Thus, g 0 (ξ) = = 0, since f (ξ) = 0, and f 0 (ξ) 6= 0.
(f 0 (ξ))2

Since g 0 (x) is continuous, this means that there exists a small neighborhood around the root x = ξ such that
for all points x in that neighborhood, |g 0 (x)| < 1.

Thus if g(x) is chosen as above and the starting approximation x0 is chosen sufficiently close to the
root x = ξ, then the fixed-point iteration is guaranteed to converge, according to the corollary above.

This leads to the following well-known classical method, known as, the Newton-Raphson method.

(Remark: In many text books and literature, the Newton-Raphson Method is simply called the Newton
Method. But, the title “The Newton-Raphson Method” is justified. See an article by [ ] in SIAM Review
(1996)).

8
Algorithm 1.2 The Newton-Raphson Method

Inputs: f (x) - The given function


x0 - The initial approximation
 - The error tolerance
N - The maximum number of iterations

Output: An approximation to the root x = ξ or a message of failure.

Assumption: x = ξ is a simple root of f (x).


For k = 0, 1, · · · do until convergence or failure.

• Compute f (xk ), f 0 (xk ).


f (xk )
• Compute xk+1 = xk − .
f 0 (xk )
• Test for convergence or failure:

If |f (xk )| < 

|xk+1 − xk |  stopping criteria.
or <
|xk |

or k > N , Stop.

End

f (xk )
Definition 1.1 The iterations xk+1 = xk − are called Newton’s iterations.
f 0 (xk )

Remarks:
1. If none of the above criteria has been satisfied within a predetermined, say N iterations, then the
method has failed after the prescribed number of iterations. In that case, one could try the method
again with a different x0 .

2. A judicious choice of x0 can sometimes be obtained by drawing the graph of f (x), if possible. However,
there does not seem to exist a clear-cut guideline of how to choose a right starting point x0 that
guarantees the convergence of the Newton-Raphson Method to a desired root.
Some Familiar Computations Using the Newton-Raphson Method

1. Computing the Square Root of a Positive Number A: Compute A, where A > 0.
√ √
Computing A is equivalent to solving x2 − A = 0. The number A, thus, may be computed by
applying the Newton-Raphson Method to f (x) = x2 − A.
Since f 0 (x) = 2x, we have the following Newton iterations to generate {xk }:

9

Newton-Raphson Iterations for Computing A

Input: A - A positive number√


Output: An approximation to A. √
Step 1. Guess an initial approximation x0 to A.
Step 2. Compute the successive approximations {xk } as follows:
For k = 0, 1, 2, . . . , do until convergence

x2k − A x2 + A
xk+1 = xk − = k
2xk 2xk

End

Example 1.5 Let A = 2, x0 = 1.5

x20 + A (1.5)2 + 2
Iteration 1. x1 = = = 1.4167
2x0 3

x21 + A
Iteration 2. x2 = = 1.4142
2x1

x22 + A
Iteration 3. x3 = = 1.4142
2x2

2. Computing the nth Root



It is easy √
to see that the above Newton-Raphson Method to compute A can be easily generalized to
compute n A. In this case f (x) = xn − A, f 0 (x) = nxn−1 .
Thus Newton’s Iterations in this case are:

xnk − A (n − 1)xnk + A
xk+1 = xk − = .
nxn−1
k nxn−1
k

3. Finding the Reciprocal of a Nonzero Number A


1
The problem here is to compute , where A 6= 0. Again, the Newton-Raphson Method can be applied
A
1 1
with f (x) = − A = 0. We have f 0 (x) = − 2 . The sequence xk+1 is given by
x x
1
f (xk ) xk − A
xk+1 = xk − = xk − = xk (2 − Axk ).
f 0 (xk ) − x12
k

10
1
The Newton-Raphson Iteration for Computing
A

Input: A - A nonzero number


1
Output: An approximation to A .
1
Step 1. Guess an initial approximation x0 to A .
Step 2. For k = 0, 1, 2, . . . do until convergence

xk+1 = xk (2 − Axk )

End

Example 1.6 Let A = 3

1
The problem is to find
3
1
f (x) = − 3.
x
Take x0 = 0.25
Then x1 = x0 (2 − Ax0 ) = 0.3125
x2 = x1 (2 − Ax1 ) = 0.3320
x3 = x2 (2 − A2 ) = 0.333

1
Note that 0.333 is the 4-digit approximation of (correct up to 4 decimal figures).
3
A Geometric Interpretation of the Newton-Raphson Method

• The first approximation x1 can be viewed as the x-intercept of the tangent line to the graph of f (x)
at (x0 , f (x0 )).

• The second approximation x2 is the x-intercept of the tangent line to the graph of f (x) at (x1 , f (x1 )).
And so on.

11
1 ))
(x
,f
1
(x
x0

x2 x1

))
f (x 0
,
(x 0

Figure 1.3: Graphical Representation of the Newton-Raphson Method.

1.5 The Secant Method


A major disadvantage of the Newton-Raphson Method is the requirement of finding the value
of the derivative of f (x) at each approximation. There are functions for which this job is either
extremely difficult (if not impossible) or time consuming. A way-out is to approximate the derivative by
knowing the values of the function at that and the previous approximation. Knowing f (xk ) and f (xk−1 ),
we can approximate f 0 (xk ) as:
f (xk ) − f (xk−1 )
f 0 (xk ) ≈ .
xk − xk−1
Then, the Newton iterations in this case are:

f (xk )
xk+1 = xk −
f 0 (xk )
f (xk )(xk − xk−1 )
≈ xk −
f (xk ) − f (xk−1 )

This consideration leads to the Secant method.

12
Algorithm 1.3 The Secant Method

Inputs: f (x) - The given function


x0 , x1 - The two initial approximations of the root
 - The error tolerance
N - The maximum number of iterations

Output: An approximation of the exact solution ξ or a message of failure.

For k = 1, 2, · · · do
• Compute f (xk ) and f (xk−1 ).
f (xk )(xk − xk−1 )
• Compute the next approximation: xk+1 = xk −
f (xk ) − f (xk−1 )
• Test for convergence or maximum number of iterations: If |xk+1 − xk | <  or if k > N , Stop.
End
Note: The secant method needs two approximations x0 , x1 to start with, whereas the Newton-Raphson
method just needs one, namely, x0 .
Example 1.7 f (x) = x2 − 2 = 0.

Let x0 = 1.5, x1 = 1

Iteration 1. k = 1.

f (x1 )(x1 − x0 )
x2 = x1 −
f (x1 ) − f (x0 )
(−1)(−.5) .5
= 1− =1+ = 1.4
−1 − .25 1.25
Iteration 2

f (x2 )(x2 − x1 )
x3 = x2 −
f (x2 ) − f (x1 )
(−0.04)(−0.1)
= 1.4 −
(−0.04) − (−1)
= 1.4 + 0.0042

= 1.4042
Iteration 3

f (x3 )(x3 − x2 )
x4 = x3 −
f (x3 ) − f (x2 )
= 1.4143.
Iteration 4

f (x4 )(x4 − x3 )
x5 = x4 − = 1.4142
f (x4 ) − f (x3 )

13
Remarks: By comparing the results of this example with those obtained by the Newton-Raphson (Ex-
ample
√ 1.5) method, we see that it took 5 iterations by the secant method to obtain a 4-digit accuracy of
2 = 1.4142; whereas for the Newton-Raphson method, this accuracy was obtained just after 2 iteration.
In general, the Newton-Raphson method converges faster than the secant method. The exact rate
of convergence of these two and the other methods will be discussed in the next section.

A Geometric Interpretation of the Secant Method

• x2 is the x-intercept of the secant line passing through (x0 , f (x0 )) and (x1 , f (x1 )).
• x3 is the x-intercept of the secant-line passing through (x1 , f (x1 )) and (x2 , f (x2 )).
• And so on.

Note: That is why the method is called the secant method.

(x1 , f (x1 ))

x2

x0 x=ξ x1

(x2 , f (x2 ))

(x0 , f (x0 ))

Figure 1.4: Geometric Interpretation of the Secant Method.

1.6 The Method of False Position (The Regular False Method)


In the Bisection method, every interval under consideration is guaranteed to have the desired root x = ξ.
That is why, the Bisection method is often called a bracket method, because every interval brackets the
root. However, the Newton-Raphson method and the secant method are not bracket methods in that sense,
because, there is no guarantee that the two successive approximations will bracket the root. Here is an
example.

Example 1.8 Consider solving f (x) = tan(πx) − 6 = 0, using the Secant Method.
Take: x0 = 0 x1 = 0.48

(Note that ξ = 0.44743154 is a root of f (x) = 0 lying in the interval [0, 0.48]).

Iteration 1
(x1 − x0 )
x2 = x1 − f (x1 ) = 0.181194
f (x1 ) − f (x0 )
Iteration 2
x0 = 0.48, x1 = 0.181194

14
x2 = 0.286187

Iteration 3
x0 = 0.181194, x1 = 0.286187
x2 = 1.091987. Clearly, The iterations are diverging.

However, the secant method can easily be modified to make it a bracket method. The idea is as follows:

• Choose the initial approximations x0 and x1 such that f (x0 ) f (x1 ) < 0, ensuring that the root is in
[x0 , x1 ].

• Compute x2 as in the Secant method; that is, take x2 is the x-intercept of the secant line passing
through x0 and x1 .

• Compute x3 out of the three approximation x0 , x1 , and x2 as follows:


First, compute f (x1 )f (x2 ) ; if it is negative, then [x1 , x2 ] brackets the root and x3 is the x-intercept
of the secant line joining (x1 f (x1 )) and (x2 , f (x2 )) (as in the Secant method). Otherwise, x3 will be
computed as the x-intercept of the line joining (x0 , f (x0 )) and (x2 , f (x2 )).

• Continue the process.

• This yields a method called, the Method of False Position or the Method of Regular Falsi.

Algorithm 1.4 The Method of False Position


Input: (i) f (x) - The given function
(ii) x0 , x1 - The two initial approximations f (x0 )f (x1 ) < 0 .
(iii)  - The tolerance
(iv) N The maximum number of iterations
Output: An approximation to the root x = ξ.

Step 0. Set i = 1.
(xi − xi−1 )
Step 1. Compute xi+1 = xi − f (xi )
f (xi ) − f (xi−1 )
xi+1 − xi
If | | <  or i ≥ N . Stop.
xi
Step 2. If f (xi ).f (xi+1 ) < 0, then, set xi ≡ xi+1 , xi−1 ≡ xi .
Otherwise, set xi ≡ xi+1 , xi−1 ≡ xi−1 . Return to step 0. End

Example 1.9 Consider again the previous example: (Example 1.8)

f (x) = tan(πx) − 6 = 0.

Let the interval be [0, 0.48].

Then ξ = 0.44743154 is the root of f (x) in [0, 0.48].


Let’s start with the same initial approximations:

15
x0 = 0, x1 = 0.48; f (x0 ).f (x1 ) < 0

Iteration 1
x1 − x0
x2 = x1 − f (x1 ) = 0.181194
f (x1 ) − f (x0 )
f (x2 ).f (x1 ) < 0

Set x0 = 0.48, x1 = 0.181194

Iteration 2
x1 − x0
x2 = x1 − f (x1 ) = 0.286187
f (x1 ) − f (x0 )
f (x2 ).f (x1 ) > 0

Set x0 = 0.48, x1 = 0.286187


Iteration 3 (i = 3)
x2 = 0.348981
f (x2 ) · f (x1 ) > 0

Set x0 = 0.48, x1 = 0.348981


Iteration 4
x2 = 0.387051

Iteration 5
x2 = 0.410304

1.7 Convergence Analysis of the Iterative Methods


We have seen that the Bisection method always converges, and each of the methods: the Newton-Raphson
Method, the Secant method, and the method of False Position, converges under certain conditions. The
question now arises: when a method converges, how fast it converges? In other words, what is the
rate of convergence? To this end, we first define:

Definition 1.2 (Rate of Convergence of an Iterative Method). Suppose that the sequence {xk }
converges to ξ. Then the sequence {xk } is said to converge to ξ with the order of convergence α if there
exists a positive constant p such that

|xk+1 − ξ| ek+1
lim = lim = p.
k→∞ |xk − ξ|α k→∞ eα k

Thus if α = 1, the convergence is linear. If α = 2, the convergence is quadratic; and so on. The number
α is called the convergence factor.
Using the above definition, we will now show that the rate of convergence for the Fixed Point Iteration
method is usually linear but that of the Newton-Raphson method is quadratic, and that of the Secant method
is superlinear.

Rate of Convergence of the Fixed-Point Iteration Method


Recall that in the proof of the Fixed-Point Iteration Theorem, we established that

ek+1 = |g 0 (c)|ek ,

16
where c is a number between xk and ξ.
ek+1
So, lim = lim |g 0 (c)|.
k→∞ ek k→∞

Since {xk } → ξ, and c is in (xk , ξ), it follows that lim |g 0 (c)| = |g 0 (ξ)|.
k→∞
ek+1
This gives lim = |g 0 (ξ)|. Thus α = 1.
k→∞ ek
We, therefore, have

The fixed-point iteration

xk+1 = g(xk ), k = 0, 1, · · ·

converges linearly.

Rate of Convergence of the Newton-Raphson Method


To study the rate of convergence of the Newton-Raphson Method, we need Taylor’s Theorem.

Taylor’s Theorem of order n

Suppose that the function f (x) possesses continuous derivatives

of order up to (n + 1) in the interval [a, b], and p is a point

in this interval. Then for every x in this interval, there exists

a number c (depending upon x) between p and x such that

f 00 (p)
f (x) = f (p) + f 0 (p)(x − p) + (x − p)2
2!
f n (p)
+···+ (n − p)n + Rn (x),
n!

where Rn (x), called the remainder after n terms, is given by:

f (n+1) (c)
Rn (x) = (x − p)n+1
(n + 1)!

Let’s choose a small interval around the root x = ξ. Then for any x in this interval, we have, by Taylor’s
theorem of order 1, the following expansion of the function g(x):

(x − ξ)2 00
g(x) = g(ξ) + (x − ξ)g 0 (ξ) + g (ηk )
2!
where ηk lies between x and ξ.

17
Now, for the Newton-Raphson Method, we have seen that g 0 (ξ) = 0.

(x − ξ)2 00
So, g(x) = g(ξ) + g (ηk )
2!
(xk − ξ)2 00
or g(xk ) = g(ξ) + g (ηk ).
2!
(xk − ξ)2 00
or g(xk ) − g(ξ) = g (ηk ).
2!
Since g(xk ) = xk+1 , and g(ξ) = ξ, this give
(xk − ξ)2 00
xk+1 − ξ = g (ηk )
2!
|xk − ξ|2 00
That is, |xk+1 − ξ| = |g (ηk )|
2
ek+1 |g 00 (ηk )|
or = .
e2k 2
Since ηk lies between x and ξ for every k, it follows that

lim ηk = ξ.
k→∞

ek+1 |g 00 (ηk )| |g 00 (ξ)|


So, we have lim 2 = lim = .
k→∞ ek 2 2
This proves
The Newton-Raphson Method Converges Quadratically.

The quadratic convergence roughly means that the accuracy gets doubled at each iteration.
The Rate of Convergence for the Secant Method.
It can be shown [Exercise] that if the Secant method converges, then the convergence factor α is approxi-
mately 1.6 for sufficiently large k. This is said as “The rate of convergence of the Secant method is
superlinear”.

1.8 A Modified Newton-Raphson Method for Multiple Roots


Recall that the most important underlying assumption in the proof of quadratic convergence of the Newton-
Raphson method was that f 0 (x) is not zero at the approximation x = xk , and in particular f 0 (ξ) 6= 0. This
means that ξ is a simple root of f (x) = 0.

The question, therefore, is:


How can the Newton-Raphson method be modified for a multiple root so that the quadratic
convergence can be retained?

Note that if f (x) has a root ξ of multiplicity m, then

f (ξ) = f 0 (ξ) = f 00 (ξ) = · · · = f (m−1) (ξ) = 0,

where f (m−1) (ξ) denotes the (m − 1)th derivative of f (x) at x = ξ. In this case f (x) can be written as
f (x) = (x − ξ)m h(x), where h(x) is a polynomial of degree at most m − 2.

18
In case f (x) has a root of multiplicity m, it was suggested by Ralston and Robinowitz (1978) that, the fol-
lowing modified iteration will ensure quadratic convergence in the Newton-Raphson method, under certain
conditions [Exercise].

The Newton-Raphson Iterations for Multiple Roots:


Method I
mf (xi )
xi+1 = xi −
f 0 (xi )

Since in practice, the multiplicity m is not known apriori, another useful modification is to apply the
Newton-Raphson iteration to the function:

f (x)
u(x) =
f 0 (x)

in place of f (x).

u(xi )
Thus, in this case, the modified Newton-Raphson iteration becomes xi+1 = xi − .
u0 (xi )
Note that, since f (x) = (x − ξ)m h(x), u0 (x) 6= 0 at x = ξ.

f 0 (x)f 0 (x) − f (x)f 00 (x)


Since u0 (xi ) =
[f 0 (x)]2
we have

The Newton-Raphson Iterations for Multiple Roots


Method II
f (xi )f 0 (xi )
xi+1 = xi −
[f 0 (xi )]2− [f (xi )]f 00 (xi )

Example:
f (x) = ex − x − 1
f 0 (x) = ex − 1
f 00 (x) = ex

Figure 1.5: The Graph of ex − x − 1.


There are two roots at x = 0 (See the graph above!). So, m = 2.
Method I
2f (xi )
xi+1 = xi − 0
f (xi )
x0 = 0.5
2(e0.5 − 0.5 − 1)
Iteration 1. x1 = 0.5 − = .0415 = 4.15 × 10−2
(e0.5 − 1)

19
2(e0.0415 − 0.0415 − 1)
Iteration 2. x2 = 0.0415 − = .00028703 = 2.8703 × 10−4 .
e0.0415 − 1
Method II
f (x) = ex − x − 1.
f (xi )f 0 (xi )
xi+1 = xi −
(f 0 (xi )2 − ((f (xi )f 00 (xi ))
x0 = 0.5
Iteration 1. x1 = −0.0493 = −4.93 × 10−2

Iteration 2. x2 = −3.9848 × 10−4

1.9 The Newton-Raphson Method for Polynomials


A major requirement of all the methods that we have discussed so far is the evaluation of f (x) at successive
approximations xk .
If f (x) happens to be a polynomial Pn (x) (which is the case in many practical applications), then there is an
extremely simple classical technique, known as Horner’s Method to compute Pn (x) at x = z. The value
Pn (z) can be obtained recursively in n-steps from the coefficients of the polynomial Pn (x). Furthermore, as
a by-product, one can get the value of the derivative of Pn (x) at x = z; that is Pn0 (z). Then, the scheme can
be incorporated in the Newton-Raphson Method to compute a zero of Pn (x).
(Note that the Newton-Raphson Method requires evaluation of Pn (x) and its derivate at successive iteration).
Horner’s Method
Let Pn (x) = a0 + a1 x + a2 x2 + · · · + an xn and let x = z be given. We need to compute Pn (z) and Pn0 (z).
Let’s write Pn (x) as:
Pn (x) = (x − z) Qn−1 (x) + bo ,
where Qn−1 (x) = bn xn−1 + bn−1 xn−2 + · · · + b2 x + b1 .
Thus, Pn (z) = b0 .
Let’s see how to compute b0 recursively by knowing the coefficients of Pn (x)
From above, we have

a0 + a1 x + a2 x2 + · · · + an xn = (x − z)(bn xn−1 + bn−1 xn−2 + · · · + b1 ) + bo

Comparing the coefficients of like powers of x from both sides, we obtain

b n = an
bn−1 − bn z = an−1 ⇒ bn−1 = an−1 + bn z
..
.

and so on.
In general, bk − bk+1 z = ak ⇒ bk = ak + bk+1 z. k = n − 1, n − 2, ..., 1, 0.

Thus, knowing the coefficients an , an−1 , ..., a1 , a0 of Pn (x), the coefficients bn , bn−1 , ..., b1 of Qn−1 (x) can be
computed recursively starting with bn = an , as shown above. That is, bk = ak + bk+1 z, k = n − 1, k =
2, . . . , 1, 0.
Again, note that Pn (z) = b0 . So, Pn (z) = b0 = a0 + b1 z.

That is, if we know the coefficients of a polynomial, we can compute the value of the polynomial at a given
point out of these coefficients recursively as shown above.

20
Algorithm 1.5 Evaluating Polynomial: Pn (x) = a0 + a1 xp + a2 x2 + . . . + an xn at x = z.

Inputs: (1) a0 , a1 , ..., an of Pn (x)-The coefficients of the polynomial Pn (x).

(2) z-The number at which Pn (x) has to be evaluated.

Outputs: b0 = Pn (z).

Step 1. Set bn = an .

Step 2. For k = n − 1, n − 2, . . . , 1, 0 do.

Compute bk = ak + bk+1 z

Step 3. Set Pn (z) = b0

End

It is interesting to note that as a by-product of above, we also obtain Pn0 (z), as the following computations
show :

Finding Pn0 (x)

Pn (x) = (x − z)Qn−1 (x) + b0


Thus, Pn0 (x) = Qn−1 (x) + (x − z)Q0n−1 (x)
So, Pn0 (z) = Qn−1 (z).
This means that knowing the coefficients of Qn−1 (x) we can compute Pn0 (z) recursively out of those co-
efficients. The coefficients of the polynomial Qn−1 (z) can be obtained exactly in the same way as above,
by replacing ak , k = n, · · · 1 by bk , k = n, n − 1, · · · , 1 and labeling b’s by c’s, and a’s by b’s. And, then
Qn−1 (z) = Pn0 (z) = c1 . Note that in computing Pn (z), we generated the numbers b0 , b1 , · · · , bn , from the co-
efficients a0 , a1 , · · · , an of Pn (x). So, in computing Pn0 (z), we need to generate the numbers c1 , c2 , · · · , cn−1
from the coefficients b1 , b2 , · · · , bn−1 , bn of the polynomial Qn−1 (z), which hav already been computed by
Algorithm 1.5.

Then, we have the following scheme to compute Pn0 (z):

Computing Pn0 (z)

Inputs: (1) b1 , ..., bn - The coefficients of the polynomial Qn−1 (x) obtained from the
previous algorithm
(2) A number z.

Outputs: c1 = Qn−1 (z).

Step 1. Set cn = b n ,

Step 2. For k = n − 1, n − 2, ...2, 1do

Compute ck = bk + ck+1 z,

End

Step 3. Set Pn0 (z) = c1 .

21
Example 1.10 Let P3 (x) = x3 − 7x2 + 6x + 5.
Let z = 2.
Here a0 = 5, a1 = 6, a2 = −7, a3 = 1.
Generate the sequence {bk } from {ak }:
b 3 = a3 = 1
b2 = a2 + b3 z = −7 + 2 = −5
b1 = a1 + b2 z = 6 − 10 = −4
b0 = a0 + b1 z = 5 − 8 = −3.

So, P3 (2) = b0 = −3

Generate the sequence {ck } from {bk }.


c3 = b 3 = 1
c2 = b2 + c3 z = −5 + 2 = −3
c1 = b1 + c2 z = −4 − 6 = −10.

So, P30 (2) = −10.

The Newton-Raphson Method with Horner’s Scheme


We now describe the Newton-Raphson Method for a polynomial using Horner’s scheme. Recall that the
Newton-Raphson iteration for finding a root of f (x) = 0 is:

f (xk )
xk+1 = xk −
f 0 (xk )
In case f (x) is a polynomial Pn (x) of degree n: f (x) = Pn (x) = a0 + a1 x + a2 x2 + · · · + an xn , the above
iteration becomes:
Pn (x)
xk+1 = xk − 0
Pn (xk )
If the sequence {bk } and {ck } are generated using Horner’s Scheme, at each iteration we then have

b0
xk+1 = xk −
c1

(Note that at the iteration k, Pn (xk ) = b0 and Pn0 (xk ) = c1 ).

22
Algorithm 1.6 The Newton-Raphson Method with Horner’s Scheme

Input: a0 , a1 , · · · , an - The coefficients of Pn (x) = a0 + a1 x + a2 x2 + · · · + an xn .


x0 - The initial approximation
 - The tolerance
N - The maximum number of iterations to be performed.

Output: An approximation of the root x = ξ or a failure message.

For k = 0, 1 , 2, · · · , do

Step 1. z = xk , bn ≡ an , cn ≡ bn

Step 2. Computing b1 and c1

For j = n − 1, n − 2, · · · , 1 do

bj ≡ aj + z bj+1

cj ≡ bj + z cj+1

End.

Step 3. b0 = a0 + z b1

Step 4. xk+1 = xk − b0 /c1

Step 5. If |xk+1 − xk | <  or k > N , stop.

End

Notes:
1. The outer-loop corresponds to the Newton-Raphson Method
2. The inner-loop and the statement b0 = a0 + z b1 corresponds to the evaluation of Pn (x) and its
derivative Pn (x) at successive approximations.

Example 1.11 Find a root of P3 (x) = x3 − 7x2 + 6x + 5 = 0, using the Newton-Raphson method and
Horner’s scheme.
P3 (x) has a root between 1.5 and 2.
Let x0 = 2

Iteration 1. k = 0:
b0 = −3, c1 = −10
b0 3 17
x1 = x0 − =2− = = 1.7
c1 10 10

Iteration 2. k = 1:
z = x1 = 1.7, b3 = 1, c3 = b3 = 1.

23
b2 = a2 + z b3 = −7 + 1.7 = −5.3

c2 = b2 + z c3 = −5.3 + 1.7 = −3.6

b1 = a1 + z b2 = 6 + 1.7(−5.3) = −3.0100

c1 = b1 + z c2 = −3.01 + 1.7(−3.6) = −9.130

b0 = a0 + z b1 = 5 + 1.7(−3.0100) = −0.1170
b0 0.1170
x2 = x1 − = 1.7 − = 1.6872
c1 9.130
(The exact root, correct upto four decimal places is 1.6872).

1.10 The Müler Method: Finding a Pair of Complex Roots.


The Müler Method is an extension of the Secant Method. It is conveniently used to approximate a complex
root of a real equation.
Note that none of the methods: The Newton-Raphson, the Secant, the Fixed Point iteration
Methods, etc. can produce an approximation of a complex root starting from real approxima-
tions.
The Müler Method is capable of doing so.

• Recall that starting with two initial approximations x0 and x1 , the Secant Method finds the next
approximation x2 as the x-intercept of the line passing throught (x0 , f (x0 )), and (x1 , f (x1 )).

• In the Müler Method, starting with three initial approximations x0 , x1 and x2 , the next approximaiton
x3 is obtained as the x-intercept of the parabola P passing through (x0 , f (x0 )), (x1 , f (x1 )), and
(x2 , f (x2 )).

• Since a parabola can intersect x-axis at two points, the one that is close to x2 is taken as x3 .

• So, in Müler’s method, x3 is obtained as follows:


1. Find the parabola P passing throught(x0 , f (x0 )), (x1 , f (x1 )), and (x2 , f (x2 )).
2. Solve the quadratic equation representing the parabola P and take x3 as the root closest to x2 .

(Note that even when x0 , x1 , and x2 are real, x3 can be complex.)


The process, of course, can be continued to obtain other successive approximations. That is, x4 is obtained
by solving the quadratic equation representing the parabola P passing through (x1 , f (x1 )), (x2 , f (x2 )),
and (x3 , f (x3 )), and taking x4 as the zero that is closest to x3 ; and so on.
Remarks: Note that it is advisable to relabel the approximation at the beginning of each iteration so that
we need to store only 3 approximations per iteration.
Geometric Interpretation of the Müler Method

24
x0 x1 x2 x3

We now give the details of how to obtain x3 starting from x0 , x1 , and x2 .


Let the equation of the parabola P be P (x) = a(x − x2 )2 + b(x − x2 ) + c.
Since P (x) passes through (x0 , f (x0 )), (x1 , f (x1 )), and (x2 , (f (x2 )), we have
P (x0 ) = f (x0 ) = a(x0 − x2 )2 + b(x0 − x2 ) + c
P (x1 ) = f (x1 ) = a(x1 − x2 )2 + b(x1 − x2 ) + c
P (x2 ) = f (x2 ) = c.
Knowing c = f (x2 ), we can now obtain a and b by solving the first two equations:

a(x0 − x2 )2 + b(x0 − x2 ) = f (x0 ) − c
a(x1 − x2 )2 + b(x1 − x2 ) = f (x1 ) − c.
Or
    
(x0 − x2 )2 (x0 − x2 ) a f (x0 ) − c
= .
(x1 − x2 )2 (x1 − x2 ) b f (x1 ) − c
Knowing a and b from above, we can now obtain x3 by solving P (x3 ) = 0.

P (x3 ) = a(x3 − x2 )2 + b(x3 − x2 ) + c = 0.

To avoid catastrophic cancellation, instead of using the traditional formula, we use the following formula
for solving the quadratic equation ax2 + bx + c = 0.

Solving the Quadratic Equation


ax2 + bx + c = 0
−2c
x= √
b ± b2 − 4ac

The sign in the denominator is chosen so that it is the largest in magnitude, guaranteeing that x3 is closest
to x2 .
−2c
In our case, x3 − x2 = √
b ± b2 − 4ac

25
−2c
or x3 = x2 + √ .
b ± b2 − 4ac
Note: one can also give on explicit expresions for a and b as follows:
(x0 − x2 )2 [f (x1 ) − f (x2 )] − (x1 − x2 )2 [f (x0 ) − f (x2 )]
b=
(x0 − x2 )(x1 − x2 )(x0 − x1 )

(x1 − x2 ) [f (x0 ) − f (x2 )] − (x0 − x2 ) [f (x1 ) − f (x2 )]


a= .
(x0 − x2 )(x1 − x2 )(x0 − x1 )
However, for computational accuracy, it may be easier to solve the following 2 × 2 linear system to obtain a
and b, once c = f (x2 ) is computed.

Algorithm 1.7 The Müler Method

Inputs: (i) f (x) - The given functions


(ii) x0 , x1 , x2 - Initial approximations
(iii)  - Tolerance
(iv) N - The maximum number of iterations.
Output: An approximation of the root x = ξ
Step 1: Evaluate f (x0 ), f (x1 ), and f (x2 ).
Step 2: Compute a, b, and c, as follows:
2.1 c = f (x2 ).
2.2 Solve the 2 × 2 linear system for a and b:
    
(x0 − x2 )2 (x0 − x2 ) a f (x0 ) − c
= .
(x1 − x2 )2 (x1 − x2 ) b f (x1 ) − c

−2c
Step 3: Compute x3 = x2 + √ choosing + or − so that the denominator is the
b ± b2 − 4ac
largest in magnitude.

Step 4: Test if |x3 − x2 | <  or if the number of iterations exceeds N . If so, stop. Otherwise go
to Step 5.

Step 5: Relabel x3 as x2 , x2 as x1 , and x1 as x0 , and return to Step 1.

Example 1.12 f (x) = x3 − 2x2 − 5


The roots of f (x) are {2.6906, −0.3453 ± 1.3187i}
We will try to approximate the pair of complex roots here.

Iteration 1: Initial Approximations: x0 = −1, x1 = 0, x2 = 1


Step 1: f (x0 ) = −8, f (x1 ) = −5, f (x2 ) = −6.
Step 2: Compute a, b, and c:
2.1 c = f (x2 ) = 6
2.2 a and b are computed by solving the 2 × 2 linear system:

4a − 2b = −2
a−b=1
This gives a = −2, b = −3.

26
2c 12
Step 3: Compute x3 = x2 − √ = x2 + √ = 0.25 + 1.5612i
b− 2
b − 4ac −3 − 9 − 48

Iteration 2. Relabeling x3 as x2 , x2 as x1 , and x1 as x0 , we have


x0 = 0, x1 = 1, x2 = 0.25 + 1.5612i
Step 1: f (x0 ) = −5, f (x1 ) = −6, and f (x2 ) = 0.25 − 1.5612i

Step 2: c = f (x2 ) = −2.0625 + 5.0741i


a and b are obtained by solving the 2 × 2 linear system:
 2    
x2 −x2 a f (x0 ) − c
= ,
(x1 − x2 )2 (x1 − x2 ) b f (x1 ) − c

which gives a = −0.75 + 1.5612i, b = −5.4997 − 3.1224i.

2c
Step 3: x3 = x2 − √ = x2 − (0.8388 + 0.3702i) = −0.5888 + 1.1910i
b− b2 − 4ac

Iteration 3: Again, relabeling x3 as x2 , x2 as x1 , and x1 as x0 , we have


x0 = 1, x1 = 0.25 + 1.5612i, x2 = −0.5888 + 1.1910i
Step 1: f (x0 ) = −6, f (x1 ) = −2.0627 − 5.073i, f (x2 ) = −0.5549 + 2.3542i

Step 2: c = f (x2 ) = −0.5549 + 2.3542i


a and b are obtained by solving the system:
! ! !
(x0 − x2 )2 (x0 − x2 ) a f (x0 ) − c
= ,
(x1 − x2 )2 (x1 − x2 ) b f (x1 ) − c

which gives a = −1.3888 + 2.7522i, b = −2.6339 − 8.5606i.


2c
Step 3: x3 = x2 − √ = −0.3664 + 1.3508i
b − b2 − 4ac

Iteration 4
x3 = −0.3563 + 1.3238i

Iteration 5
x3 = −0.3492 + 1.3184i

Iteration 6
x3 = −0.3464 + 1.3180i

Iteration 7
x3 = −0.3455 + 1.3183i
x3 − x2
Since | | < 6.8738 × 10−4 , we stop here.
x2

Thus, an approximate pair of complex roots is: α1 = −0.3455 + 1.3183i, α2 = 0.3455 − 1.3183i.

27
Verify:
(x − α1 )(x − α2 )(x − 2.6906) = x3 − 2.006x2 − 4.9975.

Example 1.13 f (x) = x3 − 7x2 + 6x + 5

All three roots are real: 5.8210, 1.6872, −0.5090. We will try to approximate the root 1.6872.

Iteration 1.
x0 = 0, x1 = 1, x2 = 2

Step 1: f (x0 ) = 5, f (x1 ) = 5, f (x2 ) = −3.

Step 2: Compute c = f (x2 ) = −3.


Then a and b are obtained by solving the 2 × 2 linear system:
! ! !
(x0 − x2 )2 (x0 − x2 ) a 5−c
=
(x1 − x2 )2 (x1 − x2 ) b 5−c

Or ! ! !
4 −2 a 8
= , which gives a = −4, b = −12.
1 −1 b 8
2c
Step 3: Compute x3 = x2 − √
b − b2 − 4ac
= x2 − 0.2753 = 1.7247 (choosing the negative sign in the determinant).

Iteration 2.
x2 = x3 = 1.7247, x1 = x2 = 2, x0 = x1 = 1

Step 1: f (x0 ) = 5, f (x1 ) = −3, f (x2 ) = −0.3441

Step 2: Compute c = −0.3441


Then a and b are obtained by solving the 2 × 2 linear system:
! ! !
0.5253 −0.7247 a 5−c
= ,
0.0758 0.2753 b 3−c
which gives a = −2.2753, b = −9.0227.
2c
Step 3: Compute x3 = x2 − √ = x2 − 0.0385 = 1.6862
b − b2 − 4ac

1.11 Sensitivity of the Roots of a Polynomial


The roots of a polynomial can be very sensitive to small perturbutions of the coefficients of the polynomial.
For example, take
p(x) = x2 = 2x + 1.
The roots x1 = −1, x2 = −1. Now perturb only the coefficient −2 to −1.9999, leaving the other two
coefficients unchanged. The roots of the peturbed polynomial are: 1 ± .01i. Thus, both the roots become
complex. One might think that this happend because of the roots of p(x) are multiple roots. It is true
that the multiple roots are prone to perturbations, but the roots of a polynomial with well-separated
roots can be very sensitive too!!
The well-known example of this is the celebrated Wilkinson polynomial:

p(x) = (x − 1)(x − 2)...(x − 20).

28
The roots of this polynomial equation are 1, 2, ..., 20 and thus very well-separated. But, a small perturbation
of the coefficient x19 which is -210 will change the roots completely. Some of them will even become complex.
Try this yourself using MATLAB function roots.

29
Roots of Wilkinson polynomial
4
original
perturbed

1
Im

−1

−2

−3

−4
0 5 10 15 20 25
Re

Figure 1.6: Roots of the Wilkinson Polynomial

1.12 Deflation Technique


Deflation is a technique to compute the other roots of f (x) = 0, once an approximate real root or a pair of
complex roots are computed. Then if x = α is an approximate root of P (x) = 0, where P (x) is a polynomial
of degree n, then we can write
P (x) = (x − α)Qn−1 (x)
where Qn−1 (x) is a polynomial of degree n − 1.
Similarly, if x = α ± iβ is an approximate pair of complex conjugate roots, then

P (x) = (x − α − iβ)(x − α + iβ)Qn−2 (x)

where Qn−2 (x) is a polynomial of degree n − 2. Moreover, the zeros of Qn−1 (x) in the first case
and those of Qn−2 (x) is the second case are also the zeros of P (x). The coefficients of Qn−1 (x) or
Qn−2 (x) can be generated using synthetic division as in Horner’s method, and then any of the root-finding
procedure obtained before can be applied to compute a root or a pair of roots of Qn−2 (x) = 0 or Qn−1 (x).
The procedure can be continued until all the roots are found.

Example 1.14 Consider finding the roots of f (x) = x4 + 2x3 + 3x2 + 2x + 2 = 0. It is easy to see that x ± i
is a pair of complex conjugate roots of f (x) = 0. Then we can write

f (x) = (x + i)(x − i)(b3 x2 + b2 x + b1 )

or
x4 + 2x3 + 3x2 + 2x + 2 = (x2 + 1)(b3 x2 + b2 x + b1 ).

30
Equating coefficients of x4 , x3 , and x2 on both sides, we obtain

b3 = 1, b2 = 2, b1 + b3 = 3,

giving b3 = 1, b2 = 2, b1 = 2.
Thus, (x4 + 2x3 + 3x2 + 2x + 2) = (x2 + 1)(x2 + 2x + 2).
The zeros of the deflated polynomial Q2 (x) = x2 + 2x + 2 can now be found by using any of the methods
described earlier.

31
PART II.
Interpolation
2 Interpolation
2.1 Problem Statement and Applications
Consider the following table:

x0 f0
x1 f1
x2 f2
.. ..
. .
xk fk
.. ..
. .
xn fn

In the above table, fk , k = 0, · · · , n are assumed to be the values of a certain function f (x), evaluated
at xk , k = 0, · · · , n in an interval containing these points. Note that only the functional values are
known, not the function f (x) itself. The problem is to find fu corresponding to a nontabulated inter-
mediate value x = u.

Such a problem is called an Interpolation Problem. The numbers x0 , x1 , · · · , xn are called the nodes.

Interpolation Problem

Given (n + 1) points: (x0 , f0 ), (x1 , f1 ), · · · (xn , fn ), find fu


corresponding to xu , where x0 < xu < xn ; assuming that
f0 , f1 , · · · , fn are the values of a certain function f (x)
at x = x0 , x1 , · · · , xn , respectively.

The Interpolation problem is also a classical problem and dates back to the time of Newton and Kepler,
who needed to solve such a problem in analyzing data on the positions of stars and planets. It is also of
interest in numerous other practical applications. Here is an example.

2.2 Existence and Uniqueness


It is well-known that a continuous function f (x) on [a, b] can be approximated as close as possible by means
of a polynomial. Specifically, for each  > 0, there exists a polynomial P (x) such that |f (x) − P (x)| <  for
all x in [a, b]. This is a classical result, known as Weierstrass Approximation Theorem.
Knowing that fk , k = 0, · · · , n are the values of a certain function at xk , the most obvious thing then
to do is to construct a polynomial Pn (x) of degree at most n that passes through the (n + 1) points:
(x0 , f0 ), (x1 , f1 ), · · · , (xn , fn ).
Indeed, if the nodes x0 , x1 , ..., xn are assumed to be distinct, then such a polynomial always
does exist and is unique, as can be seen from the following.

Let Pn (x) = a0 + a1 x + a2 x2 + · · · + an xn be a polynomial of degree at most n. If Pn (x) interpolates at


x0 , x1 , · · · , xn , we must have, by definition

32
Pn (x0 ) = f0 = a0 + a1 x0 + a2 x20 + · · · + an xn0
Pn (x1 ) = f1 = a0 + a1 x1 + a2 x21 + · · · + an xn1
.. (2.1)
.
Pn (xn ) = fn = a0 + a1 xn + a2 x2n + · · · + an xn2
These equations can be written in matrix form:
 
  a0  
1 x0 x20 · · · xn0   f0
  a1   
 1 x1 x21 · · · xn1    f1 
    
  a2 =  (2.2)
 ..    .. 
 . 
 ..  
 . 
 . 
1 xn x2n · · · xnn fn
an

Because x0 , x1 , · · · , xn are distinct, it can be shown [Exercise] that the matrix of the above system is
nonsingular. Thus, the linear system for the unknowns a0 , a1 , · · · , an has a unique solution, in view of the
following well-known result, available in any linear algebra text book.

The n × n algebraic linear system Ax = b has a unique solution


for every b if and only if A is nonsingular.

This means that Pn (x) exists and is unique.

Theorem 2.1 (Existence and Uniqueness Theorem for Polynomial Interpolation)


Given (n + 1) distinct points x0 , x1 , · · · , xn and the associated values f0 , f1 , · · · , fn of a function
f (x) at these points (that is, f (xi ) = fi , i = 0, 1, · · · , n)), there is a unique polynomial Pn (x) of
degree at most n such that Pn (xi ) = fi , i = 0, 1, · · · , n. The coefficients of this polynomial can be
obtained by solving the (n + 1) × (n + 1) linear system (2.2).

The polynomial Pn (x) in Theorem 2.1 is called the interpolating polynomial.

2.3 The Lagrange Interpolation


Once we know that the interpolating polynomial exists and is unique, the problem then becomes how to
construct an interpolating polynomial; that is, how to construct a polynomial Pn (x) of degree at most n,
such that
Pn (xi ) = fi , i = 0, 1, · · · , n.
It is natural to obtain the polynomial by solving the linear system (3.1) in the previous section. Unfortunately,
the matrix of this linear system, known as the Vandermonde Matrix, is usually highly ill-conditioned,
and the solution of such an ill-conditioned system, even by the use of a stable method, may not

33
be accurate. There are, however, several other ways to construct such a polynomial, that do not require
solution of a Vandermonde system. We describe one such in the following:
Suppose n = 1, that is, suppose that we have only two points (x0 , f0 ), (x1 , f1 ), then it is easy to see that
the linear polynomial
x − x1 (x − x0 )
P1 (x) = f0 + f1
(x0 − x1 ) (x1 − x0 )
is an interpolating polynomial, because

P1 (x0 ) = f0 , P1 (x1 ) = f1 .

For convenience, we shall write the polynomial P1 (x) in the form

P1 (x) = L0 (x)f0 + L1 (x)f1 ,


x − x0 x − x0
where, L1 (x) = , and L1 (x) = .
x1 − x0 x1 − x0
Note that both the polynomials L0 (x) and L1 (x) are polynomials of degree 1.
The concept can be generalized easily for polynomials of higher degrees.
To generate polynomials of higher degrees, let’s define the set of polynomials {Lk (x)} recursively, as follows:

(x − x0 )(x − x1 ) · · · (x − xk−1 )(x − xk+1 ) · · · (x − xn )


Lk (x) = , k = 0, 1, 2, · · · , n. (2.3)
(xk − x0 )(xk − x1 ) · · · (xk − xk−1 )(xk − xk+1 ) · · · (xk − xn )
We will now show that the polynomial Pn (x) defined by

Pn (x) = L0 (x)f0 + L1 (x)f1 + · · · + Ln (x)fn (2.4)

is an interpolating polynomial.
To see this, note that

(x − x1 ) · · · (x − xn )
L0 (x) =
(x0 − x1 ) · · · (x0 − xn )
(x − x0 )(x − x2 ) · · · (x − xn )
L1 (x) =
(x1 − x0 )(x1 − x2 ) · · · (x1 − xn )
..
.
(x − x0 )(x − x1 )(x − x2 ) · · · (x − xn−1 )
Ln (x) =
(xn − x0 )(xn − x1 )(xn − x2 ) · · · (xn − xn−1 )
Also, note that

L0 (x0 ) = 1, L0 (x1 ) = L0 (x2 ) = · · · = L0 (xn ) = 0

L1 (x1 ) = 1, L1 (x0 ) = L1 (x2 ) = · · · = Ln (xn ) = 0


In general
Lk (xk ) = 1 and Lk (xi ) = 0, i 6= k.

Thus
Pn (x0 ) = L0 (x0 )f0 + L1 (x0 )f1 + · · · + Ln (x0 )fn = f0

Pn (x1 ) = L0 (x1 )f0 + L1 (x1 )f1 + · · · + Ln (x1 )fn = 0 + f1 + · · · + 0 = f1


..
.
Pn (xn ) = L0 (xn )f0 + L1 (xn )f1 + · · · + Ln (xn )fn = 0 + 0 + · · · + 0 + fn = fn
That is, the polynomial Pn (x) has the property that Pn (xk ) = fk , k = 0, 1, · · · , n.
The polynomial Pn (x) defined by (3.3) is known as the Lagrange Interpolating Polynomial.

34
Example 2.1 Interpolate f (x) from the following table:
0 7
1 13
2 21
4 43

and find and approximation to f (3).

(x − 1)(x − 2)(x − 4)
L0 (x) =
(−1)(−2)(−4)
(x − 0)(x − 2)(x − 4)
L1 (x) =
1 · (−1)(−3)
(x − 0)(x − 1)(x − 4)
L2 (x) =
2 · 1 · (−2)
(x − 0)(x − 1)(x − 2)
L3 (x) =
4·3·2

P3 (x) = 7L0 (x) + 13L1 (x) + 21L2 (x) + 43L3(x)


Thus, P3 (3) = 7L0 (3) + 13L1 (3) + 21L2(3) + 43L3 (3) = 31.
1 3 1
Now, L0 (3) = , L1 (3) = −1, L2 (3) = , L3 (3) = .
4 2 4
So, P3 (3) = 7L0 (3) + 13L1 (3) + 21L2 (3) + 43L3 (3) = 31.
Verify: Note that f (x) in this case is f (x) = x2 + 5x + 7, and the exact value of f (x) at x = 3 is 31.
Example 2.2 Given

i xi f (xi )

1
0 2
2
1
1 2.5
2.5
1
2 4
4
We want to find f (3).

(x − x1 )(x − x2 ) (x − 2.5)(x − 4)
L0 (x) = = = (x − 2.5)(x − 4)
(x0 − x1 )(x0 − x2 ) (−0.5)(−2)
(x − x0 )(x − x2 ) (x − 2)(x − 4) 1
L1 (x) = = =− (x − 2)(x − 4)
(x1 − x0 )(x1 − x2 ) (0.5)(−1.5) 0.75
(x − x1 )(x − x0 ) 1
L2 (x) = = (x − 2.5)(x − 2)
(x2 − x1 )(x2 − x0 ) 3
1 1 1
L0 (x) +
P2 (x) = f (x0 )L0 (x) + f (x1 )L1 (x) + f (x2 )L2 (x) = L1 (x) + L2 (x)
2 2.5 4
   
1 1 1 1 1 1 1 .5
P2 (3) = L0 (3) + L1 (3) + L2 (3) = (−0.5) + + = 0.3250
2 2.5 4 2 2.5 0.75 4 3

35
1
Verify: (The value of f (x) at x = 3 is 3 ≈ 0.3333).

2.4 Error in Interpolation


If f (x) is approximated by an interpolating polynomial Pn (x), we would like to obtain an expression for the
error of interpolation at a give intermediate point, say, x̄.
That is, we would like to calculate E(x̄) = f (x̄) − Pn (x̄).
Note that, since Pn (xi ) = f (xi ), E(xi ) = 0, i = 0, 1, 2, · · · , n), that is, there are no errors of interpo-
lating at a tabulated point.
Here is a result for the error expression E(x̄).

Theorem 2.2 (Interpolation-Error Theorem) Let Pn (x) be the interpolating polynomial that interpo-
lates at (n + 1) distinct numbers in [a, b], and let f (x) be (n + 1) times continuously differentiable on [a, b].
Then for every x̄ in [a, b], there exists a number ξ = ξ(x̄) in (a, b) (depending on x̄) such that

f (n+1) (ξ(x̄)) Y
n
En (x̄) = f (x̄) − Pn (x̄) = (x̄ − xi ). (2.5)
(n + 1)! i=0

Proof: If x̄ is one of the numbers x0 , x1 , · · · , xn : then the result follows trivially. Because, the error in this
case is zero, and the result will hold for any arbitrary ξ.
Next, assume that x̄ is not one of the numbers x0 , x1 , · · · , xn .
Define a function g(t) in variable t in [a, b]:
 
(t − x0 )(t − x1 ) · · · (t − xn )
g(t) = f (t) − Pn (t) − [f (x̄) − Pn (x̄)] ∗ . (2.6)
(x̄ − x0 )(x̄ − x1 ) · · · (x̄ − xn )

Then, noting that f (xk ) = Pn (x), for k = 0, 1, 2, · · · , n, we have


 
(xk − x0 ) · · · (xk − xn )
g(xk ) = f (xk ) − Pn (xk ) − [f (x̄) − Pn (x̄)]
(x̄ − x0 ) · · · (x̄ − xn )
= Pn (xk ) − Pn (xk ) − [f (x̄) − Pn (x̄)] × 0 (2.7)

=0
(Note that the numerator of the fraction appearing above contains the term (xk − xk ) = 0).
Furthermore,  
(x̄ − x0 ) · · · (x̄ − xn )
g(x̄) = f (x̄) − Pn (x̄) − [f (x̄) − Pn (x̄)] ∗
(x̄ − x0 ) · · · (x̄ − xn )
= f (x̄) − Pn (x̄) − f (x̄) + Pn (x̄) (2.8)
=0
Thus, g(t) becomes identically zero at (n + 2) distinct points: x0 , x1 , · · · , xn , and x̄. Furthermore, g(t) is
(n + 1) times continuously differentiable, since f (x) is so.
Therefore, by generalized Rolle’s theorem, [ ], there exists a number ξ(x̄) in (a, b) such that g (n+1) (ξ) =
0.
Let’s compute g (n+1) (t) now. From (2.5) we have
 
dn+1 (t − x0 )(t − x1 ) · · · (t − xn )
g (n+1) (t) = f (n+1) (t) − Pn(n+1) (t) − [f (x̄) − Pn (x̄)] (2.9)
dn+1 ((x̄ − x0 )(x̄ − x1 ) · · · (x̄xn )

Then

36
 
dn+1 (t − x0 )(t − x1 ) · · · (t − xn )
dtn+1 (x̄ − x0 )(x̄ − x1 ) · · · (x̄xn )
1 dn+1
= · n+1 [(t − x0 )(t − x1 ) · · · (t − xn )]
(x̄ − x0 )(x̄ − x1 ) · · · (x̄ − xn ) dt
1
= (n + 1)!
(x̄ − x0 )(x̄ − x1 ) · · · (x̄ − xn )

(note that the expression within in (2.9) is a polynomial of degree (n + 1)).

(n+1) (n+1)
Also, Pn (t) = 0, because Pn is a polynomial of degree at most n. Thus, Pn (ξ) = 0.

So,
(f (x̄) − Pn (x̄))
g (n+1) (ξ) = f (n+1) (ξ) − (n + 1)! (2.10)
(x̄ − x0 ) · · · (x̄ − xn )
f (n+1) (ξ)
Since g (n+1) (ξ) = 0, from (2.10), we have En (x̄) = f (x̄) − Pn (x̄) = (x̄ − x0 ) · · · (x̄ − xn ).
(n + 1)!

Remark: To obtain the error of interpolation using the above theorem, we need to know the (n + 1)th
derivative of the f (x) or its absolute maximum value on the interval [a, b]. Since in practice this value is
hardly known, this error formula is of limited use only.

Example 2.3 Let’s compute the maximum absolute error for Example 2.2.
Here n = 2.

E2 (x̄) = f (x̄) − P2 (x̄)

f (3) (ξ)
= (x̄ − x0 )(x̄ − x1 )(x̄ − x2 )
3!
To know the maximum value of E2 (x̄), we need to know f (3) (x).
Let’s compute this now:
1 1 2 6
f (x) = , f 0 (x) = − 2 , f 00 (x) 3 , f (3) (x) = − 4 .
x x x x
6 6
So, |f (3) (ξ)| < 4 = for 0 < x ≤ 2.
2 16
Since x̄ = 3, x0 = 2, x1 = 2.5, x2 = 4, we have
6 1
× (3 − 2)(3 − 2.5)(3 − 4)| = 0.0625.
|E2 (x̄)| ≤ |
16 6
Note that in four-digit arithmetic, the difference between the value obtained by interpolation and the exact
value is 0.3333 − 0.3250 = 0.0083.

2.5 Simplification of the Error Bound for Equidistant Nodes


The error formula in Theorem 2.2 can be simplified in case the tabulated points (nodes) are equally spaced;
because, in this case it is possible to obtain a nice bound for the expression: (x̄ − x0 )(x̄ − x1 ) · · · (x̄ − xn ).
Suppose the nodes are equally spaced with spacing h; that is xi+1 − xi = h.

37
h h h h h

a = x0 x1 x2 b = xn
Then it can be shown [Exercise] that

hn+1
|(x̄ − x0 )(x̄ − x1 ) · · · (x̄ − xn )| ≤ n!
4
If we also assume that |f (n+1) (x)| ≤ M , then we have

M hn+1 M hn+1
|En (x̄)| = |f (x̄) − Pn (x̄)| ≤ n! = . (2.11)
(n + 1)! 4 4(n + 1)

Example 2.4 Suppose a table of values for f (x) = cos x has to be prepared in [0, 2π] with equal spacing
nodes of spacing h, using linear interpolation, with an error of interpolation of at most 5 × 10−8 . How
small should h be?
Here n = 1.
f (x) = cos x, f 0 (x) = − sin x, f 2 (x) = f 00 (x) = − cos x
max |f (2) (x)| = 1, for 0 ≤ x ≤ 2π
Thus M = 1.

So, by (2.9) above we have


h2
|E1 (x̄)| = |f (x̄) − P1 (x̄)| ≤ .
8
Since the maximum error has to be 5 × 10−7 , we must have:
h2 1
≤ 5 × 10−7 = × 10−6 . That is, h ≤ 6.3246 × 10−4 .
8 2

Example 2.5 Suppose a table is to be prepared for the function f (x) = x on [1, 2]. Determine the spaceing
h in a table such that the interpolation with a polynomial of degree 2 will give accuracy  = 5 × 10−8 .

We first compute the maximum absolute error.


3 −5
Since f (3) (x) = x 2 ,
8 3

M = f (3) (x) ≤ for 1 ≤ x ≤ 2.
8
3 h3 1 3
Thus, taking n = 2 in (2.9) the maximum (absolute) error is × 3 = h .
8 4x 32
1 3
Thus, to have an accuracy of  = 5 × 10−8 , we must have h < 5 × 10−8 or h3 < 160 × 10−8 .
32

This means that a spacing h of about h = 3
160 × 10−8 = 0.0117 will be needed in the Table to guarantee
the accuracy of 5 × 10−8 .

2.6 Divided Differences and the Newton-Interpolation Formula


A major difficulty with the Lagrange Interpolation is that one is not sure about the degree of interpolating
polynomial needed to achieve a certain accuracy. Thus, if the accuracy is not good enough with polynomial
of a certain degree, one needs to increase the degree of polynomial, and computations need to be started
all over again.

38
Furthermore, computing various Lagrangian polynomials is an expensive procedure. It is, indeed, desir-
able to have a formula which makes use of Pk−1 (x) in computing Pk (x).
The following form of interpolation, known as Newton’s interpolation allows us to do so.
The idea is to obtain the interpolating polynomial Pn (x) in the following form:

Pn (x) = a0 + a1 (x − x0 ) + a2 (x − x0 )(x − x1 ) + · · · + an (x − x0 )(x − x1 ) · · · (x − xn−1 ) (2.12)


The constants a0 through an can be determined as follows:
For x = x0 , Pn (x0 ) = a0 = f0
For x = x1 , Pn (x1 ) = a0 + a1 (x1 − x0 ) = f1 ,
which gives
f 1 − a0 f1 − f0
a1 = = .
x1 − x0 x1 − x0
The other numbers ai , i = 2, ..., n can similarly be obtained.
It is convenient to introduce the following notation, because we will show how the numbers a0 , ..., an can be
obtained using these notations.

f [xi+1 ] − f [xi ]
f (xi ) = f [xi ] and f [xi , xi+1 ] =
xi+1 − xi
Similarly,
f [xi+1 , ..., xi+k−1 , xi+k ] − f [xi , xi+1 , ..., xi+k−1 ]
f [xi , xi+1 , · · · , xi+k−1 , xi+k ] =
xi+k − xi
With these notations, we then have

a0 = f0 = f (x0 ) = f [x0 ]
f1 − f0 f (x1 ) − f (x0 ) f [x1 ] − f [x0 ]
a1 = = = = f [x0 , x1 ].
x1 − x0 x1 − x0 x1 − x0
Continuing this, it can be shown that [Exercise]

ak = f [x0 , x1 , · · · , xk ]. (2.13)

The number f [x0 , x1 , · · · , xk ] is called the k-th divided difference.

Substituting these expressions of ak in (2.13), the interpolating polynomial Pn (x) now can be written in
terms of the divided differences:
Pn (x) = f [x0 ] + f [x0 , x1 ](x − x0 ) + f [x0 , x1 , x2 ](x − x0 )(x − x1 ) + · · · +
(2.14)
f [x0 , x1 , · · · , xn ](x − x0 )(x − x1 ) · · · (x − xn−1 ).

Notes:
(i) Each divided difference can be obtained from two previous ones of lower orders.
For example, f [x0 , x1 , x2 ] can be computed from f [x0 , x1 ], and f [x1 , x2 ], and so on. Indeed, they can
be arranged in form of a table as shown below:

39
Table of Divided Differences (n = 4)
x f (x) 1st Divide Difference 2nd Divided Difference 3rd Divided Difference
x0 f0
f [x0 , x1 ]
x1 f1 f [x0 , x1 , x2 ]
f [x1 , x2 ]
f [x0 , x1 , x2 , x3 ]
x2 f2 f [x1 , x2 , x3 ]
f [x2 , x3 ]
f [x1 , x2 , x3 , x4 ]
x3 f3 f [x2 , x3 , x4 ]
f [x3 , x4 ]
x4 f4

(ii) Note that in computing Pn (x) we need only the diagonal entries of the above table; that is, we need
only f [x0 ], f [x0 , x1 ], · · · , f [x0 , x1 , · · · , xn ].
(iii) Since the divided differences are generated recursively, the interpolating polynomials of successively
higher degrees can also be generated recursively. Thus the work done previously can be used
gainfully.
For example,
P1 (x) = f [x0 ] + f [x1 , x0 ](x − x0 )
P2 (x) = f [x0 ] + f [x0 , x1 ](x − x0 ) + f [x0 , x1 , x2 ](x − x0 )(x − x1 )
= P1 (x) + f [x0 , x1 , x2 ](x − x0 )(x − x1 ).

Similarly, P3 (x) = P2 (x) + f [x0 , x1 , x2 , x3 ](x − x0 )(x − x1 )(x − x2 )


P4 (x) = P3 (x) + f [x0 , x1 , x2 , x3 , x4 ](x − x0 )(x − x1 )(x − x2 )(x − x3 )
and so on.
Thus, in computing P2 (x), P1 (x) has been gainfully used; in computing P3 (x), P2 (x) has been gainfully
used, etc.
Example 2.6 Interpolate at x = 2.5 using the following Table, with polynomials of degree 3 and 4.

i xi fi 1st diff. 2nd diff. 3rd diff. 4th diff. 5th diff.

0 1.0 0
0.3522
1 1.5 0.17609 −0.1023
0.2400 0.0265533
2 2.0 0.30103 −0.0491933 −0.006408
0.1761 0.0105333 −00.2172
3 3.0 0.47712 −0.0281267 .001412
0.1339 0.0051033
4 3.5 0.54407 −0.01792
0.11598
5 4.0 0.60206

40
From (2.14), with n = 3, we have

P3 (2.5) = 0 + (2.5 − 1.0)(.35222) + (2.5 − 1.0)(2.5 − 1.5)(−.1023)


+(2.5 − 1.0)(2.5 − 1.5)(2.5 − 2.0)(.0265533)
= .52827 − .15345 + .019915
= 0.394735
Similarly, with n = 4, we have
P4 (2.5) = P3 (2.5) + (2.5 − 1.0)(2.5 − 1.5)(2.5 − 2.0)(2.5 − 3.0)(−.006408)
= .394735 + .002403 = 0.397138 .

Note that P4 (2.5) − P3 (2.5) = 0.002403 .

Verification. The above is a table for log(x).


The exact value of log(2.5) (correct up to 5 decimal places) is 0.39794.

(Note that in computing P4 (2.5), P3 (2.5) computed previously has been gainfully used).
If we use the notation Dij = f [xi , . . . , xi+j ]. Then

Algorithm 2.1 Algorithm for Generating Divided Differences

Inputs:
The definite numbers x0 , x1 , · · · , xn and the values f0 , f1 , · · · , fn .

Outputs:
The Divided Differenced D00 , D11 , · · · Dnn .

Step 1: (Initialization). Set


Di,0 = fi , i = 0, 1, 2, · · · , n.

Step 2: For i = 1, 2, · · · , n do
j = 1, 2, · · · i do
Di,j−1 − Di−1,j−1
Dij =
xi − xi−j
End
A Relationship Between nth Divided Difference and the nth Derivative
The following theorem shows how the nth derivative of a function f (x) is related to the nth divided difference.
The proof is omitted. It can be found in any advanced numerical analysis text book (e.g., Atkins on (1978),
p. 144).

Theorem 2.3 Suppose f is n times continuously differentiable and x0 , x1 , · · · , xn are (n + 1) distinct num-
bers in [a, b]. Then there exists a number ξ in (a, b) such that
f (n) (ξ)
f [x0 , x1 , · · · , xn ] =
n!

41
The Newton Interpolation with Equally Spaced Nodes
Suppose again that x0 , · · · , xn are equally spaced with spacing h;
that is xi+1 − xi = h, i = 0, 1, 2, ..., n − 1. Let x − x0 = sh.
Then x − x0 = sh
x − x1 = x − x0 + x0 − x1 = (x − x0 ) − (x1 − x0 ) = sh − h = (s − 1)h
In general, x − xi = (s − i)h.
So,

Pn (x) = Pn (x0 + sh) = f [x0 ] + f [x0 , x1 ](x − x0 ) + · · · + f [x0 , x1 , · · · , xn ](x − x0 ) · · · (x − xn−1 )

= f [x0 ] + sh f [x0 , x1 ] + s(s − 1)h2 f [x0 , x1 , x2 ] + · · · + s(s − 1) · · · (s − h + 1)hn f [x0 , · · · , xh ]


X
n
s(s − 1) · · · (s − k + 1)hk f [x0 , · · · , xk ].
k=0
(2.15)
Invoke now the notation:  
s (s)(s − 1) · · · (s − k + 1)
=
k k!
We can then write n  
X s
Pn (x) = Pn (x0 + sh) = hk k!f [x0 , · · · , xn ] (2.16)
k
k=0

The Newton Forward-Difference Formula


Let’s introduce the notations:
∆fi = f (xi+1 ) − f (xi ).
Then, ∆f0 = f (x1 ) − f (x0 )
So,

f (x1 ) − f (x0 ) ∆f0


f [x0 , x1 ] = = . (2.17)
x1 − x0 h
Also, note that

∆2 f0 = ∆(∆f0 ) = ∆(f (x1 ) − f (x0 ))


= ∆f (x1 ) − ∆f (x0 )
(2.18)
= f (x2 ) − f (x1 ) − f (x1 ) + f (x0 )
= f (x2 ) − 2f (x1 ) + f (x0 )
So,

f [x1 , x2 ] − f [x0 , x1 ]
f [x0 , x1 , x2 ] =
(x2 − x0 )
f (x2 )−f (x1 ) f (x1 )−f (x0 )
x2 −x1 − x1 −x0
= (2.19)
(x2 − x0 )
f (x2 ) − 2f (x1 ) + f (x0 ) ∆2 f0
= =
h × 2h 2h2
In general, we have

Theorem 2.4
f [x0 , x1 , . . . , xk ]
1 (2.20)
= ∆k f0 .
k!hk

42
Proof is by induction on k.

For k = 0, the result is trivially true. We have also proved the result for k = 1, and k = 2.
Assume now that the result is true k = m. Then we need to show  m that the result
 is also true for k = m + 1.
∆ f1 ∆m f0
f [x1 , . . . , xm+1 ] − f [x0 , . . . , xm ] m!hm − m!hm ∆m (f1 − f0 ) ∆m+1 f0
Now, f [x0 , · · · , xm+1 ] = = = =
xm+1 − x0 (m + 1)h m!(m + 1)h m+1 (m + 1)!hm+1

The numbers ∆k fi are called k th order forward differences of f at x = i.


We now show how the interpolating polynomial Pn (x) given by (2.16) can be written using forward differences.

Pn (x) = Pn (x0 + sh) = f [x0 ] + shf [x0 , x1 ] + s(s − 1)h2 f [x0 , x1 , x2 ]


+ · · · + s(s − 1) · · · (s − n + 1)hn f [x0 , x1 . · · · , xn ]
X n
= s(s − 1) · · · (s − k + 1)hk f [x0 , x1 , · · · , xk ]
k=0  
Xn n  
X
s s
= k!hk f [x0 , x1 , · · · , xn ] = k
k ∆ f0 using (2.20).
k
k=0 k=0

Newton’s Forward-Difference Interpolation Formula

Let x0 , x1 , ..., xn be (n + 1) equidistant points with distance h; that is xi+1 − xi = h.


Then Newton’s interpolating polynomial Pn (x) of degree at most n, using forward-
differences ∆k f0 , k = 0, 1, 2, ..., n is given by
n 
X 
s
Pn (x) = ∆k f0 ,
k
k=0

x − x0
where s = .
h

Illustration: The Forward Difference Table with n = 4

x f (x) ∆f ∆2 f ∆3 f ∆4 f
x0 f0
∆f0
x1 f1 ∆2 f0
∆f1 ∆3 f0
2
x2 f2 ∆ f1 ∆3 f0
3
∆f2 ∆ f1
x3 f3 ∆2 f2
∆f3
x4 f4

Example 2.7 Let f (x) = ex .

43
x f ∆f ∆2 f ∆3 f
0 1
1.7183
1 2.7183 2.9525
4.6708 5.0731
2 7.3891 8.0256 8.7177
12.6964 13.7908
3 20.0855 21.8163
34.5127
4 54.5982
Let x = 1.5
x − x0 1.5 − 0
Then s = = = 1.5
h 1
         
1.5 1.5 1.5 1.5 1.5
P4 (1.5) = 1+ 1.7183+ 2.9525+ 5.0731+ 8.7177 = 1+1.5(1.7183)+
0 1 2 3 4
0.3750(2.9525) + (−0.0312)(5.0731) + (0.0020)(8.7177) = 4.5438
The correct answer up to 4 decimal digits is 4.4817.

Interpolation using Newton-Backward Differences


While interpolating at some value of x near the end of the difference table, it is logical to reorder the nodes
so that the end-differences can be used in computation. The backward differences allow us to do so.

The backward differences are defined by

∇fn = fn − fn−1 , n = 1, 2, 3, . . . ,

and

∇k fn = ∇(∇k−1 fn ), k = 2, 3, . . .

Thus, ∇2 fn = ∇(∇fn ) = ∇(fn − fn−1 )


= fn − fn−1 − (fn−1 − fn−2 )
= fn − 2fn−1 + fn−2 ,
and so on.
The following a relationship between the backward-differences and the divided differences can be obtained.
1
f [xn−k , · · · , xn−1 , xn ] = ∇k fn .
k!hk
Using these backward-differences, we can write the Newton interpolation formula as:

s(s + 1) 2 s(s + 1)...(s + h − 1) n


Pn (x) = fn + s∇fn + ∇ fn + ... + ∇ fn .
2! n!
 (−s)(−s − 1) · · · (−s − k + 1)
Again, using the notation −sk = k!
we can rewrite the above formula as:

44
Newton’s Backward-Difference
Interpolations Formula
X
n  
k −s
Pn (x) = (−1) 5k f n .
k
k=0

2.7 Spline-Interpolation
So far we have been considering interpolation by means of a single polynomial in the entire range.
Let’s now consider interpolation using different polynomials (but of the same degree) at different intervals.
Let the function f (x) be defined at the nodes a = x0 , x1 , x2 , ..., xn = b. The problem now is to construct
piecewise polynomials Sj (x) on each interval [xj , xj+1 ], j = 0, 1, 2, ..., n − 1, so that the resulting function
S(x) is an interpolant for the function f (x).
The simplest such polynomials are of course, linear polynomials (straight lines). The interpolating polynomial
in this case is called linear spline. Two most disadvantages of a linear spline are (i) the convergence is
rather very slow, and (ii) not suitable for applications demanding smooth approximations, since these splines
have corner at the knots.
“Just imagaine a cross section of an airplane wing in the form of a linear spline and you quickly decide to
go by rail. (Stewart (1998), p. 93).”
Likewise the quadratic splines have also certain disadvantages.
The most common and widely used splines are cubic splines. Assume that the cubic polynomial Sj (x),
has the following form:
Sj (x) = aj + bj (x − xj ) + cj (x − xj )2 + dj (x − xj )3 .
Since Sj (x) contains four unknowns, to construct n cubic polynomials S0 (x), S1 (x), ..., Sn−1 (x), we need 4n
conditions. To have these 4n conditions, a cubic spline S(x) for the function f (x) can be conveniently defined
as follows:
Cubic Spline Interpolant

A function S(x), denoted by Sj (x) over the interval [xj , xj+1 ], j = 0, 1, ..., n − 1 is called a cubic
spline interpolant if the following conditions hold:

(i) Sj (xj ) = fj , j = 0, 1, 2, ..., n.


(ii) Sj+1 (xj+1 ) = fj+1 , j = 0, 1, 2, ..., n − 2.
0
(iii) Sj+1 (xj+1 ) = Sj0 (xj+1 ), j = 0, 1, 2, ..., n − 2.
00
(iv) Sj+1 (xj+1 ) = Sj00 (xj+1 ), j = 0, 1, 2, ..., n − 2.

Then, since for j ther are four unkowns: aj , bj , cj , and dj and there are n such polynomials to be determined,
all together there are 4n unknowns. However, conditions (i)-(iv) above give only (4n − 2) equations: (i) gives
n + 1, and each of (ii)-(iv) gives n − 1. So, to completely determine the cubic spline, we must need two more
equations. To obtain these two additional equations, we can invoke boundary conditions.

• If the second derivative of S(x) can be approximated, than we can take: (i) S 00 (x0 ) = S 00 (xn ) = 0
(free or natural boundary).
• On the other hand, if only the first derivative can be estimated, we can use the following boundary
condition: (ii) S 0 (x0 ) = f 0 (x0 ), and S 0 (xn ) = f 0 (xn ) (clamped boundary).
• We will discuss only the clamped cubic spline here.
From the condition (i), it immediately follows that

45
aj = fj , j = 0, 1, 2, ..., n.

Set hj = xj+1 − xj . With some mathematical manipulations, it can then be shown (see for e.g., Burden and
Faires, pp. 144-145), that once aj ’s are known, the coefficients cj , j = 0, 1, 2, ..., n can be found by solving
the following linear algebraic system of equations:

Ax = r,

where
A=  
2h0 h0 0 0
 
 .. 
 h . 
 0 2(h0 + h1 ) h1 
 
 
 .. 
 0 h1 2(h1 + h2 ) h2 . 
 
 
 .. .. .. .. 
 . . . . 
 0 
 
 
 .. 
 . hn−2 2(hn−2 + hn−1 ) hn−1 
 
0 0 hn−1 2hn−1

 3 
h0 (a1 − a0 ) − 3f 0 (a)
 
c0  
 3
− a1 ) − − a0 ) 
3
 c1   h1 (a2 h0 (a1

   
x= .. , r =  .
 .   .. 
 . 
cn  
0 3
3f (b) − hn−1 (an − an−1 )
Once c0 , c1 , ..., cn are known by solving the above system of equations, the quantities bj and dj can be
computed as follows:

(aj+1 − aj ) hj
bj = − (cj+1 + 2cj ), j = n − 1, n − 2, · · · , 0
hj 3
cj+1 − cj
dj = , j = n − 1, n − 2, · · · , 0.
3hj

Note:
The matrix A is a symmetric strictly diagonally daminant matrix and can be solved in O(n) flops, as described
before. (See the book, Numerical Linear Algebra and Application, by Biswa Nath Datta, Brooks-Cole
Publishing Co., Pacific Grove, California, (1995)).

46
Algorithm 2.2 Computing Clamped Cubic Spline

Inputs:
(i) The nodes x0 , x1 , ..., xn .
(ii) The functional values:
f (xi ) = fi , i = 0, 1, ..., n.
(Note that a = x0 and b = xn ).
(iii) f 0 (x0 ) = f 0 (a) and f 0 (xn ) = f 0 (b).

Outputs: The coefficients a0 , ..., an ; b0 , ..., bn ; c0 , c1 , ..., cn , and d0 , d1 , ..., dn of the n polynomials
S0 (x), S1 (x), ..., Sn−1 (x) of which the cubic interpolant S(x) is composed.
Step 1. For i = 0, 1, ..., n − 1 do
Set hj = xj+1 − xj
Step 2. Compute a0 , a1 , ..., an :
For i = 0, 1, ..., n − 1 do
ai = f (xi ), i = 0, 1, ..., n.

End
Step 3. Compute the coefficients c0 , c1 , ..., cn by solving the system Ax = r, where A, x, and r are as defined
above.
Step 4. Compute the coefficients b0 , ..., bn and d0 , d1 , ..., dn as given in the box above.

47
Exercises

Y
n
(t − xi )
dn+1 i=0 (n + 1)!
1. Prove by induction that n+1 n = n
dt Y Y
(x − xi ) (x − xi )
i=0 i=0

2. Consider the following table


x f (x)
1 0
1.01 0.01
1.02 0.0198
1.03 0.0296

(a) Find an approximation of f (1.025) using


i. The associated Vandermonde linear system
ii. Lagrangian interpolating polynomial of degree 3
iii. Newton interpolating polynomial of degree 3
(b) Compute the maximum error in interpolation using the interpolating error formula.
(c) Compare the error made in each case (i)-(iii) with the maximum error.

3. Suppose that a table is to be prepared for the function f (x) = ln x on [1, 3.5] with equal spacing
nodes such that the interpolation with third degree polynomial will give an accuracy of  = 5 × 10−8 .
Determine how small h has to be to guarantee the above accuracy.

48
4. Prove by induction that
∆k f0
(a) f [x0 , x1 , · · · , xk ] = .
k!hk

(b) ak = f [x0 , x1 , · · · , xk ], where ak ’s are the coefficients of the Newton interpolating polynomial.

5. (a) Find an approximate value of log10 (5) using Newton’s forward difference formula with x0 =
1, x1 = 2.5, x2 = 4, and x3 = 5.5.

(b) Repeat part (a) using Newton’s backward differences.

(c) Compare the results.

6. Using the data of problem 2, estimate f (1.025) with linear and quadratic splines and compare your
results with those obtained there.

49
7. Show that f [x0 , x1 , · · · , xn ] does not change if the nodes are rearranged.

8. Let ψn (x) = (x − x0 )(x − x1 ) · · · (x − xn ). Let xi+1 − xi = h, i = 0, 1, · · · , n − 1. Then prove that


h2
(a) max |ψ1 (x)| =
4
x0 ≤ x ≤ x1


2 3 3
(b) max |ψ2 (x)| = h
9
x0 ≤ x ≤ x2

[Hint: To bound |ψ2 (x)|, shift this polynomial along the x-axis:
ψ̂2 (x) = (x + h)x(x − h) and obtain the bound of ψ̂2 (x)].

(c) max |ψ3 (x)| = h4


x0 ≤ x ≤ x3
1 4
max |ψ3 (x)| = h .
16
x1 ≤ x ≤ x2

(d) Using the results of (a), (b), and (c), obtain the maximum absolute error in each case for the
functions:

f (x) = ex , 0 ≤ x ≤ 2
Π
f (x) = sin x 0 ≤ x ≤
2

(e) The interpretation of the result of Part (c) is that if the interpolation nodes are chosen so that
the interpolation point x is chosen close to the midpoint of [x0 , x3 ], then the interpolation error is
smaller than if n were chosen anywhere between x0 and x3 . Verify this statement with the data
of Problem #2.

50

You might also like