Professional Documents
Culture Documents
Rootfinding
f (x) = 0 (7.1)
We will
explore some simple numerical methods for solving this equation,
and also will
consider some possible difficulties
Then f (x) changes sign on [a, b], and f (x) = 0 has at least one root on the
interval.
Definition
The simplest numerical procedure for finding a root is to repeatedly halve the
interval [a, b], keeping the half for which f (x) changes sign. This procedure is
called the bisection method, and is guaranteed to converge to a root,
denoted here by α.
Suppose that we are given an interval [a, b] satisfying (7.2) and an error
tolerance ε > 0.
The bisection method consists of the following steps:
a+b
B1 Define c = 2 .
The interval [a, b] is halved with each loop through steps B1 to B3.
The test B2 will be satisfied eventually, and with it the condition |α − c| ≤ ε
will be satisfied.
Notice that in the step B3 we test the sign of sign[f (b)] · sign[f (c)] in order
to avoid the possibility of underflow or overflow in the multiplication of f (b)
and f (c).
Example
Find the largest root of
f (x) ≡ x6 − x − 1 = 0 (7.3)
Use bisect.m
n a b c b−c f (c)
1 1.0000 2.0000 1.5000 0.5000 8.8906
2 1.0000 1.5000 1.2500 0.2500 1.5647
3 1.0000 1.2500 1.1250 0.1250 -0.0977
4 1.1250 1.2500 1.1875 0.0625 0.6167
5 1.1250 1.1875 1.1562 0.0312 0.2333
6 1.1250 1.1562 1.1406 0.0156 0.0616
7 1.1250 1.1406 1.1328 0.0078 -0.0196
8 1.1328 1.1406 1.1367 0.0039 0.0206
9 1.1328 1.1367 1.1348 0.0020 0.0004
10 1.1328 1.1348 1.1338 0.00098 -0.0096
log b−a
ε
n≥
log 2
The line tangent to the graph of y = f (x) at (x0 , f (x0 )) is the graph
of the linear Taylor polynomial:
i.e.,
f (x0 )
x1 = x0 − .
f 0 (x0 )
f (x1 )
x2 = x1 − .
f 0 (x1 )
Example
Using Newton’s method, solve (7.3) used earlier for the bisection method.
Here
f (x) = x6 − x − 1, f 0 (x) = 6x5 − 1
and the iteration
x6n − xn − 1
xn+1 = xn − , n≥0 (7.7)
6x5n − 1
· ·
The true root is α = 1.134724138, and x6 = α to nine significant digits.
Newton’s method may converge slowly at first. However, as the iterates come
closer to the root, the speed of convergence increases.
Use newton.m
Compare these results with the results for the bisection method.
Example
One way to compute ab on early computers (that had hardware arithmetic for
addition, subtraction and multiplication) was by multiplying a and 1b , with 1b
approximated by Newton’s method.
1
f (x) ≡ b −
=0
x
where we assume b > 0. The root is α = 1b , the derivative is
1
f 0 (x) =
x2
and Newton’s method is given by
1
b− xn
xn+1 = xn − 1 ,
x2n
i.e.,
xn+1 = xn (2 − bxn ), n≥0 (7.8)
equivalently
2
0 < x0 < (7.10)
b
3. Rootfinding Math 1070
> 3. Rootfinding > 3.2 Newton’s method
1
The iteration (7.8), xn+1 = xn (2 − bxn ), n ≥ 0, converges to α = b if and
only if the initial guess x0 satisfies
0 < x0 < 2b
1
Figure: The iterative solution of b − x =0
Error analysis
f 0 (α) 6= 0, (7.12)
i.e., the graph y = f (x) is not tangent to the x-axis when the graph
intersects it at x = α. The case in which f 0 (α) = 0 is treated in Section 3.5.
Note that combining (7.12) with the continuity of f 0 (x) implies that
f 0 (x) 6= 0 for all x near α.
By Taylor’s theorem
1
f (α) = f (xn ) + (α − xn )f 0 (xn ) + (α − xn )2 f 00 (cn )
2
with cn an unknown point between α and xn .
Note that f (α) = 0 by assumption, and then divide by f 0 (xn ) to obtain
00
f (xn ) 2 f (cn )
0= + α − xn + (α − xn ) .
f 0 (xn ) 2f 0 (xn )
−f 00 (cn )
2
α − xn+1 = (α − xn ) (7.13)
2f 0 (xn )
This formula says that the error in xn+1 is nearly proportional to the
square of the error in xn .
When the initial error is sufficiently small, this shows that the error in
the succeeding iterates will decrease very rapidly, just as in (7.11).
Example
x6n −xn −1
For the earlier iteration (7.7), i.e., xn+1 = xn − 6x5n −1 , n ≥ 0, we have
f 00 (x) = 30x4 . If we are near the root α, then
−f 00 (cn ) −f 00 (α) −30α4 ·
≈ = = −2.42
2f 0 (cn ) 2f 0 (α) 2(6α5 − 1)
Thus,
α − xn+1 ≈ M (α − xn )2 , n≥0
M (α − xn+1 ) ≈ [M (α − xn )]2
Assuming that all of the iterates are near α, then inductively we can show that
n
M (α − xn ) ≈ [M (α − x0 )]2 , n≥0
|M (α − x0 )| < 1
0
1 2f (α)
|α − x0 | < = 00 (7.16)
|M | f (α)
If the quantity |M | is very large, then x0 will have to be chosen very close to
α to obtain convergence. In such situation, the bisection method is probably
an easier method to use.
The choice of x0 can be very important in determining whether
Newton’s method will converge.
α − xn ≈ xn+1 − xn (7.17)
This is the standard error estimation formula for Newton’s method, and it is
usually fairly accurate.
However, this formula is not valid if f 0 (α) = 0, a case that is discussed in Section 3.5.
Example
Consider the error in the entry x3 of the previous table.
·
α − x3 = −4.73E − 3
·
x4 − x3 = −4.68E − 3
Example
Use Newton’s Method to find a root of f (x) = x2 .
f (xn ) x2n xn
xn+1 = xn − f 0 (xn ) = xn − 2xn = 2 .
So the method converges to the root α = 0, but the convergence is only linear
en
en+1 = .
2
Example
Use Newton’s Method to find a root of f (x) = xm .
xm m−1
xn+1 = xn − n
mxm−1 = m xn .
The method converges to the root α = 0, again with linear convergence
m−1
en+1 = en .
m
3. Rootfinding Math 1070
> 3. Rootfinding > Error Analysis - linear convergence
Theorem
Assume f ∈ C m+1 [a, b] and has a multiplicity m root α. Then Newton’s
Method is locally convergent to α, and the absolute error en satisfies
en+1 m−1
lim = . (7.18)
n→∞ en m
Example
Find the multiplicity of the root α = 0 of f (x) = sin x + x2 cos x − x2 − x,
and estimate the number of steps in NM for convergence to 6 correct decimal
places (use x0 = 1).
2
α−xn+1 0
f (xn ) + (α − xn )f 0 (xn ) 1 − m 1
+ f 00 (c) (α−x n)
= m 2! 00
α−xn+1 0 1
f 00 (ξ) + (α − xn )2 f 2!(c)
= m f (xn ) + (α − xn )2 1 − m
Example
Apply Newton’s Method to f (x) = −x4 + 3x2 + 2 with starting guess x0 = 1.
3.5
2.5
1.5
0.5
0
x1 x0
−0.5
−1
Assume that two initial guesses to α are known and denote them by
x0 and x1 .
They may occur
on opposite sides of α, or
on the same side of α.
1.5
0.5
x3
0
y
x0 x2 x1
−0.5
−1
−1.5
−0.2 0 0.2 0.4 0.6 0.8 1 1.2
Use secant.m
Example
We solve the equation f (x) ≡ x6 − x − 1 = 0.
n xn f (xn ) xn − xn−1 α − xn−1
0 2.0 61.0
1 1.0 -1.0 -1.0
2 1.01612903 -9.15E-1 1.61E-2 1.35E-1
3 1.19057777 6.57E-1 1.74E-1 1.19E-1
4 1.11765583 -1.68E-1 -7.29E-2 -5.59E-2
5 .113253155 -2.24E-2 -2.24E-2 1.71E-2
6 1.13481681 9.54E-4 2.29E-3 2.19E-3
7 1.13472365 -5.07E-6 -9.32E-5 -9.27E-5
8 1.13472414 -1.13E-9 4.92E-7 4.92E-7
The iterate x8 equals α rounded to nine significant digits.
As with the Newton method (7.7) for this equation, the initial iterates do not
converge rapidly. But as the iterates become closer to α, the speed of
convergence increases.
3. Rootfinding Math 1070
> 3. Rootfinding > 3.3.1 Error Analysis
−f 00 (ξn )
α − xn+1 = (α − xn )(α − xn−1 ) . (7.21)
2f 0 (ζn )
f (xn ) − f (xn−1 )
f 0 (xn ) ≈ .
xn − xn−1
Check that the use of this in the Newton formula (7.6) will yield (7.20).
The formula (7.21) can be used to obtain the further error result that
if x0 and x1 are chosen sufficiently close to α, then we have
convergence and
|α − xn+1 | f 00 (α) r−1
lim = 0 ≡c
n→∞ |α − xn |r 2f (α)
√
5+1 ·
where r = 2 = 1.62. Thus,
c = |M |r−1 .
The restriction (7.16) on the initial guess for Newton’s method can be
replaced by a similar one for the secant iterates, but we omit it.
3. Rootfinding Math 1070
> 3. Rootfinding > 3.3.1 Error Analysis
Finally, the result (7.22) can be used to justify the error estimate
α − xn−1 ≈ xn − xn−1
Example
For the iterate x5 in the previous Table
·
α − x5 = 2.19E − 3
· (7.23)
x6 − x5 = 2.29E − 3
General remarks
The derivation of both the Newton and secant methods illustrate a general
principle of numerical analysis.
When trying to solve a problem for which there is no direct or simple method
of solution, approximate it by another problem that you can solve more easily.
GENERAL OBSERVATION
When dealing with problems involving differentiable functions f (x),
move to a nearby problem by approximating each such f (x) with a
linear problem.
The linearization of mathematical problems is common throughout applied
mathematics and numerical analysis.
There are three generalization of the Secant method that are also
important. The Method of False Position, or Regula Falsi, is similar
to the Bisection Method, but where the midpoint is replaced by a
Secant Method-like approximation. Given an interval [a, b] that
brackets a root (assume that f (a)f (b) < 0), define the next point
bf (a) − af (b)
c=
f (a) − f (b)
as in the Secant Method, but unlike the Secant Method, the new point
is guaranteed to lie in [a, b], since the points (a, f (a)) and (b, f (b)) lie
on separate sides of the x-axis. The new interval, either [a, c] or [c, b],
is chosen according to whether f (a)f (c) < 0 or f (c)f (b) < 0,
respectively, and still brackets a root.
Example
Apply the Method of False Position on initial interval [-1,1] to find the
root r = 1 of f (x) = x3 − 2x2 + 23 x.
0 x0 x3 x4 x2 x1
−1
−2
y
−3
−4
−5
−6
−1.5 −1 −0.5 0 0.5 1 1.5
0 x0 x4 x3 x2 x1
−1
−2
y
−3
−4
−5
−6
−1.5 −1 −0.5 0 0.5 1 1.5
f (xn )
xn+1 = xn − , n = 0, 1, 2, . . .
f 0 (xn )
Definition
The solution α is called a fixed point of g.
Example
As motivational example, consider solving the equation
x2 − 5 = 0 (7.24)
√ ·
for the root α = 5 = 2.2361.
We give four methods to solve this equation
I1. xn+1 = 5 + xn − x2n x = x + c(x2 − a), c 6= 0
5 a
I2. xn+1 = x=
xn x
1
I3. xn+1 = 1 + xn − x2n x = x + c(x2 − a), c 6= 0
5
1 5 1 a
I4. xn+1 = xn + x= (x + )
2 xn 2 x
All four iterations have the property that if the sequence {xn : n ≥ 0} has a
limit α, then α is a root of (7.24). For each equation, check this as follows:
√
Replace xn and xn+1 by α, and then show that this implies α = ± 5.
3. Rootfinding Math 1070
> 3. Rootfinding > 3.4 Fixed point iteration
Lemma
Let g ∈ C[a, b]. Assume that g([a, b]) ⊂ [a, b], i.e.,
∀x ∈ [a, b], g(x) ∈ [a, b].
Then x = g(x) has at least one solution α in the interval [a, b].
3.5
2.5 y=g(x)
y=x
1.5
0.5
0.5 1 1.5 2 2.5 3 3.5
Lipschitz continuous
Definition
Given g : [a, b] → R, is called Lipschitz continuous with constant λ > 0
(denoted g ∈ Lipλ [a,b]) if ∃λ > 0 such that
Definition
g : [a, b] → R is called contraction map if g ∈ Lipλ [a, b] with λ < 1.
Lemma
Let g ∈ Lipλ [a, b] with λ < 1 and g([a, b]) ⊂ [a, b].
Then x = g(x) has exactly one solution α. Moreover, for
xn+1 = g(xn ), xn → α for any x0 ∈ [a, b] and
λn
|α − xn | ≤ |x1 − x0 |.
1−λ
Proof.
Existence: follows from previous Lemma.
Uniqueness: assume ∃α, β solutions: α = g(α), β = g(β).
If xn ∈ [a, b] then g(xn ) = xn+1 ∈ [a, b]) ⇒ {xn }n≥0 ⊂ [a, b].
Linear convergence with rate λ:
Error estimate
1
|α − xn | ≤ |xn − xn+1 |
1−λ
|α − xn+1 | ≤ λ|α − xn |
λ
=⇒ |α − xn+1 | ≤ |xn+1 − xn |
1−λ
Define
λ = max |g 0 (x)|.
x∈[a,b]
Theorem 2.6
Assume g ∈ C 1 [a, b], g([a, b]) ⊂ [a, b] and maxx∈[a,] |g 0 (x)| < 1. Then
1 x = g(x) has a unique solution α in [a, b],
2 xn → α ∀x0 ∈ [a, b],
λn
3 |α − xn | ≤ |x1 − x0 |,
1−λ
α − xn+1
4 lim = g 0 (α).
n→∞ α − xn
Proof.
3. Rootfinding Math 1070
> 3. Rootfinding > 3.4 Fixed point iteration
Theorem 2.7
Assume α solves x = g(x), g ∈ C 1 [Iα ], for some Iα 3 α, |g 0 (α)| < 1.
Then Theorem 2.6 holds for x0 close enough to α.
Proof.
Since |g 0 (α)| < 1 by continuity
Take x0 ∈ Iα close to x1 ∈ Iα
⇒ x1 ∈ Iα and, by induction, xn ∈ Iα .
⇒ Theorem 2.6 holds with [a, b] = Iα .
Examples
√
Recall α = 5.
√
1 g(x) = 5 + x − x2 ; , g 0 (x) = 1 − 2x, g 0 (α) = 1 − 2 5. Thus
√
the iteration will not converge to 5.
5 5
2 g(x) = 5/x, g 0 (x) = − ; g 0 (α) = − √ = −1.
x 2
( 5)2
We cannot conclude that the iteration converges or diverges.
From the Table, it is clear that the iterates will not converge to α.
√ ·
3 g(x) = 1 + x − 1 x2 , g 0 (x) = 1 − 2 x, g 0 (α) = 1 − 2
5 5 5 5 = 0.106,
i.e., the iteration will converge. Also,
|α − xn+1 | ≈ 0.106|α − xn |
Examples
√
Recall α = 5.
√
1 g(x) = 5 + x − x2 ; , g 0 (x) = 1 − 2x, g 0 (α) = 1 − 2 5. Thus
√
the iteration will not converge to 5.
5 5
2 g(x) = 5/x, g 0 (x) = − ; g 0 (α) = − √ = −1.
x 2
( 5)2
We cannot conclude that the iteration converges or diverges.
From the Table, it is clear that the iterates will not converge to α.
√ ·
3 g(x) = 1 + x − 1 x2 , g 0 (x) = 1 − 2 x, g 0 (α) = 1 − 2
5 5 5 5 = 0.106,
i.e., the iteration will converge. Also,
|α − xn+1 | ≈ 0.106|α − xn |
Examples
√
Recall α = 5.
√
1 g(x) = 5 + x − x2 ; , g 0 (x) = 1 − 2x, g 0 (α) = 1 − 2 5. Thus
√
the iteration will not converge to 5.
5 5
2 g(x) = 5/x, g 0 (x) = − ; g 0 (α) = − √ = −1.
x 2
( 5)2
We cannot conclude that the iteration converges or diverges.
From the Table, it is clear that the iterates will not converge to α.
√ ·
3 g(x) = 1 + x − 1 x2 , g 0 (x) = 1 − 2 x, g 0 (α) = 1 − 2
5 5 5 5 = 0.106,
i.e., the iteration will converge. Also,
|α − xn+1 | ≈ 0.106|α − xn |
Examples
√
Recall α = 5.
√
1 g(x) = 5 + x − x2 ; , g 0 (x) = 1 − 2x, g 0 (α) = 1 − 2 5. Thus
√
the iteration will not converge to 5.
5 5
2 g(x) = 5/x, g 0 (x) = − ; g 0 (α) = − √ = −1.
x 2
( 5)2
We cannot conclude that the iteration converges or diverges.
From the Table, it is clear that the iterates will not converge to α.
√ ·
3 g(x) = 1 + x − 1 x2 , g 0 (x) = 1 − 2 x, g 0 (α) = 1 − 2
5 5 5 5 = 0.106,
i.e., the iteration will converge. Also,
|α − xn+1 | ≈ 0.106|α − xn |
g 0 (x) = 1 + 2cx
√
α= 3
Need |g 0 (α)| < 1
√
−1 < 1 + 2c 3 < 1
√ 1
Optimal choice: 1 + 2c 3 = 0 =⇒ c = − 2√ 3
.
The possible behaviour of the fixed point iterates xn for various sizes
of g 0 (α).
To see the convergence, consider the case of x1 = g(x0 ), the height of
the graph of y = g(x) at x0 .
We bring the number x1 back to the x-axis by using the line y = x
and the height y = x1 .
We continue this with each iterate, obtaining a stairstep behaviour
when g 0 (α) > 0.
When g 0 (α) < 0, the iterates oscillate around the fixed point α, as can
be seen.
We need a more precise way to deal with the concept of the speed of
convergence of an iteration method.
Definition
We say that a sequence {xn : n ≥ 0} converges to α with an order of
convergence p ≥ 1 if
|α − xn+1 | ≈ |g 0 (α)||α − xn |
Theorem 2.8
Assume g ∈ C p (Iα ) for some Iα containing α, and
g 0 (α) = g 00 (α) = . . . = g (p−1) (α) = 0, p ≥ 2.
Then for x0 close enough to α, xn → α and
α − xn+1 g (p) (α)
lim p
= (−1)p−1
n→∞ (α − xn ) p!
i.e. convergence is of order p.
Proof: xn+1 = g(xn )
(xn − α)p−1 (p−1)
= g(α) +(xn − α) g 0 (α) + . . . + g (α)
|{z} | {z } (p − 1)! | {z }
=α =0 =0
(xn − α)p (p)
+ g (ξn )
p!
(xn − α)p (p)
α − xn+1 = − g (ξn ).
p!
α − xn+1 g (p) (ξn ) g (p) (α)
p
= (−1)p−1 −→ (−1)p−1
(α − xn ) p! p!
3. Rootfinding Math 1070
> 3. Rootfinding > 3.4 Fixed point iteration
f (xn )
xn+1 = xn − , n>0
f 0 (xn )
f (x)
= g(xn ), g(xn ) = x − ;
f 0 (x)
f f 00
g 0 (x) = g 0 (α) = 0
(f 0 )2
f 0 f 00 + f f 000 f f 00 f 00 (α)
g 00 (x) = − 2 , g 00 (α) =
(f 0 )2 (f 0 )3 f 0 (α)
f (xn )
xn+1 = xn − a
Ex.: a = f 0 (x0 ).
f (xn )
xn+1 = xn − f 0 (x0 ) = g(xn ).
0 (α)
f
|g 0 (α)|
Need < 1 for convergence: 1 −
a
0
Linear convergence wit rate 1 − f a(α) . (Thm 2.6.)
0 (α)
0
f
If a = f (x0 ) and x0 is close enough to α, then 1 − .
a
Recall
Theorem 2.6
xn+1 = g(xn )
xn → α
α − xn+1
−→ g 0 (α)
α − xn
From (7.28)-(7.29)
1
α − xn = (α − xn ) + (xn−1 − xn )
g 0 (ξn−1 )
g 0 (ξn−1 )
α − xn = (xn−1 − xn )
1 − g 0 (ξn−1 )
g 0 (ξn−1 )
α − xn = (xn−1 − xn )
1 − g 0 (ξn−1 )
g 0 (ξn−1 ) g 0 (α)
≈
1 − g 0 (ξn−1 ) 1 − g 0 (α)
Need an estimate for g 0 (α).
Define
xn − xn−1
λn =
xn−1 − xn−2
and
(α − xn−1 ) − (α − xn )
λn =
(α − xn−2 ) − (α − xn−1 )
(α − xn−1 ) − g 0 (ξn−1 )(α − xn−1 )
=
(α − xn−1 )/g 0 (ξn−2 ) − (α − xn−1 )
1 − g 0 (ξn−1 ) 0
= g (ξn−2 )
1 − g 0 (ξn−2 )
λn → g 0 (α) as ξn → α : λn ≈ g 0 (α)
λn
α − xn = (xn − xn−1 ) (7.30)
1 − λn
From (7.30)
λn
α ≈ xn + (xn − xn−1 ) (7.31)
1 − λn
Define
Aitken Extrapolation Formula
λn
x̂n = xn + (xn − xn−1 ) (7.32)
1 − λn
Example
Repeat the example for I3.
The Table contains the differences xn − xn−1 , the ratios λn , and the
λn
estimated error from α − xn ≈ 1−λ n
(xn − xn−1 ), given in the column
Estimate. Compare the column Estimate with the error column in the
previous Table.
n xn xn − xn−1 λn Estimate
0 2.5
1 2.25 -2.50E-1
2 2.2375 -1.25E-2 0.0500 -6.58E-4
3 2.23621875 -1.28E-3 0.1025 -1.46E-4
4 2.23608389 -1.35E-4 0.1053 -1.59E-5
5 2.23606966 -1.42E-5 0.1055 -1.68E-6
6 2.23606815 -1.50E-6 0.1056 -1.77E-7
7 2.23606800 -1.59E-7 0.1056 -1.87E-8
Algorithm (Aitken)
General remarks
Quasi-Newton Iterates
(
x0
f (x) = 0 f (xk )
xk+1 = xk − ak , k = 0, 1, . . .
Quasi-Newton Iterates
1 ak = f (xk +ff(x(xkk))−f
)
(xk )
⇒ Steffensen Method. This is Finite
Difference Method with hk = f (xk ) ⇒ quadratic convergence.
f (x )−f (x )
2 ak = xkk −x 0 k0 where k 0 is the largest index < k such that
k
f (xk )f (xk0 ) < 0 ⇒ Regula False
Need x0 , x1 : f (x0 )f (x1 ) < 0
x1 − x0
x2 = x1 − f (x1 )
f (x1 ) − f (x0 )
x2 − x0
x3 = x2 − f (x2 )
f (x2 ) − f (x0 )
If g 00 (α) 6= 0, then this formula shows that the iteration xn+1 = g(xn ) is of
order 2 or is quadratically convergent.
3. Rootfinding Math 1070
> 3. Rootfinding > 3.4.2 High-Order Iteration Methods
Example.
(a) f (x) = (x − 1)2 (x + 2) has two roots. The root α = 1 has
multiplicity 2, and α = −2 is a simple root.
(b) f (x) = x3 − 3x2 + 3x − 1 has α = 1 as a root of multiplicity 3.
To see this, note that
When the Newton and secant methods are applied to the calculation of a
multiple root α, the convergence of α − xn to zero is much slower than it
would be for simple root. In addition, there is a large interval of uncertainty
as to where the root actually lies, because of the noise in evaluating f (x).
The large interval of uncertainty for a multiple root is the most serious
problem associated with numerically finding such a root.
The computer used is decimal with six digits in the significand, and it uses
rounding. The function f (x) is evaluated in the nested form of (7.39), and
f 0 (x) is evaluated similarly. The results are given in the Table.
n xn f (xn ) α − xn Ratio
0 0.800000 0.03510 0.300000
1 0.892857 0.01073 0.207143 0.690
2 0.958176 0.00325 0.141824 0.685
3 1.00344 0.00099 0.09656 0.681
4 1.03486 0.00029 0.06514 0.675
5 1.05581 0.00009 0.04419 0.678
6 1.07028 0.00003 0.02972 0.673
7 1.08092 0.0 0.01908 0.642
and the error decreases at about the constant rate. In our example,
λ = 32 , since the root has multiplicity m = 3, which corresponds to the
values in the last column of the table. The error formula (7.42) implies
a much slower rate of convergence than is usual for Newton’s method.
With any root of multiplicity m ≥ 2, the number λ ≥ 21 ; thus, the
bisection method is always at least as fast as Newton’s method for
multiple roots. Of course, m must be an odd integer to have f (x)
change sign at x = α, thus permitting the bisection method to be
applied.
3. Rootfinding Math 1070
> 3. Rootfinding > 3.5 The Numerical Evaluation of Multiple Roots
f (xk )
xk+1 = xk −
f 0 (xk )
f (x) = (x − α)p h(x), p≥0
(x − α)h(x)
g(x) = x −
ph(x) + (x − α)h0 (x)
Differentiating
0 h(x) d h(x)
g (x) = 1− −(x−α)
ph(x) + (x − α)h0 (x) dx ph(x) + (x − α)h0 (x)
and
1 p−1
g 0 (α) = 1 − =
p p
Quasi-Newton Iterates
E.g. p = 2, p−1 1
p = 2.
f (xk )
xk+1 = xk − p
f 0 (xk )
xk+1 = g(xk )
f (x)
g(x) = x − p
f 0 (x)
p
g 0 (α) = 1 − = 0
p
α − xk+1 g 00 (α)
lim =
k→∞ (x − xk )2 2
Roots of polynomials
p(x) = 0
p(x) = a0 + a1 x + . . . + an xn , an 6= 0.
p(x) = an (x − z1 )(x − z2 ) . . . (x − zn ), z1 , . . . , zn ∈ C.
2. Cauchy
ai
|ζi | ≤ 1 + max
0≤i≤n−1 an
ρ1 ≤ |ζj | ≤ ρ2 .
p(x) = a0 + a1 x + a2 x2 + . . . + an xn (7.43)
p(x) = a0 + x (a1 + x (a2 + . . . + x (an−1 + an x)) . . .) (7.44)
bn = an
bk = ak + ζbk+1 , k = n − 1, n − 2, . . . , 0
Consider
q(x) = b1 + b2 x + . . . + bn xn−1 .
Claim:
p(x) = b0 + (x − ζ)q(x).
Proof.
b0 + (x − ζ)q(x)
= b0 + (x − ζ)(b1 + b2 x + . . . + bn xn−1 )
= b0 − ζb1 + (b1 − b2 ζ) x + . . . + (bn−1 − bn ζ) xn−1 + bn xn
| {z } | {z } | {z } |{z}
a0 a1 an−1 an
n
= a0 + a1 x + . . . + an x = p(x).
Deflation
p(xk )
xk+1 = xk − , k = 0, 1, 2, . . .
p0 (xk )
To evaluate p and p0 at x = ζ:
p(ζ) = b0
p0 (x) = q(x) + (x − ζ)q 0 (x)
p0 (ζ) = q(ζ)
Given: a = (a0 , a1 , . . . , an )
Output: b = b(b1 , b2 , . . . , bn ): coefficients of deflated polynomial g(x);
root.
Newton(a, n, x0 , ε, itmax, root, b, ierr)
itnum = 1
1. ζ := x0 ; bn := an ; c := an
for k = n − 1, . . . , 1 bk := ak + ζbk+1 ; c := bk + ζc p0 (ζ)
b0 := a0 + ζb1 p(ζ)
if c = 0, iter = 2, exit p0 (ζ) = 0
x1 = x0 − b0 /c x1 = x0 − p(x0 )/p0 (x0 )
if |x0 − x1 | ≤ ε, then ierr= 0: root = x, exit
it itnum = itmax, then ierr = 1, exit
itnum = itnum + 1, x0 := x1 ,
quad go to 1.
3. Rootfinding Math 1070
> 3. Rootfinding > Roots of polynomials
Conditioning
p(x) = a0 + a1 x + . . . + an xn
roots: ζ1 , . . . , ζn
Perturbation polynomial q(x) = b0 + b1 x + . . . + bn xn
Perturbed polynomial p(x, ε) = p(x) + εq(x)
= (a0 + εb0 ) + (a1 + εb1 )x + . . . + (an + εbn )xn
roots: ζj (ε) - continuous functions of ε, ζi (0) = ζi .
(Absolute) Conditioning number
|ζj (ε) − ζj |
kζj = lim
ε→0 |ε|
Example
(x − 1)3 = 0 ζ1 = ζ2 = ζ3 = 1
3
(x − 1) − ε = 0 (q(x) = −1)
Set y = x − 1 and a = ε1/3
p(x, ε) = y 3 − ε = y 3 − a3 = (y − a)(y 2 + ya + a2 )
y1 = a
√ √
−a ± −3a2
−a(1 ± i 3)
y2,3 = =
2 2 √
2 1−i 3
y2 = −aω, y3 = −aω , ω= , |ω| = 1
2
ζ1 (ε) = 1 + ε1/3
ζ2 (ε) = 1 − ωε1/3
ζ3 (ε) = 1 − ω 2 ε1/3
|ζj (ε) − 1| = ε1/3
ζj (ε) − 1 ε1/3 ε→0
Conditioning number = −→ ∞
ε ε
General argument
p(ζ) = 0
p0 (ζ) 6= 0
p(x, ε) : root ζ(ε)
X∞
ζ(ε) = ζ + γ` ε`
`=1
= ζ +γ1 ε + γ2 ε2 + . . .
|{z} | {z }
this is what matters negligeable if εis small
ζ(ε) − ζ
= γ1 + γ2 ε + . . . −→ γ1
ε ε→∞
To find γ1 :
ζ 0 (0) = γ1
p(ζ(ε), ε) = 0
p(ζ(ε)) + εq(ζ(ε)) = 0
p0 (ζ(ε))ζ 0 (ε) + q(ζ(ε)) + εq 0 (ζ(ε))ζ 0 (ε) =
ε=0
q(ζ)
p0 (ζ) ζ 0 (0) +q(ζ) = 0 =⇒ γ1 = − 0
| {z } p (ζ)
γ1
q(ζ)
kζ = |γ1 | = 0
p (ζ)
k is large if p0 (ζ) is close to zero.
Example
7
Y
p(x) = W7 = (x − i)
i=1
q(x) = x6 , ε = −0.002
Y 7
p0 (ζj ) = (j − `) ζj = j
`=1,`6=j 6
q(ζj )
kζj = 0 = Q j
p (ζj ) 7
`=1 (j − `)
In particular,
q(ζj )
ζj (ε) ≈ j + ε = j + δ(j).
p0 (ζj )
f1 (x1 , . . . , xm ) = 0,
f2 (x1 , . . . , xm ) = 0,
.. (7.45)
.
fm (x1 , . . . , xm ) = 0.
If we denote
f1 (x)
f2 (x)
F= : Rm → Rm ,
..
.
fm (x)
then (7.45) is equivalent to writing
F(x) = 0. (7.46)
x = G(x), G : Rm → Rm
Solution α: α = G(α) is called a fixed point of G.
Example: F(x) = 0
x = x − AF(x) for some A ∈ Rm×m , nonsingular matrix.
= G(x)
Iteration:
initial guess x0
xn+1 = G(xn ), n = 0, 1, 2, . . .
Recall x ∈ Rm
m
!1/p
X
kxkp = |xi |p , 1 ≤ p < ∞,
i=1
kxk∞ = max |xi |.
1≤i≤n
Recall x ∈ Rm
m
!1/p
X
kxkp = |xi |p , 1 ≤ p < ∞,
i=1
kxk∞ = max |xi |.
1≤i≤n
Proof
k−1
X k−1
X
kxk − x0 k = k (xi+1 − xi )k ≤ kxi+1 − xi k
i=0 i=0
k−1
X 1 − λk
≤ λi kx1 − x0 k = kx1 − x0 k
1−λ
i=0
1
< kx1 − x0 k.
1−λ
∀k, `:
xn+1 = G(xn )
↓n→∞
α = G(α)
⇒ α is a fixed point.
Uniqueness: Assume β = G(β)
Jacobian matrix
Definition
F : Rm → Rm is continuously differentiable (F ∈ C 1 (Rm )) if, for every
x ∈ Rm ,
∂fi (x)
, i, j = 1, . . . , m
∂xj
exist.
∂f1 (x) ∂f1 (x)
∂x1 ... ∂xm
def
F0 (x) := .. ..
. .
∂fm (x) ∂fm (x)
∂x1 ... ∂xm m×m
∂fi (x)
(F0 (x))ij = ∂xj , i, j = 1, . . . , m.
∂f (z) ∂f (z)
∇f (z)T (x − y) = (x1 − y1 ) + . . . + (xm − ym ).
∂x1 ∂xm
Note:
f1 (x)
F(x) = ..
,
.
fm (x)
fi (x) − fi (y) = ∇fi (zi )T (x − y), i = 1, . . . , n.
∇g1 (α)T
..
If xn → α, Jn → = G0 (α).
.
∇gm (α)T
The size of G0 (α) will affect convergence.
Theorem 2.9
Assume
D is closed, bounded, convex subset of Rm .
G ∈ C 1 (D)
G(D) ⊂ D
λ = maxx∈D kG0 (x)k∞ < 1.
Then
(i) x = G(x) has a unique solution α ∈ D
(ii) ∀x0 ∈ D, xn+1 = G(xn ) converges to α.
(iii) kα − xn+1 k∞ ≤ (kG0 (α)k∞ + εn ) kα − xn k∞ ,
whenever εn −→ 0.
n→∞
Proof: ∀x, y ∈ D
kα − xn+1 k∞ ≤ kJn k∞ kα − xn k∞
≤ kJn − G0 (α)k∞ +kG0 (α)k∞ kα − xn k∞ .
| {z }
εn −→ 0
n→∞
Example (p.104)
Solve
f1 ≡ 3x21 + 4x22 − 1 = 0
, for α near (x1 , x2 ) = (−.5, .25).
f2 ≡ x32 − 8x31 − 1 = 0
Iteratively
3x21,n + 4x22,n − 1
x1,n+1 x1,n .016 −.17
= −
x2,n+1 x2,n .52 −.26 x32,n − 8x31,n − 1
Example (p.104)
xn+1 = xn − AF(xn )
| {z }
G(x)
kα − xn+1 k∞ −1
kG0 (α)k∞ ≈ 0.04, −→ 0.04, A = F0 (x0 )
kα − xn k∞
Why?
G0 (x) = I − AF0 (x), G0 (α) = I − AF0 (α)
Need
kG0 (α)k∞ ≈ 0
−1 −1
A ≈ F0 (α) , A = F0 (x0 )
m dimensional Parallel Chords Method
xn+1 = xn − (F0 (x0 ))−1 F(xn )
3. Rootfinding Math 1070
> 3. Rootfinding > Systems of Nonlinear Equations
In practice:
1 Solve a linear system F0 (xn )δ n = −F(xn )
2 Set xn+1 = xn + δ n
kG0 (α)k∞ = 0
If G0 ∈ C 1 (Br (α)) where Br (α) = {y : ky − αk ≤ r}, by continuity:
kG0 (α)k∞ < 1 for x ∈ Br (x)
for some r.
By Theorem 2.9 D = Br ⇒ linear convergence.
Theorem
Assume
F0 ∈ C 1 (D)
∃α ∈ D such that F(α) = 0
F0 (α) ∈ Lipγ (D)
∃ (F0 (α))−1 and kF0 (α)k−1 ≤ β
Then ∃ε > 0 such that if kx0 − αk < ε,
=⇒ xn+1 = xn − (F0 (xn ))−1 F(xn ) → α and kxn+1−αk ≤ βγkxn−αk2 .
xn+1 = xn − A−1
n F(xn ), An ≈ F0 (xn )
Ex.: Finite Difference Newton
fi (xn + hn ej ) − fi (xn ) ∂fi (xn )
An = aij = ≈ ,
hn ∂xj
√
hn ≈ δ and ej = (0 . . . 0 1 0 . . . 0)T .
↑
j th position
Global Convergence
Newton’s method:
xn+1 = xn + sn dn
when
If Newton step sn not satisfactory, e.g. kF(xn+1 )k2 > kF(xn )k2
defined through the user function fun starting from the vector x0 as initial
guess.
The function fun returns the n values f1 (x), . . . , fn (x) for any value of the
input vector x.
f (x1 , x2 , . . . , xn ) : Rn → R;
min f (x)
x∈Rn
Solve:
∂f
∂x1
∇f (x) = 0 with ∇f = ..
. (7.48)
.
∂f
∂xn
Hamiltonian
∂2f
Hij =
∂xi ∂xj
xn+1 = xn − H(xn )−1 ∇f (xn )
Taylor:
1
mn (x) = f (xn ) + ∇f (xn )T (x − xn ) + (x − xn )T H(xn )(x − xn )
2
mn (x) ≈ f (x) for x near xn .
Newton’s method:
∇2 mn (xn ) = H(xn ).
Modify
e −1 (xn )∇f (xn )
xn+1 = xn − H
where
H(x
e n ) = H(xn ) + µn I,
for some µn ≥ 0.
If λ1 , . . . , λn eigenvalues of H(xn ) and λ
e1 , . . . , λ
en eigenvalues of H.
e
λ
ei = λi + µn
Gershgorin Circles
r2=2
r1=3
1
r3=1
-2 1 5 7
-2.4 -1.6 -0.8 0 0.8 1.6 2.4 3.2 4 4.8 5.6 6.4 7.2 8
-1
-2
-3
Descent Methods
Definition
d is a descent direction for f (x) at point x0 if
f (x0 ) > f (x0 + αd) for 0 ≤ α < α0
Lemma
d is descent direction if and only if ∇f (x0 )T d < 0.