Professional Documents
Culture Documents
1 Introduction
We know that the quadratic equations of the form
ax2 + bx + cx = 0
x3 + x2 − 3x = 3.
xey = 1,
− x2 + y = 1.
NB!
1
2 Scalar equations
In this section we discuss the solution of scalar equations. The techniques we will use are known
from previous courses. When they are repeated here, it is because the techniques used to develop
and analyse these methods can, at least in principle, be extended to systems of equations. We will
also emphasise the error analysis of the methods.
A scalar equation is given by
f ( x ) = 0, (1)
where f is a continuous function defined on some interval [ a, b]. A solution r of the equation is
called a zero or a root of f . A nonlinear equation may have one, more than one, or no roots.
Example 1: Given the function f ( x ) = x3 + x2 − 3x − 3 on the interval [−2, 2]. Plot the function
on the interval, and locate possible roots.
import numpy as np
from numpy import pi
from numpy.linalg import solve, norm # Solve linear systems and compute norms
import matplotlib.pyplot as plt
2
According to the plot, the function f has three real roots in the interval [−2, 2]. The function
can be rewritten as
f ( x ) = x3 + x2 − 3x − 3 = ( x + 1)( x2 − 3).
√
Thus the roots of f are −1 and ± 3. We will use this example as a test problem later on.
• If f ∈ C [ a, b] with f ( a) and f (b) of opposite sign, then there exist at least one r ∈ ( a, b) such
that f (r ) = 0.
The first condition guarantees that the graph of f will pass the x-axis at some point r, the
second guarantees that the function is either strictly increasing or strictly decreasing.
We note that the condition that f ( a) and f (b) are of opposite sign can be summarised in the
condition that f ( a) · f (b) < 0.
3
Bisection method.
• Given the function f and the interval I0 := [ a, b], such that f ( a) · f (b) < 0.
• Set a0 = a, b0 = b.
• For k = 0, 1, 2, 3, . . .
a k + bk
• ck =
2
(
[ ak , ck ] if f ( ak ) · f (ck ) ≤ 0
• Ik+1 := [ ak+1 , bk+1 ] =
[ck , bk ] if f (bk ) · f (ck ) ≤ 0
Use ck as the approximation to the root. Since ck is the midpoint of the interval [ ak , bk ], the error
satisfies |ck − r | 6 (bk − ak )/2. The loop is terminated when (bk − ak )/2 is smaller than some user
specified tolerance. Since the length of interval Ik satisfies (bk − ak ) = 2−k (b − a), we also have an
a priori estimate of the error after k bisections,
1 1
| c k − r | 6 ( bk − a k ) 6 k + 1 ( b − a )
2 2
Consequently, we can estimate how many bisections we have to perform to compute a root up to
a given tolerance tol by simply requiring that
b−a
1 b−a log
(b − a) 6 tol ⇔ 6 2k ⇔ 2 tol 6 k.
2k +1 2 tol log 2
2.2.1 Implementation
The algorithm is implemented in the function bisection().
fa = f(a)
fb = f(b)
if fa*fb > 0:
print('Error: f(a)*f(b)>0, there may be no root in the interval.')
return 0, 0
for k in range(max_iter):
4
c = 0.5*(a+b) # The midpoint
fc = f(c)
print('k ={:3d}, a = {:.6f}, b = {:.6f}, c = {:10.6f}, f(c) = {:10.3e}'
.format(k, a, b, c, fc))
if abs(f(c)) < 1.e-14 or (b-a) < 2*tol: # The zero is found!
break
elif fa*fc <= 0:
b = c # There is a root in [a, c]
else:
a = c # There is a root in [c, b]
return c, k
Example 2: Use the code above to find the root of
f ( x ) = x3 + x2 − 3x − 3
in the interval [1.5, 2]. Use 10−6 as the tolerance.
In [8]: # Example 2
def f(x): # Define the function
return x**3+x**2-3*x-3
a, b = 1.5, 2 # The interval
c, nit = bisection(f, a, b, tol=1.e-6) # Apply the bisection method
k = 0, a = 1.500000, b = 2.000000, c = 1.750000, f(c) = 1.719e-01
k = 1, a = 1.500000, b = 1.750000, c = 1.625000, f(c) = -9.434e-01
k = 2, a = 1.625000, b = 1.750000, c = 1.687500, f(c) = -4.094e-01
k = 3, a = 1.687500, b = 1.750000, c = 1.718750, f(c) = -1.248e-01
k = 4, a = 1.718750, b = 1.750000, c = 1.734375, f(c) = 2.203e-02
k = 5, a = 1.718750, b = 1.734375, c = 1.726562, f(c) = -5.176e-02
k = 6, a = 1.726562, b = 1.734375, c = 1.730469, f(c) = -1.496e-02
k = 7, a = 1.730469, b = 1.734375, c = 1.732422, f(c) = 3.513e-03
k = 8, a = 1.730469, b = 1.732422, c = 1.731445, f(c) = -5.728e-03
k = 9, a = 1.731445, b = 1.732422, c = 1.731934, f(c) = -1.109e-03
k = 10, a = 1.731934, b = 1.732422, c = 1.732178, f(c) = 1.201e-03
k = 11, a = 1.731934, b = 1.732178, c = 1.732056, f(c) = 4.596e-05
k = 12, a = 1.731934, b = 1.732056, c = 1.731995, f(c) = -5.317e-04
k = 13, a = 1.731995, b = 1.732056, c = 1.732025, f(c) = -2.429e-04
k = 14, a = 1.732025, b = 1.732056, c = 1.732040, f(c) = -9.845e-05
k = 15, a = 1.732040, b = 1.732056, c = 1.732048, f(c) = -2.624e-05
k = 16, a = 1.732048, b = 1.732056, c = 1.732052, f(c) = 9.860e-06
k = 17, a = 1.732048, b = 1.732052, c = 1.732050, f(c) = -8.192e-06
k = 18, a = 1.732050, b = 1.732052, c = 1.732051, f(c) = 8.340e-07
5
2.3 Exercise 1: Bisection method
a) Compute the solution(s) of x2 + sin( x ) − 0.5 = 0 by the bisection method.
c) Given a root in the interval [1.5, 2]. How many iterations are required to guarantee that the
error is less than 10−4 .
• Rewrite the equation in fixed point form x = g( x ) such that the root r of f is a fixed point of g,
that is, r satisfies r = g(r ).
• For k = 0, 1, 2, 3, . . .
x k +1 = g ( x k )
2.4.1 Implementation
The fixed point scheme is implemented in the function fixedpoint. The iterations are terminated
when either the error estimate | xk+1 − xk | is less than a given tolerance tol, or the number of
iterations reaches some maximal number max_iter.
6
err = abs(x-x_old) # Error estimate
print('k ={:3d}, x = {:14.10f}'.format(k+1, x))
if err < tol: # The solution is accepted
break
return x, k+1
x3 + x2 − 3
x3 + x2 − 3x − 3 = 0 can be rewritten as x= .
3
x 3 + x 2 −3
The fixed points are the intersections of the two graphs y = x and y = 3 , as can be demon-
strated by the following script:
In [11]: # Example 3
x = np.linspace(-2, 2, 101)
plt.plot(x, (x**3+x**2-3)/3,'b', x, x, '--r' )
plt.axis([-2, 3, -2, 2])
plt.xlabel('x')
plt.grid('True')
plt.legend(['g(x)','x']);
We observe that the fixed points of g are the same as the zeros of f .
Apply fixed point iterations on g( x ). Aim for√the fixed point between 1 and 2, so choose
x0 = 1.5. Do the iterations converge to the root r = 3?
7
# Define the function
def g(x):
return (x**3+x**2-3)/3
k = 0, x = 1.5000000000
k = 1, x = 0.8750000000
k = 2, x = -0.5214843750
k = 3, x = -0.9566232041
k = 4, x = -0.9867682272
k = 5, x = -0.9957053567
k = 6, x = -0.9985807218
k = 7, x = -0.9995282492
k = 8, x = -0.9998428981
k = 9, x = -0.9999476491
k = 10, x = -0.9999825515
k = 11, x = -0.9999941841
k = 12, x = -0.9999980614
k = 13, x = -0.9999993538
k = 14, x = -0.9999997846
Resultat:
x = -0.9999997845980656, antall iterasjoner=14
− x2 + 3x + 3
x = g2 ( x ) = ,
p x2
3
x = g3 ( x ) = 3 + 3x − x2 ,
r
3 + 3x − x2
x = g4 ( x ) =
x
Use x0 = 1.5 in your experiments, you may well experiment with other values as well.
2.6 Theory
Let us first state some existence and uniqueness results. Apply Theorem 1 in this note on the
equation f ( x ) = x − g( x ) = 0. The following is then given (the details are left to the reader): * If
g ∈ C [ a, b] and a < g( x ) < b for all x ∈ [ a, b] then g has at least one fixed point r ∈ ( a, b).
• If in addition g ∈ C1 [ a, b] and | g0 ( x )| < 1 for all x ∈ [ a, b] then the fixed point is unique.
8
In the following, we will write the assumption a < g( x ) < b for all x ∈ [ a, b] as g([ a, b]) ⊂
( a, b).
In this section we will discuss the convergence properties of the fixed point iterations, under
the conditions for existence and uniqueness given above.
The error after k iterations is given by ek = r − xk . The iterations converge when ek → 0 as
k → ∞. Under which conditions is this the case?
This is the trick: For every k we have:
Take the difference between those and use the mean value theorem (Result 3 in section 5 in Pre-
liminaries), and finally take the absolute value of each expression in the sequence of equalities:
Here ξ k is some unknown value between xk (known) and r (unknown). We can now use the
assumptions from the existence and uniqueness result. Note here that the condition g([ a, b]) ⊂
( a, b) guarantees that if x0 ∈ [ a, b] then xk ∈ ( a, b) for k = 1, 2, 3, . . ..
The condition | g0 ( x )| ≤ L < 1 guarantees convergence towards the unique fixed point r, since
| e k +1 | ≤ L | e k | ⇒ |ek | ≤ Lk |e0 | → 0 as k → ∞,
| x k +1 − x k | ≤ L | x k − x k −1 | .
| x k +1 − x k | ≤ L k | x 1 − x 0 | .
L L k +1
| x k +1 − r | ≤ | x k +1 − x k | ≤ | x1 − x0 |.
1−L 1−L
In summary we have
9
The fixed point theorem. If there is an interval [ a, b] such that g ∈ C1 [ a, b], g([ a, b]) ⊂ ( a, b) and
there exist a positive constant L < 1 such that | g0 ( x )| ≤ L < 1 for all x ∈ [ a, b] then the following
hold:
• g has a unique fixed point r in ( a, b).
• The fixed point iteration xk+1 = g( xk ) converges towards r for all starting values x0 ∈ [ a, b].
• The error ek+1 = r − xk+1 after iteration k + 1 satisfies the estimates
• | e k +1 | 6 L | e k | . . . error reduction rate,
L k +1
• | e k +1 | 6 | x1 − x0 | . . . a-priori error estimate.
1−L
L
• | e k +1 | 6 |x − xk | . . . a-posteriori error estimate.
L − 1 k +1
From the discussion above, we can draw some conclusions:
• The smaller the constant L, the faster the convergence.
• If | g0 (r )| < 1 then there will always be an interval (r − δ, r + δ) around r for some δ > 0 on
which all the conditions are satisfied, meaning that the iterations will always converge if x0
is sufficiently close to r.
• If | g0 (r )| > 1, a similar argumentation shows that the fixed point iterates move away from
the point r. In practice, this means that the fixed point iteration in this case will never con-
verge towards r.
• If the constant L is known, the a-priori error estimate can be used in order to estimate the
number of steps it takes in order to obtain an error that is smaller than some specified toler-
ance tol. To that end, we simply have to solve the inequality
L k +1
| x1 − x0 | ≤ tol
1−L
for k.
• The a-posteriori estimate can be used to estimate the actual error based on the size of the last
update | xk+1 − xk |.
• If the constant L is not known, one can estimate it using the ideas found in
preliminaries.ipynb. Then one can use this estimate of L in the a-posteriori error estimate
in order to estimate the actual error.
Example 3 revisited: Use the theory above to analyse the iteration scheme from Example 3,
where
x3 + x2 − 3 3x2 + 2x
g( x ) = , g0 ( x ) =
3 3
√
It is clear that g is differentiable. We already
√ know that
√ g has three fixed√points, r = ±
√ 3 and
r = −1. For the first two, we have that g0 ( 3) = 3 + 32 3 = 4.15 and g0 (− 3) = 3 − 32 3 = 1.85,
so the fixed point iterations will never converges towards those roots. But g0 (−1) = 1/3, so
we get convergence towards this root, given sufficiently good starting values. The figure below
demonstrates that the assumptions of the theorem are satisfied in some region around x = −1, for
example [−1.3, −0.7].
10
In [13]: plt.rcParams['figure.figsize'] = 10, 5 # Resize the figure for a nicer plot
11
In [15]: # run the fixed point iteration starting at x_0 = -4/3
x, nit = fixedpoint(g, x0=-2/3, tol=1.e-6, max_iter=30)
k = 0, x = -0.6666666667
k = 1, x = -0.9506172840
k = 2, x = -0.9851247206
k = 3, x = -0.9951879923
k = 4, x = -0.9984113972
k = 5, x = -0.9994721469
k = 6, x = -0.9998242347
k = 7, x = -0.9999414321
k = 8, x = -0.9999804797
k = 9, x = -0.9999934935
k = 10, x = -0.9999978312
k = 11, x = -0.9999992771
k = 12, x = -0.9999997590
Resultat:
x = -0.9999997590221911, antall iterasjoner=12
The plot to the left demonstrates the assumption g([ a, b]) ⊂ ( a, b), as the graph y = g( x ) enters
at the left boundary and exits at the right and does not leave the region [ a, b] anywhere in between.
The plot to the right shows that | g0 ( x )| ≤ | g0 ( a)| = L < 1 is satisfied in the interval.
b) Do a similar analysis on the three other iteration schemes suggested above, and confirm the
results numerically. The schemes are:
− x2 + 3x + 3
x = g2 ( x ) = ,
p x2
3
x = g3 ( x ) = 3 + 3x − x2 ,
r
3 + 3x − x2
x = g4 ( x ) =
x
Example 3 revisited: The fixed point theorem stated that |ek+1 | 6 L|ek | with L < 1 and thus
the fixed point iteration is linearly convergent.
12
# list of iterates
# errs: list of computed errors
# Output:
# list of computed eocs
# Print
for k in range(len(ps)):
print("k = {:3d}, eoc(k) = {:3.3f}".format(k+2, ps[k]))
return ps
We will also modify the fixedpoint() function so that it returns not only the last iteration, but
all iterations, so that we can compute the estimated order of convergence.
Now we can rerun example 3 with g( x ) = ( x3 + x2 − 3)/3 and compute the estimated order
of convergence:
In [19]: ## Compute the eoc for the fixed point method when solving example 3.
13
# Apply the fixed point scheme
x = fixedpoint_v2(g, x0=-2/3, tol=1.e-6, max_iter=30)
# Compute x_k - r,
r = -1 # exact solution
errs = x - r # error for each iterate
k = 0, x = -0.6666666667
k = 1, x = -0.9506172840
k = 2, x = -0.9851247206
k = 3, x = -0.9951879923
k = 4, x = -0.9984113972
k = 5, x = -0.9994721469
k = 6, x = -0.9998242347
k = 7, x = -0.9999414321
k = 8, x = -0.9999804797
k = 9, x = -0.9999934935
k = 10, x = -0.9999978312
k = 11, x = -0.9999992771
k = 12, x = -0.9999997590
14
ies, section 4) of g( xk ) around r:
1
ek+1 = r − xk+1 = g(r ) − g( xk ) = g(r ) − g(r − ek ) = − g0 (r )ek + g00 (ξ k )e2k .
2
If g0 (r ) = 0 and there is a constant M such that | g00 ( x )|/2 ≤ M, then
| e k +1 | ≤ M | e k | 2
and quadratic convergence has been achieved, see Preliminaries, section 3.1 on "Convergence of an
iterative process."
Question: Given the equation f ( x ) = 0 with an unknown solution r. Can we find a g with r as
a fixed point, satisfying g0 (r ) = 0?
Idea: Let
g ( x ) = x − h ( x ) f ( x ).
for some function h( x ). Independent of the choice of h( x ), r will be a fixed point of g. Choose h( x )
such that g0 (r ) = 0, that is
g0 ( x ) = 1 − h0 ( x ) f ( x ) − h( x ) f 0 ( x ) ⇒ g 0 (r ) = 1 − h (r ) f 0 (r )
That is, we have to choose h( x ) in such a way that h(r ) = 1/ f 0 (r ). There are several choices for h
with this property, but the most obvious one is the choose h( x ) = 1/ f 0 ( x ). The result is
Newton’s method.
• For k = 0, 1, 2, 3, . . .
f ( xk )
x k +1 = x k −
f 0 ( xk )
2.8.1 Implementation
The method is implemented in the function newton(). The iterations are terminated when | f ( xk )|
is less than a given tolerance.
15
break
x = x - fx/df(x) # Newton-iteration
print('k ={:3d}, x = {:18.15f}, f(x) = {:10.3e}'.format(k+1, x, f(x)))
return x, k+1
In [28]: # Example 4
def f(x): # The function f
return x**3+x**2-3*x-3
Result:
x=-1.0, number of iterations=7
| e k +1 | ≤ M | e k | 2 ,
where ek = r − xk is the error after k iterations. But under which conditions is this true, and can
we say something about the size of the constant M?
By a Taylor expansion of f (r ) around xk , we get:
1 00
0 = f (r ) = f ( xk ) + f 0 ( xk )(r − xk ) + f (ξ k )(r − xk )2 (Taylor series)
2
0 = f ( xk ) + f 0 ( xk )( xk+1 − xk ) (Newton’s method)
16
So we obtain quadratic convergence if f is twice differentiable around r, f 0 ( xk ) 6= 0, and x0 is
chosen sufficiently close to r. More precisely:
Theorem: Convergence of Newton iterations. Assume that the function f has a root r, and let
Iδ = [r − δ, r + δ] for some δ > 0. Assume further that
• f ∈ C2 ( Iδ ).
• There is a positive constant M such that
00
f (y)
• 0
≤ 2M, for all x, y ∈ Iδ .
f (x)
In this case, Newton’s iterations converge quadratically with
| e k +1 | ≤ M | e k | 2
for all starting values satisfying | x0 − r | < min{1/M, δ}.
Here the condition | x0 − r | ≤ 1/M guarantees that the error actually decreases in each step,
since we have this condition that |ek+1 | ≤ M|ek |2 < |ek |.
f 1 ( x1 , x2 , . . . , x n ) = 0
f 2 ( x1 , x2 , . . . , x n ) = 0
..
.
f n ( x1 , x2 , . . . , x n ) = 0
or short by
f(x) = 0.
where f : → Rn Rn .
Example 5: We are given the two equations
1
x13 − x2 + = 0,
4
x12 + x22 − 1 = 0.
This is illustrated in the following figure.
17
In [22]: plt.rcParams['figure.figsize'] = 6, 6 # Resize the figure for a nicer plot
The solutions of the two equations are the intersections of the two graphs. So there are two
solutions, one in the first and one in the third quadrant. We will use this as a test example in the
sequel.
18
3.1 Newton’s method for systems of equations
The idea of fixed point iterations can be extended to systems of equations. But it is in general
hard to find convergent schemes. So we will concentrate on the extension of Newton’s method to
systems of equations. And for the sake of illustration, we only discuss systems of two equations
and two unknowns written as
f ( x, y) = 0,
g( x, y) = 0,
to avoid getting completely lost in indices.
Let r = [r x , ry ] T be a solution to these equations and some x̂ = [ x̂, ŷ] T a known approximation
to r. We search for a better approximation. This can be done by replacing the nonlinear equation
f(x) = 0 by its linear approximation. This can be found by a multidimensional Taylor expansion
around x̂:
∂f ∂f
f ( x, y) = f ( x̂, ŷ) + ( x̂, ŷ)( x − x̂ ) + ( x̂, ŷ)(y − ŷ) + . . .
∂x ∂y
∂g ∂g
g( x, y) = g( x̂, ŷ) + ( x̂, ŷ)( x − x̂ ) + ( x̂, ŷ)(y − ŷ) + . . .
∂x ∂y
where the . . . represent higher order terms, which are small if x̂ ≈ x. By ignoring these terms we
get a linear approximation to f(x), and rather than solving the nonlinear original system, we can
solve its linear approximation:
∂f ∂f
f ( x̂, ŷ) + ( x̂, ŷ)( x − x̂ ) + ( x̂, ŷ)(y − ŷ) = 0
∂x ∂y
∂g ∂g
g( x̂, ŷ) + ( x̂, ŷ)( x − x̂ ) + ( x̂, ŷ)(y − ŷ) = 0
∂x ∂y
or more compact
f(x̂) + J (x̂)(x − x̂) = 0,
where the Jacobian J (x) is given by
∂f ∂f
!
∂x ( x, y ) ∂y ( x, y )
J (x) = ∂g ∂g
∂x ( x, y ) ∂y ( x, y )
It is to be hoped that the solution of the linear equation x provides a better approximation to r
than our initial guess x̂, so the process can be repeated, resulting in
• For k = 0, 1, 2, 3, . . .
• Let xk+1 = xk + ∆k .
19
The strategy can be generalized to systems of n equations in n unknowns, in which case the Jaco-
bian is given by:
∂ f1 ∂ f1 ∂ f1
∂x1 ( x ) ∂x2 ( x ) · · · ∂xn ( x )
∂ f2 ∂f ∂f
∂x1 (x) ∂x22 (x) · · · ∂x2n (x)
J (x) =
.. .. ..
.
. . .
∂ fn ∂ fn ∂ fn
∂x1 ( x ) ∂x2 ( x ) · · · ∂xn ( x )
3.1.1 Implementation
Newton’s method for system of equations is implemented in the function newton_system. The nu-
merical solution is accepted when all components of f(xk ) are smaller than a tolerance in absolute
value, that means when |f(xk)|{∞} < tol. See Preliminaries, section 1 for a description of norms.
Example 6: Solve the equations from Example 5 by Newton’s method, that is, the system
1
x13 − x2 + = 0,
4
x12 + x22 − 1 = 0.
The vector valued function f and the Jacobian J are in this case
x1 − x2 + 41 3x12 −1
3
f(x) = and J (x) =
x12 + x22 − 1 2x1 2x2
We already know that the system has 2 solutions, one in the first and one in the third quadrant. To
find the first one, choose x0 = [1, 1] T as starting value.
In [33]: # Example 6
20
# The vector valued function. Notice the indexing.
def f(x):
y = np.array([x[0]**3-x[1]+0.25,
x[0]**2+x[1]**2-1])
return y
# The Jacobian
def jac(x):
J = np.array([[3*x[0]**2, -1],
[2*x[0], 2*x[1]]])
return J
print('\nTest: f(x)={}'.format(f(x)))
if nit == max_iter-1:
print('Warning: Convergence has not been achieved')
k = 0, x = [ 1. 1.]
k = 1, x = [ 0.8125 0.6875]
k = 2, x = [ 0.750687815833801 0.663959854014599]
k = 3, x = [ 0.746302675769953 0.665623251157924]
k = 4, x = [ 0.746281278080405 0.665630719318386]
k = 5, x = [ 0.746281277575054 0.665630719499142]
xey = 1,
using x0 = y0 = −1.
− x2 + y = 1,
21
If n is large, each iteration is computationally expensive since the Jacobian is evaluated in each
iteration, and a linear system has to be solved. In practice, some modified and more efficient
version of Newton’s method will be used, maybe together with more robust but slow algorithms
for finding sufficiently good starting values.
22