You are on page 1of 14

COS2633/LN2/3/2020

Tutorial letter LN2/3/2020

Numerical Methods I
COS2633

Semesters 1 & 2

Department of Mathematical Sciences

IMPORTANT INFORMATION:
Nonlinear equations:
This tutorial letter contains Lesson 2, additional notes on
solutions of nonlinear equations. Please read this information
along with Chapter 2 and Section 10.2 to supplement the
textbook.

BARCODE

university
Define tomorrow. of south africa
1 Lesson 2: Solutions of Non-linear Equations
Reading: Chapter 2 and Chapter 10.2 of textbook
1.1 Objectives and Outcomes
1.1.1 Objectives
The objectives of this lesson is to highlight aspects of the study of solutions of equations in one or
more variables to supplement Chapter 2 and Chapter 10.2 of the syllabus:

• To highlight the key points of the fixed point method;

• To highlight the key points of the Bisection method;

• To highlight the key points of Newton’s method for one equation, as well as a system of
equations;

• To highlight the key points of the Secant method and its variations;

• In studying each method to highlight key aspects of investigating convergence and error of
these methods;

• To highlight the key points of other methods used for specific purposes in the study of solu-
tions of nonlinear equations;

1.1.2 Learning Outcomes


At the end of this lesson you should be able to:

• Understand how the following methods/algorithms/schemes are formulated (the theory):

– Bisection method;
– Fixed point method;
– Newton’s method and its extensions: Secant method; Regula falsi method;
– Horner’s method and Muller’s method
– Newton’s method for systems of two equations in two unknowns.

• Illustrate the difference between the various methods geometrically.

• Implement the associated algorithms to solve nonlinear equations.

• To analyse the error and convergence of these methods.

• Identify the strengths and weaknesses of each algorithm.

• To identify and apply methods used to accelerate convergence.

2
COS2633/LN2/3/2020

1.2 Introduction
In this part of the course, the question of solving non-linear equations arising in many situations
where analytic methods may not be used is addressed by presenting a number of numerical
schemes/algorithms for this purpose. In most cases there are no exact analytical formulas to
solve such equations. Various schemes have been formulated to generate approximate solutions,
starting with simple ones and proceeding to other methods which were formulated due to their
advantage as related to convergence and computational efficiency.

While the methods discussed in Chapter 2 are for equations in one variable, it is useful to view
Chapter 10.2 as an extension of Chapter 2 since it deals with extending Newton’s method to solve
a system of non-linear equations instead of one equation. While also other methods can be ex-
tended to the case of functions of more than one variable, the focus of this module is on extending
Newton’s method only.

It is very important to note that the notes presented here do not in anyway replace the need
to read the relevant chapter/section in the textbook. Similar summaries are actually given in
the companion website of the textbook.

1.3 Nature of the problem


At the start of this section of study it is of paramount importance to understand the terminology used
for the discussion at hand. Note the distinction between the two problem/equation formulations:

f (x) = 0 (1)

and
x = g(x) (2)
also respectively referred to as the root problem and the fixed-point problem. The root problem
involves rearranging the given equation so that one side of the equation is 0. A solution of this
problem in the usual x-intercept of the graph of f (x). It is also worth noting the use of the terms
root, zero and fixed-point in the context of these two equations.

For the fixed-point problem, one side of the equation (obtainable in a number of ways) is x. Many
of the techniques involve finding an expression of g(x) that ’hopefully’ converges to a fixed-point x
(or p as per terminology of the textbook). The solution process involves iterative computations of
g(x) to find successive approximations to the solution.

Geometrically, this would be the point of intersection of the graph of y = g(x) and the line y = x.The
iteration formula g(x) may converge to the required solution or diverge away from it, sometimes to
another solution if it exists or simply away in an unbounded manner. The conditions for conver-
gence become part of the analysis. Such convergence sometimes depend on the choice of the
initial value of the approximation.

So in discussing each method the elements of the process are in focus: formulation; implementa-
tion and analysis of error and convergence.

3
1.3.1 Other resources
The textbook authors have created a companion website that, in particular, has good animation
that demostrate how the various methods of this chapter work. It is worth visiting this website for
an appreciation of the progression of each method from an initial point to the required solution. The
direction to the companion website is given in Section 1.4 (p38) of the textbook:
https://sites.google.com/site/numericalanalysis1burden/
In what follows, a quick overview of the various methods for solving the root equation f (x) = 0 is
given.

1.4 Bisection Method


This method is for solving the problem in the form of equation (1) and only requires that the cho-
sen initial interval (a, b) contains the root, a confirmation of which is via the Intermediate Value
Theorem. Calculation of the root p is simply by halving the current interval. The calculation is then
followed by a decision of which endpoint a or b to discard by reapplying the Intermediate Value
Theorem. The simple test of the Intermediate Value Theorem for an interval (a, b) is for the product
f (a) · f (b) < 0.

So the Bisection method starts by identifying a suitable initial interval by applying the Intermediate
Value Theorem to a pair of values. It proceeds by finding the next iterate as the midpoint of the
current interval, followed by checking the sign of its functional value and discarding the point with
the same functional value sign as the new iterate. This procedure is repeated successively and
subsequent intervals get smaller and they enclose the desired root. Example 1 on p50 illustrates
the implementation of this method. Note that the function f (x) is only used in the intermediate
value test.

1.4.1 Convergence
The nice thing about the Bisection Method is that the repeated application of the intermediate point
test after each calculation of the midpoint ensures that the iterations converge to the root because
each time it is bracketed by the endpoints.

Furthermore, the convergence rate of the method is simple and given in Theorem 2.1.

1.4.2 Termination criterion


Since the next iterate is obtained simply as the midpoint of the current interval, it is possible to de-
termine the number of iterations n required to attain a specified tolerance  by solving the inequality
b−a
< , b>a (3)
2n
for n.

Example 2 on p 52 illustrates this calculation.


Note that this calculation does not involve the function f (x).

4
COS2633/LN2/3/2020

1.5 Fixed-point Iteration


Fixed-point iteration is the core of the discussion of the methods presented in Chapter 2. In simple
terms, a fixed-point iteration formula involves rearranging equation (1) in various ways to obtain
equation (2). The different ways of rearranging (1) lead to different expressions of g(x) in (2).

Theorem 2.3 states a sufficient condition for the existence and uniqueness of a fixed point:

(i) Existence: g(x) ∈ [a, b] and g ∈ C[a, b] for all x ∈ [a, b];
i.e. g maps itself in [a, b] and is continuous on that interval.

(ii) Uniqueness: g 0 (x) ≤ k for all x ∈ [a, b]; i.e. there are no asymptotes in (a, b).

Note that both hypotheses of Theorem 2.3 must be satisfied.

Example
The root equation f (x) = 2x3 − 7x + 2 = 0 can be rearranged in various expressions
of x = g(x), i.e. of (2), as shown in the table below. The initial value and the result
of repeated iterations are also given.

Iteration Initial value Result


2(x3i + 1)
xi+1 = 1 Converges to 0.292893
7
2 Diverges
7xi − 2
xi + 1 = 1 Converges to 1.707106
2x2i
2 to 1.707106
  12
7xi − 2
xi+1 = 1 Converges to 1.707106
2xi
2 Converges to 1.707106
  13
7xi − 2
xi+1 = 1 Converges to 1.707106
2
2 Converges to 1.707106

Since the given function f (x) is a polynomial of degree 3, it has 3 roots. The initial values chosen
led to two of the 3 roots. Other choices might have led to the other root. Also worth noting is that
You can think of other iteration formulae (2) and try them using different initial values.

In a sense, the rest of the methods discussed for solving nonlinear equations entails finding differ-
ent expressions of g(x).

1.5.1 Convergence Criterion


The question of convergence for a method or formula that can involve different expressions of the
function g(x), may depend on the formula itself and the initial value chosen for the iterations, as the
above example illustrates. Thus convergence requires finding a common way of testing whether a

5
given formula converges in a given interval.

The Fixed-point Theorem (Theorem 2.4) is a simple test for convergence that can be applied to an
iterative formula. Simply stated, for a function g(x) that is continuous on [a, b], that maps the given
interval onto itself, and is differentiable on the given interval, it is sufficient for g(x) to converge if

|g 0 (x)| ≤ k < 1, 0 < k < 1, for all x ∈ (a, b). (4)

Note that the test is performed for a given interval (a, b).

Order of Convergence

The speed or rate of convergence of an iterative scheme is the next question.

Let i be the error in the i-th iteration; i.e. i = pi − p.

If g(x) and its n derivatives are known at p, then g(pi ) can be expanded in a Taylor series about p
to obtain
g 00 (p) g 000 (p)
g(xi ) = g(p) + g 0 (p)(xi − p) + (xi − p)2 + (xi − p)3 + . . . (5)
2! 3!
2
= g 0 (p) + g 0 (p)i + g 00 (p) i + . . . (6)
2!
or equivalently as
2i 3
xi+1 − p = g 0 (p)i + g 00 (p) + g 000 (p) i + · · · = i+1
2! 3!
since g(pi ) = pi+1 , g(p) = p, pi − p = i .

We observe that the error in the (i + 1)-th term has been expressed in terms of the i-th error.
Hence if i is fairly small, ni gets smaller with higher powers of i , thus causing the first term to be
dominant over other terms. However, if i is large, little can be said about convergence.

If g 0 (p) 6= 0 then i+1 ∝ i . If on the other hand g 0 (p) = 0 and g 00 (p) 6= 0, then i+1 ∝ 2i .

Definition: The order of convergence is the order of the lowest non-zero derivative of g(x) at x = p.

The order of convergence of an iterative method is not unique but depends on the iterative formula
being used. Thus the speed of convergence varies with the choice of the formula g(x). The higher
the order, the faster the convergence.

It therefore seems that the choice of the iteration function g(x) is crucial in finding the root of equa-
tion (1).

6
COS2633/LN2/3/2020

Example

For
f (x) = x2 − 5x + 4 = 0, with p = 1; 4
and
x2 + 4 2x
x = g(x) = ; g 0 (x) =
5 5
so that
2 8
g 0 (1) =
6= 0; g 0 (4) = 6= 0
5 5
hence the process is first order convergent.

A geometric test for convergence

We recall that geometrically, the solution of x = g(x) is essentially the value of x at the intersection
of the curve y = g(x) and the line y = x.
Geometrically, on the superimposed graphs of y = x and y = g(x), the test involves starting at the
point x0 and first moving vertically until the curve y = g(x) is reached. From this point movement
is horizontally until the line y = x is reached, then vertically to g(x), and so on, as shown on the
figures on p59 of the textbook.

The relationship between convergence and the slope g 0 (x) of g(x) is illustrated in Figure 2.6(b)
(monotonic) and Figure 2.6(a) (oscillating) in the textbook.
(see also the animation in the textbook companion website)

The methods discussed starting from Section 2.3 are different expressions of the iterative formula
x = g(x) where the right-hand side g(x) is obtained in a systematic way.

1.6 Newton’s Method


The Newton’s (also known simply as Newton-Raphson’s) Method is popular for it’s fast conver-
gence.
f (x)
It makes use of the function f (x) from equation (1) and the iteration equation g(x) = x − 0 ,
f (x)
leading to the iterative formula
f (xi )
xi+1 = xi − 0 . (7)
f (xi )
Geometrical Application

Geometrically, the procedure involves the following steps on the graph of f (x):

1. Picking an initial guess, x0 = p0 ;

2. Obtaining a new value x1 = p1 by drawing the tangent to the curve at at x = x0 , y = f (x0 ),


and extending it to the x-axis;

7
3. Repeating 2 above for another value x2 using x1 as the new guess. Other subsequent ap-
proximations are obtained in the same way.

The result of carrying out these steps is illustrated in Figure 2.7 of the textbook.

Another formulation of Newton’s method is as follows.

Let ∆x = x0 − x1 so that x1 = x0 − ∆x.


The slope of the tangent to the curve at the point (x0 , f (x0 )) is

f (x0 )
f 0 (x0 ) =
∆x
and hence
f (x0 )
x1 = x 0 − .
f 0 (x0 )
In general the equation of a line through (xk , f (xk )) with slope m is given by

f (x) = f (xk ) + m(x − xk ).

Using this, the approximation of the root of f (x) = 0 satisfies

0 = f (xk ) + m(x − xk ) = f (xk ) + m(xk+1 − xk ).

Solving for xk+1 yields


f (xk ) f (xk )
xk+1 = xk − = xk − 0 = g(xk )
m f (xk )
as in (7) above.

1.6.1 Convergence
Note the special expression of the iterative formula x = g(x) in equation (7). The choice of g(x) is
such that its first derivative at x̄ (or p) vanishes. Hence the method is a second order method.

(See a more formal discussion of convergence using Newton’s Method on p69 of the textbook.)

1.6.2 Pitfalls of Newton’s Method


while Newton’s Method converges fast, sometimes it may fail to converge but oscillate between two
values because of one of the following reasons:

(i) If there is no real root.

1. If the iterate corresponds to a turning point (or very close to one) as illustrated in Figure 1

2. If f (x) is symmetrical about the root as shown in Figure 2

3. If x0 is not close enough to p such that some other part of the function ”catches” the iteration.
This is illustrated in Figure 3.

8
COS2633/LN2/3/2020

Figure 1: Flat tangent


Figure 1: Flat tangent
f
f
p0
p p0
p

Figure 1: Flat spot


Figure 2: Symmetric roots f
Figure 2: Symmetric roots f

p1
p1 p p0=p2
p p0=p2

Figure 2: Symmetric iterations


Figure 3: Runaway root
Figure 3: Runaway root
4. If the roots are too close to each other the slope may be horizontal and hence fail to intersect
the curve anywhere except at the point.

If any of the above situations occur it may be necessary to use another criterion to stop the itera-
tions. One alternative is to stop afterf a predetermined number of iterations. another criterion is to
stop when |xi+1 − xi | < , where  isf some tolerance value, and f (xi ) = 0, or at least |f (xi )| < .
p0 p1 p2
p0 p1 p2

9
Figure 3: Runaway root

p0 p1 p2

Figure 3: Runaway iterations

1.7 Secant Method


This method is a modification of Newton’s method obtained by replacing the derivative in equation
(7) by the limit definition of f 0 (x) at xk−1 and assuming that xk−1 and xn−2 are close enough to
remove the limit. The method involves two initial approximations x0 and x1 (or p0 and p1 )

The iterative function of the iterative formula of the Secant method now involves the two previous
iterates xk−1 and xk .
Geometrically, instead of using the tangent at a guess point xk , the method uses the secant line
through two current guess points, (xk , f (xk )) and (xk−1 , f (xk−1 )). Otherwise the idea of using the
intersection of this line with the x-axis as the next approximation is the same as for Newton’s
method.

1.7.1 Convergence
Although this method is efficient, convergence to the required root is not guaranteed. The next
method addresses this shortfall of the Secant method.

1.8 Method of False Position


To improve on the uncertainty of the Secant Method, the method of false position (also called the
Regula Falsi Method), uses the Intermediate Value Theorem to ensure that the next choice of the
endpoints of the interval brackets (or contains) the root being sought.

Geometrically, the application process involves the following steps:


1. Choosing two initial guesses, say x0 = a and x1 = b such that f (a) and f (b) have opposite
signs; i.e. f (a)f (b) < 0;
2. Joining the points f (a) and f (b) with a straight line (the secant line); Identifying the point xM
at the x-intercept of the secant line in 2. above.
3. Examining the value of f (xM ) and compare with f (a) and f (b), and if

10
COS2633/LN2/3/2020

(a) f (xM )f (a) > 0, set a = xM ;


(b) f (xM )f (b) > 0, set b = xM .

4. Repeating the process using xM and either a or b as the new (endpoint) values.

Note that the Secant and Regula Falsi methods have the same iterative formula, but the latter has
the bracketing test as an extra step.

The Bisection and Regula Falsi methods have the following disadvantages to be aware of when
using them:

(i) Calculation of the intermediate point becomes tricky when the other two approximations on
either side are close together;

(ii) Round-off error may cause problems as the intermediate point approaches the root in that
since f (xM ) may be slightly in error, f (xM ) may be positive when it should be negative or
vice versa.

(iii) When the three current points are close to the root, their functional values may become very
small; i.e. the evaluations may result in an underflow when testing for signs by multiplication
of two small numbers, hence the test may fail.

Usually, Newton’s method may be used to refine (or speed up) an answer obtained by other meth-
ods such as the Bisection method.

1.9 A Note on Convergence


While brief discussion of convergence has been included for some of the methods discussed,
Section 2.4 of the textbook discusses the notion of order of convergence for iterative methods
in general. A definition of order of convergence is given in terms of a general sequence. Two
orders are highlighted - linear (order 1) and quadratic (order 2), as perhaps common orders of
convergence, with quadratic convergence giving satisfactory fast convergence where is holds. A
few theorems are also given for testing convergence of the iterative formula pn = g(pn−1 ) and for
the existence of a root p of f (x) in an interval (a, b). This section is worth reading in detail and can
be useful in determining existence of roots and analysing convergence of iterative methods whose
analysis of convergence is not specifically covered.

1.9.1 Speeding up Convergence


Having introduced the notion of convergence in the above section, Section 2.5 presents two meth-
ods,namely Aitken’s and Steffensen’s methods, that can be used to speed up convergence of
methods known to converge only linearly. Of particular note is that the formulae given are for re-
fining the n-iterate pn , and hence are only in terms of pn , pn+1 , pn+2 , n ≥ 0. Thus their application
necessitates starting with three values.

1.10 Methods for Polynomial Functions


Horner’s method and Muller’s method have been presented as useful techniques for handling poly-
nomial functions, with specific mention of possible existence of complex roots for Muller’s method.

11
1.10.1 Horner’s Method
A major strength of Horner’s method is the computational economy embedded in it. The main
objective for using this method is to economise on computations by using synthetic division to eval-
uate P (xi ) and P 0 (xi ) which uses only the coefficients of P (x).

It is used in the evaluation of the polynomial values when Newton’s method is applied to a polyno-
mial equation P (x) = 0, whose Newton’s iterative formula would be
P (xi )
xi+1 = xi −] .
P 0 (xi )
Example 2 on p93 illustrate the use of this method.

1.10.2 Muller’s Method


Although Muller’s method is particularly useful in handling complex roots where the root equation
is a polynomial equation, it can be used for any root-finding problem.

Unlike the other methods based on using an intersection of a line with the x-axis to find the next
approximation, Muller’s method uses the intersection of a parabola to find the next approximation.
Intuitively, a quadratic approximation of a polynomial is better than a linear approximation. So
Muller’s method is considered a stronger choice for approximating the next iterate.

Its formulation is based on using a quadratic polynomial of the form


P (x) = a(x − p2 )2 + b(x − p2 ) + c
that passes through the points corresponding to three approximate points xi−2 , xi−1 and xi , i =
2, 3, . . . at a time. This requirement is used to obtain the values of the coefficients a, b and c. The
iterative formula uses these values, and has the form (see top of p97)
2c
p3 = p2 − √ .
b + sgn(b) b2 − 4ac
The iterative process continues by discarding xi−2 and uses the new iterate and the other two to
repeat the computations of new values of a, b and c.

1.11 Newton’s Method for Multi-variable Functions


In Chapter 2 the focus of solving nonlinear equations is on functions of a single variable. While
Newton’s method is still fresh, it is worth looking at how it is applied in the case where the function
whose root is sought involves several variables. Section 10.2 addresses this issue.

We highlight the similarities and differences in Newton’s formula in the two cases, using a function
F (x, y) or F (x1 , x2 ) or F (x),
where  
x1
x=
x2

12
COS2633/LN2/3/2020

While Newton’s method for f (x) = 0 uses the iteration (fixed-point) formula

x = g(x) = x − φ(x)f (x), with φ(x) = 1/f 0 (x)

Newton’s method for a function, F (x1 , x2 ) = 0 of two variables,


 
F1 (x1 , x2 )
F (x1 , x2 ) = ,
F2 (x1 , x2 )
uses
x = G(x) = x − JF (x)−1 F (x) (8)
where JF (x)−1 is now the inverse of a matrix of partial derivatives with respect to each of the
variables x1 , and x2 . This matrix of derivatives is called the Jacobian of the function F (x1 , x2 ), and
is defined by
∂F1 (x1 , x2 ) ∂F1 (x1 , x2 )
 

JF (x1 , x2 ) = 
 ∂x1 ∂x2 
∂F2 (x1 , x2 ) ∂F2 (x1 , x2 ) 
∂x1 ∂x2
That is
∂F1 (x1 , x2 ) ∂F1 (x1 , x2 ) −1
 

JF (x)−1 =  ∂F ∂x 1 ∂x2
 
(x
2 1 , x2 ) ∂F (x
2 1 , x2 ) 
∂x1 ∂x2
Finding the inverse of this matrix follows on standard methods used in linear algebra.

Briefly, for a function of two variables this is a 2 × 2 matrix whose inverse


 is relatively easy to find
a b
using techniques of finding the inverse of a general 2 × 2 matrix A = . For such a matrix
c d
   
−1 1 d −b 1 d −b
A = =
|A| −c a ad − bc −c a

For a discussion of the iterative equation (8), the Jacobian of a general n-variable function F(x1 , x2 , . . . , xn )
see p652. The inverse of such a matrix can be computed using similar computations of the inverse
of an n × n matrix in linear algebra.

1.12 Notations and Terminology


Each chapter concludes with a section on numerical software and chapter review. The notion of
an ’algorithm’ has been alluded to in previous tutorial letters. It is worth emphasizing that the given
algorithms are a general guide of implementing a technique using any software. They are to be
translated accordingly to suite the language of the software of your choice.

Finally, it is worth noting some notations and terminology used in this chapter that may recur
in subsequent chapters. An understanding of these will make understanding of presentations in
subsequent reading and study more fluent.

• {pn }∞
n=1 denotes a sequence whose terms are pn for n satrting from 1 to ∞.

13
• TOL is the textbook notation for ’tolerance’ or desired error.

• O(·), usually used in discussing convergence, denotes the order (or rate) of convergence.

• sgn(x) is the sign of x - see p52

• ∆pn denotes the forward difference - current value subtracted from the next.
(k)
• pi denotes the value of the iterate pi in the k-th iteration/step.

• x = (x1 , x2 , · · · , xn ) denotes an n-component vector.

• |A| = det(A) is the determinant of the n × n matrix.

• kxk2m denotes the m-norm of the vector x.

This list of terms and notation may not be exhaustive, but included here to emphasise the impor-
tance of understanding the notation used in the text (or any text for that matter) so that it does not
get in your way when studying.

14

You might also like