Professional Documents
Culture Documents
TWO
Contents
2.1 Categories of Non-Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Methods to find Roots of Non-Polynomial Equations . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3
2.4
Bracketing Methods . . . . . . . . . . . . . . . . . .
Open Methods . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
ui .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
14
17
2.5 Roots of Polynomial Equations . . . . . . . . . . . . . . . . . . . . . .
di q. . . .
Important Note: Dear Students, these lecture notes are “Supplementary”, which means, they are not meant to replace your text book,
. . . . . . . . . . . 20
Si d
but only provide supporting material to the classroom lectures, in a summarized manner. You should gain enough knowledge from other
sources to be in a position to elaborate the topics and material presented here 1
me
Ofcourse, you know it! You remember it from your school days. It is the famous Quadratic Equation. A
Quadratic Equation has two solutions:
√
−b ± b2 − 4ac
am
x= (2.2)
2a
o h
These two solutions are known as Roots of the Equation 2.1, or Zeros of the Equation .
Quadratic Equation 2.1 falls into the broad category of Polynomial Equations . More precisely, it should
M
be called a _______
Second_______
Cubic__________
is ______
Order____________
Polynomial__________
Equation (which is a ______
Equation. Similarly, another famous form of ____________
Third_______
Order ____________
Polynomial__________
Equation):
f (x) = ax3 + bx2 + cx + d = 0
Polynomial __________
Equation
(2.3)
So, an nth -Order Polynomial Equation (a.k.a nth -Degree Polynomial Equation ) can be generalized
as:
f (x) = c1 xn + c2 xn−1 + c3 xn−2 + · · · + cn x + cn+1 = 0 (2.4)
Hence, as per this definition, a First Order Polynomial Equation should take the form:
f (x) = ax + b = 0 (2.5)
It is commonly known as Linear Equation, and written in the form:
y = mx + c (2.6)
Other than the Polynomial Equations there are many other forms of equations. These include equations
involving trigonometric functions, logarithmic functions, exponential functions, and many other less common
functions. Examples of such equations are:
f (x) = e−x − x (2.7)
2 2
f (x) = sin x + cos x (2.8)
1
3 −x
f (x) = ln |x | + e (2.9)
13
WSW_ t¡¤©~ © <ª~ x©© ©¢ u©ªR©§ª©«0{§ j){0©ª h¤{¨¡ W_ x©© ©¢ u©ªRs0ª¡{ j){0©ª
Equations 2.7, 2.8, 2.9, 2.10, and similar, fall into the category of Non-Polynomial Equations (a.k.a.
Transcendental Equations ).
As can be observed easily, solution to Linear Equation 2.5, is very obvious (i.e. x = −c/b) and does not
require any considerable effort. But, any other Polynomial Equation (of degree 2, 3, ..., n) and all Non-
Polynomial Equations demand at least some minimum effort to calculate the roots. We should call these two
categories as Non-Linear Equations .
Hence, in the rest of this chapter we will consider the solution to all Polynomial Equations of degree
2 or higher, and all Non-Polynomial Equations. In other words, in the rest of this chapter, we will solve
Non-Linear Equations.
_______________________
i
Bracketing Methods employ two initial guesses to reach the solution of the equation. These two guesses
u
q
must bound the root of the equation. In other words, the two guesses must lie on opposite sides of the root
i
of the equation. Once the guesses are correctly suggested, as per this requirement, Bracketing Methods will
SURELY converge to the root of the equation.
d
Si d
The other category, Open Methods , do not carry any such requirement, with regards to initial guesses.
Hence, with regards to initial guess, Open Methods are more flexible than Bracketing Methods. But, this
flexibility also results in a drawback of Open Methods: there is NO SURETY that the method will converge to
a n
the solution. On the contrary, this is not a serious drawback, because it can easily be overcome by restarting
A
Open Methods are also superior to Brackting Methods in another aspect; several Open Methods need a
single guess, as opposed to two guesses in Bracketing Methods.
e d
Finally, it is worth mention that, although there exist special faster and/or easier methods to find the
roots of Polynomial Equations (like Muller’s Method, which we will study in the end of this chapter), but both
m
Bracketing Methods as well as Open Methods are equally well-suited to find the roots of Polynomial Equations
too. Hence, we will apply these methods to all Non-Linear Equations.
am
It is equally worth mention that, both, Bracketing Methods as well as Open Methods, would (normally)
find a single root of an equation. To find multiple roots, initial guesses must be changed, so that the method
o h
converges to a different root.
(Note: Open Methods can be modified and adopted to find multiple roots, to solve Systems of Linear
M
Equations as well as to solve Systems of Non-Linear Equations, but, because it is outside our syllabus, we
will not study those extended methods in our course.)
All the Bracketing Methods follow the same algorithm, except the Transformation step, where the method
decides how to calculate c from a and b
dd
Si
Method
________
Find the root of equation x cos(x) = −1, within a
tolerance = 10−2 , using Bisection method.
Solution: To find the root, precisely, up to second
a n
decimal place, a good initial guess is required. This can
h
be done, easily, by first plotting the graph of the equation
s
A
(as shown in Figure 2.2 on page 15). Graph shows that
function has a root in the interval [-2,4].
c =
a+b
=
(−2) + (4)
e d
Hence, set a = −2 and b = 4. Calculate the mid-point k0£)¡ MWSVN – h©ª}¡¨ ©¢ g0¡}0©ª t¡¤©~
am
range [a, c] or [b, c] calculate the products f (a)f (c) and f (b)f (c): f (a)f (c) = 2.8223 > 0, f (b)f (c) = −2.4869 < 0.
f (b)f (c) < 0 indicates that root lies in the range [b, c] and not in the range [a, c]. Hence, discard the range [a, c]
o h
by considering the new range as a = c = 1 and b = 4 i.e. [a, b] = [1, 4]. Calculate the error as: ε =
b−a
2
=
M
4 − (−2)
= 3. Because the error (ε) is more than tolerance = 10−2 , repeat the whole process.
2
a+b
Calculate the mid-point c = =
2
(1) + (4)
= 2.5. Calculate the ordinates
2
for a, b, and c: f (a) = (1) cos(1) +
1 = 1.5403, f (b) = (4) cos(4) + 1 =
−1.6146, and f (c) = (2.5) cos(2.5) +
1 = −1.0029. To determine whether
the root lies in the range [a, c] or [b, c]
calculate the products f (a)f (c) and
f (b)f (c): f (a)f (c) = −1.5447 < 0,
f (b)f (c) = 1.6192 > 0. f (a)f (c) < 0
indicates that root lies in the range
[a, c] and not in the range [b, c]. Hence,
discard the range [b, c] by considering
the new range as a = 1 and b = c = 2.5
i.e. [a, b] = [1, 2.5]. Calculate the error
b−a 4 − (−2)
as: ε = = = 3. Because
2 2
k0£)¡ MWSWN – l{¨¤ ©¢ x cos(x) + 1 the error (ε) is more than tolerance
= 10−2 , once again repeat the whole
process.
gSy}S Wª~ S MhvjN tfy
astrax111@gmail.com {£¡ VZ ©¢ X]
WSX_ g{}¦¡0ª£ t¡¤©~ h¤{¨¡ W_ x©© ©¢ u©ªRs0ª¡{ j){0©ª
Instead of doing individual calcualtions it is much more intuitive to prepare Table 2.1 (as shown on
b−a
page 16). In each iteration, calculate the approximate error ε = , and compare it with the given
2
−2
tolerance = 10 . As soon as ε < tolerance, the procedure stops.
As can be seen from Table 2.1 (on page
z{|§¡ MWSVN – g0¡}0©ª t¡¤©~_ y©§(¡ x cos(x) = −1
16), the procedure attains the value of
c = 2.0723 after 10 iterations, which has Iteration a b c f(a) f(b) f(c) f(a)f(c) f(b)f(c) ε
the error ε = 0.0059, which is within our 1 -2.0000 4.0000 1.0000 1.8323 -1.6146 1.5403 2.8223 -2.4869 3.0000
tolerance level of 10−2 . 2 1.0000 4.0000 2.5000 1.5403 -1.6146 -1.0029 -1.5447 1.6192 1.5000
It can also be observed from Table 2.1 3 1.0000 2.5000 1.7500 1.5403 -1.0029 0.6881 1.0598 -0.6900 0.7500
4 1.7500 2.5000 2.1250 0.6881 -1.0029 -0.1183 -0.0814 0.1187 0.3750
(on page 16) that the error gets exactly
5 1.7500 2.1250 1.9375 0.6881 -0.1183 0.3053 0.2101 -0.0361 0.1875
halved in each iteration. Because of this
6 1.9375 2.1250 2.0313 0.3053 -0.1183 0.0973 0.0297 -0.0115 0.0938
fact, it is possible to estimate, ahead
7 2.0313 2.1250 2.0781 0.0973 -0.1183 -0.0096 -0.0009 0.0011 0.0469
of procedure, the number of iterations 8 2.0313 2.0781 2.0547 0.0973 -0.0096 0.0441 0.0043 -0.0004 0.0234
required to achieve a desired precision 9 2.0547 2.0781 2.0664 0.0441 -0.0096 0.0173 0.0008 -0.0002 0.0117
(i.e. root with in tolerance). That is, in 10 2.0664 2.0781 2.0723 0.0173 -0.0096 0.0038 0.0001 0.0000 0.0059
b−a
iteration 1, the error is ε = . In
2
b−a b−a
iteration 2, the error is ε = . In iteration 3, the error is ε = . Hence, to generalize, in iteration
22 23
b−a
N , the error is ε = N . To stop the procedure, the condition to be satisfied is:
2
ui
ε < tol ⇒ tol > ε ⇒ tol >
b−a
2N
di q
Solving the inequality for N yields:
N>
ln(b − a) − ln(tol)
ln(2)
Si d
n
Hence, if the procedure starts with initial estimates a and b, and the desired tolerance is tol, then the
a
s
number of iterations, N , after which the desired precision can be achieved by Bisection Method, is given by:
h
ln(b − a) − ln(tol) b−a
A
N= = log2
ln(2) tol
In the example, N =
e
ln(b − a) − ln(tol)
ln(2)
m
am
2.3.2 False-Position Method
h
The False Position method is another
M o
bracketing method to find a root of
f (x) = 0. Once again, it is assumed that
f (x) is continuous on an interval [a, b],
and has a root there, so that f (a) and
f (b) have opposite signs, or, equally valid
to say, f (a)f (b) < 0.
The procedure is geometrical in
nature, and described as follows. Let
[a1 , b1 ] = [a, b] be the initial range
that brackets the root. Connect points
A : (a1 , f (a1 )) and B : (b1 , f (b1 )) by a
straight line, as shown in Figure 2.3 (on
page 16). Let c1 be this line’s x-intercept.
Then, if f (a1 )f (c1 ) < 0, then range
[a1 , c1 ] brackets the root. Otherwise, the
root is in the range [c1 , b1 ]. In Figure
2.3 (on page 16), it just so happens
that [a1 , c1 ] brackets the root. Hence, k0£)¡ MWSXN – h©ª}¡¨ ©¢ k{§¡ ©00©ª t¡¤©~
treat the range [a1 , c1 ] = [a, b] for next
iteration and repeat the process by calculating new x-intercept c2 .
Continuing this process generates a sequence c1 , c2 , c3 , · · · that eventually converges to the root.
Analytically, the procedure can be illustrated as follows. The equation of the line connecting points A and
B is:
y − f (b1 ) f (a1 ) − f (b1 )
=
x − b1 a 1 − b1
{£¡ V[ ©¢ X] astrax111@gmail.com
ahsan.academic@gmail.com gSy}S Wª~ S MhvjN
h¤{¨¡ W_ x©© ©¢ u©ªRs0ª¡{ j){0©ª WSY_ v¨¡ª t¡¤©~
Generalizing this result, the sequence of points that converges to the root is generated via
an f (bn ) − bn f (an )
cn = n = 1, 2, 3, · · · (2.11)
f (bn ) − f (an )
Find the root of equation x cos(x) = −1, within a tolerance = 10−2 , using False-Position Method.
Solution : The procedure is exactly same
z{|§¡ MWSWN – k{§¡ ©00©ª t¡¤©~_ y©§(¡ x cos(x) = −1
as that we used in Bisection Method,
Iteration a b c f(a) f(b) f(c) f(a)f(c) f(b)f(c) ε except that calculation of c is different;
1 -2.0000 4.0000 1.8323 -1.6146 1.1895 1.4426 2.6434 -2.3293 - here, Equation 2.11 is used to calculate c,
2 1.1895 4.0000 1.4426 -1.6146 2.5157 -1.0389 -1.4987 1.6773 1.3262 and ε is calculated as the difference of two
3 1.1895 2.5157 1.4426 -1.0389 1.9605 0.2552 0.3681 -0.2651 -0.5552 consecutive values of c.
4 1.9605 2.5157 0.2552 -1.0389 2.0700 0.0091 0.0023 -0.0094 0.1095 As can be seen from Table 2.2 (on page
i
5 2.0700 2.5157 0.0091 -1.0389 2.0738 0.0002 0.0000 -0.0002 0.0039
17), the procedure attains a value of c =
10−2 .
i q u
2.0723 after 5 iterations, which has an error of ε = 0.0039, and it is within the prescribed tolerance level of
s a
As opposed to Bracketing Methods, the Open Methods do not assure for convergence on root. But, still, each
of the Open Methods demand some pre-conditions for convergence on root. If these conditions are fulfilled then
they do converge on root.
Most of the Open Methods require only one
A h
initial guess (as opposed to two initial guesses
e
required in Bracketing Methods), but this initial
guess should be close enough to the root, so that
d
m
the procedure converges on root. As such, here,
am
the flexibility to choose initial guess is more than
Bracketing Methods. (Secant Method is an Open
o h
Method which requires two initial guess, but, unlike
the requirement of Bracketing Methods that the two
M
guesses should always fall on opposite sides of the
root, Secant Method also, like all the other Open
Methods, does not impose any such condition.)
Lastly, the Rate of Convergence of Open Methods
is usually higher when compared to any of the
Bracketing Methods. In other words, the total
number of iterations required to converge on a root,
using Open Methods, is much less than required
using Bracketing Methods.
It is observed that g(x) has only one fixed point, which is the only root of the original equation. It should be
noted that for a given equation f (x) = 0 there usually exist more than one Iteration Function. For instance,
x
e− 2 − x = 0 can also be rewritten as x = −2 ln(x) so that g(x) = −2 ln(x).
The Fixed-Point of g(x)
is found numerically via the
Fixed-Point Iteration :
where n = 1, 2, 3, · · ·
and x1 is initial guess
The procedure begins with
an initial guess x1 near the
Fixed-Point. The next point
x2 is found by evaluating
g(x1 ). Similarly, x3 is found
k0£)¡ MWSZN – k0¬¡~ ©0ª n¡{0©ª t¡¤©~_ z¨¡ ©¢ h©ª(¡£¡ª}¡ M{N t©ª©©ª¡ by evaluating g(x2 ), then
M|N v}0§§{© x4 , x5 , and so on. This
continues until convergence
is observed, that is, until two successive points are within a prescribed distance of each other:
a n
Convergence, the elements bounce from one side
of the Fixed-Point to the other as they approach
it.
h s
Convergence of Fixed-Point Iteration: It
can be shown that if |g (x)| < 1 near a Fixed-
d A
me
Point of g(x), then convergence is guaranteed.
Note that this is a sufficient, and not necessary,
condition for convergence.
Example 2.3 : ______
Point Iteration
Solve___________
h
Method am
x − 2−x = 0_______
using ______
Fixed
o
________________________
Find a root of equation x − 2−x = 0, within k0£)¡ MWS[N – l{¨¤ ©¢ x − 2
−x
M
−4
a tolerance = 10 , using Fixed-Point Iteration
Method.
Solution : Rewrite the equation as x = 2−x so that g(x) = 2−x . The
z{|§¡ MWSXN – k0¬¡~R©0ª n¡{0©ª
−x Fixed-Point can be roughly located as in Figure 2.6 (on page 18). To
t¡¤©~_ y©§(¡ x − 2 =0
proceed, we can start with an initial guess of x = 0, and prepare, as
Iteration x g(x) ε = |g(x) − x|
before, a table of iteration as shown in Table 2.3 (on page 18).
1 0.0000 1.0000 1.0000 The final answer (within the tolerance = 10−4 ) is found as 0.6412,
2 1.0000 0.5000 0.5000 after 13 iterations.
3 0.5000 0.7071 0.2071
4 0.7071 0.6125 0.0946
5 0.6125 0.6540 0.0415 2.4.2 Newton-Raphson Method
6 0.6540 0.6355 0.0185
The most commonly used open method to solve f (x) = 0, where
7 0.6355 0.6437 0.0082
8 0.6437 0.6401 0.0037
f (x) = 0 is continuous, is Newton-Raphson Method.
9 0.6401 0.6417 0.0016
Consider the graph of f (x) in Figure 2.7 (on page 19). Start with
10 0.6417 0.6410 0.0007 an initial point x1 and locate the point (x1 , f (x1 )) on the curve. Draw
11 0.6410 0.6413 0.0003 the tangent line to the curve at that point, and let its x-intercept be x2 .
12 0.6413 0.6411 0.0001 Locate (x2 , f (x2 )), draw the tangent line to the curve there, and let x3
13 0.6411 0.6412 0.0000 be its x-intercept. Repeat this until convergence is observed. In general,
two consecutive elements xn and xn+1 are related via
f (xn )
xn+1 = xn − , n = 1, 2, 3, · · · and x1 is initial guess (2.14)
f (xn )
{£¡ V] ©¢ X] ahsan.academic@gmail.com
astrax111@gmail.com gSy}S Wª~ S MhvjN
h¤{¨¡ W_ x©© ©¢ u©ªRs0ª¡{ j){0©ª WSY_ v¨¡ª t¡¤©~
q
Iteration xn f (xn ) f (xn xn+1 ε = |xn+1 − xn |
method with a tolerance of 10−4 .
Solution : Start by plotting the graph for f (x) = x2 − 3x − 7 1
2
di
1.0000 1.5403 -0.3012 6.1144
6.1144 7.0275 2.0128 2.6230
5.1144
3.4914
d
to find approximate locations of its roots (see Figure 2.8 on
Si
3 2.6230 -1.2782 -2.1686 2.0336 0.5894
page 19). Inspired by Figure 2.8, we prepare two tables (see 4 2.0336 0.0920 -2.2662 2.0742 0.0406
Table 2.5 and Table 2.6 on page 19), one for each root. Initial 5 2.0742 -0.0007 -2.2993 2.0739 0.0003
n
guess, for first root, is x1 = −2 and, for second root, is x1 = 4. 6 2.0739 0.0000 -2.2991 2.0739 0.0000
a
Accordingly, the two roots found are -1.5414 and 4.5414. (In
the tables we employed f (x) = 2x − 3.)
A
Newton-Raphson Method
Following points are note-worthy
am
that converges rapidly to the
intended root.
, Several
M oh factors may cause
Newton-Raphson Method to
fail.
1. The initial point x1 is
not sufficiently close to
the intended root.
2. At some point in the
iterations, f (x) may be
close to or equal to zero.
3. The iteration halts (usually
k0£)¡ MWS]N – u¡'©ªRx{¨¤©ª t¡¤©~_ l{¨¤ ©¢ f (x) ≡ x2 − 3x − 7 = 0 because of point of discontinuity).
4. The sequence diverges
, If f (x), f (x), and f (x) are continuous, f (root) = 0, and the initial point x
1 is close to the root, then the
sequence generated by Newton-Raphson Method converges to the root.
z{|§¡ MWSZN – u¡'©ªRx{¨¤©ª t¡¤©~_ y©§(¡ z{|§¡ MWS[N – u¡'©ªRx{¨¤©ª t¡¤©~_ y©§(¡
x2 − 3x − 7 = 0_ k0 x©© x2 − 3x − 7 = 0_ y¡}©ª~ x©©
Iteration xn f (xn ) f (xn xn+1 ε = |xn+1 − xn | Iteration xn f (xn ) f (xn xn+1 ε = |xn+1 − xn |
1 -2.0000 3.0000 -7.0000 -1.5714 0.4286 1 4.0000 -3.0000 5.0000 4.6000 0.6000
2 -1.5714 0.1837 -6.1429 -1.5415 0.0299 2 4.6000 0.3600 6.2000 4.5419 0.0581
3 -1.5415 0.0009 -6.0831 -1.5414 0.0001 3 4.5419 0.0034 6.0839 4.5414 0.0006
4 -1.5414 0.0000 -6.0828 -1.5414 0.0000 4 4.5414 0.0000 6.0828 4.5414 0.0000
, A downside of Newton-Raphson Method is that it requires the expression for f (x), which can, at times,
be difficult.
i
using Secant Method
______________________
Find the first positive root of x cos(x) =
−1 using Secant method with a tolerance
of 10−4 .
i q u
Solution : This equation was
dd
Si
previously tackled. Using similar
procedure followed earlier, we proceed by
preparing Table 2.7 (as shown on page
20). Because we know that there is a
a n
root near x = 2 we can safely start with
initial guess of x1 = 1 and x2 = 1.5. Root
h s
A
of the equation is found to be 2.0739.
oh
in former sections were equally suitable to solve
ahsan.academic@gmail.com
Iter xn−1 xn f (xn−1 ) f (xn ) xn+1 ε = |xn+1 − xn | Polynomial Equations as well as Transcendental
1 1.00000 1.50000 1.54030 1.10611 2.77374 1.27374 Equations. But consideration of some aspects of
2
3
4
5
M
1.50000 2.77374 1.10611 -1.58818 2.02292
2.77374 2.02292 -1.58818 0.11624 2.07412
2.02292 2.07412 0.11624 -0.00044 2.07393
2.07412 2.07393 -0.00044 0.00000 2.07393
0.75082
0.05120
0.00019
0.00000
Polynomial Equations force us to develop other
methods which are more suitable to solve such
equations.
The roots of polynomials follow these rules:
, For an n -order equation, there exist n roots, out of which some or all may be repeated (i.e., multiple
th
, Out of n roots, m(0 < m < n) roots may be real and remaining n − m roots may be complex.
roots with same solution value).
Bracketing Methods as well as Open Methods are very viable methods if only real roots exist. However,
when complex roots exist Bracketing Methods cannot be used because of the obvious problem that the criterion
for defining a bracket (that is, sign change) does not translate to complex guesses. Also, the problem of finding
good initial guesses complicates both the bracketing and the open methods. Furthermore, the open methods
could be susceptible to divergence.
In this section, we will discuss methods which specially excel in finding the roots of Polynomial Equations.
a parabola through these three points (see Figure 2.10 on page 21). Definitely, the equation of this parabola is
a quadratic equation.
The Muller’s Method consists of deriving the coefficients of the parabola that goes through the three guess
points, x0 , x1 , and x2 . These coefficients can then be substituted into the quadratic formula of the parabola
to obtain the point where the parabola intercepts the x-axis – that is, the estimated root, or next guess, (call it
x3 ). The equation for parabola passing through a point (x2 , y2 ) should take the form:
c = y2 (2.21)
y0 − y2 = a(x0 − x2 )2 + b(x0 − x2 )
ui (2.22)
2
y1 − y2 = a(x1 − x2 ) + b(x1 − x2 )
di q (2.23)
Equations 2.22 and 2.23 can be solved to give the values of coefficients a and b as:
a =+
(x0 − x2 )(y1 − y2 ) − (x1 − x2 )(y0 − y2 )
Si d (2.24)
a n
(x0 − x1 )(x1 − x2 )(x2 − x0 )
(x0 − x2 )2 (y1 − y2 ) − (x1 − x2 )2 (y0 − y2 )
b =−
s
(x0 − x1 )(x1 − x2 )(x2 − x0 )
h
(2.25)
Parabola 2.17:
d A
The parabola also passes through the point (x3 , y3 ) = (x3 , 0). Substituting this point in the Equation of
me 0 = a(x3 − x2 )2 + b(x3 − x2 ) + c
As Equation 2.26 is a quadratic equation, it can be solved for x3 − x2 as:
(2.26)
am
−2c
x3 − x2 = √ (2.27)
b± b2 − 4ac
or
o h −2c
M x3 = x2 +
b±
√
b2 − 4ac
Both Equations, 2.27 and 2.28, are useful because they allow to calculate the error:
(2.28)
x3 − x2
ε = (2.29)
x3
The above procedure completes a single iteration. To proceed
to next iteration, substitute x0 = x1 , x1 = x2 , and x2 = x3 , and
calculate the new values of x3 and ε. Repeat the process till
ε > tol (or, in other words, stop the process as soon as ε < tol)
A very important point (which may have been overlooked
and ignored) with regards to calculation of values of y0 , y1 , and
y2 , is that their values are calculated by substituting values of
x0 , x1 , and x2 , in the given function f (x), respectively. This
is possible because the points (x0 , y0 ), (x1 , y1 ), and (x2 , y2 ), lies
simultaneously on the parabola as well as the curve of given
function f (x), and hence satisfy equations of both curves (i.e.,
parabola and given function).
Use of parabola, when compared to secant, results in very
rapid convergence, using Muller’s Method.
Example 2.7 : Solve x3 − 13x − 12 = 0 using Muller’s
__________________________________________
Method
________ k0£)¡ MWSVUN – h©ª}¡¨ ©¢ t)§§¡L t¡¤©~
1 4.50000 5.50000 5.00000 20.62500 82.87500 48.00000 15.00000 62.25000 48.00000 93.79461 30.70539 -1.02351 3.97649 0.25739
2 5.50000 5.00000 3.97649 82.87500 48.00000 -0.81633 14.47649 32.87801 -0.81633 66.46721 -0.71119 0.02456 4.00105 0.00614
3 5.00000 3.97649 4.00105 48.00000 -0.81633 0.03678 12.97754 35.04975 0.03678 70.07226 0.02725 -0.00105 4.00000 0.00026
4 3.97649 4.00105 4.00000 -0.81633 0.03678 0.00002 11.97754 35.00004 0.00002 70.00007 0.00002 0.00000 4.00000 0.00000
Use Muller’s Method with initial guesses of x0 = 4.5, x1 = 5.5, and x2 = 5, to determine a root of the equation
x3 − 13x − 12 = 0
.
Solution : To solve the equation, prepare Table 2.8 (as shown on page 22), and repeat the process till ε falls
sufficiently below a reasonable accuracy (for example, below 10−4 ). In the table, columns d1 and d2 pertain to
the denominators of x3 − x2 :
d1 = b + b2 − 4ac and d2 = b − b2 − 4ac (2.30)
While calculating the values of x3 − x2 and x3 we always pick up either d1 or d2 whichever has the largest
absolute value.
ui
3Beginning of Chapter
di q ⇑ Table of Contents
Si d
a n
h s
d A
me
h am
M o
{£¡ WW ©¢ X] ahsan.academic@gmail.com
astrax111@gmail.com gSy}S Wª~ S MhvjN
CHAPTER
THREE
Contents
3.1 Vectors and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Elementary Properties of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3
3.4
Orthogonality and Orthonormality of Vectors and Matrices . .
Norm of Vectors and Matrices . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
ui .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
27
28
3.5
3.6
Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . .
Systems of Linear Equations . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
d
.i q.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
30
30
3.7
3.8
Numerical Methods to solve Systems of Linear Equations . . .
Numerical Methods to solve Systems of Non-Linear Equations
.
.
.
.
Si
.
. d.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
32
35
a n
Important Note: Dear Students, these lecture notes are “Supplementary”, which means, they are not meant to replace your text book,
but only provide supporting material to the classroom lectures, in a summarized manner. You should gain enough knowledge from other
s
sources to be in a position to elaborate the topics and material presented here 1
h
3.1 Vectors and Matrices
d A
me
A vector v is an ordered set of n scalars, v1 , v2 , . . . , vn , written as
am
v1
v2
h
v= .
..
M o
More precisely, this is a _______________
column-vector (a.k.a ________________
vn
column-matrix). Instead, if the scalars v1 , v2 , . . . , vn , are
written in a row
v = v1 v2 ... vn
then it would become a row-vector
__________
(a.k.a row-matrix).
___________
The count of scalars in the vector is known as the
order of vector or size of vector . Hence, in the example above, size of vector v is n. Individual scalars of
the vector are known as elements or members or terms of the vector. Hence, v1 is first element of vector v,
__________________________________________
and vn is nth element.
When consecutive vectors are arranged sideways or stacked, it makes a matrix :
a11 a12 . . . a1n
a21 a22 . . . a2n 2 3 3
A= . .. .. .. eg. a 3 × 3 matrix: P = −1 2 4
.. . . . 5 0 −3
am1 am2 . . . amn
This is a ________
Regular________
Matrix, where, size of each row of matrix, m, matches with size of any other row, and
similarly, sizes of all columns of matrix are same, n. In a matrix, if the size of at least one row (or column)
is different than size of another row (or column) of the matrix then such a matrix falls in to the category of
Sparse Matrices. Sparse matrices have special applications and they are not of interest to us in this course.
________________
Hence, we will concentrate on regular matrices, and, as such, in this text the term “matrices” would refer to
“regular matrices” only (unless otherwise mentioned explicitly).
23
XSW_ j§¡«¡ª{ ©¨¡0¡ ©¢ t{0}¡ h¤{¨¡ X_ t{0}¡ {ª~ y¡« ©¢ j){0©ª
The size of matrix or order of matrix having m rows and n columns is m × n. In a matrix, A, the
element at i th row and j th column is denoted by aij . (For example, in the above matrix P, element p32 = 0.)
Also, m is the ________________
first-dimension ___of ________
matrix A (i.e., the count of rows in the matrix) and n is the
second-dimension (i.e., the count of columns), and as such, A is a _________________
__________________
two-dimensional___ (or_____
2D) _______
matrix. Higher
dimensional matrices, such as 3D matrices, are possible, but, in this course, we will limit ourselves to 2D
matrices.
A 2D matrix of size n × n is known as a Square Matrix . Consequently, saying “matrix of order n” means
same as “matrix of order n × n”. A “non-square matrix” is known as a Rectangular Matrix . The elements
a11 , a22 , . . . , ann in a square matrix of size n form the Principal Diagonal of matrix. In the display below,
principal diagonal of matrix A is shown in bold-characters:
a11 a12 . . . a1n
a21 a22 . . . a2n
A= . .. .. ..
.. . . .
am1 am2 ... amn
The sum of elements of principal diagonal of a matrix is known as Trace of matrix.
A diagonal matrix is a square matrix which has all it’s elements as zeros, except that at least one of the
elements on it’s principal diagonal is non-zero. A diagonal matrix of order n which has all the elements (on
it’s principal diagonal) as ones, is called Identity Matrix of order n denoted by In . A null-matrix (a.k.a
ui
zero-matrix ) has all it’s elements as zeros, and a unit matrix has all it’s elements as ones.
di q
a11
0
0
0
a22
0
0
...
...
0
0
1 0 0 ... 0
0 1 0 . . . 0
0
0
0 0
0
0
0
Si
...
... d 0
0
0
1 1 1 ... 1
1 1 1 . . . 1
1 1 1 . . . 1
n
0 a33 ... 0 In = 0 0 1 . . . 0 0 0 ...
.. .. .. .. . . .. .. .. .. .. .. . . .
a
.. .. .. .. .. .. ..
. . . . . . . . . . . . . . . . . . . ..
0 0 0 . . . ann
Diagonal Matrix
0 0 0 ... 1
h
Identity Matrix (of order n) s 0 0 0 ... 0
Null Matrix
1 1 1 ... 1
Unit Matrix
d A
If O is the m × n zero matrix and A is any m × n matrix, then A + O = A. Thus, Null Matrix O is the
me
Additive Identity for matrix addition. Now that the Additive Identity for matrix addition is defined, we can
observe that the matrix −A, which can be obtained as (−1) × A, is the Additive Inverse of A, in the sense
that A + (−A) = (−A) + A = O.
am
A square matrix of order n which has all the elements above it’s principal diagonal as zeros is known as
h
Lower Triangular Matrix (shown as matrix L below). Similarly, a square matrix of order n which has
matrix U below).
M o
all the elements below it’s principal diagonal as zeros is known as Upper Triangular Matrix (shown as
a11 0 0 ... 0 a11 a12 a13 . . . a1n
a21 a22 0 ... 0 0 a22 a23 . . . a2n
a31 a32 a33 . . . 0 U=
0 a33 . . . a3n
L= 0
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
an1 an2 an3 . . . ann 0 0 0 . . . ann
The operation of exchanging rows of matrix with it’s columns (and vice-versa) is known as Transposition
of matrix, and the new matrix produced is called transpose of original matrix. Transpose of matrix A is
denoted by AT or A . Hence
a11 a12 . . . a1n a11 a21 . . . am1
a21 a22 . . . a2n a12 a22 . . . am2
A= . . . . =⇒ A = . .. .. ..
.. .. .. .. .. . . .
am1 am2 . . . amn a1n a2n . . . amn
di q
2 −5 3 9
0 7 −2 4
0 0 −1 0 ,
6 0 −3 4 0 8
0 0 0 −2 3 −7 ,
0 7 0 4
0 0 3 8
0 0 0 2
Si d
0 0 0 2
0 0 0 0 0 0 0 0 0 0
0 0 0 0
a n
h
0 0 0 0
s
A
The first non-zero entry of each row is known as the Pivot Value or Corner Value .
A matrix in Row Echelon Form which has all it’s Pivot Values as 1 is said to be in
d
Row Echelon Form.
me
Reduced Row Echelon Form , or simply Reduced Form . Hence, all the Identity Matrices are in Reduced
Elementary Row Operations (ERO) : The three EORs that can be performed on any matrix are as
am
follows:
1. EROs : Row Swaps: interchange two rows
h
2. EROd : Row Dilations: multiply a row by a non-zero real scalar
Following example demonstrates transformation of a matrix into it’s equivalent reduced form (Rx denotes
Row
x):
7
2 8 7 (ERO :) 1 R 1 4 72 1 4 2 (ERO :)− 1 R2
(EROt :)R2 −4R1 ,R3 −3R1
4 5 6 −−−−−−−−→ 4 5 6 −−−−−−−−−−−−−−−−−→ 0 −11 −8 −−−−−d−−−11−−→
d 2 1
3 2 9 3 2 9 0 −9 − 32
1 4 7/2 1 4 7/2 22
(EROd :) 111 R3
1 4 7/2
(EROt :)R3 +9R2
0 1 8/11 −−−−−−−−−−−→ 0 1 8/11 −−−−−−−−−−→ 0 1 8/11
0 −9 −3/2 0 0 111/22 0 0 1
ui
The Elementary Matrix Es to perform EROs on rows i and j is obtained by modifying Im (the m × m Identity
Matrix) such that now aii = 0, ajj = 0, aij = 1, and aji = 1. If matrix A is a square matrix of size n × n then
i q
Elementary Matrix Es to perform EROs on rows i and j is obtained by modifying the Identity Matrix In such
that now aii = 0, ajj = 0, aij = 1, and aji = 1.
d
perform the matrix multiplication Es A we
Hence, if we
a11 a12 · · · a1i · · · a1j · · · a1n
a21 a22 · · · a2i · · · a2j · · · a2n
Si d
obtain a new matrix A1 as:
..
.
..
. · · ·
..
. · · ·
..
. · · ·
a n ..
.
aj1 aj2 · · · aji · · · ajj · · · ajn
A1 = .
..
..
. ···
..
. ···
h..
. ··· s ..
.
ai1 ai2 · · · aii · · · aij · · · ain
.
.. ..
. ···
..
. ···
d A ..
. ···
..
.
me
am1 am2 · · · ami · · · amj · · · amn
which is a transformation on original matrix A, such that, rows i and j of original matrix A are swapped (or
interchanged) to produce the new transformed matrix A1 .
am
To perform EROd (Row Dilation) on row i, with a factor of d, we can multiply matrix A with the matrix Ed ,
h
as given below:
Ed =
M o
1 0 ···
0 1 · · ·
.. ..
. . · · ·
0
0
..
.
··· 0
· · · 0
· · ·
..
.
0 0 · · · aii = d · · · 0
. . .. ..
.. .. · · · . · · · .
0 0 ··· 0 ··· 1
The Elementary Matrix Ed to perform EROd on row i, with a multiplication factor of d, is obtained by modifying
Im (the m × m Identity Matrix) such that now aii = d. If matrix A is a square matrix of size n × n then
Elementary Matrix Ed to perform EROd on row i, with a factor of d, is obtained by modifying the Identity
Matrix In such that now aii = d.
Hence, if we
perform the matrix multiplication Ed A we obtain a new matrix A2 as:
a11 a12 · · · a1i · · · a1n
a21 a22 · · · a2i · · · a2n
.. .. .. ..
. . · · · . · · · .
A2 =
da da · · · da · · · da
i1 i2 ii in
. . . .
.. .. ··· .
. .
· · · .
am1 am2 · · · ami · · · amn
which is a transformation on original matrix A, such that, row i of original matrix A is dilated with a factor of
d to produce the new transformed matrix A2 .
To perform EROt (Row Transvection) on row i using row j, with a factor of t, we can multiply matrix A
with the matrix Et , as given below:
{£¡ W[ ©¢ X] astrax111@gmail.com
ahsan.academic@gmail.com gSy}S Wª~ S MhvjN
h¤{¨¡ X_ t{0}¡ {ª~ y¡« ©¢ j){0©ª XSX_ v¤©£©ª{§0 {ª~ v¤©ª©«{§0 ©¢ ¡}© {ª~ t{0}¡
1 0 ··· 0 ··· 0 ··· 0
0 1 · · · 0 · · · 0 · · · 0
.. .. .. .. ..
. . · · · . ··· . · · · .
0 0 · · · aii = 1 · · · aij = t · · · 0
Et = . . .. .. ..
.. .. · · · . ··· . · · · .
0 0 · · · aji = 0 · · · ajj = 1 · · · 0
. . .. .. ..
.. .. · · · . ··· . · · · .
0 0 ··· 0 ··· 0 ··· 1
The Elementary Matrix Et to perform EROt on row i, using row j, with a multiplication factor of t, is obtained
by modifying Im (the m × m Identity Matrix) such that now aij = t. If matrix A is a square matrix of size n × n
then Elementary Matrix Et to perform EROt on row i, using row j, with a factor of t, is obtained by modifying
the Identity Matrix In such that now aij = t.
Hence, if we perform the matrix multiplication Et A we obtain a new matrix A3 as:
a11 a12 ··· a1i ··· a1j ··· a1n
a21 a22 ··· a2i ··· a2j ··· a2n
.. .. .. .. ..
. . ··· . ··· . ··· .
aj1 aj2 ··· aji ··· ajj ··· ajn
A3 = .. .. .. .. ..
. . · · · . · · · . · · ·
ai1 + taj1 ai2 + taj2 · · · aii + taji · · · aij + tajj · · · ain + tajn
.
ui
am1
..
.
am2
..
. ···
··· ami
..
. ···
··· amj
..
. ···
···
di q
amn
..
.
________________________________________________
a n
Following example demonstrates transformation of a matrix into it’s equivalent reduced form (Rx denotes
Row x):
2 8 7
4 5 6
h s
31 2 9
0 0
d
2 8 7A
1 4 72
e
Ed on R1 with d= 12 2
−−−−−−−−−−−−→ 0 1 0 4 5 6 = 4 5 6
am
Et on R3 using R1 with t=−3 2
−−−−−−−−−−−−−−−−−−→ 0 1 0 −4 1 0 4 5 6 = 0 −11 −8
3
−3 0 1 0 0 1 7 3 29 0 −9 −2
o h 1
Ed on R2 with d= −11
−−−−−−−−−−−−−→ 0 −11
1 0
1
0 1 4 2
0 0 −11 −8 = 0 1
1 4 7/2
8/11
M
Et on R3 using R2 with t=9
−−−−−−−−−−−−−−−−−→ 0 1 0 0 1
0
1 0 0
0 1 0 −9 −
1 4 7/2
3
2 0 −9 −3/2
1 4
8/11 = 0 1 8/11
7/2
0 9 1 0
−9 −3/2 0 0 111/22
Ed on R3 with d= 22
1 0 0 1 4 7/2 1 4 7/2
−−−−−−−−−−−−111 −→ 0 1 0 0 1 8/11 = 0 1 8/11
22
0 0 111 0 0 111/22 0 0 1
It is not hard to understand that the above six elementary matrices can be combined as a product into a
single ____________
Elementary ___________
Reduction _______
Matrix as follows:
1 1
2 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 2 0 0
0 1 0 0 1 0 −4 1 0 0 1
0 0 1 0 0 1 0 = −4 − 11 1
0
−11
22 22
0 0 1 −3 0 1 0 0 1 0 0 1 0 9 1 0 0 111 −3 9 111
So,
1 the complete reduction
can
be performed in
a single product as:
2 0 0 2 8 7 1 4 7/2
−4 − 1 0 4 5 6 = 0 1 8/11
11
22
−3 9 111
3 2 9 0 0 1
to every vector.
A set of non-zero vectors {v1 , v2 , · · · , vk } is said to be Mutually Orthogonal if vi · vj = 0 for all i = j.
For example, the standard basis vectors e1 , e2 , e3 are mutually orthogonal.
Unit Vector : A Unit Vector is a vector of length 1. If it’s length is 1, then the square of it’s length is also
1. So, v is a unit vector ⇐⇒ v · v = 1. For example, all the standard basis vectors e1 , e2 , e3 are unit vectors.
Normalization of Vector : The process of replacing a vector by a unit vector in it’s direction is called
Normalization of Vector. An arbitrary non-zero vector w can be normalized by calculating:
1
=
w w
w2
x
1 1
For example, if v = y then v = v2 v =√ v.
z x2 +y 2 +z 2
di q
1 0 0 ,
0 1 0
cos θ − sin θ
sin θ cos θ
,
1/2 −1/2 1/2 −1/2
1/2 1/2 −1/2 −1/2
1/2 −1/2 −1/2 1/2
Si d
The matrix
1 1
a n
1 −1 is not an Orthogonal Matrix. But we can adjust it to make it an Orthogonal Matrix:
√1
s
1 1
2 1 −1
. (Such matrices are known as Matrices with Equi-norm columns. Observe that each column-vector
in this matrix has same l2 -norm.)
A h
3.4 Norm of Vectors and Matrices
e d
m
Norm of Vector x, denoted as x, is any real number which satisfies the following properties:
am
1. x > 0, if x = 0, (i.e. at least one element of x is non-zero)
2. kx = |k|x, for any real scalar k
o h
3. x + y x + y, for any two vectors x and y
As you can see, based on this definition, Norm of Vector is just a real number associated with a given vector.
As such, there are many possibilities to calculate Norm of Vector. But, out of those many, we usually use one
M
of the following three norms: (a) l1 -Norm (b) l2 -Norm (c) l∞ -Norm. For a vector x of size n, these values are
,
defined as follows, respectively:
n
x1 = i=1 |xi |, where |xi | denotes the absolute value of xi
, √
x2 = + xT · x = +
n
2 T
i=1 xi , where x denotes the transpose of vector x
, x∞ = max |xi |
1in
,
As an example, the three norms for the vector v = (1, −2, 3, −4) are:
,
l1 = |1|
+ | − 2| + |3| + | − 4| = 10
l2 = + (1)2 + (−2)2 + (3)2 + (−4)2 = 5.477
, l∞ = max(|1|, | − 2|, |3|, | − 4|) = 4
Norm of Square Matrix A, denoted as A, is any real number which satisfies the following properties:
1. A 0
2. A = 0, if and only if A = 0, (i.e. A is a zero matrix)
3. kA = |k|A, for any real scalar k
4. A + B A + B, for any two matrices A and B
5. AB AB, for any two matrices A and B
Just as in case of vectors, based on this definition, Norm of Matrix is just a real number associated with a
given matrix. As such, there are many possibilities to calculate Norm of Matrix. But, out of those many, we
usually use one of the following three norms: (a) l1 -Norm (b) l2 -Norm (c) l∞ -Norm.
For a square matrix A of size n × n, these values are defined as follows, respectively:
{£¡ W] ©¢ X] astrax111@gmail.com
ahsan.academic@gmail.com gSy}S Wª~ S MhvjN
h¤{¨¡ X_ t{0}¡ {ª~ y¡« ©¢ j){0©ª XSY_ u©« ©¢ ¡}© {ª~ t{0}¡
, A 1
n
= max ( i=1 |aij |), where |aij | denotes absolute value of aij , element in ith row j th column of A
, A
1jn
n n 2
2 =+ j=1 aij
, A
i=1
n
∞ = max ( j=1 |aij |)
1in
Hence, l1 -Norm is “maximum absolute column sum”, which means, we sum the absolute values along each
column and then take the largest answer. And, l∞ -Norm is “maximum absolute row sum”, which means, we
sum the absolute values along each row and then take the largest answer.
l2 -Norm is also known as Euclidean Norm , and as such, also denoted as lE -Norm (or, for a given matrix
A as AE ).
,l
1 = max(|1|
+ | − 4| + |7|, | − 2| + |5| + | − 8|, |3| + | − 6| + |9|) = 18
2 2 2 2 2 2 2 2 2
,l
2 = + (1) + (−2) + (3) + (−4) + (5) + (−6) + (7) + (−8) + (9) = 16.882
Matrix Inversion
ui
i q
A square matrix whose rank is less than the count of rows in the matrix is known as a Singular Matrix .
d
For any non-singular matrix M of size n there exist another non-singular matrix M −1 of size n such that
M −1 M = In . We say M −1 is the Inverse of matrix M .
Si d
For a given non-singular matrix M the corresponding inverse M −1 can be found by systematic application
of Elementary Row Operations on M . The procedure can be best illustrated with an example.
A = 3
−1 1 1
2 4
A h
To start, we write:
A|I
1 −1 0
e d
m
Now, we perform a series of Elementary Row Operations on both the matrices, simultaneously, on either side
am
of the middle bar, as follows:
−1 1 1 1 0 0
o h 3
2 4
1 −1 0
0 1 0
0 0 1
M E on R with d=−1
−−d−−−−1−−−−−−−→ 3 2
1 −1 −1
1 −1 0
4
−1 0 0
0 1 0
0 0 1
Et on R2 using R1 with t=−3 1 −1 −1 −1 0 0
Et on R3 using R1 with t=−1 7
−−−−−−−−−−−−−−−−−−→ 0 5 3 1 0
0 0 1 1 0 1
1 −1 −1 −1 0 0
Ed on R2 with d= 15 0 1 7 3
−−−−−−−−−−−−→ 5 5
1
5 0
0 0 1 1 0 1
2 1
1 0 25 −5 5 0
Et on R1 using R2 with t=1 0 1 7 3 1
0
−−−−−−−−−−−−−−−−−→ 5 5 5
0 0 1 1 0 1
4 1
Et on R1 using R3 with t=− 25 1 0 0 − 5 5 − 25
7
Et on R2 using R3 with t=− 5 0 1 0 − 4 1 − 7
−−−−−−−−−−−−−−−−−−→ 5 5 5
0 0 1 1 0 1
As can be observed, the aim was to reduce the left matrix (i.e. A) to Identity Matrix, I. While reducing the left
matrix A to I, the EROs transformed the original I on right side of
bar to A−1 .
−1 1 1 − 45 51 − 25
Hence, the inverse of matrix A = 3 2 4 is A−1 = − 45 51 − 75
1 −1 0
1 0 1
gSy}S Wª~ S MhvjN tfy
astrax111@gmail.com {£¡ W^ ©¢ X]
XSZ_ s0ª¡{ j){0©ª h¤{¨¡ X_ t{0}¡ {ª~ y¡« ©¢ j){0©ª
__________
· · · + an xn = b, we call it a ____________
Polynomial________
Linear __________
Equation.
a
Otherwise, the system is said to be Non-homogeneous System of Linear Equations .
h s
Hence, a system of linear equations can be represented as Ax = b, where A is the Coefficient Matrix, x is
__________________
the Variables Vector,
________________
and b
a11 a12 · · · a1n
is the Constants
__________
Vector,
_______
x1
e
a21 a22 · · · a2n x2 b2
A= . .. .. .. , x = . , and b= .
.. .. ..
am1
.
am2
.
m
···
.
amn xn bm
h am
The Augmented Coefficient Matrix is written and defined as:
o
a11 a12 · · · a1n b1
a21 a22 · · · a2n b2
M
(A|b) = .
..
.
.. .
am1
.. .
..
am2
..
.
··· amn bm
On the contrary, Consistent Systems (where ranks of coefficient matrix and augmented matrix are equal)
___________________
have one or more
solutions. For example, consider the following system of linear equations:
1 2 3 x1 1
4 5 6 x2 = 2
7 8 9 x3 3
Thereduced form of these
equations is:
1 0 −1 x1 −1/3
0 1 2 x2 = 2/3
0 0 0 x3 0
Here, the rank of coefficient matrix as well as augmented matrix is 2. Hence, this is a consistent system and has
more than one solutions (actually, infinitely many.) To find the solution, we must express any two variables
of the system in terms of third variable. Suppose we decided to express x2 and x3 in terms of x1 , then, the
solution is x2 = −2x1 and x3 = x1 + 13 , ∀x1 .
Practically, we are much more interested in systems of linear equations where coefficient matrix is of size
n × n. When coefficient matrix is a square matrix then it’s rank can reach a maximum value of n (and,
similarly, the rank of augmented matrix also can reach a maximum value of n). Hence, for a given system of
linear equations, if the rank of coefficient matrix is n then, obviously, the rank of augmented matrix is also
n. Such a matrix falls into the category of Non-Singular Matrix (Also, determinant of Singular Matrix is
zero). If the coefficient matrix is non-singular, then, not only solution exists but it will be unique.
____________________________________________________________________________________________
i
For
example,
consider
the following system of linear equations:
1 8 7 x1 5
2 9 6 x2 3
3 4 5 x3 1
i q u
Thereduced form equations
of these is:
dd
Si
1 8 7 x1 5
0 1 8/7 x2 1
0 0 48/7 x3 6
a n
Here, the rank of coefficient matrix as well as augmented matrix is 3, which is also the size of matrix. Hence,
this is a consistent system and has unique solution. To find the solution, we solve the reduced form of equations,
to get, x1 = − 98 , x2 = 0 and x3 = 78 .
h s
Ill-Conditioned Systems of Linear Equations: We have seen that Singular Matrices impose non-
d A
existence of solution. But, equally dangerous are Ill-conditioned Matrices (which consequently imply Ill-
conditioned Systems of Linear Equations) whose determinant is close to zero. Such systems are very sensitive
me
to small changes in values of individual variables of equations, which means, even a small change in value
of a single variable will change the determinant of the coefficient matrix drastically, and as such, it becomes
very important to check whether a given solution produced by some numerical method is acceptable or not.
am
To measure this sensitivity we define Condition Number, κ for a given system of of linear equations (in
h
terms of it’s coefficient matrix A) as follows:
M o κ(A) = AA−1
Condition number is always greater than 1 (∵ κ(A) = AA−1 AA−1 = I = 1). Values of
the condition number close to 1 indicate a well-conditioned matrix whereas large values indicate an
ill-conditioned matrix .
To
illustrate
the
concept
4of
Ill-conditioning, let us take the following example of system of linear equations:
1 104 x1 10
=
−1 2 x2 1
To make the situation practical, let us assume that our computer has precision for first three significant
digits only (and it neglects the fourth digit and above as round-off error.)
Now,
the
reduced
form 4
of these equations is:
1 104 x1 10
=
0 104 x2 104
This happened because our computer approximated the calculation 2 + 104 ≈ 104 to the third significant digit
(instead of the correct value 2 + 104 = 10002 in which precision is involved at fourth significant digit.) Solving
the reduced set of equations yield x1 = 0 and x2 = 1. For sure, this is a poor approximation of true solution.
So, let us see what is the condition number for this set of equations:
−1
κ(A) = A 1 A 4
1
1 10 1 2 −104
=
−1 2 2 + 104 1
1
4 1 4
= (2 + 10 ) × 2+104 (1 + 10 )
= 10001
The large value of condition number is clearly indicating that the system is ill-conditioned.
gSy}S Wª~ S MhvjN tfy
astrax111@gmail.com {£¡ XV ©¢ X]
XS\_ u)«¡0}{§ t¡¤©~ © ©§(¡ y¡« ©¢ s0ª¡{ j){0©ª h¤{¨¡ X_ t{0}¡ {ª~ y¡« ©¢ j){0©ª
A direct method computes the solution by performing a predetermined number of operations. These
u
i q
methods transform the original system into an equivalent system in which the coefficient matrix becomes
d
an upper-triangular matrix, a lower-triangular matrix, or diagonal matrix, thereby making the new system
much easier to solve.
Si d
Indirect methods use iterations to approximate the solution. The iteration process begins with an initial
vector and generates successive approximations that eventually converge to the actual solution. Unlike direct
h
will cover s
Newton’s
Elimination Method and
______________________________
Method and Fixed-Point
A
_______________________ __________________ ____________
Iteration Method.
_________________
e d
3.7.1 Cramer’s Method
m
h am
Cramer’s Method is systematic approach to solving system of linear equations. It can solve only non-
singular system of equations.
M o
To use Cramer’s Method write the given system of equations as a1 x1 + a2 x2 + · · · + an xn = b, where
xk (∀k, 1 k n) are the variables of the equations, and ak (∀k, 1 k n) are the column-vectors for
coefficients of corresponding variable xk (∀k, 1 k n), and b is the column-vector for constants (RHS of
equations). Call matrix A as [a1 a2 · · · an ], matrix A1 as [b a2 · · · an ], matrix A2 as [a1 b · · · an ],
and so on. To generalize, call matrix Ak as [a1 a2 · · · ak−1 b ak+1 an ], which is the modified matrix
A in which the column corresponding to ak ,the coefficient column-vector of xk , is replaced by constant column-
vector b.
The solution to the system of linear equations is given by:
|A1 | |A2 | |Ak | |An |
x1 = , x2 = , · · · xk = , · · · xn =
|A| |A| |A| |A|
where |Ak | denotes the determinant of matrix Ak .
1
di q
(2)(2)(1) + (1)(−1)(−1) + (4)(1)(1) − (4)(2)(−1) − (1)(1)(1) − (2)(−1)(1)
y =
1
−1
2 −3 4
1 −1
0 1
=
(2)(1)(1)
Si d
+ (−3)(−1)(−1) + (4)(1)(0) − (4)(1)(−1) − (−3)(1)(1) − (2)(−1)(0)
=3
2 −3 1
a n
s
1 1 2
h
−1 0 1 (2)(1)(1) + (−3)(2)(−1) + (1)(1)(0) − (1)(1)(−1) − (−3)(1)(1) − (2)(2)(0)
z = = =2
A
(2)(1)(1) + (−3)(−1)(−1) + (4)(1)(0) − (4)(1)(−1) − (−3)(1)(1) − (2)(−1)(0)
2 −3 4
1 1 −1
−1 0 1
e d
3.7.2 m
Gauss Elimination Method
h am
Gauss Elimination Method uses systematic approach to transform the Augmented Coefficient Matrix
Forward
M o
of System of Linear Equations into it’s equivalent Reduced Row Echelon form. This process is known as
Elimination Process.
______________________________
Once in it’s Reduced Form we can start solving for the equation with single variable, then substitute the
variable’s value in the equation with two variables and solve for the second variable. By substituting these
two variables in the next equation (with three variables) the third variable can be solved. Continuing this
Backward_____________
process results in resolution of all the variables. This step is known as ___________ Substitution.
Hence, solution using Gauss Elimination Process involves the two steps : Forward Elimination process and
Backward Substitution process.
6 3 3 6
y− z= ⇒ y= − (− )(2) = 3
5 5 5 5
3 1 1 3
x − y + 2z = ⇒ x= − (− )(3) + (2)(2) = 1
2 2 2 2
ui
Solve the following system of equations using Gauss Elimination Method:
2a − 5b + 4c − 9d
a + b + c + d −
= −7
e = 14
di q
a + 2b + 3c + 4d − 10e = 36
8a − 5b − 3c
a − 3b + 3c − 7d −
+ e = 10
e = −4
Si d
Solution : Write down theAugmented Coefficient Matrix
n
and start Forward Elimination Process:
a
s
2 −5 4 −9 0 −7
1 −1 14
h
1 1 1
1 2 3 4 −10 36
8 −5 −3
1 −3
0
d A
3 −7 −1 −4
1 10
me
Ed on R1 with d= 12
−−−−−−−−−−−−→ 1
1 −5/2
1 1
2
2 −9/2
1
3
1
4 −10
−1
0 −7/2
14
36
am
8 −5 −3 0 1 10
1 −3 3 −7 −1 −4
o h
Et on R2 using R1 with t=−1
Et on R3 using R1 with t=−1
1 −5/2
0
2 −9/2
7/2 −1 11/2
0 −7/2
−1 35/2
M
−−−−−−−−−−−−−−−−−−→ 0 9/2 1 17/2 −10 79/2
Et on R4 using R1 with t=−8
Et on R5 using R1 with t=−1
0 15 −19 36 1 38
0 −1/2 1 −5/2 −1 −1/2
1 −5/2 2 −9/2 0 −7/2
0 1 −2/7 11/7 −2/7 5
Ed on R2 with d= 7 2
−−−−−−−−−−−−→ 0 9/2 1 17/2 −10 79/2
0 15 −19 36 1 38
0 −1/2 1 −5/2 −1 −1/2
1 −5/2 2 −9/2 0 −7/2
0 1 −2/7 11/7 −2/7 5
Et on R3 using R2 with t=− 92
−−−−−−−−−−−−−−−−−−→ 0 0 16/7 10/7 −61/7 17
Et on R4 using R2 with t=−15
0 0 −103/7 87/7 37/7 −37
Et on R5 using R2 with t= 12
0 0 6/7 −12/7 −8/7 2
1 −5/2 2 −9/2 0 −7/2
0 1 −2/7 11/7 −2/7 5
Ed on R3 with d= 7
−−−−−−−−−−−−16→ 0 0 1 5/8 −61/16 119/16
0 0 −103/7 87/7 37/7 −37
0 0 6/7 −12/7 −8/7 2
1 −5/2 2 −9/2 0 −7/2
0 1 −2/7 11/7 −2/7 5
Et on R4 using R3 with t= 103
−−−−−−−−−−−−−−−−−−7→ 0 0 1 5/8 −61/16 119/16
Et on R5 using R3 with t=− 67
0 0 0 173/8 −813/16 1159/16
0 0 0 −9/4 17/8 −35/8
{£¡ XY ©¢ X] astrax111@gmail.com
ahsan.academic@gmail.com gSy}S Wª~ S MhvjN
h¤{¨¡ X_ t{0}¡ {ª~ y¡« ©¢ j){0©ª XS]_ u)«¡0}{§ t¡¤©~ © ©§(¡ y¡« ©¢ u©ªRs0ª¡{ j){0©ª
1 −5/2 2 −9/2 0 −7/2
0 1 −2/7 11/7 −2/7 5
Ed on R4 with d= 8
−−−−−−−−−−−−173 −→
0 0 1 5/8 −61/16 119/16
0 0 0 1 −813/346 1159/346
0 0 0 −9/4 17/8 −35/8
1 −5/2 2 −9/2 0 −7/2
0 1 −2/7 11/7 −2/7 5
Et on R5 using R4 with t= 94
−−−−−−−−−−−−−−−−−→ 0 0 1 5/8 −61/16 119/16
0 0 0 1 −813/346 1159/346
0 0 0 0 −547/173 547/173
1 −5/2 2 −9/2 0 −7/2
0 1 −2/7 11/7 −2/7 5
Ed on R5 with d= 173
−−−−−−−−−−−−547 −→ 0 0 1 5/8 −61/16 119/16
0 0 0 1 −813/346 1159/346
0 0 0 0 1 −1
Now start the Backward Substitution Process:
e = −1
813 1159 1159 813
d− e= ⇒ − −d= (−1) = 2
346 346 346 346
5 61 119 119
5
61
ui
q
c+ d− e= ⇒ c= − (1) − (−1) = 3
i
8 16 16 16 8 16
2 11
b− c+ d− e=5
2
⇒ b=5− −
2
(3) +
11
(1) + −
2
(−1) = 4
dd
Si
7 7 7 7 7 7
5 9 7 7 5 9
a n
3.8
h s
Numerical Methods to solve Systems of Non-Linear Equations
d A
Systems of Non-linear Equations can be solved numerically by using either Newton’s Method (for small
_________________
systems) or the Fixed-Point
3.8.1
e
Iteration Method
_____________________________
m
(for large systems)
h am
Newton’s Method to find roots of equation was discussed in Section 2.4.2 (on page 18). An extension of that
technique can be used for solving a system of nonlinear equations. Let us solve first system of two nonlinear
M o
equations, followed by a general system of n nonlinear equations.
f1 (x, y) =0
f2 (x, y) =0
Suppose (xa , ya ) denotes the actual solution so that f1 (xa , ya ) = 0 and f2 (xa , ya ) = 0. We begin with an
initial estimate of solution as (xe , ye ). If xe is sufficiently close to xa , and ye is sufficiently close to ya , then, by
Taylor’s series expansion (ignoring second and higher order terms):
∂f1 ∂f1
f1 (xa , ya ) = f1 (xe , ye ) + ∆x + ∆y
∂x (xe ,ye ) ∂y (xe ,ye )
(3.1)
∂f2 ∂f2
f2 (xa , ya ) = f2 (xe , ye ) + ∆x + ∆y
∂x (xe ,ye ) ∂y (xe ,ye )
where ∆x = xa − xe and ∆y = ya − ye .
After substituting f1 (xa , ya ) = 0, f2 (xa , ya ) = 0, and rearranging the terms, the above two equations can be
written in matrix form as:
∂f1 ∂f1
∂x ∂y ∆x −f1
∂f ∂f2 =
2 ∆y −f2 (x ,y )
e e
∂x ∂y (xe ,ye )
gSy}S Wª~ S MhvjN tfy
astrax111@gmail.com {£¡ XZ ©¢ X]
XS]_ u)«¡0}{§ t¡¤©~ © ©§(¡ y¡« ©¢ u©ªRs0ª¡{ j){0©ª h¤{¨¡ X_ t{0}¡ {ª~ y¡« ©¢ j){0©ª
Now, system of nonlinear equations is transformed into equivalent system of linear equations! Hence,
Cramer’s Method or Gauss Elimination Method can be employed to find the values of ∆x and ∆y.
This means that we should be able to find the correct solution (xn , yn ) from the approximate guess
(xe , ye ) by substituting (xa , ya ) = (xe + ∆x, ye + ∆y). But, because we did not consider higher order
terms while expanding f1 (xa , ya ) using Taylor’s series, we will get only a better approximate guess as
(xa , ya ) (xe + ∆x, ye + ∆y).
But, also, surely (xe + ∆x, ye + ∆y) is nearer to correct solution (xa , ya ) as compared to our initial guess
(xe , ye ). This suggests that we can repeat the process iteratively to find even better guesses, and reach a final
guess which is close enough to the correct value within an allowed tolerance.
Hence, re-initialize the next guess as (xe , ye ) = (xa , y
a ) and calculate new (xa , ya ) using Equation 3.1 (on
page 35). In every iteration calculate = (∆x, ∆y) = + ∆x2 + ∆y 2 and repeat till > tolerance.
Example 3.10 : Newton’s Method to solve system of non-linear equations
___________________________________________________________
Example : Solve the following set of nonlinear equations using Newton’s Method with a tolerance of 10−3 .
3.2x3 + 1.8y 2 + 24.43 = 0
−2x2 + 3y 3 = 5.92
Solution : Let us name our equations as follows:
p = 3.2x3 + 1.8y 2 + 24.43 q = 2x2 − 3y 3 + 5.92
Also, let us name the derivatives as follows:
∂p ∂p ∂q
ui ∂q
q
a = = 9.6x2 b = = 3.6y c = = 4x d = = −9y 2
∂x
So, our aim is to solve the matrix equation :
∂y ∂x
di ∂y
a b
c d (x
e ,ye )
∆x
∆y
=
−p
−q
(xe ,ye )
Si d
an
For easier reference let us define following three matrices:
C=
a b
c d
Cx =
−p b
−q d
h s Cy =
a
c
−p
−q
d A
It is critical for convergence of solution to start with an initial guess as close to the actual solution as
possible. A good approach for initial guess is to plot the graphs of the functions and guess the initial value
e
approximate to their point of intersection (Any software application, such as MATLAB or Microsoft Excel or
FreeMat, will do the job of plotting). In this case, the graphs show that the solution is somewhere near to
(−2, 2).
m
am
p = 3.2x3 + 1.8y 2 + 24.43 q = 2x2 − 3y 3 + 5.92
z{|§¡ MXSVN – j¬{«¨§¡_ u¡'©ªL t¡¤©~ © ©§(¡ ¡« ©¢ ª©ª§0ª¡{ ¡){0©ª
H xp
o
yph a b c d p q |C| |Cx | |Cy | ∆x ∆y xn yn
M
UU
YU
WU
UU
UU
U
U
V
UU
UU
UU
YW
]V
]U
UU
UU
UU
U^
ZZ
U^
ZU
]X
UU
U]
UU
WU
UX
\Y
W\
V
YS
SY
SU
SU
SV
SW
SV
YS
]S
[S
US
XW
WS
\S
[S
VS
US
X]
RW
R]
RU
RU
RW
VY
XX
RX
RV
RV
YV
VX
XY
\
W
V
V
YW
^X
^U
VW
W]
UW
]^
YU
U^
X[
VY
UX
YX
UU
X]
U\
\Y
W\
UU
\U
W
[S
S\
SU
UY
SV
SY
SV
SV
SU
SV
\S
US
VV
VS
[S
US
VS
YW
Y]
RW
R]
RU
RV
RU
RW
US
RW
RV
RV
\
ZZ
\V
V
UV
UW
]\
V]
VV
^^
VW
YW
V\
WW
UV
UU
VW
X]
UY
SW
VU
YU
UW
VX
UU
U^
\U
VW
UU
WU
UU
\U
UU
X
SX
V
[S
S
S
UZ
VS
[S
US
VS
US
VS
US
YW
RW
R]
RU
RU
RU
RW
RW
RV
Z
\
[
\
Z]
RU
RU
RU
RU
RU
RU
UZ
^
U
^
UU
UU
UV
U]
[^
j
[j
Wj
XX
^^
^^
^^
j
UV
[
[^
ZX
Y]
\U
\U
VW
UU
Y
^S
SX
^]
XX
^\
SU
SX
SU
[S
UY
UX
[U
^W
VS
[S
US
YW
VS
RW
R]
RW
S^
S^
S]
RW
RV
WS
[S
\S
RV
R[
R\
As shown in Table 3.1 (on page 36), we were able to converge to the actual solution as MRWSU^^^QVS\UUUN
in Y iterations with a tolerance of USUUV.
Following the same approach as used in the case of two variables, and choosing (xe1 , xe2 , · · · , xen ) as the initial
estimate, we arrive at:
∂f1 ∂f1 ∂f1
∂x1 ∂x2 · · ·
∂f ∂xn
∆x1 −f1
2 ∂f 2 ∂f 2
··· ∆x2 −f2
∂x1 ∂x2 ∂xn . = .
. .. .. .. ..
.. . ··· .
∂fn ∂fn ∂fn ∆x n −f n (x ,x ,··· ,x )
e1 e2 en
···
∂x1 ∂x2 ∂xn (xe1 ,xe2 ,··· ,xen )
We can solve this equation to obtain the vector [∆xk ], 1 k n, which can be used to find the next estimate
as:
xa1
xe1
∆x1
xa1
xe1 ∆x2
.. = .. + ..
.
.
.
xan xen ∆xn
The process can be iterated till = ||(∆x1 , ∆x2 , · · · , ∆xn )|| = + ∆x21 + ∆x22 + · · · + ∆x2n > tolerance, every
i
time substituting (xe1 , xe2 , · · · , xen ) = (xa1 , xa2 , · · · , xan ).
i q u
The determinant:
∂f1 ∂f1
∂f1
dd
Si
∂x1 ∂x2 · · ·
∂f ∂xn
2 ∂f2 ∂f2
···
∂x ∂x ∂x n
n
J(f1 , f2 , · · · , fn ) = 1 2
.
.. . .
..
a
.. ···
h
∂x
1s
∂fn ∂fn
∂x2
···
∂fn
∂xn
is known as __________
,
Jacobian of (f1 , f2 , · · · , fn ).
d A
Convergence of Newton’s Method is not guaranteed, but it is expected if these conditions hold:
,
, me
f1 , f2 , · · · , fn and their partial derivatives are continuous and bounded near the actual solution.
J(f1 , f2 , · · · , fn ) = 0, near the solution.
The initial solution estimate is sufficiently close to the actual solution.
3.8.2
am
Fixed-Point Iteration Method to solve System of Non-Linear Equations
h
M
Non-Linearo
The ____________
Fixed-Point__________
Iteration_________
Equations,
_______________________
by
Method to solve a single equation can be extended to handle System
rewriting the system as a set
x1 =
of Auxiliary Functions
_____________________
as follows:
g1 (x1 , x2 , · · · , xn )
of
__________
x2 = g2 (x1 , x2 , · · · , xn )
.. .. ..
. . .
xn = gn (x1 , x2 , · · · , xn )
Choose (xp1 , xp2 , · · · , xpn ) as the initial estimate and substitute into the right sides of the Set of Auxiliary
Equations. The updated estimates are calculated as:
xg1 = g1 (xp1 , xp2 , · · · , xpn )
xg2 = g1 (xp1 , xp2 , · · · , xpn )
.. .. ..
. . .
xgn = g1 (xp1 , xp2 , · · · , xpn )
For the next iteration, these new values are then re-used in the right sides of Set of Auxiliary Equations
to generate the new updates, and so on. The process continues until convergence is observed (i.e. =
||(∆x1 , ∆x2 , · · · , ∆xn )|| tolerance, where ∆xk = xgk − xpk , ∀ 1 k n).
Example 3.11 : Fixed-Point Iteration Method to solve System of Non-Linear Equations
_________________________________________________________________________
Using the Fixed-Point Iteration Method, with a tolerance of USUUV, solve the following nonlinear system of
equations:
3.2x3 + 1.8y 2 + 24.43 = 0
−2x2 + 3y 3 = 5.92
gSy}S Wª~ S MhvjN tfy
astrax111@gmail.com {£¡ X\ ©¢ X]
XS]_ u)«¡0}{§ t¡¤©~ © ©§(¡ y¡« ©¢ u©ªRs0ª¡{ j){0©ª h¤{¨¡ X_ t{0}¡ {ª~ y¡« ©¢ j){0©ª
or, to be precise:
13
1.8yp2 + 24.43
xg = g1 (xp , yp ) = −
3.2
1
2x2p + 5.92 3
yg = g2 (xp , yp ) =
3
Using these auxiliary functions we iterate to generate Table 3.2 (shown on page 38):
z{|§¡ MXSWN – j¬{«¨§¡_ k0¬¡~R©0ª n¡{0©ª t¡¤©~ © ©§(¡
ª©ª§0ª¡{ ¡){0©ª
H xp yp xg = g1 (xp , yp ) yg = g2 (xp , yp )
ui
V
W
RWSUUUU
RWSVY[V
WSUUUU
VS[[\^
RWSVY[V
RWSU^ZX
VS[[\^
VS\VZU
i
USX[W]
d
USU[^W q
X
Z
RWSU^ZX
RWSVUWV
RWSU^^\
VS\VZU
VS[^]Z
VS\UU\
RWSVUWV
RWSU^^\
RWSVUUU
VS[^]Z
VS\UU\
VS\UUU
Si d
USUV\]
USUUXW
USUUU]
a n
As seen in Table 3.2, we were able to converge to the actual solution as MRWSVUUUQVS\UUUN, in Z iterations,
with a tolerance of USUUV.
h s
Convergence of Fixed-Point Iteration Method
d A
As was the case of convergence of Newton’s Method, similarly, convergence of Fixed-Point Iteration Method
, e
is not guaranteed, but it is expected if these conditions hold:
m
Auxiliary functions g1 , g2 , · · · , gn and their partial derivatives with respect to x1 , x2 , · · · , xn are
,
continuous near the actual solution.
am
∂g1
+ ∂g1 + · · · + ∂g1 1
o h ∂x1
∂g2
∂x2
∂xn
+ ∂g2 + · · · + ∂g2 1
M
∂x1 ∂x2 ∂xn
.. .. ..
. + . + ··· + . 1
∂gn ∂gn ∂gn
∂x1 + ∂x2 + · · · + ∂xn 1
, The initial estimate (x p1 , xp2 , · · · , xpn ) is sufficiently close to the actual solution (xg1 , xg2 , · · · , xgn ).
{£¡ X] ©¢ X] ahsan.academic@gmail.com
astrax111@gmail.com gSy}S Wª~ S MhvjN