MA 214: Introduction To Numerical Analysis: Shripad M. Garge. IIT Bombay (Shripad@math - Iitb.ac - In)

MA 214: Introduction to numerical analysis
Lecture 38
Shripad M. Garge.
IIT Bombay
(shripad@math.iitb.ac.in)
2021-2022
1/71
shripad@math.iitb.ac.in MA214 (2021-2022) L38
System of linear equations
We are interested in solving the system of linear equations. We

have the system
a11 x1 ` a12 x2 ` ¨ ¨ ¨ ` a1n xn “ b1

a21 x1 ` a22 x2 ` ¨ ¨ ¨ ` a2n xn “ b2
..
.
an1 x1 ` an2 x2 ` ¨ ¨ ¨ ` ann xn “ bn
which is written as A ¨ x “ b. The matrices A and b have real

entries and we want to solve for the matrix x.
As we noted before, this system has a unique solution if the matrix
A is invertible, this is so if and only if the determinant of A is
non-zero.
2/71
Gaußian elimination method
We noted that the case where the matrix A is upper triangular is

easier to solve. In that case the system is
a11 x1 ` a12 x2 ` ¨ ¨ ¨ ` a1n xn “ b1

a22 x2 ` ¨ ¨ ¨ ` a2n xn “ b2
..
.
ann xn “ bn
We can then solve for the variable xn using the last equation.
Once the value of xn is found, we use it in the second-last equation
an´1,n´1 xn´1 ` ann xn “ bn´1
to solve for xn´1 and so on.
3/71
The procedure will fail if the element aii is zero in the step of
eliminating xi because then the operation
ˆ ˙
aji
Ej ´ Ei Ñ pEj q
aii
can not be performed.
The system may still have a solution, but the technique for finding
the solution must be altered. Let us understand this through an
example: ¨ ˛¨ ˛ ¨ ˛
1 ´1 2 x1 ´6
˝2 ´2 3‚˝x2 ‚ “ ˝´14‚.
1 1 1 x3 0
We perform pE2 ´ 2E1 q Ñ pE2 q and pE3 ´ E1 q Ñ pE3 q.
4/71
The resulting system is

¨ ˛¨ ˛ ¨ ˛
1 ´1 2 x1 ´6
˝0 0 ´1‚˝x2 ‚ “ ˝´2‚.
0 2 ´1 x3 6
The diagonal entry a22 is now zero. So the procedure can not
continue in its present form.
We observe that a32 “ 2 ‰ 0, hence we perform the operation

pE2 q Ø pE3 q to obtain the system
¨ ˛¨ ˛ ¨ ˛
1 ´1 2 x1 ´6
˝0 2 ´1‚˝x2 ‚ “ ˝ 6 ‚.
0 0 ´1 x3 ´2
We then have x3 “ 2, x2 “ 4 and x1 “ ´6.

5/71
The above example illustrates what is to be done if akk “ 0 for

some k.
We search for the first nonzero entry below akk in the k-th column.
If apk ‰ 0 for some k ă p ď n, then the operation pEk q Ø pEp q is

performed to obtain the new matrix A.
The procedure can then be continued.
If apk “ 0 for each p, then the matrix A is not invertible and hence
the system does not have a unique solution. The procedure then
stops.
Finally, if ann “ 0, the linear system does not have a unique

solution, and again the procedure stops.
6/71
Operation count
After learning the method, we are interested in finding the total

number of arithmetic operations performed in the method.
In general, the amount of time required to perform a multiplication

or division on a computer is approximately the same and is
considerably greater than that required to perform an addition or
subtraction.
The actual differences in execution time, however, depend on the

particular computing system.
To demonstrate the counting operations for a given method, we

will count the operations required to solve a typical linear system
of n equations in n unknowns using GEM.
We will keep the count of the additions/subtractions separate from

the count of the multiplications/divisions because of the time
differential. 7/71
Operation count
Assume that the variables x1 , . . . , xi´1 have been eliminated from

the appropriate equations.
For eliminating xi from the equations Ei`1 onwards, we define

mji “ aji {aii and perform pEj ´ mji Ei q Ñ pEj q.
Here pn ´ iq divisions are performed, then mji is multiplied to each

term of Ei , thus resulting in pn ´ iqpn ´ i ` 1q multiplications, and
then each term of the resulting equation is subtracted from the
equation Ej , requiring pn ´ iqpn ´ i ` 1q subtractions.
Mult/Div: pn ´ iq ` pn ´ iqpn ´ i ` 1q “ pn ´ iqpn ´ i ` 2q.
Add/Sub: pn ´ iqpn ´ i ` 1q.
The total number of operations required is obtained by summing

the operation counts for each i.
8/71
Operation count
Mult/Div:
n´1
ÿ n´1
ÿ n´1
ÿ
pn ´ iqpn ´ i ` 2q “ pn ´ iq2 ` 2 pn ´ iq
i“1 i“1 i“1
n´1
ÿ n´1
ÿ
“ i2 ` 2 i
i“1 i“1
pn ´ 1qnp2n ´ 1q pn ´ 1qn
“ `2
6 2
2n3 ` 3n2 ´ 5n
“ .
6
This is the count of multiplication or division operations.
9/71
Operation count
Add/Sub:
n´1
ÿ n´1
ÿ n´1
ÿ
pn ´ iqpn ´ i ` 1q “ pn ´ iq2 ` pn ´ iq
i“1 i“1 i“1
n´1
ÿ n´1
ÿ
“ i2 ` i
i“1 i“1
pn ´ 1qnp2n ´ 1q pn ´ 1qn
“ `
6 2
n3 ´ n
“ .
3
This is the count of addition or subtraction operations.
These are the operations required to make the matrix

upper-triangular.
10/71
Operation count
We now count the number of operations required in the backward

substitutions.
Each such step requires pn ´ iq multiplications, pn ´ i ´ 1q

additions for each summation term followed by one subtraction
and finally one division.
Mult/Div:
n´1
ÿ n´1
ÿ
` ˘
1` pn ´ iq ` 1 “ 1` pn ´ iq ` pn ´ 1q
i“1 i“1
n´1
ÿ n´1
ÿ
“ pn ´ iq ` n “ i `n
i“1 i“1
npn ´ 1q n2 ` n
“ `n “ .
2 2
11/71
Operation count
Finally, we count the number of additions/subtractions:

n´1
ÿ ` ˘ n´1
ÿ n´1
ÿ n2 ´ n
pn ´ i ´ 1q ` 1 “ pn ´ iq “ i“ .
i“1 i“1 i“1
2
The total number of operations in the GEM is
Mult/Div:
2n3 ` 3n2 ´ 5n n2 ` n n3 n
` “ ` n2 ´
6 2 3 3
Add/Sub:
n3 ´ n n2 ´ n n3 n2 5n
` “ ` ´ .
3 2 3 2 6
12/71
Operation count
For large n, the total number of multiplications and divisions is

approximately n3 {3, as is the total number of additions and
subtractions.
Thus the amount of computation and the time required increases

with n in proportion to n3 , as shown below:
n Mult/Div Add/Sub
3 17 11
10 430 375
50 44, 150 42, 875
100 3, 43, 400 3, 38, 250
13/71
Lecture 39
Shripad M. Garge.
IIT Bombay
2021-2022
14/71
Pivoting
In GEM, we found that a row interchange was needed when one of

the pivot elements akk is 0.
This row interchange has the form pEk q Ø pEp q, where p is the
smallest integer greater than k with apk ‰ 0.
To reduce the round-off error, it is often necessary to perform row

interchanges even when the pivot elements are not zero.
If akk is small in magnitude compared to ajk , then the magnitude

of the multiplier mjk “ ajk {akk will be much larger than 1.
Round-off error introduced in the computation of one of the terms

akl is multiplied by mjk when computing ajl , which compounds the
original error.
15/71
Pivoting
Also, when performing the backward substitution for
ak,n`1 xn`1 ´ nj“k`1 akj xj

ř
xk “
akk
with a small value of akk , any error in the numerator can be
dramatically increased because of the division by akk .
In our next example, we will see that even for small systems,
round-off error can dominate the calculations.
Consider the system:

E1 : 0.003x1 ` 59.14x2 “ 59.17
E2 : 5.291x1 ´ 6.13x2 “ 46.78
We use four-digit rounding arithmetic and compare the results to
the exact solution x1 “ 10 and x2 “ 1.
16/71
Pivoting
The first pivot element, a11 “ 0.003, is small and then

5.291
m21 “ “ 1763.6
0.003
rounds to 1764.
Performing pE2 ´ m21 E1 q Ñ pE2 q and the appropriate rounding

gives the system
0.003x1 ` 59.14x2 “ 59.17
´104300x2 « ´104400
instead of the exact system.
The disparity in the magnitudes of m21 a13 and a23 has introduced
a round-off error, but the round-off error has not yet been
propagated.
17/71
Pivoting
Backward substitution yields
x2 « 1.001
which is a close approximation to the actual value, x2 “ 1.
However, because of the small pivot a11 “ 0.003
59.17 ´ p59.14qp1.001q
x1 « “ ´10
0.003
59.14
contains the small error of 0.001 multiplied by « 20000.
0.003
This ruins the approximation to the actual value x1 “ 10.
This is clearly a contrived example but it does give an idea on how

the round-off error can alter the results.
18/71
Partial pivoting
For larger systems it is much more difficult to predict in advance

when devastating round-off error might occur.
To avoid this problem, pivoting is performed by selecting an

element apq with a larger magnitude as the pivot, and
interchanging the k-th and p-th rows. This can be followed by the
interchange of the k-th and q-th columns, if necessary.
The simplest strategy is to select an element in the same column

that is below the diagonal and has the largest absolute value;
specifically, we determine the smallest p ě k such that
|apk | “ max |aik |

kďiďn
and perform pEk q Ø pEp q.
In this case no interchange of columns is used.

19/71
Partial pivoting
Consider the system:

E1 : 0.003x1 ` 59.14x2 “ 59.17
E2 : 5.291x1 ´ 6.13x2 “ 46.78
We use partial pivoting and four-digit rounding arithmetic and
compare the results to the exact solution x1 “ 10 and x2 “ 1.
The partial pivoting procedure first requires finding

( (
max |a11 |, |a21 | “ max |0.003|, |5.291| “ |5.291| “ |a21 |.
Then we perform pE2 q Ø pE1 q to get the system
E1 : 5.291x1 ´ 6.13x2 “ 46.78

E2 : 0.003x1 ` 59.14x2 “ 59.17
20/71
Partial pivoting
The multiplier for this system is

a21 0.003
m21 “ “ “ 0.0005670
a11 5.291
and the operation pE2 ´ m21 E1 q Ñ pE2 q reduces the system to
E1 : 5.291x1 ´ 6.13x2 “ 46.78

E2 : 59.14x2 « 59.14
The four-digit answers resulting from the backward substitution are
the correct values x1 “ 10 and x2 “ 1.
Each multiplier mji in the partial pivoting algorithm has magnitude

less than or equal to 1.
Although this strategy is sufficient for many linear systems,

situations do arise when it is inadequate.
21/71
Scaled partial pivoting
The linear system:

E1 : 30x1 ` 591400x2 “ 591700
E2 : 5.291x1 ´ 6.13x2 “ 46.78
is the same as that in the above examples, except that all the
entries in the first equation, E1 , have been multiplied by 104 .
The partial pivoting procedure with four-digit rounding arithmetic

leads to the same results as obtained in the first example. The
maximal value in the first column is 30, and the multiplier
5.291
m21 “ “ 0.1764 leads to the system
30
E1 : 30x1 ` 591400x2 “ 591700
E2 : ´104300x2 “ ´104400
which has the same inaccurate solutions x2 « 1.001 and x1 « ´10.
22/71
Scaled partial pivoting (or scaled-column pivoting) is needed for

the system illustrated above.
It places the element in the pivot position that is largest relative to

the entries in its row.
The first step in this procedure is to define a scale factor si for

each row as
si “ max |aij |.
1ďjďn
Assuming that si ą 0, the appropriate row interchange to place

zeros in the first column is determined by choosing the least
integer p with
|ap1 | |ak1 |
“ max
sp 1ďkďn s1
and performing pE1 q Ø pEp q.

23/71
The effect of scaling is to ensure that the largest element in each

row has a relative magnitude of 1 before the comparison for row
interchange is performed.
In a similar manner, before eliminating the variable xi using the

operations Ek ´ mki Ei , for k “ i ` 1, . . . , n, we select the smallest
integer p ě i with
|api | |aki |
“ max
sp iďkďn sk
and perform the row interchange pEi q Ø pEp q if i ‰ p.
The scale factors s1 , . . . , sn are computed only once, at the start of

the procedure.
They are row dependent, so they must also be interchanged when

row interchanges are performed.
24/71
We now apply scaled partial pivoting to the previous example. It

gives
s1 “ maxt|30.00|, |591400|u “ 591400, s2 “ 6.130.
Consequently,
|a11 | 30 |a21 | 5.291

“ “ 0.5073 ˆ 10´4 , “ “ 0.8631
s1 591400 s2 6.130
and the interchange pE1 q Ø pE2 q is made.
Applying GEM to the new system

E1 : 5.291x1 ´ 6.13x2 “ 46.78
E2 : 30x1 ` 591400x2 “ 591700
produces the correct results: x1 “ 10 and x2 “ 1.
25/71
Operation count
The first additional computations required for scaled partial

pivoting result from the determination of the scale factors; there
are pn ´ 1q comparisons for each of the n rows, for a total of
npn ´ 1q comparisons.
To determine the correct first interchange, n divisions are

performed, followed by n ´ 1 comparisons. So the first interchange
determination adds n divisions and n ´ 1 comparisons.
The scaling factors are computed only once, so the second step
requires pn ´ 1q divisions and pn ´ 2q comparisons.
We proceed in a similar manner until there are zeros below the

main diagonal in all but the n-th row.
The final step requires that we perform 2 divisions and 1

comparison.
26/71
Operation count
As a consequence, scaled partial pivoting adds a total of

n´1
ÿ npn ´ 1q 3
npn ´ 1q ` k “ npn ´ 1q ` “ npn ´ 1q
k“1
2 2
comparisons and
n
ÿ npn ` 1q 1
k“ ´ 1 “ pn ´ 1qpn ` 2q
k“2
2 2
divisions to the GEM.
The time required to perform a comparison is about the same as

an addition/subtraction.
27/71
Operation count
The total time to perform the basic GEM is Opn3 {3q

multiplications/divisions and Opn3 {3q additions/subtractions.
Hence, scaled partial pivoting does not add significantly to the

computational time required to solve a system for large values of n.
To emphasize the importance of choosing the scale factors only

once, imagine if the procedure were modified so that new scale
factors were determined each time a row interchange was made.
In this case, the term npn ´ 1q would be replaced by

n
ÿ 1
kpk ´ 1q “ npn2 ´ 1q.
k“2
3
As a consequence, this pivoting technique would add Opn3 {3q

comparisons, in addition to the rnpn ` 1q{2s ´ 1 divisions.
28/71
Lecture 40
Shripad M. Garge.
IIT Bombay
2021-2022
29/71
LU decomposition
The Gaußian elimination method is the principal tool in solving

linear systems of equations, so it should be no surprise that it
appears in other guises.
We now see that the steps used to solve a system of the form
Ax “ b can be used to factor the matrix A.
The factorisation is particularly useful when it has the form

A “ LU, where L is lower triangular and U is upper triangular.
Although not all matrices have this type of representation, many

do that occur frequently in the application of numerical techniques.
To see which matrices have an LU factorisation and to find how it

is determined, first suppose that Gaußian elimination method can
be performed on the system Ax “ b without row interchanges.
This is equivalent to having nonzero pivot elements aii , for each i.

30/71
LU decomposition
The first step in the Gaussian elimination method consists of

performing, for each j “ 2, 3, ..., n, the operations
aj1
pEj ´ mj1 E1 q Ñ pEj q, where mj1 “ .
a11
These operations transform the system into one in which all the
entries in the first column below the diagonal are zero.
The above system of operations can be viewed in another way.
It is simultaneously accomplished by multiplying the original matrix

A on the left by the matrix
¨ ˛
1 0 ¨¨¨ 0
˚´m21 1 0‹
M p1q “ ˚ . ‹.
˚ ‹
˝ .. ..
. 0‚
´mn1 0 ¨ ¨ ¨ 1
31/71
LU decomposition
It is a matrix which differs from the identity matrix only in the first
column, below the diagonal element, where the entry 0 is replaced
by the negatives of multipliers, ´mj1 .
We then have
M p1q Ax “ M p1q b which gives Ap2q x “ b p2q .
In a similar manner, we construct M p2q , which is obtained by

replacing the entries below the diagonal in the second column of
the identity matrix by the negatives of the multipliers, ´mj2 .
This changes the system by
M p2q Ap2q x “ M p2q b p2q giving Ap3q x “ b p3q .
32/71
LU decomposition
In general, with
Apkq x “ b pkq
already formed, we multiply by the matrix M pkq .
This is the matrix which differs from the identity matrix only in the
k-th column, below the diagonal entry, where 0 is replaced by
´mjk to obtain the next system
Apk`1q x “ M pkq Apkq x “ M pkq b pkq “ b pk`1q .
The process ends with the formation of Apnq x “ b pnq , where Apnq is
the upper triangular matrix, the result of the Gaußian elimination
method.
We have
Apnq “ M pn´1q M pn´2q ¨ ¨ ¨ M p1q A.
33/71
LU decomposition
The matrix Apnq is an upper triangular matrix.
This is the U matrix in our desired decomposition A “ LU.
The matrix L then has to be L “ AU ´1 but we will obtain it using

the above matrices M pkq .
Note that each of the M pkq is lower triangular and so is the inverse
of each M pkq .
The matrix M pkq effects the row operation pEj ´ mjk Ek q Ñ pEj q
and its inverse Lpkq should reverse this operation, hence should
perform the operation pEj ` mjk Ek q Ñ pEj q.
Thus, the matrix Lpkq “ pM pkq q´1 differs from the identity matrix
only in the k-th column, below the diagonal entry, where 0 is
replaced by the multiplier, mjk , for j “ k ` 1, . . . , n.
34/71
LU decomposition
Therefore L “ Lp1q Lp2q ¨ ¨ ¨ Lpn´1q is a lower triangular matrix and

one has
` ˘` ˘
A “ LU “ Lp1q Lp2q ¨ ¨ ¨ Lpn´1q M pn´1q M pn´2q ¨ ¨ ¨ M p1q A .
Thus, if the GEM can be performed on a system Ax “ b without

any row changes, then the matrix A has a factorisation A “ LU as
a product of a lower triangular matrix and an upper triangular
matrix.
Consider the system

¨ ˛¨ ˛ ¨ ˛
1 1 0 3 x1 1
˚2 1 ´1 1 ‹ ˚x2 ‹ ˚ 1 ‹
˚ ‹˚ ‹ “ ˚ ‹.
˝3 ´1 ´1 2 ‚˝x3 ‚ ˝´3‚
´1 2 3 ´1 x4 4
35/71
An example
We obtain the LU factorisation. The first operations are:
pE2 ´ 2E1 q Ñ pE2 q, pE3 ´ 3E1 q Ñ pE3 q, pE4 ´ p´1qE1 q Ñ pE4 q.
This gives
¨ ˛¨ ˛
1 0 0 0 1 1 0 3
˚2 1 0 0‹ ˚0 ´1 ´1 ´5‹ ` ˘` ˘
A“˚ ‹˚ ‹ “ Lp1q M p1q A .
˝3 0 1 0‚˝0 ´4 ´1 ´7‚
´1 0 0 1 0 3 3 2
The next operations are: pE3 ´ 4E2 q Ñ pE3 q,

pE4 ´ p´3qE2 q Ñ pE4 q which gives
¨ ˛¨ ˛¨ ˛
1 0 0 0 1 0 0 0 1 1 0 3
˚2 1 0 0‹ ˚0 1 0 0‹ ˚0 ´1 ´1 ´5 ‹
A“˚
˝3
‹˚ ‹˚ ‹
0 1 0‚˝0 4 1 0‚˝0 0 3 13 ‚
´1 0 0 1 0 ´3 0 1 0 0 0 ´13
` ˘` ˘
A “ Lp1q Lp2q M p2q M p1q A .
36/71
The example, continued
The resulting matrix M p2q M p1q A is upper triangular now. Hence

U “ M p2q M p1q A and L “ Lp1q Lp2q . We thus get
¨ ˛¨ ˛
1 0 0 0 1 1 0 3
˚2 1 0 0‹ ‹ ˚0 ´1 ´1 ´5 ‹ .
˚ ‹
A“˚ ˝3 4 1 0 ‚ ˝ 0 0 3 13 ‚
´1 ´3 0 1 0 0 0 ´13
To solve the system Ax “ LUx “ b, we introduce the variable y ,

defined by y “ Ux. We solve for y first from the system Ly “ b:
¨ ˛¨ ˛ ¨ ˛
1 0 0 0 y1 8
˚2 1 0 0‹ ˚y2 ‹ ˚ 7 ‹
‹ ˚ ‹ ˚
˚ “ ‹.
˝3 4 1 0‚˝y3 ‚ ˝ 14 ‚
´1 ´3 0 1 y4 ´7
This gives y1 “ 8, y2 “ ´9, y3 “ 26 and y4 “ ´26.

37/71
The example, continued
We then solve for Ux “ y

¨ ˛¨ ˛ ¨ ˛
1 1 0 3 x1 8
˚0 ´1 ´1 ´5 ‹‹ ˚x2 ‹ “ ˚ ´9 ‹ .
˚ ‹ ˚ ‹
˚
˝0 0 3 13 ‚˝x3 ‚ ˝ 26 ‚
0 0 0 ´13 x4 ´26
Using backward substitution, we obtain x4 “ 2, x3 “ 0, x2 “ ´1

and x1 “ 3.
Observe that we had the diagonal entry in the matrix L, lii “ 1 for
all i in this method.
The methods differ by putting conditions on the diagonal elements

of L. The one we studied just now had lii “ 1. Some other method
demands that uii “ 1 and yet another demands that lii “ uii .
38/71
Permutation matrices
Now, we worry ourselves with the question about row interchange.
If the row interchange is not needed then A can be factored as

A “ LU.
Performing a row interchange pEi q Ø pEj q of the system Ax “ b is

achieved by multiplying to the system on the left by a certain
matrix P.
This matrix P differs from the identity matrix only in 4 places,

pii “ pjj “ 0 and pji “ pji “ 1.
Such a matrix is called a permutation matrix.
In general, a permutation matrix is a matrix whose every row and

every column has only one non-zero entry which is equal to 1.
39/71
Permutation matrices
We write the permutation matrices of size 2.

ˆ ˙ ˆ ˙
1 0 0 1
, .
0 1 1 0
There will be 6 permutation matrices of size 3, and in general n! of

size n.
Note that the product of two permutation matrices is a

permutation matrix, the determinant of a permutation matrix is
˘1 and the inverse of a permutation matrix is again a permutation
matrix.
If we knew the row interchanges that would be needed to solve a

particular system by GEM, then we could arrange the original
equations in an order that would ensure that no row operations are
required.
40/71
PLU decomposition
This means that for an invertible matrix A, a permutation matrix

P exists such that the system PAx “ Pb does not require row
interchanges.
Then the Gaußian elimination method can be applied to it.
As a consequence, PA can be factored as LU where L is a lower

triangular matrix and U is an upper triangular matrix.
Then we get
A “ P ´1 LU “ P 1 LU
which is a decomposition of A as a product of a permutation
matrix P 1 , a lower triangular matrix L and an upper triangular
matrix M, in that order.
This is called a PLU decomposition.
41/71
Lecture 41
Shripad M. Garge.
IIT Bombay
2021-2022
42/71
PLU decomposition
Given a system of linear equations Ax “ b, if we knew the row

interchanges that would be needed to solve it by GEM, then we
could arrange the original equations in an order that would ensure
that no row operations are required.
This means that for an invertible matrix A, a permutation matrix

P exists such that the system PAx “ Pb does not require row
interchanges. Then the matrix PA can be factored as LU where L
is a lower triangular matrix and U is an upper triangular matrix.
Then we get A “ P ´1 LU “ P 1 LU which is a decomposition of A

as a product of a permutation matrix P 1 , a lower triangular matrix
L and an upper triangular matrix M, in that order.
This is called a PLU decomposition.
43/71
PLU decomposition
Consider the matrix

¨ ˛
0 0 ´1 1
˚1 1 ´1 2‹
A“˚
˝´1 ´1 2 0‚.
‹
1 2 0 2
This matrix can not have factorisation as A “ LU because a11 “ 0.
We therefore perform pE1 q Ø pE2 q and then pE3 ` E1 q Ñ pE3 q,

pE4 ´ E1 q Ñ pE4 q to get the matrix
¨ ˛
1 1 ´1 2
˚0 0 ´1 1‹
˝0 0 1 2‚.
˚ ‹
0 1 1 0
This necessitates a row interchange pE2 q Ø pE4 q. 44/71

PLU decomposition
After that we perform pE4 ` E3 q Ñ pE4 q to get the matrix

¨ ˛
1 1 ´1 2
˚0 1 1 0 ‹
˚ ‹
˝0 0 1 2 ‚
0 0 0 3
which is upper triangular.
The permutation matrix associated to the row interchanges

pE1 q Ø pE2 q followed by pE2 q Ø pE4 q is
¨ ˛
0 1 0 0
˚0 0 0 1‹
P“˚ ˝0 0 1 0‚.
‹
1 0 0 0
45/71
PLU decomposition
We have ¨ ˛
1 1 ´1 2
˚1 2 0 2‹
PA “ ˚
˝´1 ´1 2
‹.
0‚
0 0 ´1 1
This is the matrix on which we can perform the Gaußian
elimination, the operations being the same as the ones above:
pE2 ´ E1 q Ñ pE2 q, pE3 ` E1 q Ñ pE3 q and pE4 ` E3 q Ñ pE4 q.
The nonzero multipliers for PA are consequently, m21 “ 1,

m31 “ ´1 and m43 “ ´1.
This gives the LU factorisation for the matrix PA.
46/71
PLU decomposition
¨ ˛¨ ˛
1 0 0 0 1 1 ´1 2
˚1 1 0 0‹ ˚0
‹ ˚ 1 1 0‹
PA “ ˚ ‹ “ LU.
˝´1 0 1 0‚˝0 0 1 2‚
0 0 ´1 1 0 0 0 3
The final decomposition for A as a product of a permutation
matrix, a lower triangular matrix and an upper triangular matrix
can be obtained as A “ P ´1 LU:
¨ ˛¨ ˛¨ ˛
0 0 0 1 1 0 0 0 1 1 ´1 2
˚1 0 0 1‹ ˚ 1 1 0 0‹ ˚0 1 1 0‹
˝0 0 1 0‚˝´1 0 1 0‚˝0 0 1 2‚.
˚ ‹˚ ‹˚ ‹
0 1 0 0 0 0 ´1 1 0 0 0 3
While it would be of interest to some to write A “ PLU, it is

desirable to know the matrices which admit GEM without row
interchanges. 47/71
Diagonally dominant matrices
We now turn attention to two classes of matrices for which

Gaussian elimination can be performed effectively without row
interchanges.
An n ˆ n matrix A is said to be diagonally dominant when

n
ÿ
|aii | ě |aij |
j “1
j ‰i
holds for each i “ 1, 2, . . . , n.
A diagonally dominant matrix is said to be strictly diagonally

dominant when the above inequality is strict for each n.
48/71
Diagonally dominant matrices
A strictly diagonally dominant matrix A is invertible.
Moreover, in this case, the Gaußian elimination method can be

performed on any linear system of the form Ax “ b to obtain its
unique solution without row interchanges.
Furthermore, the computations will be stable with respect to the

growth of round-off errors.
The above statement is a serious theorem, which we will not prove

in this course, but we need to understand that this is the reason
the strictly dominant matrices are useful.
49/71
Positive definite matrices
A matrix A is called positive definite if A is symmetric and if

x t Ax ą 0 for every 0 ‰ x.
If A “ raij s and the column vector x is given by x “ rx1 , . . . , xn st

then ˜ ¸
n ÿ
ÿ n
x t Ax “ aij xi xj .
i“1 j“1
The matrix ¨ ˛
2 ´1 0
A “ ˝´1 2 ´1‚
0 ´1 2
is a positive definite matrix.
To begin with, A is a symmetric matrix.
50/71
Further,
x t Ax “ 2x12 ´ 2x1 x2 ` 2x22 ´ 2x2 x3 ` 2x32

“ x12 ` px1 ´ x2 q2 ` px2 ´ x3 q2 ` x32 .
If px1 , x2 , x3 q ‰ p0, 0, 0q then it follows that
x t Ax ą 0
and hence A is a positive definite matrix.
It is clear from this example that using the definition to determine

if a matrix is positive definite is difficult.
Fortunately, there are more easily verified criteria, which will be

presented later in our course, for identifying members of this
important class.
51/71
The next result provides some necessary conditions that can be

used to eliminate certain matrices from consideration.
If A “ raii s is a positive definite matrix then
A is invertible,
aii ą 0 for each i,
paij q2 ă aii ajj for each i ‰ j,
max |akj | ď max |aii |.
1ďk,jďn 1ďiďn
We should note, however, that a matrix A may satisfy all these

conditions and may still not be positive definite!
We now give a necessary and sufficient criterion for a matrix to be

positive definite.
52/71
A leading principal submatrix of a matrix A is a matrix of the form

¨ ˛
a11 a12 ¨ ¨ ¨ a1k
˚a21 a22 ¨ ¨ ¨ a2k ‹
˚ ‹
˚ .. .. .. .. ‹
˝ . . . . ‚
ak1 ak2 ¨ ¨ ¨ akk
for some 1 ď k ď n.
A matrix A is positive definite if and only if each leading principal

submatrix of A has positive determinant.
Consider the matrix discussed in the above example:

¨ ˛
2 ´1 0
A “ ˝´1 2 ´1‚.
0 ´1 2
53/71
We verify that it is positive definite using the criteria.

ˆ ˙
2 ´1
A1 “ p2q, A2 “ .
´1 2
We then have detpA1 q “ 2 and detpA2 q “ 3.
Since detpAq “ 4, we have that A is positive definite.
We now record the results that make positive definite matrices

interesting.
A symmetric matrix A is positive definite if and only if Gaußian

elimination method can be applied to any system Ax “ b without
interchanging rows with all pivot elements positive.
Moreover, in this case, the computations are stable with respect to

the growth of round-off errors.
54/71
The positive definite matrices admit interesting factorisations.
A matrix A is positive definite if and only if A “ LDLt where L is

lower triangular with 1’s on the diagonal and D is a diagonal
matrix with positive diagonal entries.
A matrix A is positive definite if and only if A “ LLt where L is a

lower triangular matrix.
A factorisation of a matrix A, A “ LU with lii “ uii is called a

Cholesky factorisation.
The factorisation A “ Lt L is one such.
Note that being positive definite is not necessary to admit a

Cholesky factorisation. The zero matrix admits one, for instance.
55/71
Cholesky factorisation
Determine the Cholesky factorisation of the positive definite matrix

¨ ˛
4 ´1 1
A “ ˝´1 4.25 2.75‚.
1 2.75 3.5
If
¨ ˛ ¨ ˛¨ ˛
4 ´1 1 l11 0 0 l11 l21 l31
A “ ˝´1 4.25 2.75‚ “ ˝l21 l22 0 ‚˝ 0 l22 l32 ‚
1 2.75 3.5 l31 l32 l33 0 0 l33
then we get
¨ ˛ ¨ 2 ˛
4 ´1 1 l11 l11 l21 l11 l31
˝´1 4.25 2.75‚ “ ˝l11 l21 2 ` l2
l21 l21 l31 ` l22 l32 ‚
22
1 2.75 3.5 2 ` l2 ` l2
l11 l31 l21 l31 ` l22 l32 l31 32 33
56/71
Cholesky factorisation
We can solve for lij :
a11 : 2
4 “ l11 ùñ l11 “ 2,
a21 : ´1 “ l11 l21 ùñ l21 “ ´0.5,
a31 : 1 “ l11 l31 ùñ l31 “ 0.5,
a22 2 ` l2
: 4.25 “ l21 ùñ l22 “2
22
a32 : 2.75 “ l21 l31 ` l22 l32 ùñ l32 “ 1.5
a33 2 ` l2 ` l2
: 3.5 “ l31 ùñ l33 “1
32 33
and hence
¨ ˛ ¨ ˛¨ ˛
4 ´1 1 2 0 0 2 ´0.5 0.5
A “ ˝´1 4.25 2.75‚ “ ˝´0.5 2 0‚˝0 2 1.5‚ “ LLt .
1 2.75 3.5 0.5 1.5 1 0 0 1
57/71
Lecture 42
Shripad M. Garge.
IIT Bombay
2021-2022
58/71
Iterative methods
The methods we used to solve systems Ax “ b until the last

lecture are called the direct methods.
Now we want to try the analogues of iterative techniques which

were so useful in finding solutions of equations in one variable.
An initial approximation (or approximations) was found, and new

approximations are then determined based on how well the
previous approximations satisfied the equation.
The objective is to find a way to minimise the difference between

the approximations and the exact solution.
To discuss iterative methods for solving linear systems, we first

need to determine a way to measure the distance between
n-dimensional column vectors.
59/71
Distance in Rn
Our underlying set is Rn , the vector space of all column vectors of

size n ˆ 1.
We define two types of distances on pairs of vectors in Rn . Let

x, y P Rn with
x “ px1 , . . . , xn qt and y “ py1 , . . . , yn qt .
The l2 -distance:
˜ ¸1{2
n
ÿ
2
}x ´ y}2 “ pxi ´ yi q .
i“1
The l8 -distance:
}x ´ y}8 “ max |xi ´ yi |.

1ďiďn
60/71
Distance in Rn
Consider the system of linear equations:
3.3330x1 ` 15920x2 ´ 10.333x3 “ 15913,

2.2220x1 ` 16.710x2 ` 9.6120x3 “ 28.544,
1.5611x1 ` 5.1791x2 ` 1.6852x3 “ 8.4254
which has the exact solution x “ p1, 1, 1qt but the Gaußian
elimination method performed using five-digit rounding arithmetic
and partial pivoting produces the approximate solution
x̃ “ p1.2001, 0.99991, 0.92538qt .
We now compute the distance of x̃ in both the distances, l2 and l8 .
61/71
Distance in Rn
We have
˘1{2
p0.2001q2 ` p0.00009q2 ` p0.07462q2
`
}x ´ x̃}2 “
“ 0.21356
and
}x ´ x̃}8 “ maxt0.2001, 0.00009, 0.07462u
“ 0.2001.
Here, the second and the third components are good
approximations but the first component is a poor approximation
and that error dominates the computations of both the above
distances.
The above notions of distance can be used to define convergence

of sequences in Rn .
?
Note that }x ´ y}8 ď }x ´ y}2 ď n}x ´ y}8 .
62/71
Norms on matrices
We want to generalise the notion of absolute value in R to the

space of n ˆ n matrices.
Note that we also have the analogues of absolute value on Rn ,

}x}2 “ }x ´ 0}2 and }x}8 “ }x ´ 0}8 .
They are called the l2 and l8 norms on Rn .
We now define:
}A}2 “ max }Ax}2 , }A}8 “ max }Ax}8

}x}2 “1 }x}8 “1
The l8 norm on matrices can be computed quite easily. If

A “ paij q then ÿ
}A}8 “ max |aij |.
i
j
63/71
Norms on matrices
Compute the l8 norm of

¨ ˛
1 2 ´1
A “ ˝0 3 ´1‚.
5 ´1 1
We have
3
ÿ 3
ÿ
|a1j | “ 1 ` 2 ` 1 “ 4, |a2j | “ 0 ` 3 ` 1 “ 4
j“1 j“1
and
3
ÿ
|a3j | “ 5 ` 1 ` 1 “ 7.
j“1
Hence }A}8 “ 7.
64/71
Eigenvalues, eigenvectors
An n ˆ n matrix A acts as a function from Rn to Rn .
The eigenvectors of A are the vectors that are sent to their

multiples by the matrix A.
More precisely, a nonzero vector v P Rn is an eigenvector for A if

there is a λ P R such that Av “ λv.
A real number λ P R is called an eigenvalue of A if there is a

non-zero v P Rn with Av “ λv.
Now, if λ is an eigenvalue of A then A ´ λI is not invertible,

because it sends a nonzero v to zero. Hence detpA ´ λI q “ 0.
Thus, the eigenvalues of A are the roots of the characteristic

polynomial of A which is the detpA ´ λI q.
65/71
Determine the eigenvalues and eigenvectors of

¨ ˛
2 0 0
A “ ˝1 1 2‚.
1 ´1 4
Note that the characteristic polynomial of A is
´pλ3 ´ 7λ2 ` 16λ ´ 12q “ ´pλ ´ 3qpλ ´ 2q2 .
Thus the eigenvalues of A are 2 and 3.
Now we compute the corresponding eigenvectors.
The eigenvectors for λ are the vectors v with pA ´ λI qpvq “ 0.
66/71
λ “ 3: pA ´ 3I qpvq “ 0 gives
¨ ˛¨ ˛ ¨ ˛
´1 0 0 v1 0
˝ 1 ´2 2‚˝v2 ‚ “ ˝0‚.
1 ´1 1 v3 0
This gives
v1 “ 0 and v2 “ v3
hence an eigenvector for eigenvalue 3 is of the form cp0, 1, 1qt for
0 ‰ c P R.
λ “ 2: A similar analysis as above tells us that an eigenvector for

eigenvalue 2 is a non-zero vector v satisfying v1 ´ v2 ` 2v3 “ 0.
In particular, it is of the form c1 p0, 2, 1qt ` c2 p´2, 0, 1qt where

pc1 , c2 q ‰ p0, 0q.
67/71
Spectral radius
Note that even though our A has real entries, some roots of the
characteristic polynomial of A may be (non-real) complex numbers.
The spectral radius of A, ρpAq, is defined by
ρpAq “ max |λ|
where λ varies over all roots of the characteristic polynomial of A.
If A is an n ˆ n matrix then
“ ‰1{2
}A}2 “ ρpAt Aq
ρpAq ď }A}2 and
ρpAq ď }A}8 .
Let us now compute l2 -norm of a matrix.

68/71
Spectral radius
Let ¨ ˛
1 1 0
A “ ˝ 1 2 1‚.
´1 1 2
We apply the above result, and for that we need to compute
ρpAt Aq.
Here
¨ ˛¨ ˛ ¨ ˛
1 1 ´1 1 1 0 3 2 ´1
At A “ ˝1 2 1 ‚˝ 1 2 1‚ “ ˝ 2 6 4 ‚
0 1 2 ´1 1 2 ´1 4 5
?
and the eigenvalues of At A are 0, 7 ˘ 7. Hence
b
a ?
t
}A}2 “ ρpA Aq “ 7 ` 7 « 3.10576.
69/71
Convergent matrices
In studying iterative matrix techniques, it is of particular

importance to know when powers of a matrix become small, that
is, when all the entries approach zero.
Matrices of this type are called convergent.
An n ˆ n matrix A is called convergent if for each 1 ď i, j ď n
lim pAk qij “ 0.

kÑ8
For instance, if
ˆ ˙ ˆ ˙
1{2 0 k 1{2k 0
A“ then A “ .
1{4 1{2 k{2k`1 1{2k
It then follows that A is a convergent matrix.
70/71
Convergent matrices
The following statements are equivalent:
1 A is a convergent matrix.
2 limnÑ8 }An }2 “ 0.
3 limnÑ8 }An }8 “ 0.
4 ρpAq ă 1.
5 limnÑ8 An v “ 0 for every vector v.
71/71

MA 214: Introduction To Numerical Analysis: Shripad M. Garge. IIT Bombay (Shripad@math - Iitb.ac - In)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MA 214: Introduction To Numerical Analysis: Shripad M. Garge. IIT Bombay (Shripad@math - Iitb.ac - In)

Uploaded by

Copyright:

Available Formats

MA 214: Introduction to numerical analysis

We are interested in solving the system of linear equations. We

a11 x1 ` a12 x2 ` ¨ ¨ ¨ ` a1n xn “ b1

which is written as A ¨ x “ b. The matrices A and b have real

We noted that the case where the matrix A is upper triangular is

a11 x1 ` a12 x2 ` ¨ ¨ ¨ ` a1n xn “ b1

an´1,n´1 xn´1 ` ann xn “ bn´1

to solve for xn´1 and so on.

can not be performed.

The resulting system is

We observe that a32 “ 2 ‰ 0, hence we perform the operation

We then have x3 “ 2, x2 “ 4 and x1 “ ´6.

The above example illustrates what is to be done if akk “ 0 for

If apk ‰ 0 for some k ă p ď n, then the operation pEk q Ø pEp q is

The procedure can then be continued.

Finally, if ann “ 0, the linear system does not have a unique

After learning the method, we are interested in finding the total

In general, the amount of time required to perform a multiplication

The actual differences in execution time, however, depend on the

To demonstrate the counting operations for a given method, we

We will keep the count of the additions/subtractions separate from

Assume that the variables x1 , . . . , xi´1 have been eliminated from

For eliminating xi from the equations Ei`1 onwards, we define

Here pn ´ iq divisions are performed, then mji is multiplied to each

Mult/Div: pn ´ iq ` pn ´ iqpn ´ i ` 1q “ pn ´ iqpn ´ i ` 2q.

Add/Sub: pn ´ iqpn ´ i ` 1q.

The total number of operations required is obtained by summing

These are the operations required to make the matrix

We now count the number of operations required in the backward

Each such step requires pn ´ iq multiplications, pn ´ i ´ 1q

Finally, we count the number of additions/subtractions:

The total number of operations in the GEM is

For large n, the total number of multiplications and divisions is

Thus the amount of computation and the time required increases

In GEM, we found that a row interchange was needed when one of

To reduce the round-off error, it is often necessary to perform row

If akk is small in magnitude compared to ajk , then the magnitude

Round-off error introduced in the computation of one of the terms

Also, when performing the backward substitution for

ak,n`1 xn`1 ´ nj“k`1 akj xj

Consider the system:

The first pivot element, a11 “ 0.003, is small and then

Performing pE2 ´ m21 E1 q Ñ pE2 q and the appropriate rounding

Backward substitution yields

which is a close approximation to the actual value, x2 “ 1.

However, because of the small pivot a11 “ 0.003

This is clearly a contrived example but it does give an idea on how

For larger systems it is much more difficult to predict in advance

To avoid this problem, pivoting is performed by selecting an

The simplest strategy is to select an element in the same column

|apk | “ max |aik |

and perform pEk q Ø pEp q.

In this case no interchange of columns is used.

Consider the system:

The partial pivoting procedure first requires finding

Then we perform pE2 q Ø pE1 q to get the system

E1 : 5.291x1 ´ 6.13x2 “ 46.78

The multiplier for this system is

E1 : 5.291x1 ´ 6.13x2 “ 46.78

Each multiplier mji in the partial pivoting algorithm has magnitude

Although this strategy is sufficient for many linear systems,

The linear system:

The partial pivoting procedure with four-digit rounding arithmetic