You are on page 1of 71

MA 214: Introduction to numerical analysis

Lecture 38

Shripad M. Garge.
IIT Bombay

(shripad@math.iitb.ac.in)

2021-2022

1/71
shripad@math.iitb.ac.in MA214 (2021-2022) L38
System of linear equations

We are interested in solving the system of linear equations. We


have the system

a11 x1 ` a12 x2 ` ¨ ¨ ¨ ` a1n xn “ b1


a21 x1 ` a22 x2 ` ¨ ¨ ¨ ` a2n xn “ b2
..
.
an1 x1 ` an2 x2 ` ¨ ¨ ¨ ` ann xn “ bn

which is written as A ¨ x “ b. The matrices A and b have real


entries and we want to solve for the matrix x.
As we noted before, this system has a unique solution if the matrix
A is invertible, this is so if and only if the determinant of A is
non-zero.

2/71
shripad@math.iitb.ac.in MA214 (2021-2022) L38
Gaußian elimination method

We noted that the case where the matrix A is upper triangular is


easier to solve. In that case the system is

a11 x1 ` a12 x2 ` ¨ ¨ ¨ ` a1n xn “ b1


a22 x2 ` ¨ ¨ ¨ ` a2n xn “ b2
..
.
ann xn “ bn

We can then solve for the variable xn using the last equation.
Once the value of xn is found, we use it in the second-last equation

an´1,n´1 xn´1 ` ann xn “ bn´1

to solve for xn´1 and so on.

3/71
shripad@math.iitb.ac.in MA214 (2021-2022) L38
Gaußian elimination method

The procedure will fail if the element aii is zero in the step of
eliminating xi because then the operation
ˆ ˙
aji
Ej ´ Ei Ñ pEj q
aii

can not be performed.

The system may still have a solution, but the technique for finding
the solution must be altered. Let us understand this through an
example: ¨ ˛¨ ˛ ¨ ˛
1 ´1 2 x1 ´6
˝2 ´2 3‚˝x2 ‚ “ ˝´14‚.
1 1 1 x3 0
We perform pE2 ´ 2E1 q Ñ pE2 q and pE3 ´ E1 q Ñ pE3 q.

4/71
shripad@math.iitb.ac.in MA214 (2021-2022) L38
Gaußian elimination method

The resulting system is


¨ ˛¨ ˛ ¨ ˛
1 ´1 2 x1 ´6
˝0 0 ´1‚˝x2 ‚ “ ˝´2‚.
0 2 ´1 x3 6

The diagonal entry a22 is now zero. So the procedure can not
continue in its present form.

We observe that a32 “ 2 ‰ 0, hence we perform the operation


pE2 q Ø pE3 q to obtain the system
¨ ˛¨ ˛ ¨ ˛
1 ´1 2 x1 ´6
˝0 2 ´1‚˝x2 ‚ “ ˝ 6 ‚.
0 0 ´1 x3 ´2

We then have x3 “ 2, x2 “ 4 and x1 “ ´6.


5/71
shripad@math.iitb.ac.in MA214 (2021-2022) L38
Gaußian elimination method

The above example illustrates what is to be done if akk “ 0 for


some k.

We search for the first nonzero entry below akk in the k-th column.

If apk ‰ 0 for some k ă p ď n, then the operation pEk q Ø pEp q is


performed to obtain the new matrix A.

The procedure can then be continued.

If apk “ 0 for each p, then the matrix A is not invertible and hence
the system does not have a unique solution. The procedure then
stops.

Finally, if ann “ 0, the linear system does not have a unique


solution, and again the procedure stops.

6/71
shripad@math.iitb.ac.in MA214 (2021-2022) L38
Operation count

After learning the method, we are interested in finding the total


number of arithmetic operations performed in the method.

In general, the amount of time required to perform a multiplication


or division on a computer is approximately the same and is
considerably greater than that required to perform an addition or
subtraction.

The actual differences in execution time, however, depend on the


particular computing system.

To demonstrate the counting operations for a given method, we


will count the operations required to solve a typical linear system
of n equations in n unknowns using GEM.

We will keep the count of the additions/subtractions separate from


the count of the multiplications/divisions because of the time
differential. 7/71
shripad@math.iitb.ac.in MA214 (2021-2022) L38
Operation count

Assume that the variables x1 , . . . , xi´1 have been eliminated from


the appropriate equations.

For eliminating xi from the equations Ei`1 onwards, we define


mji “ aji {aii and perform pEj ´ mji Ei q Ñ pEj q.

Here pn ´ iq divisions are performed, then mji is multiplied to each


term of Ei , thus resulting in pn ´ iqpn ´ i ` 1q multiplications, and
then each term of the resulting equation is subtracted from the
equation Ej , requiring pn ´ iqpn ´ i ` 1q subtractions.

Mult/Div: pn ´ iq ` pn ´ iqpn ´ i ` 1q “ pn ´ iqpn ´ i ` 2q.

Add/Sub: pn ´ iqpn ´ i ` 1q.

The total number of operations required is obtained by summing


the operation counts for each i.

8/71
shripad@math.iitb.ac.in MA214 (2021-2022) L38
Operation count

Mult/Div:
n´1
ÿ n´1
ÿ n´1
ÿ
pn ´ iqpn ´ i ` 2q “ pn ´ iq2 ` 2 pn ´ iq
i“1 i“1 i“1
n´1
ÿ n´1
ÿ
“ i2 ` 2 i
i“1 i“1
pn ´ 1qnp2n ´ 1q pn ´ 1qn
“ `2
6 2
2n3 ` 3n2 ´ 5n
“ .
6
This is the count of multiplication or division operations.

9/71
shripad@math.iitb.ac.in MA214 (2021-2022) L38
Operation count

Add/Sub:
n´1
ÿ n´1
ÿ n´1
ÿ
pn ´ iqpn ´ i ` 1q “ pn ´ iq2 ` pn ´ iq
i“1 i“1 i“1
n´1
ÿ n´1
ÿ
“ i2 ` i
i“1 i“1
pn ´ 1qnp2n ´ 1q pn ´ 1qn
“ `
6 2
n3 ´ n
“ .
3
This is the count of addition or subtraction operations.

These are the operations required to make the matrix


upper-triangular.
10/71
shripad@math.iitb.ac.in MA214 (2021-2022) L38
Operation count

We now count the number of operations required in the backward


substitutions.

Each such step requires pn ´ iq multiplications, pn ´ i ´ 1q


additions for each summation term followed by one subtraction
and finally one division.

Mult/Div:
n´1
ÿ n´1
ÿ
` ˘
1` pn ´ iq ` 1 “ 1` pn ´ iq ` pn ´ 1q
i“1 i“1
n´1
ÿ n´1
ÿ
“ pn ´ iq ` n “ i `n
i“1 i“1
npn ´ 1q n2 ` n
“ `n “ .
2 2
11/71
shripad@math.iitb.ac.in MA214 (2021-2022) L38
Operation count

Finally, we count the number of additions/subtractions:


n´1
ÿ ` ˘ n´1
ÿ n´1
ÿ n2 ´ n
pn ´ i ´ 1q ` 1 “ pn ´ iq “ i“ .
i“1 i“1 i“1
2

The total number of operations in the GEM is

Mult/Div:

2n3 ` 3n2 ´ 5n n2 ` n n3 n
` “ ` n2 ´
6 2 3 3

Add/Sub:
n3 ´ n n2 ´ n n3 n2 5n
` “ ` ´ .
3 2 3 2 6
12/71
shripad@math.iitb.ac.in MA214 (2021-2022) L38
Operation count

For large n, the total number of multiplications and divisions is


approximately n3 {3, as is the total number of additions and
subtractions.

Thus the amount of computation and the time required increases


with n in proportion to n3 , as shown below:

n Mult/Div Add/Sub
3 17 11
10 430 375
50 44, 150 42, 875
100 3, 43, 400 3, 38, 250

13/71
shripad@math.iitb.ac.in MA214 (2021-2022) L38
MA 214: Introduction to numerical analysis
Lecture 39

Shripad M. Garge.
IIT Bombay

(shripad@math.iitb.ac.in)

2021-2022

14/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
Pivoting

In GEM, we found that a row interchange was needed when one of


the pivot elements akk is 0.

This row interchange has the form pEk q Ø pEp q, where p is the
smallest integer greater than k with apk ‰ 0.

To reduce the round-off error, it is often necessary to perform row


interchanges even when the pivot elements are not zero.

If akk is small in magnitude compared to ajk , then the magnitude


of the multiplier mjk “ ajk {akk will be much larger than 1.

Round-off error introduced in the computation of one of the terms


akl is multiplied by mjk when computing ajl , which compounds the
original error.

15/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
Pivoting

Also, when performing the backward substitution for

ak,n`1 xn`1 ´ nj“k`1 akj xj


ř
xk “
akk
with a small value of akk , any error in the numerator can be
dramatically increased because of the division by akk .

In our next example, we will see that even for small systems,
round-off error can dominate the calculations.

Consider the system:


E1 : 0.003x1 ` 59.14x2 “ 59.17
E2 : 5.291x1 ´ 6.13x2 “ 46.78
We use four-digit rounding arithmetic and compare the results to
the exact solution x1 “ 10 and x2 “ 1.
16/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
Pivoting

The first pivot element, a11 “ 0.003, is small and then


5.291
m21 “ “ 1763.6
0.003
rounds to 1764.

Performing pE2 ´ m21 E1 q Ñ pE2 q and the appropriate rounding


gives the system
0.003x1 ` 59.14x2 “ 59.17
´104300x2 « ´104400
instead of the exact system.

The disparity in the magnitudes of m21 a13 and a23 has introduced
a round-off error, but the round-off error has not yet been
propagated.
17/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
Pivoting

Backward substitution yields

x2 « 1.001

which is a close approximation to the actual value, x2 “ 1.

However, because of the small pivot a11 “ 0.003

59.17 ´ p59.14qp1.001q
x1 « “ ´10
0.003
59.14
contains the small error of 0.001 multiplied by « 20000.
0.003
This ruins the approximation to the actual value x1 “ 10.

This is clearly a contrived example but it does give an idea on how


the round-off error can alter the results.
18/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
Partial pivoting

For larger systems it is much more difficult to predict in advance


when devastating round-off error might occur.

To avoid this problem, pivoting is performed by selecting an


element apq with a larger magnitude as the pivot, and
interchanging the k-th and p-th rows. This can be followed by the
interchange of the k-th and q-th columns, if necessary.

The simplest strategy is to select an element in the same column


that is below the diagonal and has the largest absolute value;
specifically, we determine the smallest p ě k such that

|apk | “ max |aik |


kďiďn

and perform pEk q Ø pEp q.

In this case no interchange of columns is used.


19/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
Partial pivoting

Consider the system:


E1 : 0.003x1 ` 59.14x2 “ 59.17
E2 : 5.291x1 ´ 6.13x2 “ 46.78
We use partial pivoting and four-digit rounding arithmetic and
compare the results to the exact solution x1 “ 10 and x2 “ 1.

The partial pivoting procedure first requires finding


( (
max |a11 |, |a21 | “ max |0.003|, |5.291| “ |5.291| “ |a21 |.

Then we perform pE2 q Ø pE1 q to get the system

E1 : 5.291x1 ´ 6.13x2 “ 46.78


E2 : 0.003x1 ` 59.14x2 “ 59.17

20/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
Partial pivoting

The multiplier for this system is


a21 0.003
m21 “ “ “ 0.0005670
a11 5.291
and the operation pE2 ´ m21 E1 q Ñ pE2 q reduces the system to

E1 : 5.291x1 ´ 6.13x2 “ 46.78


E2 : 59.14x2 « 59.14
The four-digit answers resulting from the backward substitution are
the correct values x1 “ 10 and x2 “ 1.

Each multiplier mji in the partial pivoting algorithm has magnitude


less than or equal to 1.

Although this strategy is sufficient for many linear systems,


situations do arise when it is inadequate.
21/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
Scaled partial pivoting

The linear system:


E1 : 30x1 ` 591400x2 “ 591700
E2 : 5.291x1 ´ 6.13x2 “ 46.78
is the same as that in the above examples, except that all the
entries in the first equation, E1 , have been multiplied by 104 .

The partial pivoting procedure with four-digit rounding arithmetic


leads to the same results as obtained in the first example. The
maximal value in the first column is 30, and the multiplier
5.291
m21 “ “ 0.1764 leads to the system
30
E1 : 30x1 ` 591400x2 “ 591700
E2 : ´104300x2 “ ´104400
which has the same inaccurate solutions x2 « 1.001 and x1 « ´10.
22/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
Scaled partial pivoting

Scaled partial pivoting (or scaled-column pivoting) is needed for


the system illustrated above.

It places the element in the pivot position that is largest relative to


the entries in its row.

The first step in this procedure is to define a scale factor si for


each row as
si “ max |aij |.
1ďjďn

Assuming that si ą 0, the appropriate row interchange to place


zeros in the first column is determined by choosing the least
integer p with
|ap1 | |ak1 |
“ max
sp 1ďkďn s1

and performing pE1 q Ø pEp q.


23/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
Scaled partial pivoting

The effect of scaling is to ensure that the largest element in each


row has a relative magnitude of 1 before the comparison for row
interchange is performed.

In a similar manner, before eliminating the variable xi using the


operations Ek ´ mki Ei , for k “ i ` 1, . . . , n, we select the smallest
integer p ě i with
|api | |aki |
“ max
sp iďkďn sk

and perform the row interchange pEi q Ø pEp q if i ‰ p.

The scale factors s1 , . . . , sn are computed only once, at the start of


the procedure.

They are row dependent, so they must also be interchanged when


row interchanges are performed.

24/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
Scaled partial pivoting

We now apply scaled partial pivoting to the previous example. It


gives

s1 “ maxt|30.00|, |591400|u “ 591400, s2 “ 6.130.

Consequently,

|a11 | 30 |a21 | 5.291


“ “ 0.5073 ˆ 10´4 , “ “ 0.8631
s1 591400 s2 6.130
and the interchange pE1 q Ø pE2 q is made.

Applying GEM to the new system


E1 : 5.291x1 ´ 6.13x2 “ 46.78
E2 : 30x1 ` 591400x2 “ 591700
produces the correct results: x1 “ 10 and x2 “ 1.
25/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
Operation count

The first additional computations required for scaled partial


pivoting result from the determination of the scale factors; there
are pn ´ 1q comparisons for each of the n rows, for a total of
npn ´ 1q comparisons.

To determine the correct first interchange, n divisions are


performed, followed by n ´ 1 comparisons. So the first interchange
determination adds n divisions and n ´ 1 comparisons.

The scaling factors are computed only once, so the second step
requires pn ´ 1q divisions and pn ´ 2q comparisons.

We proceed in a similar manner until there are zeros below the


main diagonal in all but the n-th row.

The final step requires that we perform 2 divisions and 1


comparison.
26/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
Operation count

As a consequence, scaled partial pivoting adds a total of


n´1
ÿ npn ´ 1q 3
npn ´ 1q ` k “ npn ´ 1q ` “ npn ´ 1q
k“1
2 2

comparisons and
n
ÿ npn ` 1q 1
k“ ´ 1 “ pn ´ 1qpn ` 2q
k“2
2 2

divisions to the GEM.

The time required to perform a comparison is about the same as


an addition/subtraction.

27/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
Operation count

The total time to perform the basic GEM is Opn3 {3q


multiplications/divisions and Opn3 {3q additions/subtractions.

Hence, scaled partial pivoting does not add significantly to the


computational time required to solve a system for large values of n.

To emphasize the importance of choosing the scale factors only


once, imagine if the procedure were modified so that new scale
factors were determined each time a row interchange was made.

In this case, the term npn ´ 1q would be replaced by


n
ÿ 1
kpk ´ 1q “ npn2 ´ 1q.
k“2
3

As a consequence, this pivoting technique would add Opn3 {3q


comparisons, in addition to the rnpn ` 1q{2s ´ 1 divisions.
28/71
shripad@math.iitb.ac.in MA214 (2021-2022) L39
MA 214: Introduction to numerical analysis
Lecture 40

Shripad M. Garge.
IIT Bombay

(shripad@math.iitb.ac.in)

2021-2022

29/71
shripad@math.iitb.ac.in MA214 (2021-2022) L40
LU decomposition

The Gaußian elimination method is the principal tool in solving


linear systems of equations, so it should be no surprise that it
appears in other guises.

We now see that the steps used to solve a system of the form
Ax “ b can be used to factor the matrix A.

The factorisation is particularly useful when it has the form


A “ LU, where L is lower triangular and U is upper triangular.

Although not all matrices have this type of representation, many


do that occur frequently in the application of numerical techniques.

To see which matrices have an LU factorisation and to find how it


is determined, first suppose that Gaußian elimination method can
be performed on the system Ax “ b without row interchanges.

This is equivalent to having nonzero pivot elements aii , for each i.


30/71
shripad@math.iitb.ac.in MA214 (2021-2022) L40
LU decomposition

The first step in the Gaussian elimination method consists of


performing, for each j “ 2, 3, ..., n, the operations
aj1
pEj ´ mj1 E1 q Ñ pEj q, where mj1 “ .
a11
These operations transform the system into one in which all the
entries in the first column below the diagonal are zero.

The above system of operations can be viewed in another way.

It is simultaneously accomplished by multiplying the original matrix


A on the left by the matrix
¨ ˛
1 0 ¨¨¨ 0
˚´m21 1 0‹
M p1q “ ˚ . ‹.
˚ ‹
˝ .. ..
. 0‚
´mn1 0 ¨ ¨ ¨ 1
31/71
shripad@math.iitb.ac.in MA214 (2021-2022) L40
LU decomposition

It is a matrix which differs from the identity matrix only in the first
column, below the diagonal element, where the entry 0 is replaced
by the negatives of multipliers, ´mj1 .

We then have

M p1q Ax “ M p1q b which gives Ap2q x “ b p2q .

In a similar manner, we construct M p2q , which is obtained by


replacing the entries below the diagonal in the second column of
the identity matrix by the negatives of the multipliers, ´mj2 .

This changes the system by

M p2q Ap2q x “ M p2q b p2q giving Ap3q x “ b p3q .

32/71
shripad@math.iitb.ac.in MA214 (2021-2022) L40
LU decomposition

In general, with
Apkq x “ b pkq
already formed, we multiply by the matrix M pkq .

This is the matrix which differs from the identity matrix only in the
k-th column, below the diagonal entry, where 0 is replaced by
´mjk to obtain the next system

Apk`1q x “ M pkq Apkq x “ M pkq b pkq “ b pk`1q .

The process ends with the formation of Apnq x “ b pnq , where Apnq is
the upper triangular matrix, the result of the Gaußian elimination
method.

We have
Apnq “ M pn´1q M pn´2q ¨ ¨ ¨ M p1q A.
33/71
shripad@math.iitb.ac.in MA214 (2021-2022) L40
LU decomposition

The matrix Apnq is an upper triangular matrix.

This is the U matrix in our desired decomposition A “ LU.

The matrix L then has to be L “ AU ´1 but we will obtain it using


the above matrices M pkq .

Note that each of the M pkq is lower triangular and so is the inverse
of each M pkq .

The matrix M pkq effects the row operation pEj ´ mjk Ek q Ñ pEj q
and its inverse Lpkq should reverse this operation, hence should
perform the operation pEj ` mjk Ek q Ñ pEj q.

Thus, the matrix Lpkq “ pM pkq q´1 differs from the identity matrix
only in the k-th column, below the diagonal entry, where 0 is
replaced by the multiplier, mjk , for j “ k ` 1, . . . , n.

34/71
shripad@math.iitb.ac.in MA214 (2021-2022) L40
LU decomposition

Therefore L “ Lp1q Lp2q ¨ ¨ ¨ Lpn´1q is a lower triangular matrix and


one has
` ˘` ˘
A “ LU “ Lp1q Lp2q ¨ ¨ ¨ Lpn´1q M pn´1q M pn´2q ¨ ¨ ¨ M p1q A .

Thus, if the GEM can be performed on a system Ax “ b without


any row changes, then the matrix A has a factorisation A “ LU as
a product of a lower triangular matrix and an upper triangular
matrix.

Consider the system


¨ ˛¨ ˛ ¨ ˛
1 1 0 3 x1 1
˚2 1 ´1 1 ‹ ˚x2 ‹ ˚ 1 ‹
˚ ‹˚ ‹ “ ˚ ‹.
˝3 ´1 ´1 2 ‚˝x3 ‚ ˝´3‚
´1 2 3 ´1 x4 4

35/71
shripad@math.iitb.ac.in MA214 (2021-2022) L40
An example
We obtain the LU factorisation. The first operations are:
pE2 ´ 2E1 q Ñ pE2 q, pE3 ´ 3E1 q Ñ pE3 q, pE4 ´ p´1qE1 q Ñ pE4 q.
This gives
¨ ˛¨ ˛
1 0 0 0 1 1 0 3
˚2 1 0 0‹ ˚0 ´1 ´1 ´5‹ ` ˘` ˘
A“˚ ‹˚ ‹ “ Lp1q M p1q A .
˝3 0 1 0‚˝0 ´4 ´1 ´7‚
´1 0 0 1 0 3 3 2

The next operations are: pE3 ´ 4E2 q Ñ pE3 q,


pE4 ´ p´3qE2 q Ñ pE4 q which gives
¨ ˛¨ ˛¨ ˛
1 0 0 0 1 0 0 0 1 1 0 3
˚2 1 0 0‹ ˚0 1 0 0‹ ˚0 ´1 ´1 ´5 ‹
A“˚
˝3
‹˚ ‹˚ ‹
0 1 0‚˝0 4 1 0‚˝0 0 3 13 ‚
´1 0 0 1 0 ´3 0 1 0 0 0 ´13
` ˘` ˘
A “ Lp1q Lp2q M p2q M p1q A .
36/71
shripad@math.iitb.ac.in MA214 (2021-2022) L40
The example, continued

The resulting matrix M p2q M p1q A is upper triangular now. Hence


U “ M p2q M p1q A and L “ Lp1q Lp2q . We thus get
¨ ˛¨ ˛
1 0 0 0 1 1 0 3
˚2 1 0 0‹ ‹ ˚0 ´1 ´1 ´5 ‹ .
˚ ‹
A“˚ ˝3 4 1 0 ‚ ˝ 0 0 3 13 ‚
´1 ´3 0 1 0 0 0 ´13

To solve the system Ax “ LUx “ b, we introduce the variable y ,


defined by y “ Ux. We solve for y first from the system Ly “ b:
¨ ˛¨ ˛ ¨ ˛
1 0 0 0 y1 8
˚2 1 0 0‹ ˚y2 ‹ ˚ 7 ‹
‹ ˚ ‹ ˚
˚ “ ‹.
˝3 4 1 0‚˝y3 ‚ ˝ 14 ‚
´1 ´3 0 1 y4 ´7

This gives y1 “ 8, y2 “ ´9, y3 “ 26 and y4 “ ´26.


37/71
shripad@math.iitb.ac.in MA214 (2021-2022) L40
The example, continued

We then solve for Ux “ y


¨ ˛¨ ˛ ¨ ˛
1 1 0 3 x1 8
˚0 ´1 ´1 ´5 ‹‹ ˚x2 ‹ “ ˚ ´9 ‹ .
˚ ‹ ˚ ‹
˚
˝0 0 3 13 ‚˝x3 ‚ ˝ 26 ‚
0 0 0 ´13 x4 ´26

Using backward substitution, we obtain x4 “ 2, x3 “ 0, x2 “ ´1


and x1 “ 3.

Observe that we had the diagonal entry in the matrix L, lii “ 1 for
all i in this method.

The methods differ by putting conditions on the diagonal elements


of L. The one we studied just now had lii “ 1. Some other method
demands that uii “ 1 and yet another demands that lii “ uii .

38/71
shripad@math.iitb.ac.in MA214 (2021-2022) L40
Permutation matrices

Now, we worry ourselves with the question about row interchange.

If the row interchange is not needed then A can be factored as


A “ LU.

Performing a row interchange pEi q Ø pEj q of the system Ax “ b is


achieved by multiplying to the system on the left by a certain
matrix P.

This matrix P differs from the identity matrix only in 4 places,


pii “ pjj “ 0 and pji “ pji “ 1.

Such a matrix is called a permutation matrix.

In general, a permutation matrix is a matrix whose every row and


every column has only one non-zero entry which is equal to 1.

39/71
shripad@math.iitb.ac.in MA214 (2021-2022) L40
Permutation matrices

We write the permutation matrices of size 2.


ˆ ˙ ˆ ˙
1 0 0 1
, .
0 1 1 0

There will be 6 permutation matrices of size 3, and in general n! of


size n.

Note that the product of two permutation matrices is a


permutation matrix, the determinant of a permutation matrix is
˘1 and the inverse of a permutation matrix is again a permutation
matrix.

If we knew the row interchanges that would be needed to solve a


particular system by GEM, then we could arrange the original
equations in an order that would ensure that no row operations are
required.
40/71
shripad@math.iitb.ac.in MA214 (2021-2022) L40
PLU decomposition

This means that for an invertible matrix A, a permutation matrix


P exists such that the system PAx “ Pb does not require row
interchanges.

Then the Gaußian elimination method can be applied to it.

As a consequence, PA can be factored as LU where L is a lower


triangular matrix and U is an upper triangular matrix.

Then we get
A “ P ´1 LU “ P 1 LU
which is a decomposition of A as a product of a permutation
matrix P 1 , a lower triangular matrix L and an upper triangular
matrix M, in that order.

This is called a PLU decomposition.

41/71
shripad@math.iitb.ac.in MA214 (2021-2022) L40
MA 214: Introduction to numerical analysis
Lecture 41

Shripad M. Garge.
IIT Bombay

(shripad@math.iitb.ac.in)

2021-2022

42/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
PLU decomposition

Given a system of linear equations Ax “ b, if we knew the row


interchanges that would be needed to solve it by GEM, then we
could arrange the original equations in an order that would ensure
that no row operations are required.

This means that for an invertible matrix A, a permutation matrix


P exists such that the system PAx “ Pb does not require row
interchanges. Then the matrix PA can be factored as LU where L
is a lower triangular matrix and U is an upper triangular matrix.

Then we get A “ P ´1 LU “ P 1 LU which is a decomposition of A


as a product of a permutation matrix P 1 , a lower triangular matrix
L and an upper triangular matrix M, in that order.

This is called a PLU decomposition.

43/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
PLU decomposition

Consider the matrix


¨ ˛
0 0 ´1 1
˚1 1 ´1 2‹
A“˚
˝´1 ´1 2 0‚.

1 2 0 2

This matrix can not have factorisation as A “ LU because a11 “ 0.

We therefore perform pE1 q Ø pE2 q and then pE3 ` E1 q Ñ pE3 q,


pE4 ´ E1 q Ñ pE4 q to get the matrix
¨ ˛
1 1 ´1 2
˚0 0 ´1 1‹
˝0 0 1 2‚.
˚ ‹

0 1 1 0

This necessitates a row interchange pE2 q Ø pE4 q. 44/71


shripad@math.iitb.ac.in MA214 (2021-2022) L41
PLU decomposition

After that we perform pE4 ` E3 q Ñ pE4 q to get the matrix


¨ ˛
1 1 ´1 2
˚0 1 1 0 ‹
˚ ‹
˝0 0 1 2 ‚
0 0 0 3

which is upper triangular.

The permutation matrix associated to the row interchanges


pE1 q Ø pE2 q followed by pE2 q Ø pE4 q is
¨ ˛
0 1 0 0
˚0 0 0 1‹
P“˚ ˝0 0 1 0‚.

1 0 0 0

45/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
PLU decomposition

We have ¨ ˛
1 1 ´1 2
˚1 2 0 2‹
PA “ ˚
˝´1 ´1 2
‹.
0‚
0 0 ´1 1
This is the matrix on which we can perform the Gaußian
elimination, the operations being the same as the ones above:

pE2 ´ E1 q Ñ pE2 q, pE3 ` E1 q Ñ pE3 q and pE4 ` E3 q Ñ pE4 q.

The nonzero multipliers for PA are consequently, m21 “ 1,


m31 “ ´1 and m43 “ ´1.

This gives the LU factorisation for the matrix PA.

46/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
PLU decomposition

¨ ˛¨ ˛
1 0 0 0 1 1 ´1 2
˚1 1 0 0‹ ˚0
‹ ˚ 1 1 0‹
PA “ ˚ ‹ “ LU.
˝´1 0 1 0‚˝0 0 1 2‚
0 0 ´1 1 0 0 0 3
The final decomposition for A as a product of a permutation
matrix, a lower triangular matrix and an upper triangular matrix
can be obtained as A “ P ´1 LU:
¨ ˛¨ ˛¨ ˛
0 0 0 1 1 0 0 0 1 1 ´1 2
˚1 0 0 1‹ ˚ 1 1 0 0‹ ˚0 1 1 0‹
˝0 0 1 0‚˝´1 0 1 0‚˝0 0 1 2‚.
˚ ‹˚ ‹˚ ‹

0 1 0 0 0 0 ´1 1 0 0 0 3

While it would be of interest to some to write A “ PLU, it is


desirable to know the matrices which admit GEM without row
interchanges. 47/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
Diagonally dominant matrices

We now turn attention to two classes of matrices for which


Gaussian elimination can be performed effectively without row
interchanges.

An n ˆ n matrix A is said to be diagonally dominant when


n
ÿ
|aii | ě |aij |
j “1
j ‰i

holds for each i “ 1, 2, . . . , n.

A diagonally dominant matrix is said to be strictly diagonally


dominant when the above inequality is strict for each n.

48/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
Diagonally dominant matrices

A strictly diagonally dominant matrix A is invertible.

Moreover, in this case, the Gaußian elimination method can be


performed on any linear system of the form Ax “ b to obtain its
unique solution without row interchanges.

Furthermore, the computations will be stable with respect to the


growth of round-off errors.

The above statement is a serious theorem, which we will not prove


in this course, but we need to understand that this is the reason
the strictly dominant matrices are useful.

49/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
Positive definite matrices

A matrix A is called positive definite if A is symmetric and if


x t Ax ą 0 for every 0 ‰ x.

If A “ raij s and the column vector x is given by x “ rx1 , . . . , xn st


then ˜ ¸
n ÿ
ÿ n
x t Ax “ aij xi xj .
i“1 j“1

The matrix ¨ ˛
2 ´1 0
A “ ˝´1 2 ´1‚
0 ´1 2
is a positive definite matrix.

To begin with, A is a symmetric matrix.

50/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
Positive definite matrices

Further,

x t Ax “ 2x12 ´ 2x1 x2 ` 2x22 ´ 2x2 x3 ` 2x32


“ x12 ` px1 ´ x2 q2 ` px2 ´ x3 q2 ` x32 .

If px1 , x2 , x3 q ‰ p0, 0, 0q then it follows that

x t Ax ą 0

and hence A is a positive definite matrix.

It is clear from this example that using the definition to determine


if a matrix is positive definite is difficult.

Fortunately, there are more easily verified criteria, which will be


presented later in our course, for identifying members of this
important class.
51/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
Positive definite matrices

The next result provides some necessary conditions that can be


used to eliminate certain matrices from consideration.

If A “ raii s is a positive definite matrix then

A is invertible,
aii ą 0 for each i,
paij q2 ă aii ajj for each i ‰ j,
max |akj | ď max |aii |.
1ďk,jďn 1ďiďn

We should note, however, that a matrix A may satisfy all these


conditions and may still not be positive definite!

We now give a necessary and sufficient criterion for a matrix to be


positive definite.
52/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
Positive definite matrices

A leading principal submatrix of a matrix A is a matrix of the form


¨ ˛
a11 a12 ¨ ¨ ¨ a1k
˚a21 a22 ¨ ¨ ¨ a2k ‹
˚ ‹
˚ .. .. .. .. ‹
˝ . . . . ‚
ak1 ak2 ¨ ¨ ¨ akk

for some 1 ď k ď n.

A matrix A is positive definite if and only if each leading principal


submatrix of A has positive determinant.

Consider the matrix discussed in the above example:


¨ ˛
2 ´1 0
A “ ˝´1 2 ´1‚.
0 ´1 2
53/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
Positive definite matrices

We verify that it is positive definite using the criteria.


ˆ ˙
2 ´1
A1 “ p2q, A2 “ .
´1 2

We then have detpA1 q “ 2 and detpA2 q “ 3.

Since detpAq “ 4, we have that A is positive definite.

We now record the results that make positive definite matrices


interesting.

A symmetric matrix A is positive definite if and only if Gaußian


elimination method can be applied to any system Ax “ b without
interchanging rows with all pivot elements positive.

Moreover, in this case, the computations are stable with respect to


the growth of round-off errors.
54/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
Positive definite matrices

The positive definite matrices admit interesting factorisations.

A matrix A is positive definite if and only if A “ LDLt where L is


lower triangular with 1’s on the diagonal and D is a diagonal
matrix with positive diagonal entries.

A matrix A is positive definite if and only if A “ LLt where L is a


lower triangular matrix.

A factorisation of a matrix A, A “ LU with lii “ uii is called a


Cholesky factorisation.

The factorisation A “ Lt L is one such.

Note that being positive definite is not necessary to admit a


Cholesky factorisation. The zero matrix admits one, for instance.

55/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
Cholesky factorisation

Determine the Cholesky factorisation of the positive definite matrix


¨ ˛
4 ´1 1
A “ ˝´1 4.25 2.75‚.
1 2.75 3.5

If
¨ ˛ ¨ ˛¨ ˛
4 ´1 1 l11 0 0 l11 l21 l31
A “ ˝´1 4.25 2.75‚ “ ˝l21 l22 0 ‚˝ 0 l22 l32 ‚
1 2.75 3.5 l31 l32 l33 0 0 l33

then we get
¨ ˛ ¨ 2 ˛
4 ´1 1 l11 l11 l21 l11 l31
˝´1 4.25 2.75‚ “ ˝l11 l21 2 ` l2
l21 l21 l31 ` l22 l32 ‚
22
1 2.75 3.5 2 ` l2 ` l2
l11 l31 l21 l31 ` l22 l32 l31 32 33
56/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
Cholesky factorisation

We can solve for lij :

a11 : 2
4 “ l11 ùñ l11 “ 2,
a21 : ´1 “ l11 l21 ùñ l21 “ ´0.5,
a31 : 1 “ l11 l31 ùñ l31 “ 0.5,
a22 2 ` l2
: 4.25 “ l21 ùñ l22 “2
22
a32 : 2.75 “ l21 l31 ` l22 l32 ùñ l32 “ 1.5
a33 2 ` l2 ` l2
: 3.5 “ l31 ùñ l33 “1
32 33

and hence
¨ ˛ ¨ ˛¨ ˛
4 ´1 1 2 0 0 2 ´0.5 0.5
A “ ˝´1 4.25 2.75‚ “ ˝´0.5 2 0‚˝0 2 1.5‚ “ LLt .
1 2.75 3.5 0.5 1.5 1 0 0 1

57/71
shripad@math.iitb.ac.in MA214 (2021-2022) L41
MA 214: Introduction to numerical analysis
Lecture 42

Shripad M. Garge.
IIT Bombay

(shripad@math.iitb.ac.in)

2021-2022

58/71
shripad@math.iitb.ac.in MA214 (2021-2022) L42
Iterative methods

The methods we used to solve systems Ax “ b until the last


lecture are called the direct methods.

Now we want to try the analogues of iterative techniques which


were so useful in finding solutions of equations in one variable.

An initial approximation (or approximations) was found, and new


approximations are then determined based on how well the
previous approximations satisfied the equation.

The objective is to find a way to minimise the difference between


the approximations and the exact solution.

To discuss iterative methods for solving linear systems, we first


need to determine a way to measure the distance between
n-dimensional column vectors.

59/71
shripad@math.iitb.ac.in MA214 (2021-2022) L42
Distance in Rn

Our underlying set is Rn , the vector space of all column vectors of


size n ˆ 1.

We define two types of distances on pairs of vectors in Rn . Let


x, y P Rn with

x “ px1 , . . . , xn qt and y “ py1 , . . . , yn qt .

The l2 -distance:
˜ ¸1{2
n
ÿ
2
}x ´ y}2 “ pxi ´ yi q .
i“1

The l8 -distance:

}x ´ y}8 “ max |xi ´ yi |.


1ďiďn
60/71
shripad@math.iitb.ac.in MA214 (2021-2022) L42
Distance in Rn

Consider the system of linear equations:

3.3330x1 ` 15920x2 ´ 10.333x3 “ 15913,


2.2220x1 ` 16.710x2 ` 9.6120x3 “ 28.544,
1.5611x1 ` 5.1791x2 ` 1.6852x3 “ 8.4254

which has the exact solution x “ p1, 1, 1qt but the Gaußian
elimination method performed using five-digit rounding arithmetic
and partial pivoting produces the approximate solution

x̃ “ p1.2001, 0.99991, 0.92538qt .

We now compute the distance of x̃ in both the distances, l2 and l8 .

61/71
shripad@math.iitb.ac.in MA214 (2021-2022) L42
Distance in Rn

We have
˘1{2
p0.2001q2 ` p0.00009q2 ` p0.07462q2
`
}x ´ x̃}2 “
“ 0.21356
and
}x ´ x̃}8 “ maxt0.2001, 0.00009, 0.07462u
“ 0.2001.
Here, the second and the third components are good
approximations but the first component is a poor approximation
and that error dominates the computations of both the above
distances.

The above notions of distance can be used to define convergence


of sequences in Rn .
?
Note that }x ´ y}8 ď }x ´ y}2 ď n}x ´ y}8 .
62/71
shripad@math.iitb.ac.in MA214 (2021-2022) L42
Norms on matrices

We want to generalise the notion of absolute value in R to the


space of n ˆ n matrices.

Note that we also have the analogues of absolute value on Rn ,


}x}2 “ }x ´ 0}2 and }x}8 “ }x ´ 0}8 .

They are called the l2 and l8 norms on Rn .

We now define:

}A}2 “ max }Ax}2 , }A}8 “ max }Ax}8


}x}2 “1 }x}8 “1

The l8 norm on matrices can be computed quite easily. If


A “ paij q then ÿ
}A}8 “ max |aij |.
i
j

63/71
shripad@math.iitb.ac.in MA214 (2021-2022) L42
Norms on matrices

Compute the l8 norm of


¨ ˛
1 2 ´1
A “ ˝0 3 ´1‚.
5 ´1 1

We have
3
ÿ 3
ÿ
|a1j | “ 1 ` 2 ` 1 “ 4, |a2j | “ 0 ` 3 ` 1 “ 4
j“1 j“1

and
3
ÿ
|a3j | “ 5 ` 1 ` 1 “ 7.
j“1

Hence }A}8 “ 7.

64/71
shripad@math.iitb.ac.in MA214 (2021-2022) L42
Eigenvalues, eigenvectors

An n ˆ n matrix A acts as a function from Rn to Rn .

The eigenvectors of A are the vectors that are sent to their


multiples by the matrix A.

More precisely, a nonzero vector v P Rn is an eigenvector for A if


there is a λ P R such that Av “ λv.

A real number λ P R is called an eigenvalue of A if there is a


non-zero v P Rn with Av “ λv.

Now, if λ is an eigenvalue of A then A ´ λI is not invertible,


because it sends a nonzero v to zero. Hence detpA ´ λI q “ 0.

Thus, the eigenvalues of A are the roots of the characteristic


polynomial of A which is the detpA ´ λI q.

65/71
shripad@math.iitb.ac.in MA214 (2021-2022) L42
Eigenvalues, eigenvectors

Determine the eigenvalues and eigenvectors of


¨ ˛
2 0 0
A “ ˝1 1 2‚.
1 ´1 4

Note that the characteristic polynomial of A is

´pλ3 ´ 7λ2 ` 16λ ´ 12q “ ´pλ ´ 3qpλ ´ 2q2 .

Thus the eigenvalues of A are 2 and 3.

Now we compute the corresponding eigenvectors.

The eigenvectors for λ are the vectors v with pA ´ λI qpvq “ 0.

66/71
shripad@math.iitb.ac.in MA214 (2021-2022) L42
Eigenvalues, eigenvectors

λ “ 3: pA ´ 3I qpvq “ 0 gives
¨ ˛¨ ˛ ¨ ˛
´1 0 0 v1 0
˝ 1 ´2 2‚˝v2 ‚ “ ˝0‚.
1 ´1 1 v3 0

This gives
v1 “ 0 and v2 “ v3
hence an eigenvector for eigenvalue 3 is of the form cp0, 1, 1qt for
0 ‰ c P R.

λ “ 2: A similar analysis as above tells us that an eigenvector for


eigenvalue 2 is a non-zero vector v satisfying v1 ´ v2 ` 2v3 “ 0.

In particular, it is of the form c1 p0, 2, 1qt ` c2 p´2, 0, 1qt where


pc1 , c2 q ‰ p0, 0q.
67/71
shripad@math.iitb.ac.in MA214 (2021-2022) L42
Spectral radius

Note that even though our A has real entries, some roots of the
characteristic polynomial of A may be (non-real) complex numbers.

The spectral radius of A, ρpAq, is defined by

ρpAq “ max |λ|

where λ varies over all roots of the characteristic polynomial of A.

If A is an n ˆ n matrix then
“ ‰1{2
}A}2 “ ρpAt Aq
ρpAq ď }A}2 and
ρpAq ď }A}8 .

Let us now compute l2 -norm of a matrix.


68/71
shripad@math.iitb.ac.in MA214 (2021-2022) L42
Spectral radius

Let ¨ ˛
1 1 0
A “ ˝ 1 2 1‚.
´1 1 2
We apply the above result, and for that we need to compute
ρpAt Aq.

Here
¨ ˛¨ ˛ ¨ ˛
1 1 ´1 1 1 0 3 2 ´1
At A “ ˝1 2 1 ‚˝ 1 2 1‚ “ ˝ 2 6 4 ‚
0 1 2 ´1 1 2 ´1 4 5
?
and the eigenvalues of At A are 0, 7 ˘ 7. Hence
b
a ?
t
}A}2 “ ρpA Aq “ 7 ` 7 « 3.10576.
69/71
shripad@math.iitb.ac.in MA214 (2021-2022) L42
Convergent matrices

In studying iterative matrix techniques, it is of particular


importance to know when powers of a matrix become small, that
is, when all the entries approach zero.

Matrices of this type are called convergent.

An n ˆ n matrix A is called convergent if for each 1 ď i, j ď n

lim pAk qij “ 0.


kÑ8

For instance, if
ˆ ˙ ˆ ˙
1{2 0 k 1{2k 0
A“ then A “ .
1{4 1{2 k{2k`1 1{2k

It then follows that A is a convergent matrix.

70/71
shripad@math.iitb.ac.in MA214 (2021-2022) L42
Convergent matrices

The following statements are equivalent:

1 A is a convergent matrix.
2 limnÑ8 }An }2 “ 0.
3 limnÑ8 }An }8 “ 0.
4 ρpAq ă 1.
5 limnÑ8 An v “ 0 for every vector v.

71/71
shripad@math.iitb.ac.in MA214 (2021-2022) L42

You might also like