Professional Documents
Culture Documents
CEF352 / E. Wamba
Contents
1 Introductory notes 2
1.1 Matrix form and augmented matrix of a system of equations . . . . . . . . . . . . . . . 2
1.2 Transpose, determinant and minors of a matrix . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Transpose of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Determinant and minors of a matrix . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Comatrix and inverse of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.1 Cofactor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.2 Comatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.3 Adjoint matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.4 Inverse matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Direct methods 9
3.1 Inverse matrix method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Elimination methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.1 Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.2 Gauss-Jordan elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.3 Problems with elimination methods and techniques for improving the solution
procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Matrix decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.1 LU decomposition and application . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.2 Cholesky method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1
1 Introductory notes
1.1 Matrix form and augmented matrix of a system of equations
Suppose you model an engineering problem, and obtain an algebraic system of n linear equations and
n unknowns xi , i = 1, 2, ..., n. It can have the following general form
a x + a12 x2 + · · · + a1n xn = b1
11 1
( n )
X a21 x1 + a22 x2 + · · · + a2n xn = b2
aij xj = bi ⇐⇒ ..
j=1
.
i=1,2,...,n
a x + a x + · · · + a x = b .
n1 1 n2 2 nn n n
A x = b.
• Exercise: The ages of a kid and of his dad make together 35 years old. The dad is 6 times older
than the kid.
1. Formulate this problem as a set of linear equations.
2. Write it down into the matrix form.
3. Deduce the coefficient matrix and the augmented coefficient matrix.
Note that you might obtain different but equivalent answers.
2
Example: Consider the matrix
−1 0
−1 2 4
A= . Then its transpose is AT = 2 5 .
0 5 10
4 10
Note: The transpose of a column vector is a row vector, and vice-versa (conversely).
a11 a12
detA =
= a11 a22 − a21 a12 .
a21 a22
b) Minors of a matrix. When we delete the ith row and the jth column of a determinant, the
remaining determinant is called the minor of the i-jth element, and is denoted Mij .
Solution. We have
4 1 0 1 0 4
M11 = = −3, M12 =
−2 0 = 2, M13 = −2 3 = 8
3 0
2 5 −1 5 −1 2
M21 =
= −15, M22 =
= 10, M23 =
=1
3 0 −2 0 −2 3
2 5 −1 5 −1 2
M31 =
= −18, M 32 = = −5, M33 =
= −4.
4 1 1 0 0 4
c) Higher-order determinants (n > 2). Higher-order determinants are computed based on the
minors. We can take any arbitrary row i(or column j) and evaluate an n × n determinant |A| in terms
of minors as
n
X Xn
i+j
detA = (−1) aij Mij , rowwise, or detA = (−1)i+j aij Mij , columnwise.
j=1 i=1
3
• Example with a 3 × 3 matrix. Compute the following determinant
−1 2 5
detA = 0 4 1 .
−2 3 0
1.3.2 Comatrix
The cofactor matrix (or the matrix of cofactors or comatrix) of a square matrix A is the matrix formed
by all the cofactors Aij of the matrix A. For a 3 × 3 matrix, we have:
A11 A12 A13
Cof(A) = A21 A22 A23
A31 A32 A33
• Example. Compute all cofactors of the following determinant, and deduce the comatrix of A.
−1 2 5
A = 0 4 1 .
−2 3 0
4
Solution. We have
4 1 0 1 0 4
= −3, A12 = −
A11 = + −2 0 = −2, A13 = + −2 3 = 8
3 0
2 5 −1 5 −1 2
A21 = − −2 0 = 10, A23 = − −2 3 = −1
= 15, A22 = +
3 0
= −18, A32 = − −1
2 5 5 −1 2
A31 = + = 5, M33 = + = −4.
4 1 1 0 0 4
Hence
−3 −2 8
Cof(A) = 15 10 −1
−18 5 −4
We can compute adjoints using Matlab commands, provided that the matrix be generated in sym-
bolic mode (and not for input arguments of type ’double’):
ans =
[ 4, -3]
[ 1, 2]
AB = BA = In .
We denote B = A−1 .
A−1 exists if detA 6= 0; otherwise A has no inverse. A−1 can be computed with one of the following
methods:
5
A†
— Cofactor method: [A−1
ij ] =
1
detA
[Aji ] ⇐⇒ A−1 = 1
detA
Cof(A)T = detA
.
Then B = A−1 .
Solution. We have
−1 2 5
detA = 0 4 1 = −1[4(0) − 3(1)] − 0 + (−2)[2(1) − 4(5)] = 39.
−2 3 0
Hence
−3 15 −18
1 1
A−1 = Cof(A)T = −2 10 5 .
detA 39
8 −1 −4
ans =
which is A−1 .
6
2 Solving a small set of equations: the Cramer’s rule
• Exercise. Solve the system
3x1 + 2x2 = 18
−x1 + 2x2 = 2
• Solution with Matlab. Using Matlab commands A = [3 2; -1 2]; B=[18; 2]; linsolve(A,B)
The output is
4
3
7
−x1 + 2x2 = 2 =⇒ x1 = 2x2 − 2
3x1 + 2x2 = 18 =⇒ 3(2x2 − 2) + 2x2 = 18
=⇒ 8x2 = 24
=⇒ x2 = 3
=⇒ x1 = 2(3) − 2 = 4.
D is the determinant of the system. We define the other determinants, D1 and D2 , by replacing columns
1 and 2, respectively, by the constants in the rhs of the system of equations. Then Cramer’s rule states
that the solution should reads
D1 32 D2 24
x1 ≡ = = 4, x2 ≡ = = 3.
D 8 D 8
Cramer’s rule is a determinant-based method for solving a matrix equation of n linear equations
A x = b. It is used when detA 6= 0. Cramer’s rule is not very suitable for numerical simulations as
it involves the computation of determinants which is very time consuming at higher order. It is best
suited to small numbers of equations.
Cramer’s rule states that each unknown in a system of linear algebraic equations may be expressed
as a fraction of two determinants with the denominator being detA and with the numerator obtained
from detA by replacing the column of coefficients of the unknown in question by the constants in b.
We have
a11 b1 a13 · · · a1n
a21 b2 a23 · · · a2n
.. .. .. . . ..
. . . . .
Di an1 bn an3 · · · ann
xi = , e.g., x2 =
D |A|
Single solution =⇒ D 6= 0.
No solution =⇒ D = 0 and b 6= 0.
Homework
Using Cramer’s rule and smartly applying Kirchhof’s laws of current and voltage, determine the current
in the 15 Ω, 10 Ω (between Nodes 2 and 5), and 5 Ω resistors of the following resistor circuit.
8
3 Direct methods
3.1 Inverse matrix method
This method is used for nonsingular square matrices. Such matrices are invertible because they have
nonzero determinant. Suppose we want to solve
Ax = b.
Since A is invertible, there exists a matrix A−1 such that
A−1 A = AA−1 = I.
From where we can write
x = A−1 b.
It follows that the solution vector can be found by simple multiplication of the inverse matrix by
the vector of free terms in the second phase of the method. The determination of the inverse matrix
constitutes an essential and most difficult problem to be solved in the first stage.
Solved exercise. Solve the following system of equations using the method of inverse matrix.
1 −2 3 x1 12
−1 1 2 x2 = 8
2 −1 −1 x3 4
Phase 1: compute the inverse matrix. The determinant is
1 −2 3
−1 1 2 = 1(−1 + 2) − (−2)(1 − 4) + 3(1 − 2) = −8.
2 −1 −1
9
Matlab solution. The problem can be solved using one of the two Matlab commands
A=[1 -2 3; -1 1 2; 2 -1 -1]; b=[12;8;4]; x=linsolve(A,b)
A=[1 -2 3; -1 1 2; 2 -1 -1]; b=[12;8;4]; x=A\b
A=[1 -2 3; -1 1 2; 2 -1 -1]; b=[12;8;4]; x=inv(A)*b
• Forward elimination. This phase consists in keeping the ith equation (ith row of the matrix)
unchanged, then eliminate the ith unknown, xi , from the subsequent equations (from the i + 1th to the
nth equations), progressively for i running from 1 to n (i.e., turning to zero the corresponding elements
(k)
of the matrix). Let the coefficients resulting from the kth step be called aij . Then, we have
- Step 1: Keep the 1st equation unchanged, then eliminate the first unknown, x1 , in the following
equations (from the 2nd to the last equations). This is done through the elementary row operations
(1)
Rowi ← Rowi + (−fi1 ) Row1 , for i = 2, · · · , n yielding new coefficients aij .
(0)
ai1
fi1 = (0)
is the elimination factor.
a11
- Step 2: Keep the 2nd equation unchanged, then eliminate the second unknown, x2 , from the 3rd
to the last equations. This step is done through the elementary row operations
(2)
Rowi ← Rowi + (−fi2 ) Row1 , for i = 3, · · · , n yielding new coefficients aij .
(1)
ai2
fi2 = (1)
is the elimination factor.
a22
- Step 3: Keep the 3rd equation unchanged, then eliminate the third unknown, x3 , from the 4th to
the last equations. This step is done through the elementary row operations
(3)
Rowi ← Rowi + (−fi3 ) Row1 , for i = 4, · · · , n yielding new coefficients aij .
(2)
ai3
fi3 = (1)
is the elimination factor.
a33
After the n − 1 th step, the resulting system has become upper triangular.
10
• Back(ward) substitution. In this phase, the last unknown can be directed calculated as
(n−1)
bn
xn = (n−1)
.
ann
Then, the second last unknown, xn−1 , can be calculated through substitution of xn into the n−1th equa-
tion. And following similar steps, we will recursively use substitution to compute xn−2 , xn−3 , · · · , x2 , x1
as
bk − nj=k+1 akj xj
P
xk = .
akk
Illustration of Gaussian elimination with a set of 3 equations.
For step 2, the pivot is a022 = 0.4/3. We have to perform the operation row3 ← row3 + − 0.38
0.4
row2 :
0.3 0.52 1 | −0.01 0.3 0.52 1 | −0.01
0 0.4/3 0.7/3 | 2.06/3 ∼ 0 0.4/3 0.7/3 | 2.06/3 (4)
0 0.38/3 0.5/3 | −1.31/3 0 0 −0.055 | −1.089
11
Note: Because elimination methods may lead to errors when a division by small numbers occurs,
it is sometimes termed the naive Gaussian elimination. The matrix obtained in the end of Gaussian
elimination is upper triangular and in the row echelon form. Therefore, Gaussian elimination can be
used for
- putting a matrix into ref.
- compute determinants simply as the product of diagonal elements: D = a11 a22 · · · ann (the deter-
minant is not changed by forward elimination).
Algorithm for Gaussian elimination. The method presented above can be implemented through
the following pseudocode:
Input:
n % matrix order
a % matrix coefficients
b % vector of constants (rhs)
% Elimination phase
for k=1:(n-1)
for i=(k+1):n % Going through the subsequent rows
f=a(i,k)/a(k,k);
a(i,k)=0;
for j=(k+1):n % Update the elements of a given subsequent row
a(i,j)=a(i,j)-f*a(k,j);
end
b(i)=b(i)-f*b(k);
end
end
% Back substitution phase
x(n)=0;
for i=1:n
k=n+1-i; S=0;
for j=(k+1):n
S=S+a(k,j)*x(j);
end
x(k)=(b(k)-S)/a(k,k);
end
12
3.2.2 Gauss-Jordan elimination
- First part (Gauss’ forward elimination): using elementary row operations to transform the augmented
coefficient matrix into the echelon form where the matrix is upper triangular. Then pursue elimination
to turn the matrix into a diagonal one.
- Second part: Normalize the rows (turning the diagonal elements to 1) by dividing with the pivot.
Then obtain directly the unknowns to be the updated elements of b:
(n−1) (n)
Rowi → Rowi /aii =⇒ xi = bi .
It should be noted that the major difference with Gaussian elimination is the fact that when an
unknown is eliminated in the Gauss-Jordan method, it is eliminated from all other equations rather
than just the subsequent ones. In addition, all rows are normalized by dividing them by their pivot
elements.
2x1 + x2 − x3 = 2
5x1 + 2x2 + 2x3 = −4
3x1 + x2 + x3 = 5
Step 2: the pivot is a022 = −0.5 =⇒ row3 ← row3 − row2 and row1 ← row1 + 2row2 :
2 1 −1 | 2 2 0 8 | −16
0 −0.5 4.5 | −9 ∼ 0 −0.5 4.5 | −9 (6)
0 −0.5 2.5 | 2 0 0 −2 | −11
Step 3: the pivot is a033 = −2 =⇒ row2 ← row2 + 4.5 row3 and row1 ← row1 + 28 row3 :
2
2 0 8 | −16 2 0 0 | −60
0 −0.5 4.5 | −9 ∼ 0 −0.5 0 | −33.75 (7)
0 0 −2 | −11 0 0 −2 | −11
Step 4: normalize =⇒ row1 ← row1 /2, row2 ← row2 /(−0.5) and row3 ← row3 /(−2):
2 0 0 | −60 1 0 0 | −30
0 −0.5 0 | −33.75 ∼ 0 1 0 | 67.5 (8)
0 0 −2 | −11 0 0 1 | 5.5
13
Algorithm for Gauss-Jordan elimination. The method can be implemented through the following
pseudocode:
Input:
n % matrix order
a % matrix coefficients
b % vector of constants (rhs)
% Elimination phase
for i=2:n % Going through the subsequent rows
f=a(i,1)/a(1,1);
a(i,1)=0;
for j=2:n % Update the elements of a given subsequent row
a(i,j)=a(i,j)-f*a(1,j);
end
b(i)=b(i)-f*b(1);
end
for k=2:n
for i=1:(k-1) and i=(k+1):n % Going through the subsequent rows
f=a(i,k)/a(k,k);
a(i,k)=0;
for j=(k+1):n % Update the elements of a given subsequent row
a(i,j)=a(i,j)-f*a(k,j);
end
b(i)=b(i)-f*b(k);
end
end
% Results
for i=1:n
x(i)=b(i)/a(i,i);
end
3.2.3 Problems with elimination methods and techniques for improving the solution
procedure
a) Division by zero and small numbers. The main reason why the Gaussian elimination is
termed ”naive” is that during both the elimination and the back-substitution phases, a division by zero
may occur. An example is the system
x2 + 2x3 = 4
1.1x1 + 2x2 + 4x3 = 10.4
3.1x1 − 3x2 + 4x3 = 5.1.
• Solution: the pivoting technique. Partial pivoting is switching the rows so that the largest
element is the pivot element. If rows and columns are switched, it is called complete pivoting. Complete
pivoting is not used for solution as it may yield a wrong final solution (at least by changing the order of
unknowns/basic variables). The pivoting technique has been developed to partially avoid the division
by zero.
14
Problem. Use Gaussian elimination to solve the system using 4 and 6 significant digits, and
compare with the exact results given by Cramer’s rule.
The solution
0.687
i2 = = 5.165.
0.133
2.686 + 0.01
0.3i1 + 0.52(5.165) = −0.01 =⇒ i1 = − = −8.987.
0.3
With 6 significant digits 0.5
0.3
≈ 1.66667. Then
0.3 0.52 | −0.01 0.3 0.52 | −0.01
∼ (10)
0.5 1 | 0.67 0 0.13333 | 0.68667
The solution
0.68667
i2 = = 5.15015.
0.13333
2.67808 + 0.01
0.3i1 + 0.52(5.15015) = −0.01 =⇒ i1 = − = −8.96027.
0.3
The exact result is
−0.01(1) − 0.67(0.52) −0.3584
i1 = = = −8.96.
0.3(1) − 0.5(0.52) 0.04
0.3(0.67) − 0.5(−0.01) 0.201
i2 = = = 5.15.
0.3(1) − 0.5(0.52) 0.04
Even though the approximate solution with 6 significant digits is close to the exact answer, there
is a small deviation amounting to a relative error on i2 of
|5.15015 − 5.15|
δ= × 100 = 0.0029%
5.15
It is clear that the error would reduce further if we use more significant figures. If we had used
fractions instead of decimals (and consequently avoided roundoff), the answers would have been exact.
However, computers carry only a limited number of significant figures, and thus round-off errors are
inevitable and must be considered when evaluating the results. The problem of round-off error can
become particularly important for large numbers of equations. This is due to the fact that every result
is dependent on previous results. Consequently, an error in the early steps will tend to propagate –that
is, will induce errors in subsequent steps.
• Solution: Use of more significant figures. Using more significant digits or applications with
extended precision improves the calculation results, especially in the case of ill-conditioning. Avoidance
of partial rounding if possible also improves the results.
15
Other solution: scaling. It is done by setting to 1 the highest coefficient in each row. This
allows us to standardize the size of the determinant and minimize round-off errors.
Example.
For scaled systems, the determinant is always small for ill-conditioned systems.
c) Ill-conditioning. Ill-conditioned systems are those where small changes in coefficients result
in large changes in the solution. This is for instance the case when at least two of the equations are
nearly identical.
Example. Consider the system
x1 + 2x2 = 10
1.1x1 + 2x2 = 10.4.
Perturb the system, for instance by taking as coefficient 1.05 instead of 1.1 in the second equation, and
solve again. The solution of the equation is
Graphically, an ill-conditioned system of equations will lead to lines with slopes so close that it
would be difficult to visually detect the intersection point.
d) Singular Systems. Those are the systems with zero determinant (product of the diagonal
elements after Gaussian elimination). This happens when two or more of the equations are identical.
In such a case, we would lose one degree of freedom, and would be dealing with the impossible case
of solving n − 1 equations with n unknowns. This can be detected during Gaussian elimination as a
zero diagonal element is created during the elimination stage for all possible row permutations. In the
algorithm, tell the compiler to terminate the calculation in such an event.
16
- The decomposition stage requires to write the matrix A as a product of lower and upper triangular
matrices such that A = LU , where
l11 0 0 u11 u12 u13
L = l21 l22 0 , U = 0 u22 u23 .
l31 l32 l33 0 0 u33
- The substitution stage is a two-step determination of the solution x for arbitrary b’s as follows:
Ly = b,
Ax = b ⇐⇒
Ux = y
In a first step, we solve the first equation by forward substitution to get the intermediate solution y.
The second step consists in using y into the second equation, and apply backward substitution to get
the solution x.
b) With nonzero pivots. This is the version of LU decomposition that uses Gaussian elimination.
When pivots used during Gauss elimination are all nonzero, the coefficient matrix A can be straightfor-
wardly written as a product of matrices L and U , where U is the end matrix of the elimination phase,
and L the matrix of elimination factors with ones in the diagonal. For a 3-by-3 matrix, that reads
a11 a12 a13 1 0 0
A = LU, with U = 0 a022 a023 , L = f21 1 0 .
0 0 a0033 f31 f32 1
Proof:
a11 a12 a13
a11 a12 a13 a21 − f21 a11 a22 − f21 a12 a23 − f21 a13 1 0 0
A = a21 a22
a23 ∼ | {z } | {z } | {z } ≡ A0 =⇒ A = f21 1 0 A0 .
0 a022 a023
a31 a32 a33 0 0 1
a31 a32 a33 | {z }
L1
a11 a12 a13
0 a220
a230 1 0 0
A0 ∼ 00 0 00 00
a31 − f31 a11 a32 − f31 a12 a33 − f31 a13 ≡ A =⇒ A = 0 1 0 A =⇒ A = L1 L2 A .
| {z } | {z } | {z } f31 0 1
0 a032 a033 | {z }
L2
Hence
1 0 0
L = L1 L2 L3 = f21 1 0 .
f31 f32 1
Note that inverting L simply changes the signs of the fij entries.
17
Example: Reconsider the system below, and solve using LU decomposition.
Forward substitution:
1 0 0 y1 −0.01
Ly = b =⇒ 5/3 1 0 y2 = 0.67 (13)
1/3 0.95 1 y3 −0.44
y1 = −0.01, (5/3)(−0.01) + y2 = 0.67 =⇒ y2 = 2.06/3.
(1/3)(−0.01) + 0.95(2.06/3) + y3 = −0.44 =⇒ y3 = −1.089.
Back substitution:
0.3 0.52 1 x1 −0.01
U x = y =⇒ 0 0.4/3 0.7/3 x2 = 2.06/3 (14)
0 0 −0.055 x3 −1.089
x3 = 1.089/0.055 = 19.8,
(0.4/3)x2 + (0.7/3)(19.8) = 2.06/3 =⇒ x2 = −29.5,
0.3x1 + 0.52(−29.5) + 19.8 = −0.01 =⇒ x1 = −14.9.
Note: That particular LU decomposition where L has 1’s on the diagonal is formally referred to
as a Doolittle decomposition. There exists an alternative approach which involves a U matrix with 1’s
on the diagonal; it is called Crout decomposition.
c) With at least one zero pivot. This version of LU decomposition is sometimes called the PALU
decomposition. When a zero pivot occurs in the course of Gauss elimination, then row exchanges
(permutations) are necessary to complete the elimination. Indeed, even if the pivot is not exactly
zero, a very small value can leads to big roundoff errors. For very big matrices, one can easily lose all
accuracy in the solution. Then row permutations allows us to place the largest element (in absolute
value) in the pivot.
In that case, the coefficient matrix A can be factorized in such a way that
P A = LU,
where P is a permutation matrix. We denote by Pnm the permutation matrix that interchanges the
nth and mth rows or columns of an arbitrary matrix when acted on. Pnm is obtained from the identity
matrix I by permuting the nth and mth rows or columns of I.
18
Example: The matrix
0 1 0 1 0 0 0 1 0
P12 = 1 0 0 obtained as I = 0 1 0 ∼ 1 0 0 ≡ P12
0 0 1 0 0 1 0 0 1
0 1 0 a11 a12 a13 a21 a22 a23
P12 A = 1
0 0 a21 a22 a23 = a11
a12 a13
0 0 1 a31 a32 a33 a31 a32 a33
a11 a12 a13 0 1 0 a21 a22 a23
AP12 = a21
a22 a23 1 0 0 = a11
a12 a13 .
a31 a32 a33 0 0 1 a31 a32 a33
Solved exercise. Using PALU decomposition, solve the system with coefficient matrix
−2 2 −1
A = 6 −6 7 .
3 −8 4
• Step 1: Since 6 > 2 (in column 1) we exchange rows 1 and 2 before performing the first elimination.
−2 2 −1 6 −6 7 0 1 0
A = 6 −6 7 ∼ A0 = −2 2 −1 =⇒ A0 = P12 A, with P12 = 1 0 0 .
3 −8 4 3 −8 4 0 0 1
6 −6 7 6 −6 7 1 0 0
A0 = −2 2 −1 ∼ A00 = 0 0 4/3 =⇒ A0 = L1 A00 , with L1 = −1/3 1 0 .
3 −8 4 0 −5 1/2 1/2 0 1
• Step 2: Since 5 > 0 (in the subsequent part of column 2) we exchange rows 2 and 3 before pursuing
the elimination process.
6 −6 7 6 −6 7 1 0 0
A00 = 0 0 4/3 ∼ A000 = 0 −5 1/2 =⇒ A000 = P23 A00 , with P23 = 0 0 1 .
0 −5 1/2 0 0 4/3 0 1 0
No further elimination will be carried out because A000 accidentally turns out to be upper triangular
A000 = U .
Hence
U = P23 L−1
1 P12 A
A000 = P23 A00 ⇐⇒ U = P23 A00 −1
= P23 L1 P23 P23 P12 A
= P23 L−1
1 A
0 | {z } .
−1 I
= P23 L1 P12 A −1
⇐⇒ P23 L−1
1 P23 U = P23 P12 A
19
−1
Let P = P23 P12 and L = P23 L−1
1 P23 . Then we have
P A = LU.
L can be obtained by exchanging both the 2nd and 3rd rows and columns of L−1
1 and inverting the
result:
1 0 0 1 0 0 1 0 0
L−1
1 = +1/3 1 0 =⇒ L−1 −1
1 P23 = +1/3 0 1 =⇒ P23 L1 P23 = −1/2 1 0 .
−1/2 0 1 −1/2 1 0 +1/3 0 1
The net operation on L−1
1 is an interchange of the nondiagonal elements −1/2 and 1/3. Finally
1 0 0
L = +1/2 1 0 .
−1/3 0 1
−1
Obviously, we could have written L = P23 L−1
1 P23 = P23 L1 P23 since P23 is its own inverse.
Once the Cholesky decomposition has been done, the solution can be obtained like in the LU
decomposition.
In general, this method may lead to an execution error if diagonal entries of A are negative. How-
ever, the Cholesky algorithm find wide application because many symmetric matrices encountered in
engineering are, in fact, positive definite. Note that a matrix A is positive definite when xT Ax is greater
than zero for all nonzero vector x.
20
From where
l11 = 2
10 √
l21 = = 5, l22 = 41 − 52 = 4
2
20 74 − (5)(10) p
l31 = = 10, l32 = = 6, l33 = 145 − (102 + 62 ) = 3.
2 4
Hence A = L LT , with
2 0 0 2 0 0 2 5 10
L = 5 4 0 i.e., A = 5 4 0 0 4 6 .
10 6 3 10 6 3 0 0 3
(i−1) (i−1)
(i) b1 − a12 x2 − a13 x3
x1 =
a11
(i−1) (i−1)
(i) b2 − a23 x3 − a21 x1
x2 =
a22
(i−1) (i−1)
(i) b3 − a31 x1 − a32 x2
x3 = ,
a33
for i > 1. As new values are generated, they are not immediately used but rather are retained for the
next iteration.
Like for Gauss-Seidel method (see next section), Jacobi’s method will converge to the correct solution
for any initial approximation only when the coefficient matrix is strictly diagonally dominant.
21
4.3 Gauss-Seidel’s method
4.3.1 Overview of the method
Gauss-Seidel method is particularly well suited for large numbers of equations where elimination meth-
ods may grow bigger roundoff errors. Round-off error is not an issue with this method as the error is
controlled by the number of iterations. The Gauss-Seidel method is an improved version of Jacobi’s
method, and represents the most commonly used iterative method.
(0) (0) (0)
Start with an estimate x1 , x2 , x3 . As each new xi value is computed for the Gauss-Seidel
method, it is immediately used in the next equation to determine another x value. Then the iteration
scheme is
(i−1) (i−1)
(i) b1 − a12 x2 − a13 x3
x1 =
a11
(i−1) (i)
(i) b2 − a23 x3 − a21 x1
x2 =
a22
(i) (i)
b3 − a31 x1 − a32 x2
(i)
x3 = ,
a33
for 1 6 i 6 N. Convergence can be checked using the criterion that the relative error between the
present and past iterations is less than a given precision.
This criterion is sufficient but not necessary for convergence. Systems where the condition holds
are said to be strictly diagonally dominant.
4x1 + 2x2 − x3 = −1
3x1 − x2 = −4
2x1 + x2 + 6x3 = −4
2x1 + 5x2 = 2
3x1 − 5x2 + x3 = 3
Largest entries are located in the main diagonal, thus A is diagonally dominant.
For the 3-by-3 system, the coefficient matrix is
4 2 −1
A = 2 1 6
3 −5 1
Since |a11 | > |a12 | + |a13 |, |a22 | ≯ |a21 | + |a23 | and |a33 | ≯ |a31 | + |a32 |, the condition holds for the first
row only. Thus A is not diagonally dominant.
22
Remember that row permutation does not change the solution of a system of equations. If we
interchange rows 2 and 3, then the coefficient matrix is
4 2 −1
0
A = 3 −5
1
2 1 6
In this case |a011 | > |a012 | + |a013 |, |a022 | > |a021 | + |a023 | and |a033 | > |a031 | + |a032 |, the condition now holds for
all rows. Thus A is strictly diagonally dominant, and consequently iterative methods will be convergent.
Hence interchanging rows allows us to achieve the convergence.
where λ is a weighting factor that is assigned a value between 0 and 2. The values of λ distinguishes
three regimes:
• λ = 1, no relaxation as xnew
i → xnew
i .
• λ ∈ [0, 1[, underrelaxation as less weight is put on the new result. It is typically employed to make
a nonconvergent system converge or to hasten convergence.
• λ ∈ ]1, 2[, overrelaxation as too much weight is put on the new result. This implicitly assumes
that the new value is moving in the right direction towards the exact solution but too slowly (before
relaxation).
Solved exercise The coefficient matrix and right-hand-side vector of a system of equations are as
follows
1.9 1.0 0.5 −0.01
A = 0.3 1.0 0.52 , b = 0.67 .
0.1 0.3 0.5 −0.44
Using an initial approximate/guess (0, 0, 0) compute the first 10 iterates of the solution through Gauss-
Seidel and relaxation methods with λ = 0.9 and λ = 1.2. Compare the results for the first solution
x2 .
Solution. The conditions of convergence are satisfied. So we can apply those methods. The results
are given in the table below.
Iteration 0 1 2 3 4 5 6 7 8 9 10
Gauss-Seidel 0 0.672 1.343 1.625 1.723 1.753 1.763 1.766 1.766 1.767 1.767
Overrelaxation 0 0.672 1.520 1.806 1.794 1.768 1.765 1.766 1.767 1.767 1.767
Underrelaxation 0 0.671 1.259 1.535 1.663 1.721 1.746 1.758 1.763 1.765 1.766
The final estimate for the solutions is (-0.448, 1.767, -1.850). We realized that the final es-
timate is obtained for the overrelaxation, Gauss-Seidel, and underrelaxation methods at the 8th, 9th
and 13th iterations, respectively. For the comparison of the estimates of x2 , see the graph.
23
Figure 1: Left: solution x1 , Right: solution x2 .
5.1 Norms
Consider an n-dimensional vector x, the Euclidean norm is computed as
v
u n
uX
kxke = t x2i .
i=1
Consider an n × n matrix A, similarly to the Euclidean norm, we can compute the so-called Frobenius
norm as v
u n X n
uX
p
kAkF = t a2ij = Tr (A∗T A).
i=1 j=1
The sum of the absolute value of the elements is computed for each row, and the largest of these is
taken as the norm.
The column-sum norm defined as
n
X
kAk1 = max16j6n |aij |.
i=1
24
It is given by the largest singular value of A, i.e., the square root of the largest eigenvalue of the matrix
A∗T A, where A∗T is the conjugate transpose of A.
Exercise. Compute the Frobenius, spectral norm, row-sum and column-sum norms of the matrix
4 2 −1
A = 3 −5 1 .
2 1 6
Solved exercise. Use the row-sum norm to estimate the matrix condition number for the 3-by-3
Hilbert matrix
1 1/2 1/3
A = 1/2 1/3 1/4 .
1/3 1/4 1/5
Solution. First, the matrix can be normalized so that the maximum element in each row is 1.
1 1/2 1/3
B= 1 2/3 1/2 .
1 3/4 3/5
Summing each of the rows gives 1.833, 2.1667, and 2.35. Thus, the third row has the largest sum and
the row-sum norm is kBk∞ = 2.35.
Inverse of the scaled matrix
9 −18 10
B −1 = −36 96 −60 ,
30 −90 60
with row-sum norm is kB −1 k∞ = 192.
25
References
[1] S. C. Chapra and R. P. Canale, Numerical Methods for Engineers, 7th Ed, McGraw-Hill Education,
New York (2015)
[2] R. V. Dukkipati, Numerical Method, New Age International Publishers Ltd, New Delhi (2010)
26