You are on page 1of 37

UNIT 2: MATRICES, DETERMINANTS & SYSTEMS OF LINEAR EQUATIONS

2.1 Definitions and Basic Matrix Operations

A. Definition of a Matrix and Examples

In mathematics and other fields of study where numbers are widely used, it is sometimes necessary to
arrange a set of numbers in some special form to make them convenient for certain purpose. For instance, we
could arrange them in ascending order, descending order, triangular form, rectangular form, etc. A matrix is a
rectangular arrangement of numbers, enclosed with in a square bracket or a parenthesis. More formally,

Definition 1: Let F be a field. Let m and n be natural numbers. An array of numbers in F of the form

𝑎11 𝑎12 … 𝑎1𝑛


𝑎21 𝑎22 … 𝑎2𝑛
𝐴=[ ⋮ ⋮ ⋮ ]
𝑎𝑚1 𝑎𝑚2 … 𝑎𝑚𝑛

is called a matrix in F. More precisely we say “A is an m by n matrix over F’.

Remarks:

a) We can write the matrix above briefly as A = [a i j]m × n .


b) The expression “ m by n” (also written as “ m× 𝑛") is called the size (shape, order) of the matrix.
c) The objects that belong to a matrix are called its entries. (Contrast this term with the terms members of
a set, components of a vector, coordinates of a point, terms of a sequence).
d) For i ∈ {1, … , 𝑛}, the vector 𝐴𝑖 ≔ 〈𝑎𝑖1 , 𝑎𝑖2 , … , 𝑎𝑖𝑛 〉 is called the ith row of A. Similarly the vector
𝑎1𝑗
jA: = [ ⋮ ] is called the jth column of A. Thus the above matrix has m rows and n columns.
𝑎𝑚𝑗
e) The entry aij is called the (i, j) entry of A. It is located at the intersection of Ai and jA.
f) A matrix A over a field F is said to be a real matrix if F = ℝ and a complex matrix if F= ℂ.
g) An m by 1 matrix is a vector; it is called a column matrix (column vector).
h) A 1 by n matrix is also a vector; it called a row matrix (row vector).
i) A 1 by 1 matrix is just a scalar.

2 −5 3
Example 1: Let A = [3 −1], B = (8 ), −5
2 −6 C = [ ] and D = (8). Then
0 8 √2

- A is a 3 ×2 matrix, B is a 1×3 matrix, C is a 3 by 1 matrix and D is a 1 by 1 matrix.

1
- The (1, 2) entry of A is 𝑎12 = −5 and the (2,1) entry of A is a21= 3
- the second row of A is A2 = [3 -1]
−5
- The second column of A is 2A = [−1]
8

Definition 2: Two matrices A and B are said to be equal, written 𝐴 = 𝐵, if

a) A and B have the same size and

b) The corresponding entries of A and B are equal, i.e, aij = bij for every i and j.

𝑥+𝑦 2𝑧 + 𝑤 3 5
Example 2: Find x, y, z, w so that [ ]=[ ].
𝑥−𝑦 𝑧−𝑤 1 4

Solution: Since corresponding entries must be equal, so

𝑥+𝑦 = 3
𝑥−𝑦 = 1
{
2𝑧 + 𝑤 = 5
𝑧−𝑤 =4

Solving these system of equations yields 𝑥 = 2, 𝑦 = 1, 𝑧 = 3 and 𝑤 = −1.

Special Types of Matrices

There are different types of matrices. We list some of them in this section and defer others to next sections
because we have not been armed yet, with the required language.

Definition 3: An 𝑚 by 𝑛 matrix 𝐴 is said to be:

a) a zero matrix if every entry of A is 0. (An m by n zero matrix is denoted by 0m×n)

b) a square matrix if m=n. (For an n by n square matrix A, the entries a11, a22, …, ann are called the
(main)diagonal entries of A and the entries an1, an2, …, ann are called secondary diagonal entries)

c) a diagonal matrix if A is a square matrix and all non-diagonal entries are zero i.e., 𝑎𝑖𝑗 ≠ 0 for for i ≠ j
. (An n by n diagonal matrix is denoted by diag(a11, a22,…,ann)).

d) a scalar matrix if A is a square matrix and the diagonal entries are equal.

e) an identity (unit) matrix if A is a square matrix and each diagonal entry is 1. (An n by n unit matrix is
denoted by In).

2
f) an upper triangular matrix if A is a square matrix and 𝑎𝑖𝑗 = 0 for for i ≤ j

g) a lower triangular matrix if A is a square matrix and 𝑎𝑖𝑗 = 0 for for i ≥ j

h) a triangular matrix if A is either an upper triangular or a lower triangular matrix

i) a strictly upper triangular matrix if A is a square matrix and 𝑎𝑖𝑗 = 0 for for i < j

j) a strictly lower triangular matrix if A is a square matrix and 𝑎𝑖𝑗 = 0 for for i > j

k) a strictly triangular matrix if it is either a strictly upper triangular or a strictly lower triangular matrix.

Example 3:

0 0
0 0 0
a) 03×2 = [0 0] and 02×3 = [ ] are zero matrices.
0 0 0
0 0
1 0 0 0 0 1
1 0 0 1
b) I2 = [ ] and I3 =[0 1 0] are identity matrices. But [ ] and [0 1 0] are not identity
0 1 1 0
0 0 1 1 0 0
matrices..
8 −6 0
c) A = [0 7 −4] is an upper triangular matrix but it is not strictly upper triangular.
0 0 5
1 0 0
d) The matrix [0 3 0 ] is a diagonal matrix. Briefly we can abbreviate it as diag(1,3,-6).
0 0 −6
e) A diagonal matrix is both upper triangular and lower triangular. But it need not be strictly upper
triangular and strictly lower triangular.
f) A zero matrix is both strictly upper triangular and strictly lower triangular.

B. Basic Matrix Operations: Addition, subtraction, Scalar Multiplication and Multiplication

Definition 4 (addition, subtraction and multiplication by a scalar):


Let 𝐴 = [𝑎𝑖𝑗 ]𝑚×𝑛 and 𝐵 = [𝑏𝑖𝑗 ]𝑚×𝑛 be real/complex matrices and let 𝜆 be a real/complex number. Then

we define the sum A + B, the difference A – B and the scalar multiple 𝜆𝐴 by


a) 𝐴 + 𝐵: = [𝑐𝑖𝑗 ]𝑚×𝑛 where cij: = aij + bij for all i and j.

b) 𝐴 – 𝐵: =[𝑐𝑖𝑗 ]𝑚×𝑛 where cij: = aij – bij for all i and j.

c) 𝜆𝐴 ≔ [𝑐𝑖𝑗 ]𝑚×𝑛 where cij: =𝜆 aij for all i and j.

3
For clarity the definition of sum is elaborated as follows. Let A and B be m by n matrices. Then
C = A + B if and only if size(C) is m by n and (i,j)th entry of C = (i,j)th entry of A + (i,j)th entry of B for every indices i
and j.

Note that A + B is defined only if A and B have the same size. Two matrices A and B are said to be
conformable for addition if they have the same size.

Definition 5: If A = [𝑎𝑖𝑗 ] is any m×n matrix and B = [𝑏𝑖𝑗 ] is any n×p matrix, then the product AB or A×B,
is an m by p matrix defined as follows:
AB =[𝑐𝑖𝑗 ]𝑚×𝑝 where 𝑐𝑖𝑗 ≔ 𝐴𝑖 ∙ jA

Note:
- AB is defined only if # columns (A) = # rows (B). A is said to be conformable to B for multiplication if
AB is defined.
- Unlike the previous matrix operations matrix multiplication is not defined entry-wise. The motivation
behind this is explained in section 2.5.
2 3 −5 8 −9 10 −5 −2
Example 4: Let𝐴 = [ ], B= [ ] and C [ ]
3 −6 8 10 5 −2 0 8
Then compute (if possible)
a) A +B d) 0A and g) AC
b) A + C e) 3A – 5B. h) CA
c) 4A f) AB
Solution:
2 + 8 3 − (−9) −5 + 10 10 −6 5
a) A +B= [ ]=[ ]
3 + 10 −6 − 5 8 + (−2) 13 −1 6
b) A + C is not defined
4(2) 4(3) 4(−5) 8 12 −20
c) 4A =[ ]=[ ]
4(3) 4(−6) 4(8) 12 −24 32
d) 0A= 02×3
5(2) 5(3) 5(−5) 2(8) 2(−9) 2(10) −6 33 −45
e) 5A – 2B =[ ]−[ ]= [ ]
5(3) 5(−6) 5(8) 2(10) 2(5) 2(−2) −5 −40 44
f) AB is undefined
g) AC is undefined
𝑥11 𝑥12 𝑥13
h) CA is a 2 by 3 matrix. Let CA = [𝑥 𝑥22 𝑥23 ]. Then
21

4
2
x11 = C1 ⋅1A = (-5, -2)⋅ [ ] = -16,
3
3
x12 = C1 ⋅2A = (-5 , -2)⋅ [ ] = -3,
−6
−5
x13 = C1 ⋅3A = (-5 , -2)⋅ [ ] = 9,
8
2
x21 = C2 ⋅1A = (0, 8)⋅ [ ] = 24,
3
3
x22 = C2 ⋅2A = (0, 8) ⋅ [ ] = -48,
−6
−5
x23 = C2 ⋅3A = (0, 8)⋅ [ ]=64.
8
−16 −3 9
Therefore, CA = [ ].
24 −48 64

Notation: The set of all m by n matrices over a field F is denoted by Matm×n (F). If F = ℝ, we can also use
ℝ𝑚×𝑛 .

Theorem 1(Properties of matrix addition): Let A, B and C be m by n matrices over the same field F. Then

A1 : 𝐴 + 𝐵 = 𝐵 + 𝐴

A2 : (𝐴 + 𝐵) + 𝐶 = 𝐴 + (𝐵 + 𝐶)

A3: 𝐴+0 = 𝐴

A4: 𝐴 + (−𝐴) = 0 (where – A: = (−1)A)

Theorem 2 (Properties of Scalar multiplication): Let A, B be m by n matrices over the same field F and 
,  be scalars.

M1:  ( A+ B)=  A+  B (Distributivity)

M2: (    )A=  A+  A (Distributivity)

M3: (  ) A=  (  A) (Associativity)

M4: 1A= A (property of 1)

5
Note: It follows from Theorem 1 and 2 that Mmxn(F) is a vector space over F. In section2.4 we shall consider
some subspaces of this vector space.

Theorem 3 (Properties of Matrix Multiplication): Let A, B and C be matrices over the same field F and
conformable for the indicated sums and products. Let  and  be scalars.

a) (A B) C = A (BC) (Associativity)

b) (𝜆A) B = A (𝜆𝐵) = 𝜆(𝐴𝐵) (Associativity)

c) A (B + C) = AB + AC and (A + B) C = AC + BC (Distributive property)

d) AI= A = IA (property of I)

For real numbers 𝑎, 𝑏 and 𝑐 we know that the following properties hold:

a. 𝑎𝑏=𝑏𝑎
b. 𝑎 ≠ 0 and 𝑎𝑏 = 𝑎𝑐 implies 𝑏 = 𝑐.
c. 𝑎𝑏 = 0 implies 𝑎 = 0 𝑜𝑟 𝑏 = 0.
d. 𝑎2 = 1 implies 𝑎 = 1 𝑜𝑟 𝑎 = −1.

The counter parts of these properties do not hold for matrices.

Remark [Exceptional Properties of Matrix Multiplication]

a) AB ≠BA
b) A ≠ 0 and AB = AC does not imply B=C.
c) AB = 0 does not imply that A =0 or B=0.
d) AA = I does not imply A = I or A = -I.

We prove remark (b) by a counter example and defer the proofs of the remaining remarks to Exercise 2.1.

1 1 0 1 0 0
Let 𝐴 ≔ [ ], 𝐵 ≔ [ ]and 𝐶 ≔ [ ]. Then 𝐴 ≠ 0 and 𝐴𝐵 = 𝐴𝐶 but 𝐵 ≠ 𝐶.
0 0 0 0 0 1

6
C. Powers of Matrices

Definition 6: Let A be square matrix. Then we define matrix power inductively as follows:
𝐴0 : = 𝐼 ,
𝐴𝑛 : = ⏟
𝐴𝐴 … 𝐴
𝑛−𝑡𝑖𝑚𝑒𝑠

Properties:
a) Ar+s = Ar As
b) (Ar )s = Ars

1 1
Example 5: Let A =[ ]. Compute A85.
0 1
Solution: calculating 𝐴𝑛 for the first few natural numbers n suggests giving the following formula:
1 𝑛
𝐴𝑛 = [ ].
0 1

We use the principle of mathematical induction to prove the formula.


1 1
Step 1: Since A1: =A0A = I(A)=A, so 𝐴1 = [ ]. Thus the formula works for n=1.
0 1
1 𝑘
Step 2: Assume that the formula works for n=k, i. 𝑒. , 𝐴𝑘 = [ ]. We prove that it works for 𝑛 = 𝑘 + 1.
0 1
1 𝑘 1 1 1 𝑘+1
𝐴𝑘+1 = 𝐴𝑘 𝐴 = [ ][ ]=[ ]. Thus the formula works for 𝑛 = 𝑘 + 1, too. By the principle
0 1 0 1 0 1
of mathematical induction the formula works for any natural number n.

We shall define negative integral exponents later on. In this course we do not consider non-integral exponents
of matrices.

D. Transpose of a Matrix

Definition 7 (transpose): Let 𝐴 = [𝑎𝑖𝑗 ]𝑚𝑥𝑛 . The 𝑛 by 𝑚 matrix whose (i,j) entry is the (j, i) entry of A for
each i and each j, is called the transpose of A. It is denoted by At.
Note that Ai = i(At) for each i , i.e., the ith row of A is the ith column of At.
2 −5
2 3 −5
Example 6: Let A =[3 −1]. Then At = [ ] and A tt = A.
−5 −1 8
0 8

7
Theorem 4 (Properties of Matrix Transpose): Let A, B and C be matrices over the same field F and
conformable for the indicated sums and products. Let  and  be scalars. Then

a) (A+ B)t = At + Bt

b) (At)t = A

c) (kA)t = k(At)

d) (AB) t = Bt At

Definition 8: A square matrix A is said to be


a) Symmetric if A = At
b) Skew symmetric if A = - At

Example 7: Show that every square matrix A can be written as a sum of a symmetric matrix and a skew
symmetric matrix

𝐴+𝐴𝑡 𝐴−𝐴𝑡 𝐴𝑡 +𝐴 𝐴𝑡 −𝐴
Solution: Put 𝐵: = and 𝐶: = . Then 𝐵𝑡 = = 𝐵 and 𝐶 𝑡 = = −𝐶 . Thus B is
2 2 2 2

symmetric, C is skew symmetric and A = B + C.

Exercises 2.1:
2 −1 3
1. Let A=[1 5 0]. Find a matrix B such that (3𝐵𝑡 − 𝐴𝑡 ) = 2𝐼3 .
0 2 4
2. Perform the following operation.
2 4 1 2 −7
[5 3 −2] [4] + [ 4 ].
−3 −1 0 1 11
3. Prove the following assertions by counter examples.
a) If 𝐴 and 𝐵 are matrices, then 𝐴𝐵 ≠ 𝐵𝐴 in general.
b) 𝐴𝐵 = 0 does not necessarily imply 𝐴 = 0 𝑜𝑟 𝐵 = 0.
c) For square matrices A, B and C of the same size, (𝐴 + 𝐵)2 ≠ 𝐴2 + 2𝐴𝐵 + 𝐵2 .
4. Prove: For matrices A, B and C that are conformable for multiplication, (𝐴𝐵)𝐶 = 𝐴(𝐵𝐶).
5. Let A be a 2 by 2 matrix. Are all of the entries of 𝐴2 necessarily nonnegative? Justify!

8
6. Show that the set of all 𝑛 × 𝑛 symmetric matrices is a subspace of the set of all 𝑛 × 𝑛 matrices.
7. Show that the set of all 𝑛 × 𝑛 skew-symmetric matrices is a subspace of the set of all 𝑛 × 𝑛 matrices.
8. a) Show that every square matrix A can be written as a sum of a symmetric matrix B and a skew
symmetric matrix C.
0 2 1 −3
5 3 −2 0
b) Write the matrix 𝐴=[ ].
0 2 1 4
1 0 −1 0

−3 6𝑎 − 𝑐 6𝑎 + 2𝑏
9. Find the values of 𝑎, 𝑏 and 𝑐 for which the matrix 𝐴 = [ 𝑎 2 4 ] is symmetric.
𝑎 + 7𝑏 𝑐 0
10. Prove that 𝑡𝑟(𝐴𝐵) = 𝑡𝑟(𝐵𝐴).

2.2 Elementary Row Operations , Echelon Forms and Rank

Two systems of linear equations are said to be equivalent if they have exactly the same solution set. Given a
system of m linear equations E1,…,Em , then an equivalent system can be obtained by applying a finite
sequence of the following operations

- Interchanging two equations (say the ith equation and the jth equation) : Ei↔ Ej
- Multiplying an equation by a nonzero scalar (say the ith equation by 𝛼 ) : Ei → 𝛼Ei
- Adding a multiple of an equation to another equation (say adding 𝛼 times the ith equation to the jth
equation ): Ej → 𝐸𝑗 + 𝛼Ei

We can perform similar actions on the rows and columns of a matrix.

A. Elementary Row Operations(ERO) and Elementary column operations

Elementary row operations are actions we carry out on the rows of a matrix.

Definition 1: Let 𝐴 be an 𝑚 by 𝑛 matrix. We can obtain a new matrix 𝐵 from 𝐴 by performing the following
actions on the rows of 𝐴. These actions are called elementary row operations on 𝐴.

a) Interchanging two rows of A (say interchanging Ai and Aj)

Notation: Ai↔ Aj

b) Multiplying a row by a nonzero scalar (say multiplying Ai by 𝛼)

9
Notation: Ai → 𝛼Ai

c) Adding a multiple of an equation to another equation (say adding 𝛼Ai to Aj )

Notation: Aj → 𝐴𝑗 + 𝛼Ai

Similar actions can be taken on columns and in this case the operations are called elementary column operation
(ECO). The ECOs corresponding to the EROs in (a), (b) and (c) are respectively denoted by iA ↔jA , iA
→ 𝛼 iA and jA → jA+ 𝛼 iA.

→ 𝛼 iA and jA → jA+ 𝛼 iA.

Definition 2: Let 𝐴 and 𝐵 be 𝑚 by 𝑛 matrices. 𝐴 is said to be:

a) row equivalent to 𝐵, written 𝐴 𝑅≡ B if B can be obtained from A by a finite sequence of EROs.


(Column equivalence can be defined in analogues fashion).

b) equivalent to 𝐵, written if A≅ B if B can be obtained from A by a finite sequence of elementary


operations (ERO or ECO).

Definition 3(elementary matrices)

a) A matrix obtained by applying a single ERO to an identity matrix In is called an elementary row matrix
(ERM).

b) A matrix obtained by applying a single ERO to an identity matrix In is called an elementary row matrix
(ECM).

c) A matrix is said to be an elementary matrix if it is either an ERM or an ECM.

Notation: The elementary row matrices obtained by applying the ERO (a), (b) and (c) to In are denoted
respectively by 𝑅𝑛 (𝑖, 𝑗), 𝑅𝑛 (𝛼𝑖) and 𝑅𝑛 (𝑗 + 𝛼𝑖). The corresponding elementary column matrices are Cn(i, j),
1 0 0
(2,3) 5 0
𝐶𝑛 (𝛼i) and 𝐶𝑛 (𝑗 + 𝛼i). For example, 𝑅3 = [0 0 1] and 𝑅2 ((5)(1)) =[ ].
0 1
0 1 0

10
. Theorem 1:

a) The application of an ERO to a matrix A can be effected by pre-multiplication of A by a


corresponding ERT matrix B.

b) The application of an ECO to a matrix A can be effected by post-multiplication of A by a


corresponding ECT matrix B.

1 −1 0 3 0 1
Example 1: Let 𝐴 = [ ] and 𝐵 = [ ]
2 1 1 0 3 1

a) Find the matrices obtained by applying the following operations on A and B.


1
𝑖) 𝐴1 ↔ 𝐴2 𝑖𝑖) 1 𝐵 → 1
𝐵 𝑖𝑖𝑖) 𝐴2 → 𝐴2 + 2𝐴1
3
b) Show that A and B are row equivalent.

Solution (a):

2 1 1
i. Interchange the first and second rows of A: [ ]
1 −1 0
1 1 0 1
ii. Multiply the first column of B by 3 : [ ]
0 3 1
1 −1 0
iii. Add 2 times row 1 of A to row 2 of A: [ ]
0 −1 1

1 −1 0 1 −1 0 3 −3 0 3 0 1
Solution (b): 𝐴 = [ ] ≡ [ ] ≡ [ ] ≡ [ ]
2 1 1 2 2𝑅 →𝑅 + (−2)𝑅1 0 3 1 1 𝑅 →3𝑅1 0 3 1 1 1 2 0 3
𝑅 →𝑅 +𝑅 1

B. Echelon Forms

Definition 4: The first nonzero entry we get as we go across a row of a matrix from left to right is called the
leading entry of the row.

Definition 5: A matrix is said to be in reduced row echelon form (RREF) if it satisfies the following.

a) If there are any rows containing entirely of zeroes, then they are grouped together at the bottom. Such
rows are called zero rows.

b) In any two successive rows with nonzero entries, the leading entry of the lower row occurs to the right of
the leading entry of the higher row.

11
c) The leading entry of every nonzero row is 1. (In this case the leading entry is called the leading 1).

d) In each column that contains a leading 1, all other entries in that column are zero.

Definition 6:
A matrix is said to be in row echelon form (REF) if it satisfies conditions (a) and (b) of Definition 5.

Column echelon forms could also be defined by replacing the term ‘row’ by ‘column’.

Theorem 6:
a) Every m by n matrix 𝐴 is row equivalent to some matrix 𝐵 in row echelon form.
b) Every m by n matrix A is row equivalent to a unique matrix B in reduced row echelon form. The
reduced row echelon form of A is denoted by RREF(A).

We can transform a given matrix A to one in REF or RREF by successive applications of the EROs, as
illustrated below.

1 2 −1 4
Example 2: Find RREF(A) where A = [ 3 2 0 2].
0 1 3 2
3 3 3 4

1 2 −1 4 1 2 −1 4 1 2 −1 4
0 −4 3 −10 0 1 3 2
Solution: [ 3 2 0 2] ≡ [ ] ≡ [ ]
0 1 3 2 𝑅2 →𝑅2 + (−3)𝑅1 0 1 3 2 𝑅2 ↔𝑅3 0 −4 3 −10
3 3 3 4 𝑅3 →𝑅3 + (−3)𝑅1 0 −3 6 −8 0 −3 6 −8

1 2 −1 4 1 2 −1 4
0 1 3 2 0 1 3 2
≡ [ ] ≡ [ ]
𝑅3 →𝑅3 + (4)𝑅2 0 0 15 −2 𝑅4 →𝑅4 + (−1)𝑅3 0 0 15 −2
𝑅4 →𝑅4 + (3)𝑅2 0 0 15 −2 0 0 0 0

58 −14
1 2 0 1 0 0
1 2 −1 4 15 15
0 1 3 2 𝑅1 → 𝑅1 + 𝑅3 12 𝑅1 → 𝑅1 + 𝑅3 12
≡ 2 0 1 0 0 1 0
5 𝑅2 → 𝑅2 + (−3)𝑅3 5
1
𝑅3 → 𝑅3 0 0 1 − 𝑅2 → 𝑅2 + (−3)𝑅3
15 15 2 2
[0 0 0 0 ] 0 0 1 − 0 0 1 −
15 15
[0 0 0 0 ] [0 0 0 0 ]

12
C. Rank of a Matrix

Just as human beings are characterized by numbers describing their height, weight, etc there are certain
numbers that characterize matrices. One of this is matrix rank.

Definition 7: Let 𝐴 be a matrix and 𝐵, 𝐶 be its row echelon and column echelon forms respectively.

a) The number of nonzero rows of B is called the row rank of A.


b) The number of nonzero columns of C is called the column rank of A.

Theorem 7: The row rank and the column rank of a matrix are equal.

Definition 8: The row rank/column rank of a matrix is called the rank of the matrix.

1 2 −1 4
Example 3: Find the rank of A = [ 3 2 0 2]
0 1 3 2
3 3 3 4

−14
1 0 0 15
12
0 1 0
Solution: From Example 2.12, RREF(A)= 5 . Since the RREF has three
2
0 0 1 − 15
[0 0 0 0 ]
nonzero rows, so rank(A)=3.

Properties of rank

a) Only the zero matrix has rank 0.


b) The rank of an n by n identity matrix is n.
c) If A is an m by n matrix, then rank(A)≤ min {m, n}
d) An n by n matrix is invertible if and only if rank(A) = n
e) The row rank of a matrix is equal to the number of LI rows of the matrix
f) The column rank of a matrix is equal to the number of LI columns of the matrix

Exercise 2.2

13
1. Find an example of matrices A and B such that 𝑟𝑟𝑒𝑓(𝐴𝐵) ≠ 𝑟𝑟𝑒𝑓(𝐴)𝑟𝑟𝑒𝑓(𝐵).
1 2 0 1
2. Find the RREF(A) where A=(2 4 1 4).
3 6 3 9

14
2. 3 Inverse of a matrix and its properties
A. Definition and Properties
Definition 1: A multiplicative inverse (or in short an inverse) of matrix 𝐴 is any matrix 𝐵 for which

𝐴𝐵 = 𝐵𝐴 = 𝐼.

If a matrix 𝐴 has an inverse, it is called invertible (nonsingular). Otherwise it is called singular (non invertible).

 4  1 1 1
Example 1: Show that B =   is an inverse of A=  .
 3 1  3 4

 4  1 1 1  4(1)  1(3) 4(1)  (1)( 4)


Solution: BA =    =  = I2
 3 1  3 4  3(1)  1(3) (3)(1)  1(4) 

Similarly it can be shown that AB= I 2 . Since 𝐴𝐵 = 𝐼2 = 𝐵𝐴, so B is an inverse of A.

2 6 
Example 2: Given A =   , find an inverse of A.
4 10

a c 
Solution: Let B =   be an inverse of A. Then
b d 

2a + 6b = 1
4a + 10b = 0 5 3 −1
𝐴𝐵 = 𝐼2 = 𝐵𝐴 ⇒{ ⇒ 𝑎 = − , 𝑏 = 1, 𝑐 = , 𝑑 =
2c + 6d = 0 2 2 2
4c + 10d = 1

5 3

2 2
Therefore B = [ 1].
1 −2

1 2
Example 3: Show that A = [ ] has no inverse.
−1 −2

𝑥 𝑦
Solution: Suppose it is invertible and let B=[ ] be an inverse of A. Then
𝑤 𝑧

1 2 𝑥 𝑦 1 0
[ ][ ]=[ ]. This implies 𝑥 + 2𝑤 = 1 &𝑥 + 2𝑤 = 0, which in turn implies 0 = 1. This is
−1 −2 𝑤 𝑧 0 1
a contradiction and hence our supposition that A is invertible is wrong.

15
Theorem 1 [A necessary Condition for invertibility]:

If A is invertible, then A is a square matrix. (i.e., every invertible matrix is square.)

Remark: The converse of Theorem 1 is not true.

Proof: Let A be invertible matrix and let 𝐵 be an inverse of 𝐴. Let A be m by n and B be p by q. Then Since
𝐴𝐵 and 𝐵𝐴 must be defined, so 𝑛 = 𝑝 and 𝑞 = 𝑚. The sizes of AB and BA are respectively m by q and p by
n. Since AB = BA, so 𝑚 = 𝑝 and 𝑞 = 𝑛. It follows that m=n. Therefore, A is a square matrix. ∎

Theorem 2: The inverse of a matrix, if it exists, is unique.

Proof: Let B and C be inverses of A. Then 𝐴𝐵 = 𝐼 = 𝐵𝐴 and 𝐴𝐶 = 𝐼 = 𝐶𝐴. Now, 𝐵 = 𝐵𝐼 = 𝐵(𝐴𝐶) =


(𝐵𝐴)𝐶 = 𝐼𝐶 = 𝐶. Therefore, 𝐵 = 𝐶. Thus any two inverses of A are identical. ∎

Notation: The inverse of matrix A is denoted by A 1 .Thus, 𝐴 A 1 = A 1 𝐴 = 𝐼

Theorem 3 [elementary properties of Inverses]: Let A and B be invertible matrices. Then:

a) ( A 1 ) 1 = A.
b) (A t ) 1 = (A 1 ) t
c) (AB) 1 = B 1 A 1
1 1
d) (kA) 1 = ( )A (k is a nonzero constant)
k

B. Finding matrix Inverse by Elementary Operations

Theorem 4: Elementary matrices are nonsingular

Theorem 5 [A Necessary and Sufficient Condition for Invertibility]:


An n by n matrix A is nonsingular if and only if 𝑟𝑟𝑒𝑓(𝐴) = 𝐼𝑛 .

Theorem 6: If an 𝑛 × 𝑛 matrix A is reducible to In by a sequence of elementary row operations the same


sequence of elementary row operations reduces In to A-1.

Remark: The above theorem suggests the following method for finding matrix inverse.

16
Procedures for finding A-1 by the method of EROS:
Step 1 : Transform A to RREF by applying EROs.
Step 2: Apply the EROs in step 1 on In , in the same order.

1 −2 3
Example 4: Given A =[−2 −1 0]. Compute A-1 by the method of EROs.
4 −2 5
Solution:
Step 1: Find the reduced row echelon form of A.
1 −2 3 1 −2 3 1 −2 3 1 −2 3
−6
[−2 −1 0] ≡ [0 −5 6 ] ≡ 6 [0 −5 6] ≡ [0
1
1 5
]
𝑅2 →𝑅2 + 2𝑅1 −1
4 −2 5 𝑅3 →𝑅3 + (−4)𝑅1 0 6 −7 𝑅3 →𝑅3 + (5)𝑅2 0 0 5 𝑅2 → 5 𝑅2 0 0 1
𝑅3 →5𝑅3

1 −2 0 1 0 0
≡ [ 0 1 0] ≡ [0 1 0]
𝑅1 →𝑅1 +(−3)𝑅3 𝑅1 →𝑅1 +2𝑅2
6 0 0 1 0 0 1
𝑅2 →𝑅2 +( )𝑅3
5

Step 2: Apply the EROs in step 1 on I3 in the same order.

1 0 0 1 0 0 1 0 0 1 0 0
[ 0 1 0] ≡ [ 2 1 0] ≡ 6 [ 28 1 0] ≡ [ −2 −1
0]
𝑅2 →𝑅2 + 2𝑅1 6 −1 5 5
0 0 1 𝑅3 →𝑅3 + (−4)𝑅1 −4 0 1 3 3 + (5)𝑅2 − 5
𝑅 →𝑅
5
1 𝑅2 → 5 𝑅2 −8 6 5
𝑅 →5𝑅
3 3

25 −18 −15 5 −4 −3
≡ [−10 7 6 ] 𝑅 →𝑅≡+2𝑅 [−10 7 6]
𝑅1 →𝑅1 +(−3)𝑅3 1 1 2
6 −8 6 5 −8 6 5
𝑅2 →𝑅2 +( )𝑅3
5

5 −4 −3
Hence, A-1 = [−10 7 6] //
−8 6 5

Exercise 2.3
1 −1
1. Show that 𝐴 = [ ] has no inverse.
−2 2
2. Let A be an 𝑛 × 𝑛 nonsingular matrix. Prove that the transpose of A is also nonsingular.

17
𝑎 𝑏
3. Prove that the 2 × 2 matrix 𝐴 = [ ] is invertible if and only if 𝑎𝑑 ≠ 𝑏𝑐 and find a formula for the
𝑐 𝑑
inverse.
4. Find the values of x to make A singular:

5.

2.4 Determinant of a Matrix and its Properties; Adjoints of a Matrix


A. Definitions and Properties
Definition 1: Let 𝐴 be a square matrix of order less than or equal to 2. The determinant of A is a number
denoted by |𝐴| , 𝐷(𝐴)or 𝑑𝑒𝑡 (𝐴) and defined as follows:
a) If 𝐴 = [𝑎11 ], then 𝑑𝑒𝑡 (𝐴) : = 𝑎11
𝑎11 𝑎12
b) If 𝐴 =[𝑎 ], then 𝑑𝑒𝑡 (𝐴): = 𝑎11 𝑎22 − 𝑎12 𝑎21
21 𝑎22

4 −3
Example 1: If 𝐴 = [ ], then 𝑑𝑒𝑡 𝐴 = 21.
−5 9
Definition 2: Let 𝐴 = [𝑎𝑖𝑗 ]𝑛×𝑛 and let Aij be the (𝑛 − 1) × (𝑛 − 1) sub-matrix of 𝐴 obtained by deleting
Ai and jA. Then the number
a) 𝑀𝑖𝑗 := 𝑑𝑒𝑡 (𝐴𝑖𝑗 ) is called the minor of 𝑎𝑖𝑗
b) ∆𝑖𝑗 := (-1)i+j 𝑑𝑒𝑡 (𝐴𝑖𝑗 ) is called the cofactor of 𝑎𝑖𝑗

Theorem 1: ∑𝑛𝑗 𝑎𝑖𝑗 ∆𝑖𝑗 = ∑𝑛𝑗 𝑎𝑘𝑗 ∆𝑘𝑗 for every i and k.

Theorem 2: ∑𝑛𝑖 𝑎𝑖𝑗 ∆𝑖𝑗 = ∑𝑛𝑗 𝑎𝑖𝑗 ∆𝑖𝑗 for every i and j.

Definition 3: Let 𝐴 be an 𝑛 × 𝑛 matrix (n≥3). The determinant of A, denoted by |A|, D (A) or 𝑑𝑒𝑡 (𝐴) , is
the number defined by
det (A) = ∑𝑛𝑗=1 𝑎𝑖𝑗 ∆𝑖𝑗 ,
where i ∈ {1, … , 𝑛} is fixed.
This sum is called the Laplace expansion of the determinant of A by the ith row.

18
Remarks:
1. In view of Theorem 2.14 , we can also define 𝑑𝑒𝑡(𝐴) by
det (A) = ∑𝑛𝑖=1 𝑎𝑖𝑗 ∆𝑖𝑗 where j ∈ {1, … , 𝑛} is fixed.
This sum is called the Laplace expansion of the determinant of A by the jth row.
2. From Definition 2 and remark (1), it follows that the determinant of an n by n matrix can be defined by
using 2n different formulas.
2 1 0
Example 2: Let 𝐴 =[−1 1 4], then compute 𝑑𝑒𝑡(𝐴).
3 2 5
Solution: Let use the Laplace expansion of det(A) by the first row.
Then det(𝐴) = ∑3𝑗 𝑎1𝑗 ∆1𝑗 = 𝑎11 ∆11 +𝑎12 ∆12 + 𝑎13 ∆13
= 𝑎11 𝑀11 −𝑎12 𝑀12 + 𝑎13 𝑀13
1 4 −1 4
= (2) | | − (1) | |
2 5 3 5
= 2(5 − 8) − (−5 − 12) = 11 //
Notations:
a) Let 𝐴 be an 𝑛 by 𝑛 matrix. Then the determinant of 𝐴 can also be denoted in terms of rows and
columns of A by det(A1,…., An) or det(1A,…., nA). These notations are used when we investigate the
effect of manipulation of rows or/and columns of a matrix on the determinant of the matrix.
b) If A is an n by n matrix and u is an n dimensional column vector, then (1A,…, k-1A, u , k+1A,…, nA)
means a matrix whose columns are the same as those of A except that kA is replaced by u. Similarly
(1A,…, k-1A, 𝜆 kA , k+1A, ,…, nA) is the matrix whose columns are those of A except that the kth column
is replaced by 𝜆 times itself. We can also have other column/row manipulation mechanisms.
Theorem 3 (Properties of Determinants): Let 𝐴 and 𝐵 be 𝑛 by 𝑛 matrices, 𝑢 and 𝑣 be 𝑛 dimensional
column vectors and 𝜆 be a scalar. Then
a) 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑘−1 , 𝑢 + 𝑣, 𝐴𝑘+1 , , … , 𝐴𝑛 ) = 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑘−1 , u , k+1A)+ 𝑑𝑒𝑡(𝐴1 , … , 𝐴𝑘−1 , 𝑣, 𝐴𝑘+1 , , … , 𝐴𝑛 )
b) 𝑑𝑒𝑡(𝐴1 , … , 𝐴𝑘−1 , 𝐴𝑖 , 𝐴𝑘+1 , , … , 𝐴𝑖 , 𝐴𝑘 , 𝐴𝑖+1 , , … , 𝐴𝑛 ,) = −𝑑𝑒𝑡(𝐴). (If two rows of matrix "𝐴" are
interchanged to produce a matrix"𝐵", then 𝑑𝑒𝑡(𝐴) = −𝑑𝑒𝑡(𝐵))
c) 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑘−1 , 𝜆 Ak , 𝐴𝑘+1 , , … , 𝐴𝑛 ) = 𝜆 𝑑𝑒𝑡(𝐴).(If one row of "𝐴" is multiplied by 𝜆 to produce a
matrix"𝐵", then 𝑑𝑒𝑡(𝐵) = 𝜆𝑑𝑒𝑡(𝐴).)
d) 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑘−1 , 𝜆 Ai + 𝐴𝑘 , 𝐴𝑘+1 , , … , 𝐴𝑛 ) = 𝑑𝑒𝑡(𝐴). (If a multiple of one row of "𝐴" is added to another
row of A to produce a matrix"𝐵", then 𝑑𝑒𝑡(𝐴) = 𝑑𝑒𝑡(𝐵). )
e) If 𝐴 is triangular, then the determinant of 𝐴 is the product of diagonal entries of 𝐴.
f) 𝑑𝑒𝑡 (𝐴𝑡 ) = 𝑑𝑒𝑡(𝐴)

19
g) 𝑑𝑒𝑡 (𝐴𝐵) = 𝑑𝑒𝑡(𝐴) 𝑑𝑒𝑡 (𝐵)

The following corollary follows from property (c).


Corollary 1: 𝑑𝑒𝑡( 𝜆 𝐴 ) = 𝜆n 𝑑𝑒𝑡 (𝐴)

The next corollary is a consequence of property (e).


Corollary 2: 𝑑𝑒𝑡( 𝑑𝑖𝑎𝑔(𝑎11 , … , 𝑎𝑛𝑛 ) ) = 𝑎11 𝑎22 … 𝑎𝑛𝑛 . In particular, det(𝐼𝑛 ) = 0 and det(𝑂𝑛×𝑛 ) = 0.
The following corollary is a consequence of property (g).
1
Corollary 3: If det(𝐴) ≠ 0, then 𝑑𝑒𝑡( 𝐴−1 ) ) = det(𝐴)

We can use determinants to test if a set of vectors is linearly dependent or not, as given in the next corollary.
Corollary 4: Let S= {𝑉1 , … , 𝑉𝑛 } ⊆ ℝ𝑛 and 𝐴 be a matrix having the vectors in 𝑆 as its rows(or columns).
Then 𝑆 is linearly dependent if and only if 𝑑𝑒𝑡( 𝐴) ) =0.

Remark:
1. Properties (a) to (d) are valid if the given row operations are replaced by column operations.
2. The row operations in Properties (b), (c) and (d) can be used along with property (e) to effectively
compute determinants, as illustrated in Example 5.

Example 3: Use determinants to decide if the following vectors are linearly dependent:
3 2 −2 0
5 −6 −1 0
𝑉1 = [ ] , 𝑉2 = [ ] , 𝑉3 = [ ] and 𝑉4 = [ ]
−6 0 3 0
4 7 0 −3

Example 4: Let A be a 5 by 5 matrix and |𝐴| = −1. Find |2𝐴−1 [𝐴𝑡 ]4 |.


Solution: Using Corollary 1 and 2 and Theorem 3(f&g), we get that
1
|2𝐴−1 𝐴𝑡 | = 25 |𝐴−1 ||𝐴𝑡 | = 32 × |𝐴|4 = −32.
|𝐴|

2 −1 4 1
−3 3 1 6
Example 5: Evaluate | |.
8 4 5 2
1 −3 8 4
2 −1 4 1 1 −3 8 4
−3 3 1 6 −3 3 1 6 (interchanging two rows change the det by sign)
Solution: | | = −| |
8 4 5 2 8 4 5 2
1 −3 8 4 2 −1 4 1

20
1 −3 8 4
= −| 0 −6 25 18 |
0 28 −59 −30
0 5 −12 −7
1 −3 8 4
0 −6 25 18
| 346 |
=− 0 0 54
| 6 |
53
0 0 8
6

1 −3 8 4
0 −6 25 18
| 346 |
=− 0 0 54
| 6 |
(−53)(54) + 8(346)
0 0 0
346
346 2862 − 2768
= (1)(−6) ( )( )
6 346
= −94

B. Determinants and Matrix Inverse; adjoint of a matrix


Definition 4: The adjoint of an 𝑛 × 𝑛 matrix A, denoted 𝑎𝑑𝑗(𝐴), is the matrix given by
∆11 ∆12 … ∆1𝑛 𝑡
𝑎𝑑𝑗(𝐴) = [∆21 ⋮
∆22 …

∆2𝑛 ]

∆𝑛1 ∆𝑛2 … ∆𝑛𝑛
where ∆𝑖𝑗 means the cofactor of aij.
1 −2 3
Example 6: Calculate 𝑎𝑑𝑗(𝐴) where 𝐴 = [−2 −1 0]
4 −2 5
Solution:
−1 0 −2 3 −2 3
∆11 = | | = -5 ∆21 = − | |=4 ∆31 = | |=3
−2 5 −2 5 −1 0
−2 0 1 3 1 3
∆12 = − | | = 10 ∆22 = | | = -7 ∆32 = − | | = -6
4 5 4 5 −2 0
−2 −1 1 −2 1 −2
∆13 = | |=8 ∆23 = | | = -6 ∆33 = | | = -5
4 −2 4 −2 −2 −1

−5 4 3
Therefore, 𝑎𝑑𝑗 (𝐴) = [ 10 −7 −6]. //
8 −6 −5

21
Theorem 4: If 𝐴 is a square matrix of order 𝑛, then
𝐴 𝑎𝑑𝑗(𝐴) == 𝑑𝑒𝑡(𝐴)𝐼𝑛 = 𝑎𝑑𝑗(𝐴) 𝐴.

1
Corollary: If 𝑑𝑒𝑡(𝐴) ≠ 0, then A-1 = [𝑑𝑒𝑡(𝐴) 𝑎𝑑𝑗(𝐴)]

1 1 1
Proof: From Theorem 4, 𝐴 × [𝑑𝑒𝑡(𝐴) 𝑎𝑑𝑗(𝐴)] = 𝐼 = [𝑑𝑒𝑡(𝐴) 𝑎𝑑𝑗(𝐴)] 𝐴. Hence, A-1 = [𝑑𝑒𝑡(𝐴) 𝑎𝑑𝑗(𝐴)]

The corollary provides a method for computing inverse. This method is called the adjoint method or the
determinant method.
1 −2 3
Example 7: Given A =[−2 −1 0]. Compute A-1 by the adjoint method.
4 −2 5
Solution:
Step 1: Find the determinant of A.
−1 0 −2 0 −2 −1
det(A) = (1) | | – (−2) | | +(3) | | = −5 − 20 + 24 = −1.
−2 5 4 5 4 −2
Step 2: Find the adjoint of A. This is computed in example 4.
−5 4 3
𝐴𝑑𝑗(𝐴)=[ 10 −7 −6].
8 −6 −5
Step 2: Apply the formula for computing inverse:
5 −4 −3
𝐴𝑑𝑗(𝐴)
A−1 = = [−10 7 6]
det(𝐀)
−8 6 5
Theorem 5: A square matrix A is invertible if and only if its determinant is different from zero.

Theorem 6:
a) A diagonal matrix is invertible if and only if each diagonal entry is different from zero.
b) The inverse of diag(a11,.⋯ ,ann) is diag(1/a11,.⋯ ,1/ann).

Theorem 7: If 𝐴 is a nonsingular matrix of order n, then 𝑑𝑒𝑡(𝑎𝑑𝑗(𝐴)) =(𝑑𝑒𝑡𝐴)𝑛−1 .

Theorem 8: If A and B are square matrices of the same order, then is a nonsingular matrix of order 𝑛, then
𝑎𝑑𝑗(𝐴𝐵) = 𝑎𝑑𝑗(𝐴) 𝑎𝑑𝑗(𝐵).

22
Exercise 2.4:
1. For the 3 by 3 matrices A and B let 𝑑𝑒𝑡(𝐴) = 12 and 𝑑𝑒𝑡(𝐵) = 24. Compute 𝑑𝑒𝑡(−2𝐴𝑡 𝐵−1 ).
1 1
1 1
1 1
1 1
2. Find 𝑑𝑒𝑡(𝐴), where[ ].
1 1
1 0
1 1
0 0
−1 1 2 3
0 1 3 2
3. Find 𝑑𝑒𝑡(𝐴), where 𝐴 = [ ].
1 1 −1 2
1 1 −1 4
−1 0 0 7 3
1 0 2 −1 0
4. Compute || 2 3 8 −2 0 ||.
1 0 1 0 1
0 1 0 0 −1
1 1 1
5. Solve |1 𝑥 1 | = 0.
1 1 𝑥2
𝑥 𝑦 𝑧 2𝑥 2𝑦 2𝑧
6. If |3 0 2| = 5, calculate | 3 0 1 |.
2
1 1 1 1 1 1
1 1 1 (𝑎 + 1)2 (𝑏 + 1)2 (𝑐 + 1)2
7. If | 𝑎 𝑏 𝑐 | = 2, calculate | 𝑎 𝑏 𝑐 |.
𝑎2 𝑏2 𝑐2 𝑎2 𝑏2 𝑐2
8. Show that the adjoint of
a) a diagonal matrix is diagonal.
b) a triangular matrix is triangular.
c) a symmetric matrix is symmetric.
d) a Hermitian matrix is Hermitian.

9. Show that 𝑎𝑑𝑗(𝐴𝐵) = 𝑎𝑑𝑗(𝐴)𝑎𝑑𝑗(𝐵).

2.5 Systems of Linear equations

A. Matrix representation of Linear Systems

The title above bears three key words: equation, linear and system. A linear equation is an equation of the form
a1x1 + a2x2 +…+ an x n = b where 𝑎1 , … , 𝑎𝑛 and 𝑏 are fixed numbers (constants) and 𝑥1 , … , 𝑥𝑛 are unknowns
(place holders). The constants 𝑎1 , … , 𝑎𝑛 are called the coefficients of the unknowns 𝑥1 , … , 𝑥𝑛 respectively and

23
𝑏 is called the constant term of the equation. Note that the unknowns in a linear equation cannot be multiplied
with each other and they are not arguments of other functions.

A system of linear equations is a set or collection of linear equations that are considered simultaneously, i.e., for
which a common solution is sought. A general system of 𝑚 linear equations i 𝑛 unknowns,𝑥1 , … , 𝑥𝑛 , has the
form:

𝑎11 𝑥1 + 𝑎12 𝑥2 + ⋯ + 𝑎1𝑛 𝑥𝑛 = 𝑏1


𝑎21 𝑥1 + 𝑎22 𝑥2 + ⋯ + 𝑎2𝑛 𝑥𝑛 = 𝑏2 *

𝑎𝑚1 𝑥1 + 𝑎𝑚2 𝑥2 + ⋯ + 𝑎𝑚𝑛 𝑥𝑛 = 𝑏𝑛

where 𝑎𝑖𝑗 and 𝑏𝑖 are given constants. In short, we call (*) an m by n linear system.

Definition 1: The linear system (*) is said to be

a) homogeneous if 𝑏𝑖 = 0 for all i. Otherwise it is called non-homogeneous (in-homogeneous).

b) consistent if it has at least one solution. Otherwise, it is called inconsistent.

c) redundant if it is consistent and has more than one solution.

Definition 2:

a) A vector 𝑢 = 〈𝑢1 , 𝑢2 , … , 𝑢𝑛 〉 is said to be a solution of the linear system (*) if it satisfies each of the
equations in (*), i.e., if ai1 u1 + a i2 u2 +…+ a in u n = 𝑏𝑖 is true for each 𝑖 = 1, … , 𝑚.

b) The set of all solutions of (*) is called the solution set or general solution of (*).

c) The process of finding all solutions of a linear system is called solving the system.

Theorem 1: A linear system can have either no solution, only one (unique) solution or infinitely many solutions.

24
A homogeneous system of linear equations has at least one solution, namely the zero vector < 0, … ,0 >. This
is called the trivial solution. Any other solution, if it exists, is called nontrivial solution. Thus, there are only two
kinds of homogeneous systems: those having unique solution and those with infinite solutions.

Definition 3: Two linear systems are said to be equivalent if their solution sets are equal.

Given a linear system, say (*), we can construct an equivalent linear system (**) by using the following
operations called elementary equation operations.

a) Interchanging two equations (say the ith equation and the jth equation) : Ei↔ Ej
b) Multiplying an equation by a nonzero scalar (say the ith equation by 𝛼 ) : Ei → 𝛼Ei
c) Adding a multiple of an equation to another equation (say adding 𝛼 times the ith equation to the
jth equation ): Ej → 𝐸𝑗 + 𝛼Ei

Definition 4: A system of linear equations is said to be redundant if removal of one or more equations doesn’t
affect the solution set.

Definition 5: Consider the linear system (*). The matrix

𝑎11 𝑎12 … 𝑎1𝑛


𝑎21 𝑎22 … 𝑎2𝑛
a) A=[ ⋮ ⋮ ⋮ ] is called the matrix of coefficients of the system.
𝑎𝑚1 𝑎𝑚2 … 𝑎𝑚𝑛

𝑥11
b) X = [ ⋮ ] is called matrix of unknowns of the system.
𝑥𝑛1

𝑏1
c) B= [ ⋮ ] is called the matrix of constants of the system.
𝑏𝑚

𝑎11 𝑎12 … 𝑎1𝑛 𝑏1


𝑎21 𝑎22 … 𝑎2𝑛 𝑏2
d) (A|B) = [ ⋮ ⋮ ⋮ ⋮ ] is called the augmented matrix of the system.
𝑎𝑚1 𝑎𝑚2 … 𝑎𝑚𝑛 𝑏𝑚

Using the matrix notations above we can represent (*) by the matrix equation

AX = B

25
Another representation for (*) is the vector equation below:
1Ax
1 + …+ nA xn = B
It is an easy to see that both equations above are equivalent to (*) (The reader can verify by computing the
indicated multiplications and additions).
B. Characterization of solutions : Existence and uniqueness Criteria
There are three important questions in the study of any kind of equation:

Q1. Does it has a solution?

Q2. If it has, is the solution unique?

Q3. How to solve it?

We shall address all of these questions for linear systems in the sequel. We answer Q1 and Q2 as follows.

Theorem 2: Let 𝐴𝑥 = 𝑏 be a system of m linear equations in n unknowns. This system is consistent if and
only is 𝑟𝑎𝑛𝑘 (𝐴|𝑏) = 𝑟𝑎𝑛𝑘 (𝐴).

Theorem 3: Let 𝐴𝑥 = 𝑏 a system of m linear equations in n unknowns. This system is has a unique solution
if and only is 𝑟𝑎𝑛𝑘 (𝐴|𝑏) = 𝑟𝑎𝑛𝑘 (𝐴) = 𝑛.

Example 2: Determine the value(s) of 𝑘 ≠ 0 such that the system

𝑥 + 𝑦 + 𝑘𝑧 = 0
{𝑥 + 𝑘𝑦 + 𝑧 = 0
𝑘𝑥 + 𝑦 + 𝑧 = 0

has a nontrivial solution.

Example 3: Determine the value(s) of k ≠ 0 such that the system

𝑘𝑥 + 𝑦 + 𝑧 = 1
{ 𝑥 + 𝑘𝑦 + 𝑧 = 1
𝑘𝑥 + 𝑦 + 𝑘𝑧 = 1

Has

i) a unique solution
ii) no solution
iii) infinite solutions

26
C. Methods of Solving Linear Systems
In general, the methods used to solve linear systems can be classified as analytic methods and numerical methods.
Analytic methods find exact solution with in finite number of steps if infinite precision arithmetic is used (i.e.,
if rounding off is not applied). The most widely used analytic methods are the following.
- Gaussian elimination method
- Gauss Jordan reduction method
- Cramer’s rule
- Matrix factorization (decomposition) methods
- Matrix inversion methods

Here we deal with the first three only.

Theorem 4: Let 𝐴𝑥 = 𝑏 and 𝐶𝑥 = 𝑑 be 𝑚 by 𝑛 linear systems. If (𝐴|𝑏) and (𝐶|𝑑) are row equivalent,
then the two systems are equivalent (i.e, have the same solution sets).

Corollary: If 𝐴 and 𝐶 are row equivalent matrices, then the systems 𝐴𝑥 = 0 and 𝐶𝑥 = 𝑑 have exactly the
same solutions.

Remark: The above theorem provides the following powerful methods for solving linear systems.

Gaussian Elimination Method (GEM): The Gaussian elimination procedure for solving m linear systems
in n unknowns is as follows:

Step 0: Change every equation in the system to the form a1x1 + … + anxn = b (i.e., take the constant term to
RHS and the left to LHS).

Step 1: Form the augmented matrix [A|B] of the system.

Step 2: Transform [A|B] to REF. Let [C|D] be a REF of [A|B].

Step 3: Solve CX = D. When you solve the latter system you may encounter one of 3 possibilities:

a) If 𝑟𝑎𝑛𝑘 (𝐴|𝐵) ≠ 𝑟𝑎𝑛𝑘 (𝐶|𝐷), the system has no solution.


b) If 𝑟𝑎𝑛𝑘 (𝐴|𝐵) = 𝑟𝑎𝑛𝑘 (𝐶|𝐷) = 𝑛, the system has unique solution. To get the solution, first find
xn using the last nonzero row of [C|D]. Then find xn-1,⋯, x1 by back substitutions.
c) If 𝑟𝑎𝑛𝑘 (𝐴|𝐵) = 𝑟𝑎𝑛𝑘 (𝐶|𝐷) ≠ 𝑛, then the system has infinitely many solutions. In this case
certain columns of 𝐶 have no leading entries. The unknowns corresponding to such columns are

27
called basic variables and the rest are called non-basic variables, free variables or independent variables.
Assuming that no equation is redundant, the number of basic variables equals number of equations.
Let XB1,…,XBm be the basic variables and XN,1,…,XN ,n-m be the free variables. Assign arbitrary value
to the free variables and solve the system for the basic variables, as illustrated in Example 5.

Example 4: Solve by GEM:

𝑥 − 𝑦 + 2𝑧 = −5
{ 3𝑥 + 4𝑦 + 15𝑧 = 2
2𝑥 − 𝑦 + 𝑧 = 1

Solution:

Step 1: The augmented matrix of the system is

1 −1 2 −5
(3 4 15 2)
2 −1 1 1

Step 2: Find a row echelon form of the matrix in step 1.

1 −1 2 −5 1 −1 2 −5 1 −1 2 −5 1 −1 2 −5
(3 4 15 2 ) ≡ (0 7 9 17 ) 𝑅 ≡ (0 1 −3 11 ) ≡ (0 1 −3 11
𝑅2 →𝑅2 + (−3)𝑅1 2 ↔𝑅3 𝑅 →𝑅 + (−7)𝑅2
2 −1 1 1 𝑅3 →𝑅3 + (−2)𝑅1 0 1 −3 11 0 7 9 17 3 3 0 0 30 −60

Thus an REF of the augmented matrix is the matrix

1 −1 2 −5
(0 1 −3 11 )
0 0 30 −60

More over an REF of the matrix of coefficients is


1 −1 2
(0 1 −3)
0 0 30

We see that the two matrices have the same rank, namely 3. Therefore, this system is consistent. Moreover,
the number of unknowns is also 3. Therefore, the system has a unique solution.

Step 3: Solve the system of equations corresponding to the augmented matrix in step 2, i.e., solve

28
𝑥 − 𝑦 + 2𝑧 = −5 (1)
{ 𝑦 − 3𝑧 = 11 (2)
30𝑧 = −60 (3)

From equation (3), z = -2. Substituting z=2 in equation (2) gives us y=5. Finally, using equation (1) and the
4
values of y and z we get x = 4. Therefore, the solution set of the given system is 𝑆 = {[ 5 ]}.
−2

Example 5: Solve by GEM:

𝑥 + 𝑦 + 𝑧 = 150
{ 𝑥 + 2𝑦 = 100 − 3𝑧
2𝑥 + 3𝑦 + 4𝑧 = 200

Solution: Transfer all unknowns to left hand side before setting up the augmented matrix of the system. The
remaining steps are done in the same fashion as Example 3.

Step 1: The augmented matrix of the system is

1 1 1 150
(1 2 3 100)
2 3 4 200

Step 2: Find a row echelon form of the matrix in step 1.

1 1 1 150 1 1 1 150 1 1 1 150


(1 2 3 100) ≡ (0 1 2 −50 ) ≡ (0 1 2 −50)
𝑅 →𝑅 + (−1)𝑅 𝑅 →𝑅 + (−1)𝑅2
2 3 4 200 𝑅23 →𝑅23 + (−2)𝑅11 0 1 2 −100 3 3 0 0 0 −50

1 −1 2 −5
Thus an REF of the augmented matrix is the matrix (0 1 −3 11 ). On the other hand an REF of
0 0 0 −60
1 −1 2
the matrix of coefficients is (0 1 −3). The rank of the matrix of coefficients is 2 and the rank of the
0 0 0
augmented matrix is 3; thus they are different. Hence, the given system is inconsistent and so the solution set
is empty set.

Example 6: Solve by GEM:

29
𝑥1 + 𝑥2 + 2𝑥3 + 2𝑥4 + 𝑥5 = 1
2𝑥1 + 2𝑥2 + 4𝑥3 + 4𝑥4 + 3𝑥5 = 1
{
2𝑥1 + 2𝑥2 + 4𝑥3 + 4𝑥4 + 2𝑥5 = 2
3𝑥1 + 5𝑥2 + 8𝑥3 + 6𝑥4 + 5𝑥5 = 3

Solution:

Step 1: The augmented matrix of the system is

1 1 2 2 1 1
2 2 4 4 3 1
( )
2 2 4 4 2 2
3 5 8 6 5 3

Step 2: Find a row echelon form of the matrix in step 1.

1 1 2 2 1 1 1 1 2 2 1 1 1 1 2 2 1 1
2 2 4 4 3 1 0 0 0 0 1 −1 0 2 2 0 2 0
( ) ≡ ( ) ≡
2 2 4 4 2 2 𝑅2 →𝑅2 + (−3)𝑅1 0 0 0 0 0 0 𝑅2 →𝑅2 + (−3)𝑅1 0 0 0 0 1 −1
3 5 8 6 5 3 𝑅3 →𝑅3 + (−2)𝑅1 0 2 2 0 2 0 𝑅3 →𝑅3 + (−2)𝑅1
(0 0 0 0 0 0)

Thus last matrix is in row echelon form and the ranks of the matrix of coefficients and the augmented matrix
are equal. So, the system has a solution. The common rank is 3 where as the number of unknowns is 5.
Hence, the system has infinitely many solutions. Since the columns corresponding to 𝑥3 and 𝑥4 have no
leading entries, so these unknowns are free (non-basic) variables.

The given system is equivalent to

𝑥1 + 𝑥2 + 2𝑥3 + 2𝑥4 + 𝑥5 = 1
2𝑥2 + 2𝑥3 + 2𝑥5 = 0
{
−𝑥5 = 1
0=0

Let 𝑥3 = 𝑠 and 𝑥4 = 𝑡 be any given values. Solving for the remaining unknowns, we obtain 𝑥5 = −1,
𝑥2 = 1 − 𝑠 and 𝑥1 = 1 − 𝑥2 −2𝑥3 − 2𝑥4 − 𝑥5 = 1 − (1 − 𝑠) − 2(𝑠) − 2(𝑡) + 1 = −𝑠 − 2𝑡 + 1.

The solution set is

𝑆 = {〈𝑥1 , 𝑥2 , 𝑠, 𝑡, −1〉t : 𝑠, 𝑡 ∈ ℝ; 𝑥1 == −𝑠 − 2𝑡 + 1; 𝑥2 = 1 − 𝑠}

30
Gaussian Jordan Elimination Method (GJEM): The Gaussian elimination procedure for solving m linear
systems in n unknowns is as follows:

Step 0: Change every equation in the system to the form a1x1 + … + anxn = b (i.e., take the constant term to
RHS and the left to LHS).

Step 1: Form the augmented matrix [A|B] of the system.

Step 2: Transform [A|B] to RREF. Let [C|D] be the RREF of [A|B].

Step 3: Solve CX = D. When you solve the latter system you may encounter one of 3 possibilities:

a) If 𝑟𝑎𝑛𝑘 (𝐴|𝐵) ≠ 𝑟𝑎𝑛𝑘 (𝐶|𝐷), the system has no solution.


b) If 𝑟𝑎𝑛𝑘 (𝐴|𝐵) = 𝑟𝑎𝑛𝑘 (𝐶|𝐷) = 𝑛, the system has unique solution. In this case each equation in
[C|D] has exactly one unkown with coefficient one and that unkown does not appear in any other
equation of [C|D]. The value of each unkown is then the constant on the RHS of its respective
equation.
c) If 𝑟𝑎𝑛𝑘 (𝐴|𝐵) = 𝑟𝑎𝑛𝑘 (𝐶|𝐷) ≠ 𝑛, then the system has infinitely many solutions. The solution set
can be found in the same way as the corresponding case for GEM.

Example 7: Use GJEM to solve

𝑥 − 𝑦 + 2𝑧 = −5
{ 3𝑥 + 4𝑦 + 15𝑧 = 2
2𝑥 − 𝑦 + 𝑧 = 1

Solution:

Step 1: The augmented matrix of the system is

1 −1 2 −5
(3 4 15 2)
2 −1 1 1

Step 2: Find reduced row echelon form of the matrix in step 1. From the solution of Example 3, an REF of
the augmented matrix is the matrix

1 −1 2 −5
(0 1 −3 11 )
0 0 30 −60

Now

31
1 −1 2 −5 1 −1 2 −5 1 −1 0 −1 1 0 0 4
(0 1 −3 11 ) ≡ (0 1 −3 11 ) ≡ (0 1 0 5) ≡ (0 1 0 5)
1 𝑅 →𝑅 + (3)𝑅 𝑅1 →𝑅1 + 𝑅2
0 0 30 −60 𝑅3 →30𝑅3 0 0 1 −2 𝑅12→𝑅12+ (−2)𝑅33 0 0 1 −2 0 0 1 −2

It is evident from the RREF that x=4, y=5 and z=-2.

Theorem 5[Cramer’s Rule]: Let Ax = b be a system of n linear equations in n unknowns. Let C be the
matrix obtained from A by replacing the kth column of A by B, i.e., kC = B and jC = jA for j≠ k.

a) The system has a unique solution if and only if det (A) ≠0.

b) If 𝑑𝑒𝑡 (𝐴) ≠ 0, then the unique solution is given by

det(𝐶)
𝑥𝑘 = det(𝐴)

That is the value of xk is the quotient of the determinant C by the determinant of A.

Remarks:

(1) If 𝑑𝑒𝑡 (𝐴) = 0, then the system 𝐴𝑋 = 𝐵 does not have a unique solution. This means that it has
either no solution or infinite solutions. For a homogeneous system only the latter is possible.
(2) If 𝑑𝑒𝑡 (𝐴) = 0, then the system 𝐴𝑋 = 0 has infinitely many solutions.
(3) If 𝑑𝑒𝑡 (𝐴) ≠ 0, then 𝐴−1 exist and 𝐴𝑋 = 𝐵 ⟺ 𝑋 = 𝐴−1 𝐵. This is called matrix inversion method.

Example 8: Use Cramer’s rule to solve

𝑥 − 𝑦 + 2𝑧 = −5
{ 3𝑥 + 4𝑦 + 15𝑧 = 2
2𝑥 − 𝑦 + 𝑧 = 1

Solution:

−5 −1 2 1 −5 2 1 −1 −5
|2 4 15| |3 2 15| |3 4 2|
1 −1 1 −120 2 1 1 −150 2 −1 1 60
X1 = 1 −1 2 = = 4, X2 = 1 −1 2 = = 5 and X3 = 1 −1 2 = −30 = -2
−30 −30
|3 4 15| |3 4 15| |3 4 15|
2 −1 1 2 −1 1 2 −1 1

Therefore the solution set is {(4, 5, −2)}. //

32
Exercise 2.5:

1. Solve the following system of equations by GEM:


3𝑥 − 7𝑦 = 2 −6𝑥 + 6𝑦 = 0 −𝑥 + 𝑦 = 1
a) { b) { c) {
2𝑥 − 5𝑦 = −1 5𝑥 − 5𝑦 = 0 𝑥−𝑦 =1
𝑤 + 3𝑥 + 2𝑦 + 2𝑧 = 0
𝑥 + 𝑦 + 2𝑧 − 𝑤 = −2 𝑤 + 4𝑥 + 𝑦 = 0
d) {2𝑥 + 𝑦 − 2𝑧 − 2𝑤 = −2 e) {
3𝑤 + 5𝑥 + 10𝑦 + 14𝑧 = 0
3𝑥 − 3𝑤 = −3
2𝑤 + 5𝑥 + 5𝑦 + 6𝑧 = 0

2. Which of the linear systems in problem 1 has a redundant equation?


3. Solve the system of equations in Problem 1 by Cramer’s rule, if possible.
4. Solve the system of equations in problem (1) by matrix inversion method, if possible.
5. Find the value of h that makes the following system consistent.

𝑥 + ℎ𝑦 = −5
{
2𝑥 − 8𝑦 = 6

2. 6 Eigen values and Eigen vectors


Let 𝐴 be an 𝑛 × 𝑛 matrix and 𝑥 be an 𝑛 dimensional vector. Then 𝐴𝑥 is an 𝑛 dimensional vector. Let

𝐴𝑥 = 𝑦

We can view this equation as a transformation that maps a given vector 𝑥 into another vector 𝑦. In particular,
nonzero vectors that are transformed into their scalar multiples play an important role in many applications.
To find such vectors we set y =  x where  is a scalar, and seek nonzero solutions 𝑥 of Ax =  x. Thus,
we shall study how to find a nonzero vector 𝑥 and a scalar  such that Ax =  x.

Definition 1: Let 𝐴 be an 𝑛 × 𝑛 matrix. A nonzero 𝑛 dimensional vector 𝑥 is said to be an eigen vector of 𝐴 if

there exists a scalar  such that 𝑥 =  𝑥 . In this case, the scalar  is called the eigen value of
𝐴 corresponding to 𝑥.

Remarks

1. Eigen vectors are also called proper vectors, characteristic vectors or latent vectors.
Eigen values are also called proper values, characteristic values or latent values. The word ‘eigen’ is German and
it means ‘proper’ in English.

33
2. Eigen vectors are by definition nonzero vectors but Eigen values can be zero.
3. The problem of finding the eigen vectors and eigen values of a matrix is called eigen value problem. One of
the three basic problems of linear algebra is the Eigen value problem; the other two being vector algebra
and systems of equations. Eigen value problems arise in several fields of study.
4. The concept of Eigen value/vector of a matrix is generalized to linear mappings. Eigen value theory of
matrices (and maps) is now developed into a more general theory called spectral theory of operators, which
is by itself a special branch of operator theory.
5. If 𝑥 is an eigen vector of 𝐴 so is 𝑟𝑥 for any nonzero scalar 𝑟.

Methods for Computing Eigen values and Eigen Vectors

Let A be 𝑛 × 𝑛 matrix. If  is an eigenvalue of 𝐴 and 𝑥 is the corresponding eigenvector, then Ax = x.


Moreover, 𝐴𝑥 -  x = 0 ⟺ 𝐴𝑥 – 𝜆𝐼𝑛 𝑥 = 0 ⟺ [𝐴 – 𝜆𝐼𝑛 ]𝑥 = 0 .

The last equation is a homogeneous system of equations in 𝑥. Always it has a solution. But it has a nontrivial
solution iff 𝑑𝑒𝑡 (𝐴 − 𝜆𝐼𝑛 ) = 0. Solving this system for 𝜆 gives all the eigen values. This observation
suggests the following method for computing eigenvalues and eigenvectors.

Procedures for Finding the eigen values and eigen vectors of matrix A:

Step 1: Solve the equation

det (A- 𝜆𝐼𝑛 ) = 0

for 𝜆. The solutions are eigen values of A.

Step 2: For each eigen value , solve the equation

𝐴 − 𝜆𝐼𝑛 )𝑥 = 𝟎

for 𝑥. The nontrivial solutions are eigen values of 𝐴 with respect to 𝜆.

Caution: The Eigen values of a matrix can be complex numbers.

Definition 2: Let A be 𝑛 × 𝑛 matrix and 𝜆 be a scalar. Then:

a. The polynomial 𝑑𝑒𝑡 (𝐴 − 𝜆𝐼𝑛 ) is called the characteristic polynomial of 𝐴.

b. The equation 𝑑𝑒𝑡 (𝐴 − 𝜆𝐼𝑛 ) = 0 is called the characteristic equation of 𝐴.

34
1 6
Example 1 (2 by 2 matrix with distinct real eigen value): Let A =[ ].
5 2

a) Find the Eigen values and Eigen vectors of A.


b) The characteristic polynomial of A

Solution:

a) Step 1: Find the Eigen values of A by solving 𝑑𝑒𝑡 (𝐴 − 𝜆𝐼2 ) = 0 for 𝜆.

1 6
𝑑𝑒𝑡 (A- 𝜆𝐼3 ) = 0 ⟹ |[ ] − 𝜆𝐼3 | = 0
5 2

1−𝜆 6
⟹ |[ ]| = 0
5 2−𝜆

1−𝜆 6
⟹ |[ ]| = 0
5 2−𝜆

1−𝜆 6
⟹ |[ ]| = 0
5 2−𝜆

⟹ 𝜆2 − 3𝜆 − 28 = 0

⟹ 𝜆 = −4 or 𝜆 = 7.

Thus the eigen values of A are 𝜆 = −4 or 𝜆 = 7.

Step 2: Find the Eigenvalues of A corresponding to each Eigenvalue by solving (𝐴 − 𝜆𝐼𝑛 )𝑋 = 𝟎 for X.

For 𝜆 = −4

1+4 6 𝑥1 0
(𝐴 − (−4)𝐼2)𝑋 = 𝟎 ⟹ [ ] [𝑥 ] = [ ]
5 2+4 2 0

5𝑥 + 6𝑥2 = 0
⟹{ 1
5𝑥1 + 6𝑥2 = 0

6

⟹ 𝑿 = 𝑡 [ 5], where 𝑡 is any real number.
1

6
−5
Thus every scalar multiple of [ ] is an Eigen vector of A corresponding to -4.
1

For 𝜆 = 7

35
1−7 6 𝑥1 0
(𝐴 − 7𝐼2 )𝑋 = 𝟎 ⟹ [ ] [𝑥 ] = [ ]
5 2−7 2 0

−6𝑥1 + 6𝑥2 = 0
⟹{
5𝑥1 − 5𝑥2 = 0

1
⟹ 𝑿 = 𝑡 [ ], where 𝑡 is any real number.
1

1
Thus every scalar multiple of [ ] is an Eigen vector of A corresponding to -4.
1

2 0
Example 2 (2 by 2 matrix with repeated eigenvalues): Let A = [ ].
0 2

Find the Eigen values and Eigen vectors of A.

1 1
Example 3 (2 by 2 matrix with complex eigen values): Let A = [ ]. Find the Eigen values and
−1 1
Eigen vectors of A.

Example 4 [3 by 3 – repeated eigenvalues]:

7 −2 −4
Let A=(3 0 −2). Find the Eigen values and Eigen vectors of A.
6 −2 −3

Exercises 2.6

1. Find the characteristic polynomial, character tic equation, Eigen values and Eigen vectors of:
2 0 1 0 1 1
3 4 0 −2
a) ( ) b) ( ) c) ( 0 1 0 ) d) (1 0 1).
1 3 1 3
−2 0 −1 1 1 0
2. Show that the eigen values of a 2 by 2 real symmetric matrix are real.
3. Show that the eigenvalues for a 2 by 2 real skew-symmetric matrix are pure imaginary numbers.
4. Show that the eigenvectors for a 2 by 2 real symmetric matrix which belong to different eigenvalues
form a basis for ℝ𝑛 .
5. Show that the eigenvectors for a 2 by 2 real symmetric matrix which belong to different eigenvalues are
necessarily perpendicular.

36
6. A real symmetric 𝑛 × 𝑛 matrix A is called positive definite if𝑋 𝑇 𝐴𝑋 > 0 for all nonzero vectors X in ℝ𝑛 .
Prove that the eigenvalues of a real symmetric positive-definite matrix A are all positive.
Prove that if eigenvalues of a real symmetric matrix A are all positive, then A is positive-definite.
7. Definition (algebraic and geometric multiplicity): we define the algebraic multiplicity of an eigenvalue to
be the number of times it is a root of the characteristic equation. We define the geometric multiplicity of
an eigenvalue to be the number of linearly independent eigenvectors for the eigenvalue. Find the
algebraic and geometric mulitiplicities of each eigen value in problem 1.
8. Show: If V is an eigenvector corresponding to a complex eigenevalue λ, then V is an eigenvector
corresponding to λ.

37

You might also like