Professional Documents
Culture Documents
Linear Algebra
Fourth Edition
Kunquan Lan
Ryerson University
1/1
Contents
Contents
Preface v
1 Euclidean spaces 1
1.1 Euclidean spaces 1
2 Matrices 38
2.1 Matrices 38
3 Determinants 95
3.1 Determinants of 3 × 3 matrices 95
2/4
Contents
3/4
Contents
Index 479
4/4
Preface
Preface
This textbook is intended as an aid for students who are attending a course on Linear Algebra in universities
and colleges. It is suitable for students who are tackling Linear Algebra for the first time and even those who
have little prior knowledge in mathematics.
This is a self-contained textbook that contains full solutions to the questions in the exercises at the end of the
book. The aim is to present the core of Linear Algebra in a clear way and provide many new and simple
proofs for some fundamental results. For example, we give a very simple proof for the well-known Cauchy-
Schwarz inequality. The goal is to enhance students’ abilities of analyzing, solving, and generalizing
problems.
The material covered in this textbook is well organized and can be easily understood by students. The main
topics include vectors, matrices, determinants, linear systems, linear transformations in the Euclidean
spaces, bases and dimensions of spanning spaces in the Euclidean spaces, eigenvalues and
diagonalizability, subspaces of the Euclidean spaces, and general vector spaces. When we discuss linear
transformations, bases, and dimensions, we restrict our attention to the Euclidean spaces, but the ideas and
methods involved can be extrapolated to general vector spaces.
This textbook is based on the third edition of Linear Algebra. Some material has been rewritten for greater
clarity and some sections and chapters have been combined for better understanding. New examples and
exercises have been added. One new chapter introducing vector spaces has also been added. In particular,
subspaces of the Euclidean spaces are discussed in more detail.
I would like to thank Tim Robinson, Melody Vincent, Jared Steuernol, and Corina Wilshire from Pearson for
their support during production of this textbook.
1/2
Chapter 1 Euclidean spaces
The set of all elements with n-ordered numbers is called ℝ n-space (ℝ n for short) or an n-dimensional
Euclidean space and denoted by
ℝn = {(x , x , ⋯ , x ) : x
1 2 n i ∈ ℝ }
and i ∈ I n .
In geometry, ℝ 1 represents the x-axis and we write ℝ 1 = ℝ; ℝ 2 represents the xy-plane and ℝ 3 is the xyz-
space. There is no geometry to show ℝ n for n ≥ 4, but these spaces exist. For example, we can denote a
particle moving in xyz-space by a 4-dimensional space {(x, y, x, t) : x, y, t ∈ ℝ}, where (x, y, z, t) denotes
the position of the particle at time t.
The element (x 1, x 2, ⋯, x n) is called a point in ℝ n and x 1, x 2, ⋯, x nare called the coordinates of the point in
ℝ n. We use symbols like P(x 1, x 2, ⋯, x n) or P to denote the point (x 1, x 2, ⋯, x n) (see Figure 1.1 (I)). In
… 1/21
Chapter 1 Euclidean spaces
Definition 1.1.2.
Let P(x 1, x 2, ⋯, x n) be point in ℝ n. The directed line segment starting at the origin O and ending at the
→
point P is called a vector or n-vector, and is denoted by OP (see Figure 1.1 (II)). The distance
→
between the points O and P is said to be the length of the vector. If P is the origin, then OP is called the
→
zero vector and if P is not the origin, then OP is called a nonzero vector.
a in ℝ n can be written into either a column form, called a column vector (or n-column vector), or a
A vector →
row form, called a column vector (or n-column vector), that is
… 2/21
Chapter 1 Euclidean spaces
()
x1
x2
→
a=
⋮
or →
a = (x1, x2, ⋯, xn ).
xn
()
1
2
= (1, 2, 3, 4).
3
4
Note that two vectors with the same components written in different orders may not be the same. For
example, the column vectors (1, 2, 3) T and (3, 2, 1) T are different vectors. A vector whose components are
all zero is a zero vector while a nonzero vector has at least one nonzero component. For example, (0, 0, 0) is
a zero vector in ℝ 3 while (1, 0, 0) and (1, 2, 3) are nonzero vectors in
Let
(1.1.1)
… 3/21
Chapter 1 Euclidean spaces
() () ()
1 0 0
→ 0 → 1 → 0
e1 = , e2 = , ⋯, e n =
⋮ ⋮ ⋮
0 0 1
→ → →
n
be vectors in ℝ . Then each vector has n components. These vectors e 1 , e 2 , ⋯, e n are called the standard
() ()
1 0
() ()
→ 1 → 0 → →
vectors in ℝ n. In particular, e 1 = and e 2 = are the standard vectors in ℝ 2 and e 1 = 0 , e =
2
1
0 1
0 0
()
0
→
and e 3 = 0 are the standard vectors in ℝ 3.
1
In Definition 1.1.2 , a vector is defined as a directed line segment starting at the origin. The following
definition introduces the notion of a vector whose initial point is not necessarily at the origin.
Definition 1.1.3.
… 4/21
Chapter 1 Euclidean spaces
(1.1.2)
→ →
PQ = OR = (y 1 − x 1, y 2 − x 2, ⋯, y n − x n),
→
By Definition 1.1.3 , we see that the vectorPQ has different initial and terminating points than the vector
→ →
OR, but its mathematical expression is the same as OR; both are (y 1 − x 1, y 2 − x 2, ⋯, y n − x n).
Hence, if you move a nonzero vector →a anywhere in the xy-plane without changing its direction, the resulting
vector is equal to a (see Figure 1.1
→
(IV)). The same vector may have different initial and terminal points.
Mathematically, Definition 1.1.3 does not introduce any new vectors. This establishes a one-to-one
relation between points and vectors in ℝ n that is, for each point P(x 1, x 2, ⋯, x n) in ℝ n, there is a unique
→
vector OP corresponding to the point P(x 1, x 2, ⋯, x n). Conversely, for each vector → a = (x 1, x 2, ⋯, x n), there
a. Hence, the space ℝ n defined in (1.1.1)
is a unique point (x 1, x 2, ⋯, x n) corresponding to the vector → can
be treated as the set of all vectors in ℝ n.
Example 1.1.1.
→ →
1. Let P(1, 2), Q(3, 4), P 1(0, − 2, 3), and Q 1(1, 0, 2). Find PQ and P 1Q 1.
Solution
→ →
By (1.1.2) , PQ = (3 − 1, 4 − 2) = (2, 2) and P 1Q 1 = (1 − 0, 0 − ( − 2), 2 − 3) = (1, 2, 1).
… 5/21
Chapter 1 Euclidean spaces
Operations on vectors
For real numbers, there are four common operations: addition, subtraction, multiplication, and division. In this
section, we generalize the first two operations: addition and subtraction to ℝ n in a natural way, and introduce
scalar multiplication and the dot product of two vectors in ℝ n. The dot product is a real number–not a vector.
The multiplication and division of two vectors introduced in a natural way are not useful, so we do not
introduce such multiplication and division operations for two vectors. However, in Section 6.1 , we
introduce a cross product of two vectors in ℝ 3, which is a vector in ℝ 3.
Definition 1.1.4.
Two vectors are said to be equal if the following two conditions are satisfied:
→ → →
If →
a, b are equal, we write →
a = b. Otherwise, we write →
a ≠ b.
→ →
In Definition 1.1.3 , we see that the two vectors PQ and OR in (1.1.2) are equal although they have
different initial and terminating points.
→ →
a = (a 1, a 2, ⋯, a m) and b = (b 1, b 2, ⋯, b n).
→ →
According to Definition 1.1.4 ,→
a = b if and only if m = n, that is, →
a and b must be in the same space, and
→
a i = b i for each i ∈ I n. Moreover, →
a ≠ b if and only if either m ≠ n or there exists some i ∈ I n such that a i ≠ b i
… 6/21
Chapter 1 Euclidean spaces
Example 1.1.2.
( ) ()
x+y 1
→ →
1. Let →
a= x−y and b = 3 . Find all x, y, z ∈ ℝ such that →
a = b.
z 1
2. Let →
a=
()
1
x2
→
and b =
()
1
4
. Find all x ∈ ℝ such that →
a ≠ b.
→
()
1
3. →
a= 2
3
and b =
→
()
1
2
. Identify whether →
→
a and b are equal.
→ →
2
4. Let P(1, 1), Q(3, 4), P 1( − 2, 3), and Q 1(0, 6) be points in ℝ . Show that PQ = P 1Q 1.
Solution
→
1. Because the number of components of →
a and b are equal, by Definition 1.1.4 , in order to make
→ →
a = b, we need the corresponding components of →
→
a and b to be equal, that is, x, y, z satisfy the
following system of equations:
{
x+y=1
x−y=3
z = 1.
Definition 1.1.5.
→
Let →
a = (a 1, a 2, ⋯, a n) and b = (b 1, b 2, ⋯, b n) We define the following operations in a natural way.
→
Addition: →
a + b = (a 1 + b 1, a 2 + b 3, ⋯, a n + b n).
→
Subtraction: →
a − b = (a 1 − b 1, a 2 − b 3, ⋯, a n − b n).
Scalar multiplication: k→
a = (kx 1, kx 2, ⋯, kx n)
Remark 1.1.1.
The addition and subtraction of vectors are introduced in the same space ℝ n. If two vectors are in
different spaces, say, one is in ℝ m and another in ℝ n, where m ≠ n, then the addition and subtraction
()
3
of the two vectors are not defined. For example,
()
1
2
+ 3
5
is not defined.
… 8/21
Chapter 1 Euclidean spaces
→
When n = 2, the addition of two vectors is illustrated in Figure 1.2 . If →
a = (x 1, y 1) and b = (x 2, y 2) then
Figure 1.2 (I) shows that
→
a + b = (x 1 + x 2 , y 1 + y 2).
→
→
Figure 1.2 (II) shows that if we place the initial point of the vector b on the terminating point of the vector
→
a, then the vector →
→
a + b is the vector starting at the initial point of the vector →
a and ending at the terminating
→ → →
→
point of vector b. In other words, the vector OA plus the vector AB is equal to the vector OB.
(1.1.3)
→ → →
OA + AB = OB.
The subtraction of two vectors is illustrated in Figure 1.2 (III), which can be derived from Figure 1.2 (II)
by the relation (1.1.3) .
… 9/21
Chapter 1 Euclidean spaces
By Definition 1.1.5 , the following results on the vector operations can be easily proved, so we leave the
proofs to the reader.
Theorem 1.1.1.
→
Let → c ∈ ℝ n and let α, β ∈ ℝ. Then
a, b, →
→
1. →
a+0=→
a.
→
2. 0→
a = 0.
→ →
3. →
a+b = b+→
a.
→
→
4. ( a + b) + c = a + (b + →
→ → →
a).
→ →
5. α(→
a + b) = α→ a + α b.
6. (α + β) a = α a + β→
→ →
a.
7. (αβ) a = α(β a).
→ →
Example 1.1.3.
() ()
1 3
→
Let →
a= 2 and b = 6 . Compute
0 5
→
i. →
a + b;
→
ii. →
a − b;
iii. 5→
a;
iv. − →a;
→
v. 2→
a − 4 b.
Solution
10/21
Chapter 1 Euclidean spaces
() () ( ) ()
1 3 1+3 4
→
i. →
a+b= 2 + 6 = 2+6 = 8 .
0 5 0+5 5
() () ( ) ( )
1 3 1−3 −2
→
ii. →
a−b= 2 − 6 = 2−6 = −4 .
0 5 0−5 −5
() ( ) ( )
1 (5)(1) 5
a=5 2
iii. 5→ = (5)(2) = 10 .
0 (5)(0) 0
() ( ) ( )
1 ( − 1)(1) −1
a = ( − 1) 2
iv. − → = ( − 1)(2) = −2 .
0 ( − 1)(0) 0
() () () ( ) ( )
1 3 2 12 − 10
→
a − 4b = 2 2
v. 2→ −4 6 = 4 − 24 = − 20 .
0 5 0 20 − 20
Example 1.1.4.
() () ()
1 3 4
→ →
Let →
a= 2 , b= 6 and →
c = 6 . Compute →
a − b + 2→
c.
0 5 7
11/21
Chapter 1 Euclidean spaces
Solution
By Theorem 1.1.1 , we have
[( ) ( )] ( ) ( )
1 3 4 6
→ →
a − b + 2→
→
c = (→
a − b) + 2→
c = 2 − 6 +2 6 = 8 .
0 5 7 9
Example 1.1.5.
Find →
a if
1
5 [ ( )] ( )
4→
a−
9
3
=
1
5
− 2→
a.
Solution
a ∈ ℝ 2 and
From the given equation, we see that →
4→
a−
( ) [( ) ] ( )
9
3
=5
1
5
− 2→
a =
5
25
− 10→
a.
Hence,
4→
a + 10→
a=
( ) () ( )
5
25
+
9
3
=
14
28
and 14→
a=
( )
14
28
. Hence, →
a=
1
14 ( ) ()
14
28
=
1
2
.
Parallel vectors
Definition 1.1.6.
12/21
Chapter 1 Euclidean spaces
→
a and b in ℝ n are said to be parallel if there exists a number k ∈ ℝ such that
Two nonzero vectors →
→ →
a = k b. When k > 0, →
→
a and b have the same direction and when k < 0, their directions are opposite.
For example, because (2, 4, 6) = 2(1, 2, 3), the two vectors (2, 4, 6) and (1, 2, 3) are parallel and have
the same direction. Similarly, because ( − 2, − 4, − 6) = − 2(1, 2, 3), the two vectors ( − 2, − 4, − 6)
and (1, 2, 3) are parallel and have the opposite direction.
→
a and b in ℝ n are parallel, we write
If →
→ →
aǁ b.
Example 1.1.6.
→ →
Let P(1, − 2, − 3), Q(2, 0, − 1), P 1( − 2, − 7, − 4), and Q 1(1, − 1, 2). Show that PQǁQP.
Solution
→
By (1.1.2) , PQ = (2 − 1, 0 − ( − 2), − 1 − ( − 3)) = (1, 2, 2) and
→
P 1Q 1 = (1 − ( − 2), − 1 − ( − 7), 2 − ( − 4)) = (3, 6, 6).
→ → → →
It follows that P 1Q 1 = (3, 6, 6) = 3(1, 2, 2) = 3PQ. By Definition 1.1.6 , PQǁQP.
Dot product
The scalar multiplication, addition, and subtraction of two vectors in ℝ n result in vectors in ℝ n. We don’t
→
define the multiplication and division of →
a and b in the natural way as (a 1b 1, a 2b 2, ⋯, a nb n) and
13/21
Chapter 1 Euclidean spaces
( a1
b1
,
a2
b2
, ⋯,
an
bn ) because they are not useful. But we introduce the notion of the product of two vectors,
called the dot product, which can be seen to be useful later. The dot product of two vectors is a real number–
not a vector in ℝ n. In Section 6.1 , we shall introduce another product of two vectors in ℝ 3 called the cross
product of two vectors, which is a vector in ℝ 3.
Definition 1.1.7.
→ → →
Let →
a = (a 1 , a 2, ⋯, a n) and b = (b 1 , b 2, ⋯, b n). The dot product of →
a and b denoted by →
a ⋅ b, is defined
by
(1.1.4)
→
a ⋅ b = a 1b 1 + a 2b 2 + ⋯ + a nb n.
→
→
It is convenient to write the dot product of →
a and b in the following row times column form:
(1.1.5)
()
b1
→
b2
→
a ⋅ b = (a 1, a 2, ⋯, a n) = a 1b 1 + a 2b 2 + ⋯ + a nb n.
⋮
bn
→ →
Hence, →
a ⋅ b is the sum of the products of the corresponding components of →
a and b. Sometimes, we
→
use sigma notation to write →
a ⋅ b, that is,
14/21
Chapter 1 Euclidean spaces
(1.1.6)
n
a⋅b=
→ →
∑ a ib i.
i=1
Example 1.1.7.
→ →
Let →
a = ( − 1, 3, 5, − 7) and b = (5, − 4, 7, 0). Find →
a ⋅ b.
Solution
→
a ⋅ b = ( − 1)(5) + (3)( − 4) + (5)(7) + ( − 7)(0) = 18.
→
Theorem 1.1.2.
→
1. →
a ⋅ 0 = 0.
→ →
2. →
a⋅b = b⋅→
a. (Commutative law for dot product)
→ →
3. →
a ⋅ (b + →
c) = →
a⋅b+→ c. (Distributive law for dot product)
a⋅→
→ → →
4. (α→
a) ⋅ b = →
a ⋅ (α b) = α(→
a ⋅ b).
Example 1.1.8.
15/21
Chapter 1 Euclidean spaces
() () ()
1 2 1
0 →
5 2 →
Let →
a= , b= , and →
c = . Calculate →
a ⋅ (b + →
c).
4 0 0
2 1 1
Solution
() ()
2 1
→ →
5 2
a ⋅ (b + →
→
c) = →
a⋅b+→
a⋅→
c = (1, 0, 4, 2) + (1, 0, 4, 2)
0 0
1 1
= 4 + 3 = 7.
Exercises
1. Write a 3-column zero vector and a 3-column nonzero vector, and a 5-column zero vector and a 5-
column nonzero vector.
2. Suppose that the buyer for a manufacturing plant must order different quantities of oil, paper, steel,
and plastics. He will order 40 units of oil, 50 units of paper, 80 units of steel, and 20 units of plastics.
Write the quantities in a single vector.
3. Suppose that a student’s course marks for quiz 1, quiz 2, test 1, test 2, and the final exam are 70, 85,
80, 75, and 90, respectively. Write his marks as a column vector.
( ) ()
x − 2y 2
→ →
4. Let →
a= 2x − y and b = − 2 . Find all x, y, z ∈ ℝ that →
a = b.
2z 1
16/21
Chapter 1 Euclidean spaces
5. →
a=
()
|x|
y 2 and b =
→
()
1
4
. Find all x, y ∈ ℝ such that →
a = b.
→
6. →
a=
( )
x−y
4
and b =
→
( )x+y
2
. Find all x, y ∈ ℝ such that →
→
a − b is a nonzero vector.
→ →
2
7. Let P(1, x ), Q(3, 4), P 1(4, 5), and Q 1(6, 1). Find all possible values of x ∈ ℝ such that PQ = P 1Q 1.
8. A company with 553 employees lists each employee’s salary as a component of a vector → a in ℝ 553. If
a 6% salary increase has been approved, find the vector involving →
a that gives all the new salaries.
()
110
9. Let →
a= 88 denote the current prices of three items at a store. Suppose that the store announces a
40
sale so that the price of each item is reduced by 20%.
a. Find a 3-vector that gives the price changes for the three items.
b. Find a 3-vector that gives the new prices of the three items.
() () ()
−1 2 3
→
10. Let →
a= 2 , b= −2 , →
c = 0 , α ∈ ℝ. Compute
3 0 −1
→
i. 2→a − b + 5→
c;
→
ii. 4 a + αb − 2→
→
c
( ) ( ) ()
9 3x 0
11. Find x, y,and z such that 4y + 8 = 0 .
2z −6 0
17/21
Chapter 1 Euclidean spaces
1
[2→
u − 3→
v+→
a] = 2w
→
+→
u − 2→
a.
2
→
13. Determine whether →
a ǁ b,
→
1. →
a = (1, 2, 3) and b = ( − 2, − 4, − 6).
→
2. →
a = ( − 1, 0, 1, 2) and b = ( − 3, 0, 3, 6).
→
3. →
a = (1, − 1, 1, 2) and b = ( − 2, 2, 2, 3).
14. Let x, y, a, b ∈ ℝ with x ≠ y. Let P(x, 2x), Q(y, 2y), P 1(a, 2a), and Q 1(b, 2b) be points in ℝ 2. Show
→ →
that PQ ǁ P 1Q 1.
→ →
15. Let P(x, 0), Q(3, x 2), P 1(2x, 1), and Q 1(6, x 2). Find all possible values of x ∈ ℝ such that PQ ǁ P 1Q 1.
() () ()
−1 2 3
→
16. Let →
a= 2 , b= −2 , →
c = 0 , and α ∈ ℝ.
3 0 −1
Compute
→
a. →
a ⋅ b;
b. a ⋅ c;
→ →
→
c. b ⋅ →
c;
→
d. →
a ⋅ (b + →
c).
18/21
Chapter 1 Euclidean spaces
18. Assume that the percentages for homework, test 1, test 2, and the final exam for a course are 10%,
25%, 25%, and 40%, respectively. The total marks for homework, test 1, test 2 and the final exam are
10, 50, 50, and 90, respectively. A student’s corresponding marks are 8, 46, 48, and 81, respectively.
What is the student’s final mark out of 100?
19. Assume that the percentages for homework, test 1, test 2, and the final exam for a course are 14%,
20%, 20%, and 46%, respectively. The total marks for homework, test 1, test 2, and the final exam are
14, 60, 60, and 80, respectively. A student’s corresponding marks are 12, 45, 51, and 60, respectively.
What is the student’s final mark out of 100?
20. A manufacturer produces three different types of products. The demand for the products is denoted by
the vector →
a = (10, 20, 30). The price per unit for the products is given by the vector
→
b = ($200, $150, $100). If the demand is met, how much money will the manufacturer receive?
21. A company pays four groups of employees a salary. The numbers of the employees for the four
groups are expressed by a vector →a = (5, 20, 40). The payments for the groups are expressed by a
→
vector b = ($100, 000, $80, 000, $60, 000). Use the dot product to calculate the total amount of
money the company paid its employees.
Figure 1.3:
19/21
Chapter 1 Euclidean spaces
22. There are three students who may buy a Calculus or Algebra book. Use the dot product to find the
total number of students who buy both Calculus and Algebra books.
23. There are n students who may buy a Calculus or Algebra book (n ≥ 2). Use the dot product to find the
total number of students who buy both Calculus and Algebra books.
24. Assume that a person A has contracted a contagious disease and has direct contacts with four
people: P1, P2, P3, and P4. We denote the contacts by a vector →
a : = (a 11, a 12, a 13, a 14), where if
the person A has made contact with the person Pj, then a 1j = 1, and if the person A has made no
contact with the person Pj, then a 1j = 0. Now we suppose that the four people then have had a variety
→
of direct contacts with another individual B, which we denote by a vector b : = (b 11, b 21, b 31, b 41),
20/21
Chapter 1 Euclidean spaces
where if the person Pj has made contact with the person B, then b j1 = 1, and if the person Pj has
made no contact with the person B, then b j1 = 0.
i. Find the total number of indirect contacts between A and B.
ii. If →
a = (1, 0, 1, 1) and →
a = (1, 1, 1, 1), find the total number of indirect contacts between A and
B.
21/21
1.2 Norm and angle
In ℝ 1, it is known that the distance between a real number x and zero 0 is |x|, the absolute value of x. In ℝ 2
by the Pythagorean theorem, the distance formula between a point P(x 1, x 2) and the origin O(0, 0) is given
by
(1.2.1)
√
2 2
x 1 + x 2,
→
(see Figure 1.2 ), which is the length of the vector OP. Now, we generalize the formula (1.2.1) of the
2 n
length of a vector from ℝ to ℝ , where the length of a vector is called a norm of the vector.
Definition 1.2.1.
(1.2.2)
2 2 2
ǁ→
aǁ =
√x 1 + x 2 + ⋯ + x n.
… 1/21
1.2 Norm and angle
Example 1.2.1.
Let →
a = (1, 3, − 2). Find ǁ→
aǁ.
Solution
ǁ→
aǁ = √12 + 32 + ( − 2) = √14.
The following result gives the basic properties of norms of vectors.
Theorem 1.2.1.
→
a, b ∈ ℝ n. Then
Let →
aǁ 2 = →
i. ǁ→ a⋅→
a.
ii. ǁk→
aǁ = | kǁ | aǁ for k ∈ ℝ,
→
and ǁk→
aǁ = kǁ→
aǁ for k ≥ 0.
→ → →
a + bǁ 2 = ǁ→
iii. ǁ→ aǁ 2 + 2(→
a ⋅ b) + ǁ bǁ 2.
→ → →
a − bǁ 2 = ǁ→
iv. ǁ→ aǁ 2 − 2(→
a ⋅ b) + ǁ bǁ 2.
→ → →
v. (→
a − b) ⋅ (→ aǁ 2 − ǁ bǁ 2.
a + b) = ǁ→
Proof
We only prove (iii). By the result (i) and Theorem 1.1.2 , we obtain
→ → → → → → →
a + bǁ 2 = (→
ǁ→ a + b) ⋅ (→
a + b) = →
a⋅→
a+→
a⋅b+b⋅→
a+b⋅b
→ →
aǁ 2 + 2(→
= ǁ→ a ⋅ b) + ǁ bǁ 2.
Example 1.2.2.
… 2/21
1.2 Norm and angle
(1.2.3)
1
a.
→
ǁ aǁ
→
Solution
1
a ∈ ℝ n is a nonzero vector, ǁ→
Because → aǁ ≠ 0. Let k = ǁ a→ ǁ
. Then k > 0. By Theorem 1.2.1 (ii) we have
1 1
ǁ aǁ = ǁk→
→
aǁ = kǁ→
aǁ = ǁ→
aǁ = 1.
ǁ→
aǁ ǁ→
aǁ
Definition 1.2.2.
By Example 1.2.2 , we see that every nonzero vector can be normalized to a unit vector by (1.2.3) .
Example 1.2.3.
Let →
a = (1, 2, 3, √2). Normalize →
a to a unit vector.
Solution
2
Because ǁ→
aǁ = √ 1 2 + 2 2 + 3 3 + √2 = √16 = 4, the unit vector is
1
ǁ→
aǁ
a=
→
1
4
(1, 2, 3, √2) = ( 1
2
,
1
2
,
3
4
,
√2
4 ) .
… 3/21
1.2 Norm and angle
Gram determinant
Gram determinant can be used to obtain the Cauchy-Schwarz Inequality and the formula for the area of a
parallelogram in ℝ n.
A 2 × 2 matrix A is a rectangular array of four numbers a, b, c, d ∈ ℝ arranged in two rows and two columns:
(1.2.4)
A=
( )
a b
c d
.
(1.2.5)
|| | |
A =
a b
c d
= ad − bc.
Example 1.2.4.
i.
| |
2 3
4 5
ii.
| |
−1 2
−3 6
… 4/21
1.2 Norm and angle
iii.
| |
−2 5
1 0
Solution
i.
| |
2 3
4 5
= (2)(5) − (3)(4) = 10 − 12 = − 2.
ii.
| |
−1 2
−3 6
= ( − 1)(6) − (2)( − 3) = − 6 + 6 = 0.
iii.
| |
−2 5
1 0
= (1)(5) − (0)( − 2) = 5 − 0 = 5.
From Example 1.2.4 , we see that determinants can be negative, zero, or positive.
Definition 1.2.3.
→
a, b ∈ ℝ n. The number
Let →
(1.2.6)
→
→
a⋅a a⋅b → → 2
G(→
a, b) = → → → aǁ 2ǁ bǁ 2 − (→
= ǁ→ a ⋅ b)
N
b b
→
is called the Gram determinant of →
a and b
Lemma 1.2.1.
→
∈ ℝ n. Then
… 5/21
1.2 Norm and angle
(1.2.7)
→ → → 2
G(→ aǁ 2 = ǁǁ→
a, b)ǁ→ aǁ 2 b − (→
a ⋅ b)→
aǁ
Proof
Let
(1.2.8)
→
→ →
→
aǁ 2 b − (→
g = ǁ→ a ⋅ b)a.
By Theorem 1.1.2 (2), (4) and Theorem 1.2.1 (i), (iv), we have
gǁ 2 =
ǁ→
→ 2
aǁ 2 bǁ − 2(ǁ→
ǁǁ→ aǁ 2 b) ⋅ (→
→
a ⋅ b)→ (
a + ǁ(→
→
a ⋅ b)→
aǁ ) → 2
→ → 2 → 2
= aǁ 4ǁ bǁ 2 − 2ǁ→
ǁ→ aǁ 2(→
a ⋅ b) + (→ aǁ 2
a ⋅ b) ǁ→
→
aǁ 4ǁ bǁ 2 − ǁ→
= ǁ→ aǁ 2(→
→ 2
aǁ 2 ǁ→
a ⋅ b) = ǁ→ [
aǁ 2ǁ bǁ 2 − (→
→
a ⋅ b) .
→ 2
]
This implies (1.2.7) .
Using Lemma 1.2.1 , we prove the Cauchy-Schwarz inequality, which provides the relation between the
dot product and the norm of the two vectors.
a, b ∈ ℝ n. Then
→
Let →
| a ⋅ b | ≤ ǁaǁǁbǁ.
→ → → →
… 6/21
1.2 Norm and angle
Proof
→ →
If →
a = 0, then we see that the Cauchy-Schwarz Inequality holds. If →
a ≠ 0, then because
→ → 2 →
aǁ 2 b − (→
ǁǁ→ aǁ ≥ 0, by (1.2.7)
a ⋅ b)→ , G(→
a, b) ≥ 0. It follows that
→ 2 →
(→ aǁ 2ǁ bǁ 2.
a ⋅ b) ≤ ǁ→
and
2
√ √ǁ aǁ 2ǁ bǁ 2 = ǁ aǁǁ bǁ.
→ → → → →
|→
a ⋅ b| = (→
a ⋅ b) ≤ →
→
In terms of components of →
a = (x 1, x 2, ⋯, x n) and b = (y 1, y 2, ⋯, y n), the Cauchy-Schwarz inequality is
equivalent to the following inequality.
(1.2.9)
√x √
2
|x 1y 1 + x 2y 2 + ⋯ + x ny n | ≤ 1 + x 22 + ⋯ + x 2n y 21 + y 22 + ⋯ + y 2n.
→
If →
a = 0 or b = 0, then the equality of the Cauchy-Schwarz inequality holds. The following result shows that if
→
a ≠ 0 and b ≠ 0, then the equality of the Cauchy-Schwarz inequality holds if and only if the two vectors are
→
parallel.
Theorem 1.2.3.
→
a, b ∈ ℝ n be nonzero vectors. Then the following assertions are equivalent.
Let →
|
i. →
a⋅b
→
| = ǁaǁǁbǁ.
→
→ →
→ a→ ⋅ b
ii. b = →
a.
ǁ a ǁ2
→
→
iii. →
a ǁ b.
… 7/21
1.2 Norm and angle
Proof
i. implies (ii). Assume that (i) holds. Let →
g be the same as in (1.2.8) . By (1.2.7) , we have
gǁ 2 = ǁ→
ǁ→ aǁ 2 ǁ→[ →
aǁ 2ǁ bǁ 2 − (→
a ⋅ b)
→ 2
]
[
aǁ 2 ǁ→
= ǁ→
→
aǁ 2ǁ bǁ 2 − (ǁ→
aǁǁ bǁ)
→ 2
] = 0.
aǁ 2 b − (→
→ → →
→
g = ǁ→ a ⋅ b)→
a = 0.
Hence,
→
→
→
a⋅b
→
b= → 2
a.
ǁ aǁ
→
ii. implies (iii). Assume that (ii) holds. By Definition 1.1.6 ,→
a ǁ b.
→
iii. implies (i). Assume that (iii) holds. By Definition 1.1.6 , there exists k ∈ ℝ such that →
a = k b.
Then
|a ⋅ b | = | (kb) ⋅ b | = | kǁ | bǁ
→ → → → → 2
and ǁ→
→ → →
aǁǁ bǁ = ǁk bǁǁ bǁ = | kǁ | bǁ .
→ 2
By Theorem 1.2.2 , we prove the triangle inequality, which is illustrated by Figure 1.5 (I).
… 8/21
1.2 Norm and angle
Theorem 1.2.4.
→
a, b ∈ ℝ n. Then
Let →
→ →
ǁ→
a + bǁ ≤ ǁ→
aǁ + ǁ bǁ.
→ → → →
Moreover, equality holds if →
a = 0 or b = 0 or →
a ǁ b.
… 9/21
1.2 Norm and angle
Proof
By Theorem 1.2.1 (iii) and Theorem 1.2.2 , we have
→ → → → →
a + bǁ 2
ǁ→ = aǁ 2 + 2→
ǁ→ a ⋅ b + ǁ bǁ 2 ≤ ǁ→
aǁ 2 + 2ǁ→
aǁǁ bǁ + ǁ bǁ 2
→ 2
= (ǁ→
aǁ + ǁ bǁ) .
→ → →
This implies that ǁ→ aǁ + ǁ bǁ. If →
a + bǁ ≤ ǁ→ a ǁ b, by Theorem 1.2.3 ,
→ → 2
a + bǁ 2 = (ǁ→
ǁ→ aǁ + ǁ bǁ)
→ → →
and equality holds. It is obvious that equality holds if →
a = 0 or b = 0.
Let P(x 1, x 2, ⋯, x n) and Q(y 1, y 2, ⋯, y n) be two points in ℝ n. By (1.1.2) and Definition 1.2.1 , we
obtain the formula of the distance between P and Q :
10/21
1.2 Norm and angle
→
ǁPQǁ =
√(y 1 − x 1) 2 + (y 2 − x 2) 2 + ⋯ + (y n − x n) 2.
Example 1.2.5.
Solution
→
ǁP 1 P 2 ǁ = √(4 − 2) 2 + ( − 3 + 1) 2 + (1 − 5) 2 = 2√6.
Now, we derive the formula for points between two points in ℝ 2. In particular, we obtain the midpoint formula.
Theorem 1.2.5.
Let A(x 1, y 1) and B(x 2, y 2) be two different points in ℝ 2. Then the following assertions hold.
1. If C(x, y) is a point on the line segment joining A and B, then there exists t ∈ [0, 1] such that
(1.2.11)
()
x
y
= (1 − t)
()()
x1
y1
+t
x2
y2
.
2. For each t ∈ [0, 1], C(x, y) given by (1.2.11) is a point on the line segment joining A and B.
Proof
→ →
1. Because C is a point on the line segment joining A and B, then AC is parallel to AB. By Definition
1.1.6 , there exists t ∈ ℝ such that
11/21
1.2 Norm and angle
(1.2.12)
→ →
AC = tAB.
It follows that
()
x
y
−
( ) [( ) ( )]
x1
y1
=t
x2
y2
−
x1
y1
and
()x
y
=
( ) [( ) ( )] ( ) ( )
x1
y1
+t
x2
y2
−
x1
y1
= (1 − t)
x1
y1
+t
x2
y2
.
By (1.2.12) , we have
(1.2.13)
→ → →
ǁACǁ = ǁtABǁ = tǁABǁ.
→ →
N
Because C is between A and B, ǁACǁ ≤ ǁABǁ. This, together with (1.2.13) , implies t ∈ [0, 1].
2. By (1.2.11) , we have
→
AC =
x
y()
−
( ) [( ) ( )]
x1
y1
= (1 − t)
x2
y2
−
x1
y1
.
12/21
1.2 Norm and angle
→ → → →
This implies that AC is parallel to AB. Because the starting point of AC and AB is A, C is a point on
the line segment joining A and B.
1
By Theorem 1.2.5 (2) with t = 2 , we obtain the following midpoint formula of the line segment joining A
and B
(1.2.14)
( x1 + x2 y1 + y2
2
,
2
.
)
Example 1.2.6.
Let A(1, 2), B( − 2, 1), and C( − 1, 1) be three different points in ℝ 2. Find the distance between the
midpoints of the segments AB and AC.
Solution
By (1.2.14) , the two midpoints of the segments AB and AC are
N
( 1−2 2+1
2
,
2 ) ( )
= − ,
1
2
3
2
and
( 1−2 2+1
2
,
2 ) ( )
= 0,
3
2
.
13/21
1.2 Norm and angle
√( ) ( )−
1
2
−0
2
+
3 3
−
2 2
2
=
1
2
.
→ →
a⋅b
−1 ≤ → ≤ 1.
ǁ→
aǁǁ bǁ
Let θ be an angle whose radian measure varies from 0 to π. Then cos θ is strictly decreasing on [0, π] and
→
a, b ∈ ℝ n, there exists a unique angle θ ∈ [0, π] such that
Hence, for any two nonzero vectors →
(1.2.15)
→ →
a⋅b
cos θ = .
→
→
ǁ aǁ ǁ bǁ
N
If n = 2 or n = 3, then we can position the two vectors such that they have the same initial point. We denote
→
by θ the angle between the two vectors →
a and b, which satisfies 0 ≤ θ ≤ π. By the cosine law, we have
(1.2.16)
→ 2 → 2 →
ǁ→ aǁ 2 + ǁ bǁ − 2ǁ→
a − bǁ = ǁ→ aǁǁ bǁ cos θ.
14/21
1.2 Norm and angle
→ →
ǁ→
aǁǁ bǁ cos θ = →
a⋅b
and (1.2.15) holds. Hence, it is natural to define the angle given in (1.2.15) as the angle between any
→
a, b in ℝ n.
two nonzero vectors →
Definition 1.2.4.
→ →
a, b be nonzero vectors in ℝ n. We define the angle, denoted by θ, between →
Let → a and b by (1.2.15) .
only if →
a ⋅ b < 0.
Definition 1.2.5.
→ π
a, b ∈ ℝ n are said to be orthogonal (or perpendicular) if θ = 2 . We write
Two vectors →
→ →
a ⊥ b.
By (1.2.15) , we obtain
Theorem 1.2.6.
N
→ → →
a, b ∈ ℝ n. Then →
Let → a ⊥ b if and only if →
a ⋅ b = 0.
→
If the two vectors →
a and b in Theorem 1.2.4 are orthogonal, then we have the following Pythagorean
Theorem.
15/21
1.2 Norm and angle
→ →
a, b ∈ ℝ n. Then →
Let → a ⊥ b if and only if
(1.2.17)
→ 2 → 2
ǁ→ aǁ 2 + ǁ bǁ .
a + bǁ = ǁ→
Proof
By Theorem 1.2.1 (iii), we have
(1.2.18)
2 2
aǁ 2 + 2(→
→ → →
ǁ→
a + bǁ = ǁ→ a ⋅ b) + ǁ bǁ .
→
→
If a ⊥ b, then →
→
a ⋅ b = 0. It follows from (1.2.18) that (1.2.17) holds. Conversely, if (1.2.17) holds,
→
→ →
then by (1.2.18) , a ⋅ b = 0 and a ⊥ b.
→
Theorem 1.2.8.
→ →
a, b ∈ ℝ n be two nonzero vectors. Then →
Let → a ǁ b if and only if θ = 0 or θ = π.
Proof
By Theorem 1.2.3 ,→
→
a ǁ b if and only if →
a⋅b| →
| = ǁaǁǁbǁ. Because a, b ∈ ℝn are two nonzero vectors,
→ → → →
→
|
aǁ ≠ 0 and ǁ bǁ ≠ 0. This implies that →
ǁ→
→
|
a ⋅ b ≠ 0 By (1.2.15) ,
16/21
1.2 Norm and angle
→ → → →
a⋅b a⋅b
cos θ = → = .
ǁ→
aǁǁ bǁ
|a ⋅ b |
→ →
If →
→
a ⋅ b > 0, then → | →
| →
a ⋅ b and cos θ = 1 and θ = 0. If →
a⋅b =→ a ⋅ b < 0, then →
a⋅b = −→
→
|
a ⋅ b, cos θ = − 1
→
| →
and θ = π.
Example 1.2.7.
→
→
Let a = ( − 2, 3, 1) and b = (1, 2, − 4). Show that a ⊥ b.
→ →
Solution
→ →
Because →
a ⋅ b = − 2 + 6 − 4 = 0, by (1.2.6) , we have →
a ⊥ b.
Example 1.2.8.
Let →
a=
()
x2
1
and b =
→
( ) 2
−8
. Find x ∈ ℝ such that →
a ⊥ b.
→
Solution
a ⋅ b = (x 2, 1)
→ →
→
( )
2
−8
= 2x 2 − 8 = 0 if and only if x = 2 or x = − 2. Hence, when x = 2 or x = − 2, →
→
a⋅b=0
and →
a ⊥ b.
Example 1.2.9.
→
Find the angle θ between →
a and b.
→
i. →
a = (2, − 1, 1) and b = (1, 1, 2).
17/21
1.2 Norm and angle
→
ii. →
a = (1, − 1, 0, − 1) and b = (1, − 1, 1, − 1).
→
iii. →
a = (a, a, a), where a > 0 is given and b = (a, 0, 0).
→
iv. →
a = (1, − 1, 2) and b = (0, 1, − 1).
Solution
→ →
i. Because →
a ⋅ b = 3, ǁ→
aǁ = √6 and ǁ bǁ = √6, by (1.2.15) ,
→
a⋅b
→
3 1 π
cos θ = = = and θ = .
ǁ→
→
aǁǁ bǁ √6√6 2 3
→ →
ii. Because →
a ⋅ b = 3, ǁ→
aǁ = √3 and ǁ bǁ = 2, by (1.2.15) ,
→
a⋅b
→
3 √3 π
cos θ = = = and θ = .
→ →
ǁ aǁǁ bǁ 2√3 2 6
→ →
a ⋅ b = a 2 , ǁ→
iii. Because → aǁ = √3a and ǁ bǁ = a, by (1.2.15) ,
→
a⋅b
→
a2 √3 √3
cos θ = = = and θ = arccos ≈ 54.74 ° .
→ → 2 3 3
ǁ aǁǁ bǁ √3a
→ →
iv. Because →
a ⋅ b = − 3, ǁ→
aǁ = √6 and ǁ bǁ = √2, by (1.2.15) ,
→
a⋅b
→
3 √3 5π
cos θ = = − = − and θ = .
→
2 6
ǁ→
aǁǁ bǁ √6√2
Exercises
18/21
1.2 Norm and angle
1. For each of the following vectors, find its norm and normalize it to a unit vector.
i. →
a = (1, 0, − 2);
→
ii. b = (1, 1, − 2);
iii. →
c = (2, 2, − 2).
→
a, b ∈ ℝ n. Show that the following identities hold.
2. Let →
→ 2 → 2 → 2
i. ǁ→
a + bǁ + ǁ→ aǁ 2 + 2ǁ bǁ .
a − bǁ = 2ǁ→
→ 2 → 2 →
ii. ǁ→
a + bǁ − ǁ→
a − bǁ = 4(→
a ⋅ b).
A=
( ) (
1
4 −2
1
B=
−2
1
2
−3 ) ( ) ( )
C=
2 −1
3 0
D=
−2 1
−6 3
4. For each pair of vectors, verify that Cauchy-Schwarz and triangle inequalities hold.
→
i. →
a = (1, 2), b = (2, − 1)
→
ii. →
a = (1, 2, 1), b = (0, 2, − 2)
→
iii. →
a = (1, − 1, 0, 1), b = (0, − 1, 2, 1)
→ →
5. Let →
a = (2, 1) and b = (1, x). Use Theorem 1.2.3 to find x ∈ ℝ such that →
a ǁ b.
6. For each pair of points P 1 and P 2, find the distance between the two points.
1. P 1(1, − 2), P 2( − 3, − 1)
2. P 1(3, − 1), P 2( − 2, 1)
3. P 1(1, − 1, 5), P 2(1, − 3, 1)
4. P 1(0, − 1, 2), P 2(1, − 3, 1)
5. P 1( − 1, − 1, 5, 3), P 2(0, − 2, − 3, − 1)
19/21
1.2 Norm and angle
7. Let A( − 1, − 2), B(2, 3), C( − 1, 1), and D(0, 1) be four different points in ℝ 2. Find the distance
between the midpoints of the segments AB and CD.
8. For each pair of vectors, determine whether they are orthogonal.
→
i. →
a = (1, 2), b = (2, − 1)
→
ii. a = (1, 1, 1),
→
b = (0, 2, − 2)
→
iii. a = (1, − 1, − 1),
→
b = (1, 2, − 2)
→
iv. a = (1, − 1, 1),
→
b = (0, 2, − 1)
→
v. a = (2, − 1, 0, 1),
→
b = (0, − 1, 3, − 1)
9. For each pair of vectors, find all values of x ∈ ℝ such that they are orthogonal.
→
a = ( − 1, 2), b = (2, x 2)
i. →
→
ii. →
a = (1, 1, 1), b = (x, 2, − 2)
→
iii. →
a = (5, x, 0, 1), b = (0, − 1, 3, − 1)
10. For each pair of vectors, find cos θ, where θ is the angle between them.
→
1. →
a = (1, − 1, 1) and b = (1, 1, − 2);
→
2. →
a = (1, 0, 1) and b = ( − 1, − 1, − 1).
→
11. Let →
a = (6, 8) and b = (1, x). Find x ∈ ℝ such that
→
i. →
a⊥b
→
ii. →
aǁb
→ π
iii. the angle between →
a and b is 4 .
21/21
1.3 Projection and areas of parallelograms
Definition 1.3.1.
→ →
Let →a, b be nonzero vectors in ℝ n. A vector →
p in ℝ n is said to be a projection of b on →
a if →
p satisfies the
ri
following two conditions.
1. →
p is parallel to →
a.
→
2. b − →
p is perpendicular to →
a.
Figure 1.6:
… 1/14
1.3 Projection and areas of parallelograms
ri
→
If a vector →
p is a projection of b on →
a, we write
→ →
p = Proj a b.→
… 2/14
1.3 Projection and areas of parallelograms
→ → → →
We refer to Figure 1.7 for the relations among the vectors →
a, b, proj a b, and b − Proj a b. Moreover, if
→ →
→ →
a ⋅ b > 0, then the angle between →
→
a and b belongs to 0, ( ) π
2
and Proj a and →
→ a have the same direction. If
→ →
a ⋅ b < 0, then the angle between →
→ → → →
→
a and b belongs to ( )
π
2
, π and Proj a and →
→a have the opposite direction. If
a ⋅ b = 0, then →
→
a ⊥ b and Proj a b = 0.
→
Figure 1.7:
→
The following result gives the formula for the projection of b on →
a.
Theorem 1.3.1.
→
a, b be nonzero vectors in ℝ n. Then
Let →
(1.3.1)
… 3/14
1.3 Projection and areas of parallelograms
( )
→ →
→ a⋅b
Proj a b =
→
→ 2
a.
→
ǁ aǁ
Proof
→ → →
By Definition 1.3.1 , Proj a b ǁ →
→ a and ( b − Proj a b) ⊥ →
a. By Definition 1.1.6
→ and Theorem 1.2.6 ,
there exists k ∈ ℝ such that
(1.3.2)
→
Proj a b = k→
→ a
and
(1.3.3)
(b − Proja b) ⋅ a = 0.
→
→
→ →
Hence, by (1.3.3) , Theorem 1.1.2 (2) and (3), (1.3.2) , and Theorem 1.2.1 (i),
→ →
→
a ⋅ b=→
a ⋅ Proj a b = →
→ a ⋅ (k→
a) = k(→
a⋅→ aǁ 2.
a) = kǁ→
→
a→ ⋅ b
This imples k = . This, together with (1.3.2) , implies (1.3.1) .
ǁ a→ ǁ 2
Example 1.3.1.
→ →
→
Let a = (4, − 1, 2) and b = (2, − 1, 3). Find Proj a b. →
… 4/14
1.3 Projection and areas of parallelograms
Solution
Because
→
→
a ⋅ b = 4(2) + ( − 1)( − 1) + 2(3) = 15 aǁ 2 = 16 + 1 + 4 = 21,
and ǁ→
by (1.3.1) we obtain
→
→
Proj a b =
15
21
5
(4, − 1, 2) = (4, − 1, 2) =
7 ( 20
7
5 10
, − ,
7 7
.
)
Theorem 1.3.2.
→
a, b ∈ ℝ n be nonzero vectors. Then the following assertions hold.
Let →
→ |a⋅b |
→ →
i. ǁProj a bǁ =
→
ǁ a→ ǁ
.
→
ǁ Proj a b ǁ
→
→
ii. |cos θ| = → , where θ is the angle between →
a and b.
ǁbǁ
Proof
( )
→
→ a→ ⋅ b
i. Because Proj a b = → a, we have
→
ǁ a→ ǁ 2
|a ⋅ b | |a ⋅ b |
( )
→ → → →
→
→ a⋅b
→
→
ǁProj a bǁ = ǁ
→
2
aǁ = 2
ǁ→
aǁ = .
→
ǁ aǁ →
ǁ aǁ ǁ→
aǁ
… 5/14
1.3 Projection and areas of parallelograms
|a ⋅ b |
→ →
ǁProj a bǁ
→
→
|cos θ | = → = →
ǁ→
aǁǁ bǁ ǁ bǁ
( )
→
π ǁ Proj a b ǁ
→
1. If θ ∈ 0, 2
, then cos θ = → .
ǁbǁ
( )
→
π ǁ Proj a b ǁ →
2. If θ ∈ 2
, π , then cos θ = − → .
ǁbǁ
Example 1.3.2.
→
Let →
a = (1, − 1, 2) and b = (0, 1, − 1).
→
1. Use Theorem 1.3.2 (i) to find ǁProj a bǁ →
→
2. Use Theorem 1.3.2 (ii) to find the angle θ between →
a and b.
Solution
1. Because
→
a ⋅ b = (1)(0) + ( − 1)(1) + (2)( − 1) = − 3 and ǁ→
→
aǁ = √12 + ( − 1) 2 + 22 = √6,
by Theorem 1.3.2 (i),
→
|a ⋅ b |
→ →
|1 − 3| 3 √6
ǁProj a bǁ =→ = = = .
ǁ→
aǁ √6 √6 2
… 6/14
1.3 Projection and areas of parallelograms
→
2. Because ǁ bǁ = √02 + 12 + ( − 1) 2 = √2, by Theorem 1.3.2 (ii),
→
√6
ǁProj a bǁ→
2 √3
|cos θ| = = = .
→
2
ǁ bǁ √2
→
Because →
a ⋅ b = − 3 < 0, we have cos θ < 0. Hence,
√3 π 5π
cos θ = − |cos θ| = − and θ = π − = .
2 6 6
Areas of parallelograms
→
a = (x 1, x 2, ⋯ , x n) and b = (y 1, y 2, ⋯ , y n) be nonzero vectors in ℝ n. By the condition (2) of Definition
Let →
→ →
1.3.1 , we see that the vector ( b − Proj a b) is always orthogonal to →
→ a whenever the angle θ between →
a and
→
b belongs to 0,( ] ( )
π
2
or
π
2
→
, π . In ℝ 2, it is easy to see that the norm ǁ b − Proj a bǁ is the the height of the →
→
parallelogram determined by →
→
a and b whenever θ ∈ 0, ( ] π
2
or θ ∈ ( )
π
2
, π .
a and b in ℝ n
→ → →
Hence, we define the norm of ( b − Proj a b) as the height of the parallelogram determined by →
→
→ →
(see Figure 1.7 ). We denote by A(→
a, b) the area of the parallelogram determined by →
a and b. Then
(1.3.4)
→ → →
A(→
a, b) = ǁ→
aǁǁ b − Proj a bǁ. →
… 7/14
1.3 Projection and areas of parallelograms
→ →
The formula for the area A(→
a, b) given in Definition 1.3.4 depends heavily on the projection Proj a b. By →
→
Pythagorean Theorem 1.2.7 and Theorem 1.3.2 , we derive a formula for A(→
a, b), which depends only
→ →
on the Gram determinant of →
a and b so it is easier to use it to compute A(→
a, b).
Theorem 1.3.3.
→
a and b be nonzero vectors in ℝ n. Then
Let →
(1.3.5)
2
√ǁaǁ2ǁbǁ2 − (a ⋅ b) .
→ → → →
A(→
a, b) = √G( a, →
b) = → →
Proof
→ → →
Because ( b − Proj a b) ⊥ →
→ a and →
a || Proj a b, we have →
→ → →
( b − Proj a b) ⊥ Proj a b.
→ →
→ → → → →
|a ⋅ b|
→ → 2
aǁ 2
ǁ→
and
A(→
→ 2 → →
aǁ 2ǁ b − Proj a bǁ 2 = ǁ→
a, b) = ǁ→ → aǁ 2ǁ bǁ 2 −
→
| →
a ⋅ b | 2 = G(→
→
a, b).
→
→ →
This implies A(→
a, b) = √G( a, →
b).
… 8/14
1.3 Projection and areas of parallelograms
→ →
If →
a ⊥ b, then the parallelogram determined by →
a and b is called a rectangle. By Theorem 1.3.3 , we see
→ → → →
that if a ⊥ b, then the area of the rectangle equals ǁ aǁǁ bǁ because a ⋅ b = 0.
→ →
→
→
a, b ∈ ℝ n, we prove the Lagrange’s Identity on the
To give a formula of A( a, b) in terms of components of →
→
Gram determinant.
(1.3.6)
| |
n=1 n
xi xj 2
G(→
→
a, b) = ∑ ∑ yi yj
.
i=1j=i+1
Proof
By computation, we have
… 9/14
1.3 Projection and areas of parallelograms
( )( )
n n n n
ǁ→ |
aǁ 2 bǁ 2 =
→
∑ x 2i ∑ y 2j
i=1 j=1
= ∑ ∑ x 2i y 2j
i = 1j = 1
n n−1 n n−1 n
= ∑ (x iy i) 2
+ ∑∑ 2 2
xi yj + ∑∑ 2 2
xi yj
i=1 i=1 j=i+1 j=1 i=j+1
n n−1 n n−1 n
= ∑ (x iy i) 2 + ∑ ∑ 2 2
xi yj + ∑∑ 2 2
xj yi
i=1 i=1 j=i+1 i=1 j=i+1
n n−1 n
= ∑ (x iy i) 2 + ∑ ∑ 2 2 2 2
(x i y j + x j y i )
i=1 i=1 j=i+1
and
( ) ( )
n 2 n n n n
(→
→
a ⋅ b)
2
= ∑ x iy i = ∑ x iy i ∑ x jy j = ∑ ∑ (x iy i)(x jy j)
i=1 i=1 j=1 i = 1j = 1
n n−1 n n−1 n
= ∑ (x iy i) 2
+ ∑∑ (x iy i)(x jy j) + ∑∑ (x iy i)(x jy j)
i=1 i=1 j=i+1 j=1 i=j+1
n n−1 n n−1 n
= ∑ (x iy i) 2
+ ∑∑ (x iy i)(x jy j) + ∑∑ (x jy j)(x iy i)
i=1 i=1 j=i+1 i=1 j=i+1
n n−1 n
= ∑ (x iy i) 2
+ 2∑ ∑ (x iy i)(x jy j).
i=1 i=1 j=i+1
10/14
1.3 Projection and areas of parallelograms
[ ]
n n−1 n
G(→
a, b) =
→
∑ (x iy i) 2 + ∑ ∑ (x 2i y 2j + x 2j y 2i )
i=1 i=1 j=i+1
[ ]
n n−1 n
− ∑ (x iy i) 2 + 2 ∑ ∑ (x iy i)(x jy j)
i=1 i=1 j=i+1
n n n n
= ∑∑ (x 2i y 2j + x 2j y 2i ) − 2∑ ∑ (x iy i)(x jy j)
i = 1j = i + 1 i = 1j = i + 1
n−1 n n−1 n
= ∑∑
i=1 j=i+1
[ (x 2i y 2j + x 2j y 2i ) − 2(x iy i)(x jy j) = ] ∑∑
i=1 j=i+1
(x iy j − x jy i) 2.
→
By Theorem 1.3.3 and Lemma 1.3.1 , we obtain the following formula of A(→
a, b) in terms of
→
a, b ∈ ℝ n.
components of →
(1.3.7)
√ | |
n−1 n xi xj 2
→ →
A( a, b) = ∑ ∑ yi yj
.
i=1j=i+1
→ →
The formula for A(→a, b) given in (1.3.7) only depends on the components of → a and b , where norms and dot
products are not involved, but when n ≥ 4, it contains many terms of 2 × 2 determinants. So we only state the
11/14
1.3 Projection and areas of parallelograms
simple cases when n = 2 or n = 3 as a corollary. For n ≥ 4, it would be simpler to use (1.3.5) to compute
→
A(→
a, b).
In Corollary 1.3.1
→
1. Let →
a = (x 1 , x 2) and b = (y 1, y 2). Then
| |
x1 x2 2
→ →
G( a, b) =
y1 y2
and
x1 x2
→
A( a, b) = ǁ
→
ǁ.
y1 y2
and
A(a→,b→)=| x1x2y1y2 |2+| x1x3y1y3 |2+| x2x3y2y3 |2.
In Corollary 1.3.1 (2), the three determinants are obtained by deleting the third, second, and first columns of
the following matrix, respectively:
(x1x2x3y1y2y3).
Example 1.3.3.
12/14
1.3 Projection and areas of parallelograms
2. Find A(a→, b→) and ǁb→−proja→ b→ǁ, where a→=(1, −2, 0) and b→=(−5,0,2).
3. Find A(a→,b→) and ǁb→−proja→ b→ǁ, where a→=(2, −1, 1, −1) and b→=(0, 1, −1, 2).
Solution
1. By Corollary 1.3.1 (1), A(a→,b→)=| | 120−1 | |=| −1 |=1.
2. Because
G(a,→b→)=| 1−2−50 |2+| 10−52 |2+| −2002 |2=120,
Exercises
1. For each pair of vectors a→ and b→ , find Proja→ b→ and verify that
(b→−proja→ b→)⊥a→.
1. a→=(−1, 2), b→=(1, 1);
2. a→=(2, −1, 1), b→=(1, −1, 2);
3. a→=(−1, −2, 2), b→=(1, 0, −1);
4. a→=(0, −1, 1, −1), b→=(1, 1, −1, 2).
6. Find G(a→, b→) and ǁb→ −Porja→ b→ǁ for each pair of vectors.
a. a→=(−1, 2), b→=(1, −1);
b. a→=(2, −1, 1), b→=(1, −1, 2);
c. a→=(1, 2, 0, 1), b→ =(1, −1, 2, 3);
d. a→ = (2, −1, 1, 0), b→ =(3, 0, 1, 3).
14/14
1.4 Linear combinations and spanning spaces
(1.4.1)
() () ()
a 11 a 12 a 1n
→ a 21 → a 22 → a 2n
a1 = , a2 = , ⋯ , an =
⋮ ⋮ ⋮
a m1 a m2 a mn
→
be vectors in ℝ m. Let b = (b 1, b 2⋯ , b n) T and
(1.4.2)
{ }
→ → →
S= a 1, a 2, ⋯ , a n .
Definition 1.4.1.
(1.4.3)
→ → →
→
b = x 1 a 1 + x 2 a 2 + ⋯ + x na n ,
→ → →
→
then b is said to be a linear combination of a 1, a 2 , ⋯, a nor S.
Example 1.4.1.
Let
() () () () ()
1 2 0 0 1
→ 0 → 3 → 1 → 0 → 0
a1 = , a2 = , a3 = , a4 = , a5 = .
2 1 3 1 0
4 2 1 0 0
→ → → →
→ T → →
i. Let b = ( − 4, − 9, 1, 2) . Verify that 2a 1 − 3a 2 = b. Hence, b is a linear combination of a 1, a 2.
→ → →
→
( T → →
ii. Let b = 5, 7, 7, 9) . Verify that a 1 + 2a 2 + a 3 = b. Hence, b is a linear combination of
→ → →
a 1, a 2, a 3.
→ → → →
→ T → →
iii. Let b = ( − 1, − 7, 6, 7) . Verify that 3a 1 − 2a 2 − a 3 + 5a 4 = b. Hence, b is a linear combination of
→ → → →
a 1, a 2, a 3, a 4.
… 2/14
1.4 Linear combinations and spanning spaces
→ → → → →
→ → →
iv. Let b = ( − 5, 1, 2, 4) T. Verify that a 1 + a 2 − 2a 3 + 5a 4 − 8a 5 = b. Hence, b is a linear combination
→ → → → →
of a 1, a 2, a 3, a 4, a 5.
Solution
() () ( ) ( )
1 2 2−6 −4
→ → 0 3 0−9 −9
i. 2a 1 − 3a 2 = 2 −3 = = .
2 1 4−3 1
4 2 8−6 2
() () () ( ) ()
1 2 0 1+4+0 5
→ → → 0 3 1 0+6+1 7
ii. a 1 + 2a 2 + a 3 = +2 + = = .
2 1 3 2+2+3 7
4 2 1 4+4+1 9
→ → → →
iii. Let c = 3a 1 − 2a 2 − a 3 + 5a 4. Then
→
() () () () ( )
1 2 0 0 −1
0 3 1 0 −7
→
c− =3 −2 − +5 = .
2 1 3 1 6
4 2 1 0 7
→ → → → →
iv. Let →
c = a 1 + a 2 − 2a 3 + 5a 4 − 8a 5. Then
… 3/14
1.4 Linear combinations and spanning spaces
() () () () () ( )
1 2 0 0 1 −5
0 3 1 0 0 1
→
c = + −2 +5 −8 = .
2 1 3 1 0 2
4 2 1 0 0 4
→
In Example 1.4.1 , we verify that the vector b is a linear combination of some vectors from
→ → → → →
a 1, a 2, a 3, a 4, a 5. For example, in Example 1.4.1 (1), we verify that
→ →
→
b = 2a 1 − 3a 2.
→ → → →
→
A question is how to find the coefficients 2 and −3 of a 1 and a 2 if a 1, a 2, and b are given? In general, given
→ → → → → →
→ →
vectors a 1, a 2, ⋯, a n and a vector b , how to show that b is a linear combination of a 1, a 2, ⋯, a n ? Or
equivalently, by Definition 1.4.1 , how to find the numbers x 1, x 2, ⋯, x n ∈ ℝ such that (1.4.3) holds?
In the following, we only give a simple example to show how to find these numbers, and give more detailed
discussions in Section 4.6 after we study the approaches of solving systems of linear equations in
Sections 4.2–4.4.
Example 1.4.2.
() () ()
→ 1 → 2 8 → →
→
Let a 1 = and a 2 = . Show that b = is a linear combination of a 1, a 2.
2 1 7
Solution
… 4/14
1.4 Linear combinations and spanning spaces
→ →
→
Let x 1a 1 + x 2a 2 = b. Then
x1
() () ()
1
2
+ x2
2
1
=
8
7
.
This implies
( ) ()
x 1 + 2x 2
2x 1 + x 2
=
8
7
.
{ x 1 + 2x 2
2x 1 + x 2
=
= 7.
8
→ → →
→ →
Solving the system, we get x 1 = 2 and x 2 = 3. Hence, b = 2a 1 + 3a 2 and b is a linear combination of a 1
→
and a 2.
→
Now, we discuss some special cases of Definition 1.4.1 . In Definition 1.4.1 , if n = 1, then b is a linear
→
combination of a 1 if and only if there exists x 1 ∈ ℝ such that
→
→
b = x 1a 1.
… 5/14
1.4 Linear combinations and spanning spaces
→ → →
→ → → → → → → →
If b ≠ 0 and a 1 ≠ 0 , then by Definition 1.1.6 , b is parallel to a 1. Hence, If b ≠ 0 and a 1 ≠ 0 , then b is a
→ →
→
linear combination of a 1. if and only if b is parallel to a 1.
Theorem 1.4.1.
→
The zero vector 0 ∈ ℝ m is a linear combination of S.
Proof
Let x 1 = x 2 = ⋯ = x n = 0. Then
→ → → → → →
→
0 = 0a 1 + 0a 2 + ⋯ + 0a n = x 1a 1 + x 2a 2 + ⋯ + x na n.
→ →
The following result shows that if b is a linear combination of S and we add finitely many vectors to S, then b
is a linear combination of all these vectors.
Theorem 1.4.2.
→ →
→ → →
Assume that b is a linear combination of S. Let b 1, b 2, ⋯, b k be vectors in ℝ m. Then b is a linear
combination of T, where
{ }
→ → → → → →
T= a 1, a 2, ⋯, a n, b 1, b 2, ⋯, b k .
Proof
… 6/14
1.4 Linear combinations and spanning spaces
→
To show that b is a linear combination of T, we need to find n + k numbers, say y 1, ⋯, y n, y n + 1, ⋯, y n + k
such that
(1.4.4)
→ → → → →
→
b = y 1a 1 + y 2a 2 + ⋯ + y na n + y n + 1b 1 + ⋯ + y n + k b k .
→
Because b is a linear combination of S, by Definition 1.4.1 , there exist {xi ∈ ℝ : i ∈ In } such that
(1.4.5)
→ → →
→
b = x 1a 1 + x 2a 2 + ⋯ + x na n.
y 1 = x 1, ⋯, y n = x n, y n + 1 = y n + 2 = ⋯ = y n + k = 0.
Hence, we have
→ → →
→
b = x 1a 1 + x 2a 2 + ⋯ + x na n
→ → → → →
= x 1a 1 + x 2a 2 + ⋯ + x na n + 0b 1 + ⋯ + 0 b k .
→
By Definition 1.4.1 , b is a linear combination of T.
Example 1.4.3.
… 7/14
1.4 Linear combinations and spanning spaces
→
Let a 1 = (1, 2, 3) T. Show that the following assertions hold.
→
→
1. 0 = (0, 0, 0) T is a linear combination of a 1.
→
→ T →
2. Let b = (2, 4, 6) . Then b is a linear combination of a 1.
→ → → →
T T → T
3. Let a 2 = (1, 0, 1) and a 3 = (1, 1, 1) . Then b = (2, 4, 6) is a linear combination of a 1, a 2, and
→
a 3.
Solution
→
→
The result (1) follows from Theorem 1.4.1 . Because b = 2a 1, the result (2) follows from Definition
→
→
1.4.1 . Because b is a linear combination of a 1, the result (3) follows from Theorem 1.4.2 .
(1.4.6)
… 8/14
1.4 Linear combinations and spanning spaces
→ → →
→
b = x 1a 1 + x 2a 2 + ⋯ + x na n
()() ()
a 11 a 12 a 1n
a 21 a 22 a 2n
= x1 + x2 + ⋯ + xn
⋮ ⋮ ⋮
a m1 a m2 a mn
( )( ) ( )
a 11x 1 a 12x 2 a 1mx n
a 21x 1 a 22x 2 a 2mx n
= + +⋯+
⋮ ⋮ ⋮
a m1x 1 a m2x 2 a mnx n
( )
a 11x 1 + a 12x 2 + ⋯ + a 1nx n
a 12x 1 + a 22x 2 + ⋯ + a 2nx n
= .
⋮
a m1x 1 + a m2x 2 + ⋯ + a mnx n
(1.4.7)
… 9/14
1.4 Linear combinations and spanning spaces
{
a 11x 1 + a 12x 2 + ⋯ + a 1nx n = b 1
a 21x 1 + a 22x 2 + ⋯ + a 2nx n = b 2
⋮
a m1x 1 + a m2x 2 + ⋯ + a mnx n = b m.
Now we introduce spanning spaces, which will be used in Section 4.6 and Chapters 5,7, and 8. By
Definition 1.4.1 , the vector
→ → →
x 1a 1 + x 2a 2 + ⋯ + x na n
is a linear combination of S for each set of x 1, ⋯, x n ∈ ℝ. We collect all these linear combinations together
as a set called a spanning space of S.
Definition 1.4.2.
The spanning space of S, denoted by span S, is the set of all linear combinations of the vectors in S, that
is,
(1.4.8)
{ }
→ → →
span S = x 1a 1 + x 2a 2 + ⋯ + x na n : x i ∈ ℝ, i ∈ I n .
Theorem 1.4.3.
10/14
1.4 Linear combinations and spanning spaces
→
1. A vector b ∈ span S.
→ → →
→
2. b is a linear combination of a 1, a 2, ⋯ , a n.
→ → →
→
Let a 1, a 2, b be the same as in Example 1.4.2 . By Example 1.4.2 , b is a linear combination of
→ → → →
→
a 1 and a 1 . By Theorem 1.4.3 , b ∈ span{a 1, a 2}.
Example 1.4.4.
() ( ) ()
→ 6 → −4 2 → →
→ →
Let a 1 = , a2 = , and b = . Show that b ∈ span{a 1, a 2}.
3 −2 1
Solution
→ →
→
Let x 1a 1 + x 2a 2 = b. Then
x1
() ( ) ()
6
3
+ x2
−4
−2
=
2
1
.
This implies
{ 6x 1 − 4x 2 = 2
3x 1 − 2x 2 = 1.
11/14
1.4 Linear combinations and spanning spaces
The above two equations are the same, so the system is equivalent to the single equation 3x 1 − 2x 2 = 1.
Let x 2 = 1. Then 3x 1 − 2x 2 = 1 implies
1 1
x1 = (1 + 2x 2) = (1 + 2) = 1.
3 3
→ → → →
→ →
Hence, b = a 1 + a 2 and b ∈ span{a 1, a 2}.
Example 1.4.5.
() ()
→ 1 → 0 → →
Let e 1 = , e2 = 2
. Show ℝ = span{ e 1 , e 2 }.
0 1
Solution
{ }{ }{ }
→ → → →
span e 1 , e 2 = x 1 e 1 , x 2 e 2 : x 1, x 2 ∈ ℝ = (x 1, x 2) : x 1, x 2 ∈ ℝ = ℝ 2.
(1.4.9)
{ }
→ → →
ℝn = span e 1 , e 2 , ⋯, e n ,
→ → →
where e , e 2 , ⋯, e n are the standard vectors in ℝ n given in (1.1.1).
12/14
1.4 Linear combinations and spanning spaces
Example 1.4.6.
{ }
→ → → →
Let e 1 = (1, 0, 0) and e 2 = (0, 1, 0). Find span e 1 , e 2 .
Solution
{ }{ }{ }
→ → → →
span e 1 , e 2 = x e 1 + y e 2 : x, y ∈ ℝ = (x, y, 0) : x, y ∈ ℝ .
{ }
→ →
The above example shows that the spanning space span e 1 , e 2 is the xy-plane in ℝ 3.
Exercises
() () ()
1 1 −1
→ → →
1. Let a 1 = 0 , a = 3 , and a = 1 .
2 3
1 2 3
→ →
→ T →
i. Let b = ( − 1, − 9, − 4) . Verify whether 2a 1 − 3a 2 = b.
→ →
→ T →
ii. Let b = (3, 6, 6) . Verify whether a 1 − 2a 2 = b.
→ → →
→ T →
iii. Let b = (2, 7, 7) . Verify whether a 1 − 2a 2 + a 3 = b.
() () ()
→ 1 → 1 1 → →
→ →
2. Let 1 = , a2 = , and b = . Determine whether b is a linear combination of a 1, a 2.
2 1 0
13/14
1.4 Linear combinations and spanning spaces
() () ()
→ 1 → 2 0 → →
→
3. Let a 1 = and a 2 = . Show whether b = is a linear combination of a 1, a 2.
1 1 1
→ → → →
4. Let a 1 = (1, − 1, 1) T, a 2 = (0, 1, 1) T, a 3 = (1, 3, 2) T, and a 4 = (0, 0, 1) T.
→ → → →
→ T
1. Is 0 = (0, 0, 1) a linear combination of a 1, a 2, a 3, a 4 ?
→ → → →
→ T
2. Is b = (1, 0, 2) a linear combination of a 1, a 2, a 3, a 4 ?
() ( ) () { }
→ 1 → 2 4 → →
→ →
5. Let a 1 = , a2 = , and b = . Show that b ∈ span a 1, a 2 .
2 −1 3
14/14
Chapter 2 Matrices
Chapter 2 Matrices
2.1 Matrices
Definition of a matrix
Definition 2.1.1.
(2.1.1)
( )
a 11 a 12 ⋯ a 1n
a 21 a 22 ⋯ a 2n
A= .
⋮ ⋮ ⋱ ⋮
a m1 a m2 ⋯ a mn
The ijth component of A, denoted a ij, is the number appearing in the ith row and jth column of A. a ij, is
called an (i, j)-entry of A, or entry or element of A if there is no confusion. Sometimes, it is useful to write
A = (a ). Capital letters are usually applied to denote matrices. m × n is called the size of A.
… 1/15
Chapter 2 Matrices
Let
(2.1.2)
() () ()
a 11 a 12 a 1n
→ a 21 a 22 → a 2n
→
a1 = , a= , ⋯, a n = .
⋮ ⋮ ⋮
a m1 a m2 a mn
→ → →
The vectors a 1, a 2, ⋯, a n are called the column vectors of the matrix A. We can rewrite the matrix A as
(2.1.3)
→ → →
A = (a 1, a 2, ⋯, a n).
Let
(2.1.4)
→
r 1 = (a 11, a 12, ⋯, a 1n),
→
r 2 = (a 21, a 22, ⋯, a 2n),
⋮
→
r m = (a m1, a m2, ⋯, a mn).
… 2/15
Chapter 2 Matrices
→ → →
The vectors r 1 , r 2 , ⋯, r m are called the row vectors of A. We can rewrite the matrix A as
(2.1.5)
()
→
r1
→
r2
A= .
⋮
→
rm
→ → → → → →
Note that a 1, a 2, ⋯, a n are in ℝ while r 1 , r 2 , ⋯, r m are in ℝ m.
m
Example 2.1.1.
( )
0 0 0
0 0 0
is 4 × 3.
0 0 0
0 0 0
Example 2.1.2.
… 3/15
Chapter 2 Matrices
( )
1 0 0 1
Let A = 2 2 1 1 . Use the column vectors and the row vectors of A to rewrite A.
0 1 2 4
Solution
Let
() () () ()
1 0 0 1
→ → → →
a1 = 2 , a2 = 1 , a3 = 1 , and a 4 = 1 .
0 2 2 4
→→→→
Then A can be rewritten as A = (a 1 a 2 a 3 a 4).
()
→
r1
→ → → →
Let r 1 = (1, 0, 0, 1), r 2 = (2, 2, 1, 1), and r 3 = (0, 1, 2, 4). Then A can be rewritten as A = r2 .
→
r3
Definition 2.1.2.
… 4/15
Chapter 2 Matrices
Sometimes, we denote by 0 a zero matrix if there is no confusion. A 1 × n matrix and an m × 1 matrix can be
treated as a row vector and a column vector, respectively.
Example 2.1.3.
( )
0 0 0
A = (0), B=
( 0 0 0
0 0 0 )
, and C=
0 0 0
0 0 0
.
0 0 0
()
0
()
2
0
3. is a 4 × 1 column matrix and 1 is a 3 × 1 column matrix.
0
3
0
Transpose of a matrix
Let A = (a ij) bean m × n matrix defined in (2.1.1) . The transpose of A, denoted by A T, is the n × m matrix
obtained by interchanging the rows and columns of A. Hence, A T = (a ji) or
(2.1.6)
: 100%
… 5/15
Chapter 2 Matrices
( )
a 11 a 21 ⋯ a m1
a 21 a 22 ⋯ a m2
T
A = .
⋮ ⋮ ⋱ ⋮
a 1n a 2n ⋯ a mn
From (2.1.6) , we see that the ith column of A T and the ith row of A are the same. It is obvious that
(A )
T T = A.
Example 2.1.4.
( )
7 9
A = (8) B = (1 3 5) C= 18 31 .
52 68
Solution
()
1
A T = (8), B T = 3 , CT =
5
7 18 52
9 31 68 (
.
)
Operations on matrices
Definition 2.1.3.
… 6/15
Chapter 2 Matrices
Two matrices A and B are said to be equal if the following conditions hold:
If A and B are equal, then we write A = B. Let A = (a ij) and B = (b ij) be two m × n matrices. Then A = B if
and only if
Example 2.1.5.
Let A =
( ) 2
3 x2
1
and B =
( )
2 1
3 4
. Find all x ∈ ℝ such that A = B.
Solution
Note that A and B have the same size. Hence, if x 2 = 4, then A = B. This implies that when x = 2 or
x = − 2, A = B.
Example 2.1.6.
Let A =
( a 2x + y
b 4x + 3y ) and B =
( )
1 1
3 6
. Find all a, b, x, y ∈ ℝ such that A = B
Solution
Note that A and B have the same size. Hence, A = B if a = 1, b = 3, and x, y satisfy the following system.
{ 2x + y = 1,
4x + 3y = 6.
… 7/15
Chapter 2 Matrices
Example 2.1.7.
Let A = (1 2) and B =
() 1
2
be two matrices. Determine if A = B.
Solution
The sizes of A and B are 1 × 2 and 2 × 1 respectively. Because A and B have different sizes, A ≠ B.
Note that in Example 2.1.7 , if we treat A and B as vectors, then they are equal vectors.
Definition 2.1.4.
( )
a 11 + b 11 a 12 + b 12 ⋯ a 1n + b 1n
a 21 + b 21 a 22 + b 22 ⋯ a 2n + b 2n
Addition: A + B = .
⋮ ⋮ ⋱ ⋮
a m1 + b m1 a m2 + b m2 ⋯ a mn + b mn
( )
a 11 − b 11 a 12 − b 12 ⋯ a 1n − b 1n
a 21 − b 21 a 22 − b 22 ⋯ a 2n − b 2n
Subtraction: A − B = .
⋮ ⋮ ⋱ ⋮
a m1 − b m1 a m2 − b m2 ⋯ a mn − b mn
… 8/15
Chapter 2 Matrices
( )
ka 11 ka 12 ⋯ ka 1n
ka 21 ka 22 ⋯ ka 2n
Scalar multiplication: kA = .
⋮ ⋮ ⋱ ⋮
ka m1 ka m2 ⋯ ka mn
Example 2.1.8.
Let
A=
( 3 4
2 −3 0
2
) and B =
( −4
5
1
−6 1
0
)
.
Compute
i. A + B;
ii. A − B;
iii. − A
iv. 3A.
Solution
i. A + B =
( 3−4 4+1
2 + 5 −3 − 6 0 + 1
2+0
) (
=
−1
7
5
−9 1
2
) .
ii. A − B =
( 3 − ( − 4)
2−5
4−1
− 3 − ( − 6) 0 − 1
2−0
) (=
7
−3 3 −1
3 2
) .
… 9/15
Chapter 2 Matrices
iii. − A =
( ( − 1)(3) ( − 1)(4) ( − 1)(2)
( − 1)(2) ( − 1)( − 3) ( − 1)(0) ) (
=
−3 −4 −2
−2 3 0 ).
iv. 3A =
(
(3)(3) (3)(4)
(3)(2) (3)( − 3) (3)(0)
(3)(2)
) (
=
9 12 6
6 −9 0 ).
The following result can be easily proved and its proof is left to the reader.
Theorem 2.1.1.
1. A + 0 = A,
2. 0A = 0,
3. A + B = B + A,
4. (A + B) + C = A + (B + C),
5. α(A + B) = αA + αB,
6. (α + β)A = αA + βA,
7. (αβ)A = α(βA),
8. 1A = A,
9. (A + B) T = A T + B T,
10. (A − B) T = A T − B T,
11. (αA) T = αA T.
Example 2.1.9.
Let
10/15
Chapter 2 Matrices
( ) ( ) ( )
1 0 1 1 2 3 1 0 0
A= 2 1 1 B= 0 0 0 C= 0 1 1
0 0 2 1 4 2 0 0 2
Solution
2A − B + C = (2A − B) + C
[ ( ) ( )] ( )
1 0 1 1 2 3 1 0 0
= 2 2 1 1 − 0 0 0 + 0 1 1
0 0 2 1 4 2 0 0 2
[( ) ( )] ( )
2 0 2 1 2 3 1 0 0
= 4 2 2 − 0 0 0 + 0 1 1
0 0 4 1 4 2 0 0 2
( )( )( )
1 −2 −1 1 0 0 2 −2 −1
= 4 2 2 + 0 1 1 = 4 3 3 .
−1 −4 2 0 0 2 −1 −4 4
[( ) ( )] ( ) ( )
1 2 0 1 0 1 2 2 1 4 4 2
=2 0 1 0 + 2 0 4 =2 2 1 4 = 4 2 8 .
1 1 2 3 0 2 4 1 4 8 2 8
11/15
Chapter 2 Matrices
Example 2.1.10.
[ ( )] ( )
T
2A −
1
2 −1
0 T T
=
0
0 −1
1
.
Solution
Because
[ ( )]
2A T −
1
2 −1
0 T T
= (2A T) −
T
[( ) ]
1
2 −1
0 T T
= 2(A T) −
T
( )
1
2 −1
0
= 2A −
( )
1
2 −1
0
,
we obtain
2A −
( ) ( )
1
2 −1
0
=
0
0 −1
1
and
2A =
( )( ) ( )
0
0 −1
1
+
1
2 −1
0
=
1
2 −2
1
.
12/15
Chapter 2 Matrices
( )
1 1
Hence, A =
1
2 ( )
1
2 −2
1
= 2
1 −1
2
.
Exercises
1. Find the sizes of the following matrices:
( )
1 0 0
A = (2) B =
( ) (
5 8
1 4
C=
10 3 8
10 3 4 ) D=
0 10
0 0
2
1
6 9 10
2. Use column vectors and row vectors to rewrite each of the following matrices.
) ( )
9 8 7
( ) (
3 3 2 1 9 7 2
6 5 4
A= 1 8 1 B= 3 10 8 7 C=
3 2 1
5 4 10 4 3 10 4
0 −1 −2
() ( )
33 3 9 − 19
C = (4) B = ( 3 11 2 ) C= 8 D= −2 8 −7
12 5 3 −9
4. Let A =
( )
9
6 x2
3
and B =
( )
9 3
6 4
. Find all x ∈ ℝ such that A = B.
13/15
Chapter 2 Matrices
5. Let C =
( 12 5 25
19 4 6 ) and D =
( 12 5 x 2
19 4 6 ) . Find all x ∈ ℝ such that C = D.
( ) ( )
120 25 122 120 x2 122
6. Let E = 123 124 125 and F = 123 124 x3 . Find all x ∈ ℝ such that E = F.
126 127 128 26x − 4 127 128
7. Let A =
( a
3x + 2y − x + y
b
) and B =
( )
1 1
3 6
. Find all a, b, x, y ∈ ℝ such that A = B.
8. Let
C=
( a+b
x − 2y 5x + 3y
2b − a
) and D =
( )
6
8 14
0
.
9. Let A =
( −2
6
3
−1 −8
4
) and B =
( 7 −8 9
0 −1 0 ) . Compute
i. A + B;
ii. − A;
iii. 4A − 2B;
iv. 100A + B.
( ) ( ) ( )
9 5 1 2 2 2 4 7 0
10. Let A = 8 0 0 , B= 0 0 0 , and C = 0 3 1 .
0 3 2 4 6 8 0 0 2
Compute
14/15
Chapter 2 Matrices
1. 3A − 2B + C,
2. [3(A + B)] T,
( 1
3. 4A + 2 B − C
) T
.
[ ( )] ( )
6 3 T −5 6 T
1
2
B+ 8 3
1 4
−3 8
−4
−9
2
=
( 23 − 16 17
− 16 26.5 2 )
.
: 100%
15/15
2.2 Product of two matrices
( )( ) ( )
a 11 a 12 ⋯ a 1n x1 a 11x 1 + a 12x 2 + ⋯ + a 1na n
a 21 a 22 ⋯ a 2n x2 a 21x 1 + a 22x 2 + ⋯ + a 2nx n
= .
⋮ ⋮ ⋱ ⋮ ⋮ ⋮
a m1 a m2 ⋯ a mn xn a m1x 1 + a m2x 2 + ⋯ + a mnx n
→ →
Note that AX is a vector in ℝ m, where m is the number of rows of A while X is a vector in ℝ n, where n is the
→ →
number of columns of A. AX is called the image of X under A.
→ →
We denote by A(ℝ n) the set of all the vectors AX as X varies in ℝ n, that is,
(2.2.1)
A(ℝ n) = { AX ∈ ℝ
→ →
m: X ∈ ℝn . }
By Definition 2.2.1 , we see
… 1/22
2.2 Product of two matrices
(2.2.2)
)( )
→
(
→
r1 ⋅ X
a 11x 1 + a 12x 2 + ⋯ + a 1nx n
→
a 21x 1 + a 22x 2 + ⋯ + a 2na x →
→ r2 ⋅ X
AX = = ,
⋮
⋮
a m1x 1 + a m2x 2 + ⋯ + a mnx n
→
→
rm ⋅ X
→ → →
where r 1 , r 2 , ⋯, r m are the row vectors of A given in (2.1.4) .
Example 2.2.1.
( ) () ()
x1 1 0
( )
1 2 0 → → → →
Let A =
→
, X= x2 , X1 = 2 , and X = →
2 . Compute AX, AX 1, and AX 2.
3 −1 4 2
x3 1 2
Solution
→ →
Let r 1 = (1, 2, 0) and r 2 = (1, − 3, 4) be the row vectors of A. Then
… 2/22
2.2 Product of two matrices
()
x1
→
r 1 ⋅ X = (1, 2, 0) x 2
→
= (1)(x 1) + (2)(x 2) + (0)(x 3) = x 1 + 2x 2
x3
and
()
x1
→
r 2 ⋅ X = (1, − 3, 4) x 2
→
= (3)(x 1) + ( − 1)(x 2) + (4)(x 3) = 3x 1 − x 2 + 4x 3.
x3
Hence,
()( )
x1 →
( )
→
( )
1 2 0 r1 ⋅ X x 1 + 2x 2
→
AX = x2 = = .
3 −1 4 → 3x 1 − x 2 + 4x 3
x3 →
r2 ⋅ X
→
We can rewrite AX in a simple way.
… 3/22
2.2 Product of two matrices
()(
x1
→
AX = ( 1 2
3 −1 4
0
) x2
x3
=
(1)(x 1) + (2)(x 2) + (0)(x 3)
(3)(x 1) + ( − 1)(x 2) + (4)(x 3) )
=
( x 1 + 2x 2
3x 1 − x 2 + 4x 3 ) .
)( ) (
1
( ) ()
→ 1 2 0 (1)(1) + (2)(2) + (0)(1) 5
AX 1 = 2 = = .
3 −1 4 (3)(1) + ( − 1)(2) + (4)(1) 5
1
)( ) (
0
( ) ()
→ 1 2 0 (1)(0) + (2)(2) + (0)(2) 4
AX 2 = 2 = = .
3 −1 4 (3)(0) + ( − 1)(2) + (4)(2) 6
2
Example 2.2.2.
Let
( )
−4 1
() ()
5 −1 →
1 0
→
A= , X= and X 1 = .
3 0 1 4
2 4
… 4/22
2.2 Product of two matrices
→
→
Compute AX and AX 1.
Solution
( ) ( )()
−4 1 ( − 4)(1) + (1)(1) −3
→
AX =
5
3
−1
0 ()
1
1
=
(5)(1) + ( − 1)(1)
(3)(1) + (0)(1)
=
4
3
.
2 4 (2)(1) + (4)(1) 6
( )( ) ( )()
−4 1 − 4(0) + (1)(4) 4
→ 5 −1 0 (5)(0) + ( − 1)(4) −4
AX 1 = = = .
3 0 4 (3)(0) + (0)(4) 0
2 4 (2)(0) + (4)(4) 16
From (2.2.2) , a matrix and a vector can be multiplied together only when the number of columns of the
→
matrix equals the number of components of the vector. Hence, if the vectors →r i and X have a different
→
→
number of components, then the scalar product r i ⋅ X is not defined.
Example 2.2.3.
Let
… 5/22
2.2 Product of two matrices
()
1
( ) ()
−1 1 1
()
→ → 0 → 1
A= 2 −1 , X = 1 , X2 = , and X 3 = .
1 0 2
3 1 −1
4
→
Determine whether A X i is defined for i = 1, 2, 3.
Solution
→ → →
AX 2 is defined, but AX 1 and AX 3 are not defined.
Theorem 2.2.1.
, X, Y ∈ ℝ n and α, β ∈ ℝ.
→ →
Let A be the same as in (2.1.1)
Then
→ → →
ˉ + β(AY).
A(αX + βY) = α(AX)
Example 2.2.4.
Let
( ) () ()
1 1 0 1 0
→ →
A= 2 0 −1 , X = 2 , and Y = 1 .
1 0 1 −1 −1
→ →
Compute A(2X + 3Y).
… 6/22
2.2 Product of two matrices
Solution
→ → → →
A(2X + 3Y) = 2(AX) + 3(AY)
( )( ) ( )( )
1 1 0 1 1 1 0 0
= 2 2 0 −1 2 + 3 2 0 −1 1
1 0 1 −1 1 0 1 −1
() ( ) () ( ) ( )
3 1 6 3 9
= 2 3 +3 1 = 6 + 3 = 9 .
0 −1 0 −3 −3
→ → →
→
Let a 1, a 2, ⋯, a n be the column vectors of A given in (2.1.2) . By (1.4.6) and (2.2.2) , AX can be
expressed by a linear combination of these column vectors:
(2.2.3)
→ → →
→
AX = x 1 a 1 + x 2 a 2 + ⋯ + x n a n .
Example 2.2.5.
( ) ()
1 2 3 2
→ →
Let A = 4 5 6 and X = 3 . Write AX as a linear combination of the column vectors of A.
7 8 9 4
Jax]/jax/o
Solution
… 7/22
2.2 Product of two matrices
() () ()
1 2 3
→
AX = 2 4 +3 5 +4 6 .
7 8 9
The following result gives the relation between A(ℝ n) and the spanning space.
Theorem 2.2.2.
{ }
→ → →
Let S = a 1, a 2, ⋯, a n be the set of the column vectors of A given in (2.1.2) . Then
(2.2.4)
A(ℝ n) = span S.
Proof
→ → → →
Let b = (b 1, b 2, ⋯, b m) T ∈ A(ℝ n). By (2.2.1) , there exists a vector X ∈ ℝ n such that TX = b. By
(2.2.3) and (1.4.8) ,
→ → →
→ →
b = AX = x 1a 1 + x 2a 2 + ⋯ + x na n ∈ span S.
→
This shows that A(ℝ n) is a subset of span S. Conversely, if b ∈ span S, then by Theorem 1.4.3 , there
→
exists a vector X ∈ ℝ n such that
→ → →
→
b = x 1a 1 + x 2a 2 + ⋯ + x na n.
… 8/22
2.2 Product of two matrices
→ →
This, together with (2.2.3) and (2.2.1) , implies b = AX ∈ A(ℝ n). This shows that span S is a subset
of A(ℝ n). We have shown that A(ℝ n). is a subset of span S and span S is a subset of A(ℝ n). Hence,
(2.2.4) holds.
( )
b 11 b 12 ⋯ b 1r
b 21 b 22 ⋯ b 2r
Bn × r =
⋯ ⋯ ⋱ ⋯
b n1 b n2 ⋯ b nr
and let
() () ()
b 11 b 12 b 1r
→ b 21 → b 22 → b 2r
b1 = , b2 = , ⋯, b r =
⋮ ⋮ ⋮
b n1 b n2 c nr
Definition 2.2.2.
… 9/22
2.2 Product of two matrices
(2.2.5)
( )
→ → → → → →
r1 ⋅ b1 r1 ⋅ b2 ⋯ r 1 ⋅ br
→ → → → → →
AB = r2 ⋅ b1 r2 ⋅ b2 ⋯ r2 ⋅ br .
⋮ ⋮ ⋱ ⋮
→ → → → → →
rm ⋅ b1 rm ⋅ b2 ⋯ rm ⋅ br
(2.2.6)
( )
→ → →
AB = Ab 1 Ab 2 ⋯ A b r
→ → → → → →
r 1 ⋅ b 1, r 1 ⋅ b 2, ⋯, r 1 ⋅ b r
10/22
2.2 Product of two matrices
→ → → → → →
r 2 ⋅ b 1, r 2 ⋅ b 2, ⋯, r 2 ⋅ b r
→ → → → → →
r m ⋅ b 1, r m ⋅ b 2, ⋯, r m ⋅ b r .
Example 2.2.6.
Let
A=
( )
1
−2 4
3
and B =
( 3 −2 0
5 6 0 )
.
(1 3)
( 3 −2 0
5 6 0 ) = ((1)(3) + (3)(5), (1)( − 2) + (3)(6), (1)(0) + (3)(0))
= (18, 16, 0)
(−2 4)
( 3 −2 0
5 6 0 ) = (( − 2)(3) + (4)(5), ( − 2)( − 2) + (4)(6), ( − 2)(0) + (4)(0))
11/22
2.2 Product of two matrices
Solution
AB =
( )( )
1
−2 4
3 3 −2 0
5 6 0
=
( (1)(3) + (3)(5) (1)( − 2) + (3)(6) (1)(0) + (3)(0)
( − 2)(3) + (4)(5) ( − 2)( − 2) + (4)(6) ( − 2)(0) + (4)(0) )
=
( )
18 16 0
14 28 0
.
Example 2.2.7.
Let
( )
4 1 4
A=
( 1 2 4
2 6 0 ) and B = 0 −1 3 .
2 7 5
12/22
2.2 Product of two matrices
( )
4 1 4
(1 2 4) 0 − 1 3 = ((1)(4) + (2)(0) + (4)(2), (1)(1) + (2)( − 1) + (4)(7), (1)(4) + (2)(3) + (4)(5)
2 7 5
( )
4 1 4
(2 6 0) 0 − 1 3 = ((2)(4) + (6)(0) + (0)(2), (2)(1) + (6)( − 1) + (0)(7), (2)(4) + (6)(4) + (6)(3) + (0)(5)
2 7 5
= (8, − 4, 26).
Solution
By computation, we have
)( )
4 1 4
AB =
( 1 2 4
2 6 0
0 −1 3
2 7 5
=
( (1)(4) + (2)(0) + (4)(2) (1)(1) + (2)( − 1) + (4)(7) (1)(4) + (2)(3) + (4)(5)
(2)(4) + (6)(0) + (0)(2) (2)(1) + (6)( − 1) + (0)(7) (2)(4) + (6)(3) + (0)(5) )
=
( 12 27 30
8 − 4 26 ) .
13/22
2.2 Product of two matrices
Example 2.2.8.
Let
( )
7 −1 4 7
A=
( 2 0 −3
4 1 5 ) and B = 2
−3
5
1
0 −4 .
2 3
We always follow the order like the above two examples to compute AB, but we do not need to mention
the order every time when we compute AB.
Solution
)( )(
7 −1 4 7
AB =
( 2 0 −3
4 1 5
2
−3
5
1
0 −4
2 3
=
23 − 5
15 6
2 5
26 39 )
.
→
In (2.2.6) , for each i = 1, 2, ⋯, r, the product A b i requires that the number of columns of the matrix A be
→
equal to the number of components of the vector b i . Hence, the product of two matrices A and B requires
that the number of columns of A be equal to the number of rows of B. Symbolically,
(2.2.7)
(m × n)(n × r) = m × r.
14/22
2.2 Product of two matrices
The inner numbers must be the same and the outer numbers give the size of the product AB. Hence, if the
inner numbers are not the same, the product AB is not defined.
Example 2.2.9.
1. Let A, B, and C be matrices such that AB = C. Assume that the sizes of A and C are 5 × 3 and
5 × 4. respectively. Find the size of B.
2. Let
( )
1 0
A= 2 1
3 0
and B =
( )
1 2
3 4
.
3. Let A =
( )
1
−2 4
3
and B =
( )
3 −2
5 6
. Are AB and BA defined?
Solution
1. Because AB = C and the sizes of A and C are 5 × 3 and 5 × 4, respectively, by (2.2.7) , the size
of B is 3 × 4,
2. The size of A is 3 × 2 and the size of B is 2 × 2. Hence, AB is defined because the number of
columns of A and the number of rows of B are the same and equal 2.
( )( ) ( )
1 0 1 2
1 2
AB = 2 1 = 5 8 .
3 4
3 0 3 6
15/22
2.2 Product of two matrices
BA is not defined because the number of columns of B is 2 and the number of rows of A is 3 and
they are not the same.
3. AB and BA are defined because their sizes are 2 × 2.
AB =
( )( ) (
1
−2 4
3 3 −2
5 6
=
18 16
14 28 )
and
BA =
( )( ) (
3 −2
5 6
1
−2 4
3
=
7
− 7 39
1
) .
Hence, AB ≠ BA.
Following (2.2.6) , the following properties can be easily proved. The proofs are left to the reader.
Theorem 2.2.3.
A(BC) = (AB)C.
A(B + C) = AB + AC.
16/22
2.2 Product of two matrices
(A + B)C = AC + BC.
3. (AB) T = B TA T.
Example 2.2.10.
Let
( )
0 −2 1
A=
( ) (
1 −3
0 2
, B=
2 −1 4
3 1 5 ) , and C = 4
−5
3
0
2 .
6
Solution
17/22
2.2 Product of two matrices
)( )(
0 −2 1
BC =
( 2 −1 4
3 1 5
4
−5
3
0
2
6
=
− 24 − 7 24
− 21 − 3 35 )
A(BC) =
( )( ) (0 )
1 −3
2
− 24 − 7 24
− 21 − 3 35
=
39 2
− 42 − 6
− 81
70
AB =
( )( ) (
1 −3
0 2) 2 −1 4
3 1 5
=
− 7 − 4 − 11
6 2 10
)( )(
0 −2 1
(AB)C =
( − 7 − 4 − 11
6 2 10
−5
4 3
0
2
6
=
39
− 42 − 6
2 − 81
70 )
Hence, A(BC) = (AB)C.
( )( ) ( )( )
2 3 2−9 0+6 −7 6
1 0
B TA T = −1 1 = −1 − 3 0+2 = −4 2 .
−3 2
4 5 4 − 15 0 + 10 − 11 10
( )
−7 6
(AB) T =
( − 7 − 4 − 11
6 2 10 ) T
= −4 2 .
− 11 10
Hence, (AB) T = B TA T.
18/22
2.2 Product of two matrices
Exercises
( ) () ()
x1 0 2
( )
2 1 1 → → → →
1. Let A = , X=
→ x2 , X1 = 1 , X = →
1 . Compute AX, AX , and AX .
1 −1 2 2 1 2
x3 2 1
( ) ()
−2 −1
()
−a → 1 →
→ →
2. 2. A = 3 1 , X= , X1 = . Compute AX and AX 1.
2a 2
0 1
()
1
( ) ()
0 1 1
()
→ → 1 → 1
3. Let A = 1 − 1 , X1 = 0 , X2 = , X3 = .
0 1
2 1 −1
2
→
Determine whether A X i . is defined for each i = 1, 2, 3.
( ) () ()
0 1 0 1 3
→ → → →
4. Let A = 1 0 −1 , X = 0 , Y= 0 . Compute A(X − 3Y).
1 1 2 −1 −1
( ) ()
1 1 −1 1
→ →
5. Let A = 1 0 2 and X = − 1 . Write AX as a linear combination of the column vectors of A.
2 1 −1 0
6. We are interested in predicting the population changes among three cities in a country. It is known that
in the current year, 20% and 30% of the populations of City 1 will move to Cities 2 and 3, respectively,
10% and 20% of the populations of City 2 will move to Cities 1 and 3, respectively, and 25% and 10%
19/22
2.2 Product of two matrices
of the populations of City 3 will move to Cities 1 and 2, respectively. Assume that 200,000, 600,000,
and 500,000 people live in Cities, 1, 2, and 3, respectively, in the currently year. Find the populations
in the three cities in the following year.
7. Let A =
( ) 2
−2 3
1
and B =
( 1 −2 1
3 4 1 ) . Find AB and the sizes of A, B, and AB.
( )
2 1 −1
8. Let A =
( 0 1 −1
2 3 1 ) and B = 1
−1 2
0 1
4
. Compute AB and find the sizes of A, B, and AB.
( )
0 −1 2 1
9. Let A =
( 1 0 −2
3 2 −1 ) and B = −1
2
2
0
1 − 2 . Compute AB and find the sizes of A, B, and AB.
0 1
( )
1 1
10. 10. Let A = 1 3
3 1
and B =
( ) 2
−2 3
1
. Are AB and BA defined?
11. Let A =
( 2
−2
−1
3 ) and B =
( ) 0 −2
4 2
. Are AB and BA defined?
( )
1 −1 1
A=
( 1
−1
−1
2 ) (
, B=
0 −1 2
1 1 3 )
, and C = 0
−2
3
0
2 .
3
20/22
2.2 Product of two matrices
( )
0 1 1 1
A= 1 1 1 1 .
1 1 0 1
Now, assume that the second group has had a variety of direct contacts with five people in a third
group. The direct contacts between groups 2 and 3 are expressed by a matrix
( )
1 0 0 1 0
0 0 0 1 0
B= .
1 1 0 1 1
1 0 0 1 0
15. 15. There are three product items, P1, P2, and P3, for sale from a large company. In the first day, 5,
10, and 100 are sought out. The corresponding unit profits are 500, 400, and 20 (in hundreds of
dollars) and the corresponding unit taxes are 3, 2, and 1. Find the total profits and taxes in the first
day.
21/22
2.3 Square matrices
(2.3.1)
( )
a 11 a 12 ⋯ a 1n
a 21 a 22 ⋯ a 2n
A= .
⋮ ⋮ ⋱ ⋮
a n1 a n2 ⋯ a nn
The entries a 11, a 22, ⋯, a nn are said to be on the main diagonal of A. The sum of these entries, denoted by
tr(A), is called the trace of A, that is,
(2.3.2)
n
tr(A) = ∑ a ii = a 11 + a 22 + ⋯ + a nn.
l=1
Example 2.3.1.
… 1/11
2.3 Square matrices
Let A =
( 15 21
44 25 ) . Then A is a square matrix of order 2, the numbers 15, 25 are on the main diagonal of
Symmetric matrices
A square matrix A is said to be symmetric if
(2.3.3)
A T = A.
a ij = a ji for i, j ∈ I n.
( )
a 11 a 12 ⋯ a 1n
a 12 a 22 ⋯ a 2n
A= .
⋮ ⋮ ⋱ ⋮
a 1n a 2n ⋯ a nn
From the above matrix, we see that A is symmetric if the ith row and the ith column are the same for each
i ∈ I n.
Example 2.3.2.
… 2/11
2.3 Square matrices
)( )( )
1 4 5 1 x2 2
( 7
−3
−3
0
4 −3 0
5 0 7
x2
2
0
x
x
3
( )( )
1 4 5 2 x2 + 1 2
( )
7 −3
2 0
2 −3 0
5 0 7
0
2
1
2
2
2
Definition 2.3.1.
i. lower triangular if all the entries above the main diagonal are zero;
ii. upper triangular if all the entries below the main diagonal are zero;
iii. triangular if it is lower or upper triangular;
iv. diagonal if it is lower and upper triangular;
v. identity matrix if it is a diagonal matrix whose entries on the main diagonal are 1.
(2.3.4)
… 3/11
2.3 Square matrices
( )
1 0 ⋯ 0
0 1 ⋯ 0
In = .
⋮ ⋮ ⋱ ⋮
0 0 ⋯ 1
Example 2.3.3.
( ) ( ) ( )
1 0 0 0 0 0 1 0 0
( )
0 0
2 0
2 0 0
0 1 1
0 0 0
0 0 0
1 0 0
x 0 0
( )( )( )
1 0 1 0 0 0
0 1
0 0 0 0 0 0
0 0
0 0 1 0 0 0
( )( )( )
1 0 0 0 0 0
1 0
0 0 0 0 0 0
0 2
0 0 1 0 0 0
( )( ) (
1 1 0 1 0 1
0 2 0
1 1 1
2 0 0
0 0 1
0 0 0
0 0 0 )
5. The following are identity matrices.
( )
1 0 0 0
( )
1 0 0
I 1 = (1), I 2 =
( )
1 0
0 1
, I3 = 0 1 0 , I4 =
0 0 1
0 1 0 0
0 0 1 0
.
0 0 0 1
The following theorem gives some properties of triangular matrices. The proofs are left to the reader.
Theorem 2.3.1.
i. If A is an upper triangular (lower triangular) matrix, then AT is a lower triangular (upper triangular)
matrix.
ii. If A and B are lower triangular matrices, then AB is a lower triangular matrix.
iii. If A and B are upper triangular matrices, then AB is an upper triangular matrix.
Example 2.3.4.
Let A =
( )
1 0
2 1
and B =
( )
3 0
4 1
. Compute AB.
Solution
… 5/11
2.3 Square matrices
AB =
( )( ) ( )
1 0
2 1
3 0
4 1
=
3
10 1
0
.
The above example shows that the product of two lower triangular matrices is a lower triangular matrix.
(2.3.5)
A 0 = I n, A 2 = A ⋅ A, A 3 = A 2 ⋅ A, ⋯ , A n = A n − 1 ⋅ A.
Example 2.3.5.
Let A =
( )
1 2
1 3
. Find A 0, A 2, and A 3.
Solution
A0 =
( ) ( )
1 2
1 3
0
=
1 0
0 1
,
A2 =
( )( ) ( )
1 2
1 3
1 2
1 3
=
3
4 11
8
,
( )( ) (
A 3 = A 2A =
3
4 11
8 1 2
1 3
=
11 30
15 41 )
.
… 6/11
2.3 Square matrices
(2.3.6)
( )
a1 0 ⋯ 0
0 a2 ⋯ 0
diag(a 1,a 2,⋯ , a n) = .
⋮ ⋮ ⋱ ⋮
0 0 ⋯ an
diag(a 1, a 2, ⋯ , a n) k = diag(a 1k , a 2k , ⋯ , a nk ),
that is,
(2.3.7)
( )( )
a1 0 ⋯ 0 a 1k 0 ⋯ 0
k
0 a2 ⋯ 0 0 a 2k ⋯ 0
= .
⋮ ⋮ ⋱ ⋮ ⋮ ⋮ ⋱ ⋮
0 0 ⋯ an
0 0 ⋯ a nk
Example 2.3.6.
Find A 5 if
… 7/11
2.3 Square matrices
( )
1 0 0
A= 0 −2 0 .
0 0 2
Solution
( )( )
15 0 0 1 0 0
A5 = 0 ( − 2) 5 0 = 0 − 32 0 .
0 0 25 0 0 32
(2.3.8)
s
A r ⋅ A s = A r + s and (A r) = A rs.
Using the power of a square matrix, we introduce the notion of a matrix polynomial, which is a polynomial
with square matrices as variables. Let
P(x) = a 0 + a 1x + ⋯ + a mx m
(2.3.9)
… 8/11
2.3 Square matrices
p(A) = a 0I n + a 1A + a 2A 2 + ⋯ + a mA m,
Example 2.3.7.
Solution
( ) ( ) ( ) 1 0 −1 2 −1 2 2
2
P(A) = 4I 2 − 3A + 2A = 4 −3 +2
0 1 0 3 0 3
=
( )( )( ) ( )
4 0
0 4
−
−3 6
0 9
+
2
0 18
8
=
9
0 13
2
.
Exercises
1. For each of the following matrices, find its trace and determine whether it is symmetric.
( ) ( )
1 4 5 y x3 1
A1 =
( ) 0 −1
−1 0
, A2 = 4 − 3 − 1 , A3 =
5 0 7
x3
1
y
x
x .
z
2. Identify which of the following matrices are upper triangular, diagonal, triangular, or an identity matrix.
… 9/11
2.3 Square matrices
( ) ( )
1 0 0 0 0 0
A1 =
( )
0 2
0 0
A2 = 2 0 0
0 0 1
A3 = 0 0 0
0 0 0
( ) ( ) ( )
1 0 x 1 0 1
0 1
A4 = 1 0 0 A5 = A6 = 0 0 0
1 0
x 0 0 0 0 1
( ) ( ) ( )
1 0 0 1 0 0
2 0
A7 = A8 = 0 0 0 A9 = 0 2 0
0 2
2 0 1 −1 1 1
( ) ( ) ( )
1 0 1 0 0 0 0 1
A 10 = 2 0 0 A 11 = 0 0 A 12 = 0 1 0
1 0 1 0 0 1 0 0
3. Let A =
( )
1
−1 1
0
. Find A 0, A 2, A 3, and A 4.
( )
2 0 0
4. Let A = 0 − 2 0 . Find A 6.
0 0 3
10/11
2.3 Square matrices
( ) ( )
3 1 0 6 −2 −1
A= 3 −2 3 and B = 3 0 2 .
4 5 −1 −1 3 4
7. Assume that the population of a species has a life span of less than 3 years. The females are divided
into three age groups: group 1 contains those of ages less than 1, group 2 contains those of age
between 1 to 2, and group 3 contains those of age 2. Suppose that the females in group 1 do not give
birth and the average numbers of newborn females produced by one female in group 2 and 3,
respectively, are 3 and 2. Suppose that 50% of the females in group 1 survive to age 1 and 80% of the
females in group 2 live to age 2. Assume that the numbers of females in groups 1, 2, and 3 are
X 1, X 2, and X 3, respectively.
i. Find the number of females in the three groups in the following year.
ii. Find the number of females in the three groups in n years.
iii. If the current population distribution (x 1, x 2, x 3) = (1000,500,100), then use result 2 to predict the
population distribution in 2 years.
11/11
2.4 Row echelon and reduced row echelon matrices
Example 2.4.1.
( )
1 0 0
In the matrix 2 1 0 , the first two rows are nonzero rows and the last row is a zero row.
0 0 0
The first nonzero entry (starting from the left) in a nonzero row is called the leading entry of the row. If a leading entry is 1, it is called a leading one.
Example 2.4.2.
( )
1 6 0 4
0 2 0 5
A= .
3 0 0 10
0 0 0 0
Solution
The numbers 1, 2, and 3 are the leading entries of A.
Definition 2.4.1.
An m × n matrix A is said to be a row echelon matrix if it satisfies the following conditions (1) and (2); or to be a reduced row echelon matrix if it satisfies the following
conditions (1), (2), (3), and (4).
1. Each nonzero row (if any) lies above every zero row, that is, all zero rows appear at the bottom of the matrix.
2. For any two successive nonzero rows, the leading entry in the lower row occurs further to the right than the leading entry in the higher row.
3. Each leading entry is 1.
4. For each column containing a leading 1, all entries above the leading 1 are zero.
Remark 2.4.1.
i. For a row echelon matrix, if a column contains a leading entry, then by condition (2), all entries below the leading entry of that column must be zero.
ii. A reduced row echelon matrix must be a row echelon matrix.
iii. For a reduced row echelon matrix, if a column contains a leading 1, then by conditions (2) and (4), all entries except the leading 1 of that column must be zero.
Example 2.4.3.
1/9
2.4 Row echelon and reduced row echelon matrices
)( )
1 0 0 6 0
)(
2 0 1
( )(
1 0
0 1
,
2 0 3 5
0 0 0 0
, 0 1 0 ,
0 0 0
0 0 2 3 0
0 0 0 0 1
.
0 0 0 0 0
( )( )( )( )
1 0 0 0 0 1 0 0 1 0 1 0 0 0
0 0 1 , 0 0 0 , 1 0 0 0 , 0 0 0 0 .
0 1 0 0 0 1 0 0 0 0 0 0 0 1
( )( )
1 0 0 1 1 0 0 6 0
0 1 0 0
0 0 1 0
,
0 0 1 3 0
0 0 0 0 1
,
( 0 1 0 0 −1
0 0 0 1 2 ).
0 0 0 0 0 0 0 0 0
4. The following matrices are row echelon matrices, but not reduced row echelon matrices:
( )( )(
1 0 2 6 0
2 0 1
0 1 0 ,
0 0 0
0 0 1 3 0
0 0 0 0 1
,
1 0 0 2 0
0 0 2 1 0
.
)
0 0 0 0 0
Solution
We only consider (4). These matrices are row echelon matrices. However, the first matrix contains a leading entry 2 that is not a leading 1; column 3 of the second matrix
has the leading 1 of the second row but also contains a nonzero entry 2; and the third matrix has a leading entry that is not 1. Hence, they are not reduced row echelon
matrices.
Row operations. In many cases, we need to change a matrix to a row echelon matrix or a reduced row echelon matrix and wish that the new matrix and the original matrix
share some common properties. Thus, one can obtain properties of the original matrix by studying the row echelon matrix or the reduced row echelon matrix. The powerful
tools used to change a matrix to a row echelon matrix or reduced row echelon matrix are the following (elementary) row operations.
1. First row operation R j(c) : Multiply theith row of a matrix by a nonzero number c;
2. Second row operation R i(c) + R j : Add the ith row multiplied by a nonzero number c to the jth row;
3. Third row operation R i , j: Interchange the ith row and the jth row.
2/9
2.4 Row echelon and reduced row echelon matrices
Remark 2.4.2.
The first row operation R i(c) is used to change only the ith row, so other rows remain unchanged. The second row operation R i(c) + R j is used to change only the jth row,
so other rows including the ith row remain unchanged. The third row operation R i , j is used to interchange only the ith row and the jth row, so other rows remain
unchanged.
Now, we give some examples to show how to use the above row operations to change one matrix to another. These row operations in these examples are used to show
how to perform the row operations and are chosen for practice. There are no reasons why we choose these row operations here. You can choose other row operations for
practice as well. Later, when we want to change a matrix to a row echelon matrix or reduced row echelon matrix, we must choose suitable row operations to obtain its row
echelon matrices.
Example 2.4.4.
( )
0 1 1
Let A = 1 2 −3 .
2 1 1
( ) ( )
0 1 1 0 1 1
A= 1 2 − 3 R ( − 2)→ −2 −4 6 .
2
2 1 1 2 1 1
( ) ( )
0 1 1 0 1 1
A= 1 2 − 3 R ( − 2) + R → 1 2 −3 .
2 3
2 1 1 0 −3 7
( ) ( )
0 1 1 1 2 −3
A= 1 2 −3 R → 0 1 1 .
1,2
2 1 1 2 1 1
Example 2.4.5.
Let
( )
1 1 1
A= 2 −1 −1 .
−3 4 1
3/9
2.4 Row echelon and reduced row echelon matrices
Solution
To change the number 2 in the first column and second row of A, we use the row operation R 1( − 2) + R 2 to change the entire second row.
( ) ( )
1 1 1 1 1 1
A= 2 − 1 − 1 R ( − 2) + R → 0 −3 −3 .
1 2
−3 4 1 −3 4 1
Similarly, we use R 1(3) + R 3 to change the entire third row in order to change the number − 3 to 0.
( ) ( )
1 1 1 1 1 1
0 − 3 − 3 R (3) + R → 0 −3 −3 .
1 3
−3 4 1 0 7 4
From the above, we see that row 1 is always kept unchanged when we changed rows 2 and 3. Hence, we can combine the process as follows:
A=(1112−1−1−341)→R1(3)+R3R1(−2)+R2(1110−3−3074).
Example 2.4.6.
Let B=(1110−3−3074). Use the second row operation to change the number 7 in B to 0.
Solution
B=(1110−3−3074)R2(73)+R3→(1110−3−300−3).
A=(1112−1−1−341)→R1(3)+R3R1(−2)+R2(1110−3−3074)R2(73)+R3→(1110−3−300−3):=C.
Note that the last matrix C is a row echelon matrix of A and is obtained by performing second row operations three times.
After we intuitively get some ideas on how to change a matrix to its echelon matrix from Examples 2.4.5 and 2.4.6 , we learn how to apply row operations to find a row
echelon matrix for a given matrix. To find a row echelon matrix, we need to do several rounds of the same steps, depending on the number of leading entries.
The first round is to find the first leading entry. Here are some key steps to find the first leading entry.
Step 1. Move all zero rows, if any, to the bottom of the matrix. Look at each row to see whether there is a common factor in the row. If there is a common factor in a row, then
we would use the first row operation to eliminate the common factor. After eliminating all the common factors in the rows by using the first row operation, the resulting matrix
has entries that are smaller than or equal to the corresponding entries of the given matrix. Smaller entries make later computations easier.
4/9
2.4 Row echelon and reduced row echelon matrices
Step 2. After eliminating all the common factors, we would start to seek the first leading entry in the first nonzero column of the resulting matrix.
Case I. The first nonzero column of the resulting matrix contains number 1or −1.
i. If the number 1 or −1 is on the first row, then the number 1 or −1 can be used as the first leading entry.
ii. If the number 1 or −1 is not on the first row, then we use the third row operation to interchange one of the rows that contains the number 1 or −1 with row one. So the
number 1 or −1 is moved to the first row and can be used as the first leading entry.
Case II. The first nonzero column of the resulting matrix does not contain number 1 or −1.
a. We choose the smallest number of the first nonzero column of the resulting matrix and use the methods in (i)and (ii) to move the row containing the smallest number to
the first row. The smallest number in the first row, in general, can be used as the first leading entry.
b. In some cases, we can continue to change the smallest number to 1 or −1 by using the second row operation. Some examples will be provided later to exhibit the
smart method, for example see Example 2.4.9 .
Step 3. Note that the row one contains the first leading entry 1 or −1 or the smallest number, and there is a column that contains the first leading entry as well. If all the entries
below the first leading entry in the column are zero, then the first round is completed. If there are nonzero entries below the first leading entry in the column, then we would
use the second row operation and the row one to change each of these nonzero entries to 0. Note that all other entries in the row that contain such a nonzero number must be
changed following the second row operation. After all these nonzero entries are changed to 0, the first round is completed.
The second round is to find the second leading entry from the last matrix obtained above.
Step 4. Repeat the above Steps 1-3 to find the second leading entry below row one of the last matrix obtained in the above Step 3, and then continue the process to find the
third, fourth, etc., leading entry, if any, until you get a row echelon matrix.
Example 2.4.7.
A=(06−32−26201).
Solution
(1) Step 1. There are no zero rows in A. Look at each row to see whether there is a common factor in the row. Row 1 contains a common factor 3 and row 2 contains a
common factor 2. So, we first use the first row operation to eliminate the two common factor.
A=(06−32−26201)→R2(12)R1(13)(02−11−13201):=B.
Step 2. Check the first nonzero column of B to see if there is number 1 or −1. The column one of B is the first nonzero column, which contains the
number 1 as well. The row that contains the number 1 is not row one, so we need to move it to row one by using the third row operation. Therefore, we interchange rows 1
and 2 of B and the number 1 is the first leading entry. We use the symbol a to denote a leading entry a.
B=(02−11−13201)R1,2→(1−1302−1201):=C.
Step 3. If there are nonzero entries below the first leading entry in the column of C, then use the second row operation and row one to change each of these nonzero
entries to 0.
5/9
2.4 Row echelon and reduced row echelon matrices
We use the symbol ↓ to show the direction of changing all nonzero entries below a leading entry to zero.
C=(1−1302−1201)↓R1(−2)+R3→(1−1302−102−5):=D.
Step 4. Repeat the above steps 1-3 to find the second leading entry below row one of the last matrix D. In order to find the second leading entry of D, we consider the
following matrix below row one of D:
D1:=(02−102−5)
Step 1. Look at each row of the matrix D1 to see whether there is a common factor in the row. The answer is that there are no common factors in each row of D1
Step 2. Check the first nonzero column of D1 to see if there are nonzero numbers. The first nonzero column is the second column of D1 and the first and second rows
contains a nonzero number 2. So, the number 2 in the first row of D1 is the second leading entry for D.
D=(1−1302−102−5).
Step 3. If there are nonzero entries below the second leading entry 2 in the column, then use the second row operation and the row 2 of D to make the nonzero entries in
the column zero. In D, there is a nonzero number 2
in the third row below the second leading entry 2. We change the nonzero number 2 to 0 by using row 2 that contains the second leading entry 2.
D=(1−1302−102−5)↓R2(−1)+R3→(1−1302−100−4).
The last matrix is a row echelon matrix and the number −4 is the third leading entry.
A=(06−32−26201)→R2(12)R1(13)(02−11−13201)R1,2→(1−1302−1201)↓R1(−2)+R3→(1−1302−102−5)↓R2(−1)+R3→(1−1302−100−4).
In the following, we always follow Steps 1-3 in each round without further statement.
Example 2.4.8.
A=(1302011100241−1000031).
Solution
A=
(1302011100241−1000031)↓→R1(−2)+R3R1(−1)+R2(130200−21−200−21−5000031)↓R2(−1)+R3→(130200−21−20000−3000031)R3(13)→(130200−21−20000−1000031)↓R3(3)+R4
Remark
: 26% 2.4.3.
… 6/9
2.4 Row echelon and reduced row echelon matrices
For example, in Example 2.4.8 , we can continue to use row operations to change the above row echelon matrix to another row echelon matrix such as
(130200−21−20000−1000001)R3(−1)→(130200−21−200001000001).
The following example provides a smart method, which is mentioned in Step 2 of the first round.
Example 2.4.9.
A=(2−102030−1102−41−1051030)
Analysis: The first number 2 in the first row of A would be the first leading entry. But if we use it as the first leading entry to change all the nonzero numbers 3, 2, 5 in
column one to 0 by using the second row operations and row one, then the problem we encounter is that there are computations of fractions in rows 2 and 4. To avoid the
computation of fractions, we can try to use suitable second row operations to get 1 or −1 in the first nonzero column. For the above matrix A, we have several choices to
get 1 or −1. For example, we can use one of the following:
i. R2(−1)+R1
ii. R1(−1)+R2 and then R1, 2.
iii. R1(−2)+R4 and then R1, 4.
In the following, we use (i)toget −1 and take the number −1 as the first leading entry. We also use the smart method to choose the second and third leading entries.
Solution
A=(2−102030−111201−10510−3−1)R2(−1)+R1→(−1−111−130−111201−10510−3−1)↓
→R1(5)+R4R1(3)+R2R1(2)+R3(−1−111−10−324−20−231−20−452−6)R3(−2)+R2→
(−1−111−101−4220−231−20−452−6)↓→R2(2)+R3R2(4)+R4(−1−111−101−42200−55200−11102)R2(−2)+R4→(−1−111−101−42200−55200−10−2)R3, 4→(−1−111−101−42200−10−2
R3(−5)+R4→(−1−111−101−42200−10−2000512).
The basic idea is to use row operations to change a given matrix A to a row echelon matrix B, where we change A from its top to bottom, and then continue using row
operations to change all the nonzero numbers above the leading entries in the columns that contain the leading entries to 0, where we change B from its bottom to top. We
denote by
7/9
2.4 Row echelon and reduced row echelon matrices
the leading entries of the row echelon matrix B in order. To find the reduced row echelon matrix of A, we first use the second row operation and the row that contains the last
leading entry lr to change all the nonzero numbers above the leading entry lr from the lower one to the top one to 0. Then we do the same thing to lr−1, ⋯, l2 in order.
We have studied how to change a matrix to a row echelon matrix, so we first give examples demonstrating how to change a row echelon matrix to a reduced row echelon
matrix and then provide examples to show how to change a matrix to a reduced row echelon matrix.
Example 2.4.10.
Find the reduced row echelon matrix of the following row echelon matrix
B=(1−1302−100−4).
Solution
(1) Note that the leading entries in order are l1=1, l2=2, and l3=−4. To find the reduced row echelon matrix for the row echelon matrix, the first step is to change all the
nonzero numbers above l3=−4 to be zero. The nonzero numbers above l3=−4 are −1 and 3, which are in the second row and the first row of B, respectively. We use the
second row operations and the row 3 that contains the leading entry l3=−4 to change −1 and 3 to 0 in order. We use the symbol ↑ to show the direction of changing all
nonzero entries above a leading entry to zero.
B=(1−1302−100−4)R3(−14)→(1−1302−1001)↑→R3(−3)+R1R3(1)+R2(1−10020001).
(2) The second step is to change the nonzero number above l2=2 to 0 by using row 2 that contains l2=2.
(1−10020001)R2(12)→(1−10010001)↑R2(1)+R1→(100010001).
Example 2.4.11.
A=(1302011100241−1000031).
Solution
The row echelon matrix B of A is given in Example 2.4.8 . We change the echelon matrix B to the reduced echelon matrix.
B=
(130200−21−20000−1000001)R3(−1)→(130200−21−200001000001)↑→R3(−2)+R1R3(2)+R2(130000−21000001000001)R2(−12)→(1300001−12000001000001)↑R2(−3)+R1→(1032
Example 2.4.12.
A=(210130314213).
8/9
2.4 Row echelon and reduced row echelon matrices
Solution
A=
(210130314213)R2(−1)+R1→(−11−3030314213)↓→R1(4)+R3R1(3)+R2(−11−3003−6106−113)↓R2(−2)+R3→(−11−3003−610011)↑→R3(3)+R1R3(6)+R2(−110303070011)R2(13)→(
In Example 2.4.12 , we employ the smart method used in Example 2.4.9 . We use the row operation R2(−1)+R1 to change row one of A in order to change the leading
entry in row one to −1.
By Remark 2.4.3 , we know that the row echelon matrices of a matrix are not unique. But its row echelon matrix is unique. We give this result below without proof.
Theorem 2.4.1.
Any m×n matrix can be changed using row operations to a unique reduced row echelon matrix.
Exercises
1. Which of the following matrices are row echelon matrices?
A=(1000)B=(9435900074)C=(250090008)D=(30160008860000100000)E=(100001010)F=(141000066)G=(081729800000)H=(000040580961)
3. Let A=(10121−1−32−2). Use the second row operation to change the numbers 2 and −3 in the first column of A to zero.
4. Let A=(10101−3031). Use the second row operation to change the number 3 in A to 0.
5. For each of the following matrices, find its row echelon matrix. Clearly mark every leading entry a by using the symbol a and use the symbol ↓ to show the direction of
eliminating nonzero entries below leading entries.
A=(10121−1−32−2)B=(01−21−11−210)C=(210−130325−2−13)
6. For each of the following matrices, find its reduced row echelon matrix. Clearly mark every leading entry a by using the symbol a and use the symbols ↓ and ↑ to show
the directions of eliminating nonzero entries below or above leading entries.
A=(2−4201−300−6)B=(11010036−3−6000−2200002)C=(06−32−26201)D=(210130325−2−13)E=(100−11−2112−212)F=(1121111−22211)G=
(10111−312−2−5−2−148031−3−70)
9/9
2.5 Ranks, nullities, and pivot columns
Definition 2.5.1.
1. The total number of leading entries of the row echelon matrix B is called the rank of A, denoted by
r(A).
2. The number n − r(A) is called the nullity of A, denoted by null(A), that is,
(2.5.1)
null(A) = n − r(A).
3. The columns of A corresponding to the columns of B containing the leading entries are called the
pivot columns (or pivot column vectors) of A.
According to Definition 2.5.1 , the rank of A is equal to the rank of each of its row echelon
matrices. The pivot columns of A will be used in Section 7.3 to find the basis of column space
of A (see Theorem 7.3.1 ).
By Definitions 2.4.1 and 2.5.1 , we obtain the following result.
Theorem 2.5.1.
1/9
2.5 Ranks, nullities, and pivot columns
Let A be an m × n matrix and let B be one of its row echelon matrices. Then the following numbers are
equal.
1. The rank of A.
2. The number of leading entries of B.
3. The number of rows of B containing the leading entries.
4. The number of columns of B containing the leading entries.
Remark 2.5.1.
By Theorem 2.5.1 (3), we see that the number of zero rows of B equals m − r(A), and if A contains q
zero rows, then r(A) ≤ m − q. By Theorem 2.5.1 (4), if A contains k zero columns, then r(A) ≤ n − k. .
The following result gives the relation between the rank of a matrix and its size.
Theorem 2.5.2.
Let A be an m × n matrix. Then r(A) ≤ min{m, n}, where min{m, n} represents the minimum of m and n.
Proof
By Theorem 2.5.1 (3), the number of rows of B containing the leading entries is smaller than or equal
to m. By Theorem 2.5.1 (1)-(3), r(A) ≤ m. Similarly, by Theorem 2.5.1 (4), the number of columns
of B containing the leading entries is smaller than or equal to n, and thus r(A) ≤ n. This implies that r(A)
is smaller than or equal to the minimum of m and n.
Example 2.5.1.
Let
2/9
2.5 Ranks, nullities, and pivot columns
( )
0 4 2
( )
1 −1 3 2
A= 0 12 1 B=
2 −2 6 4
1 2 −1
−2 −4 2(C=
) 0 2 1
1 1 1
2 0 1
Solution
Because A is a 3 × 4 matrix, by Theorem 2.5.2 , we have
r(A) ≤ min{3, 4} = 3.
Theorem 2.5.3.
Let A be an m × n matrix. Assume that the last k rows of A are zero rows. Then r(A) ≤ m − k.
Proof
Let C be the (m − k) × n matrix consisting of rows 1 to (m − k) of A. We use row operations to change C to
a reduced row echelon matrix denoted by F. Let B be the m × n matrix whose first m − k rows are the
same as F and other rows are zero rows. Then B is the row echelon matrix of A and by Theorem
2.5.2 , r(B) ≤ m − k. . It follows that r(A) ≤ m − k.
Definition 2.5.2.
3/9
2.5 Ranks, nullities, and pivot columns
Two m × n matrices A and B are said to be row equivalent if B can be obtained from A by a finite
sequence of row operations.
The following result shows that the ranks of two row equivalent matrices are the same.
Theorem 2.5.4.
The method of finding the rank of a matrix is to follow Steps 1-3 given in Section 2.4 for each round until
a row echelon matrix is obtained, and then to count the leading entries of the row echelon matrix.
Example 2.5.2.
For each of the following matrices, find its rank, nullity, and pivot columns.
( )
0 4 2 1
( )
1 0 0 0
A= 0 1 0 0
0 0 0 0
B=
( 1
2 −1
2 −1
3 ) C=
0 2 1 1
1 1 1 4
2 0 1 3
Solution
Because A itself is a reduced row echelon matrix, we do not need to use any row operations. Note that A
has 2 leading entries, r(A) = 2. Because A is a 3 × 4 matrix, null(B) = 3 − 2 = 1. Because the columns 1
()()
1 0
and 2 of A contain the leading entries, the column vectors 0 , 1 are the pivot columns of A.
0 0
4/9
2.5 Ranks, nullities, and pivot columns
Because B is not a row echelon matrix, we need to use row operations to find its row echelon matrix.
B=
( 1
2 −1
2 −1
3 )↓ R 1( − 2) + R 2→
( 1
0 −5
2 −1
5 ) : = B *.
The last matrix B* is a row echelon matrix of B and there are two leading entries. Hence, r(B) = 2. Because B
is a 2 × 3 matrix, null(B) = 3 − 2 = 1.
Because the columns 1 and 2 of B* contain the leading entries, the columns 1 and 2 of B :
()( )
1
2
,
2
−1
are
5/9
2.5 Ranks, nullities, and pivot columns
C =
( ) ( )↓
0 4 2 1
0 2 1 1
1 1 1 4
2 0 1 3
R 1, 3→
1 1 1 4
0 2 1 1
0 4 2 1
2 0 1 3
R 1( − 2) + R 4→
1
0
0
↓
( ) ( )
1
2
4
↓
1
1
2
0 −2 −1 −5
4
1
1
R2 ( − 2 ) + R3
→
R2 ( 1 ) + R4
1 1 1
0 2 1
0 0 0 −1
0 0 0 −4
4
1
( )
1 1 1 4
0 2 1 1
R 3( − 4) + R 4→ : = C *.
0 0 0 −1
0 0 0 0
The last matrix is a row echelon matrix of C and there are three leading entries. Hence, r(C) = 3 and
null(C) = 4 − 3 = 1. Because the columns 1, 2, 4 of C* contain the leading entries, the columns 1, 2, 4 of C:
()()()
0 4 1
0 2 1
, ,
1 1 4
2 0 3
6/9
2.5 Ranks, nullities, and pivot columns
We state the following result without proof. The result shows that the rank of A T is equal to the rank of A.
Theorem 2.5.5.
1. r(A T) = r(A);
2. null(A T) = m − r(A).
For some matrices, you can find the rank of A T and then use Theorem 2.5.5 to get the rank of A.
Example 2.5.3.
( )
1 0 0 0
1 2 0 0
A= .
1 1 1 −1
2 −1 0 0
Solution
7/9
2.5 Ranks, nullities, and pivot columns
AT =
1 1
0 2
0 0
↓
( ) ( )
0 0 −1
1
1
1
2
−1
0
0
R 3(1) + R 4→
1 1 1
0 2 1 −1
0 0 1
0 0 0
2
0
0
.
The last matrix is a row echelon matrix and contains three leading entries. Hence, r(A T) = 3. By Theorem
2.5.5 , r(A T) = r(A T) = 3.
In Example 2.5.3 , if we use row operations on A to find the rank of A, we would have needed to use five
row operations while we only need to use one row operation on A T to get the rank of A.
Exercises
1. For each of the following matrices, find its rank, nullity, and pivot columns. Clearly mark every leading
entry a by using the symbol a and use the symbol ↓ to show the direction of eliminating nonzero
entries below leading entries.
8/9
2.5 Ranks, nullities, and pivot columns
( )
1 4 2 1
( )
1 0 0 1
A = 0 −1 0 0
0 0 0 0
B=
( 0
2 −1
2 −1
3 ) C=
0 2 1
1 0 1
1
4
2 0 1 −3
( )( )
−1 1 0
1 −2 −1 0
1 −2 0
D = E= −2 1 1 0
2 1 1
0 0 0 1
0 −1 0
( )( )
1 −1 0 1 0 2 −1 0 −1
0 1 0 1 −2 0 3 2 7
F = G=
1 0 0 2 −2 3 0 1 2
0 0 −1 −1 5 5 −1 1 1
9/9
2.6 Elementary matrices
Definition 2.6.1.
An m × m matrix is called an elementary matrix if it can be obtained by performing a single row operation
to the m × m identity matrix.
Example 2.6.1.
( ) ( ) ( )
2 0 0 1 0 0 1 0 0
E1 = 0 1 0 E2 = −2 1 0 E3 = 0 0 1
0 0 1 0 0 1 0 1 0
( )( )( )
1 0 0 0 1 0 4 0 1 0 0 0
0 1 0 0 0 1 0 0 0 0 0 1
E4 = E5 = E6 =
0 0 3 0 0 0 1 0 0 0 1 0
0 0 0 1 0 0 0 1 0 1 0 0
Solution
… 1/14
2.6 Elementary matrices
( ) ( )
1 0 0 2 0 0
I3 = 0 1 0 R (2)→ 0 1 0 = E 1.
1
0 0 1 0 0 1
( ) ( )
1 0 0 1 0 0
I3 = 0 1 0 R 1( − 2) + R 2→ −2 1 0 = E 2.
0 0 1 0 0 1
( ) ( )
1 0 0 1 0 0
I3 = 0 1 0 R → 0 0 1 = E 3.
2, 3
0 0 1 0 1 0
The matrices E 1, E 2, and E 3 are 3 × 3 elementary matrices. Similarly, E 4, E 5, and E 6 are obtained by
using the row operations: R 3(3), R 3(4) + R 1, and R 2 , 4 to I 4, , respectively, and so they are elementary
matrices.
The elementary matrices E 1 − E 6 in Example 2.6.1 can be changed back to the corresponding identity
matrices by performing a suitable single row operation. For example,
( ) ( )
2 0 0 1 0 0
1
E1 = 0 1 0 R ( )→ 0 1 0 = I 3.
1 2
0 0 1 0 0 1
… 2/14
2.6 Elementary matrices
( ) ( )
1 0 0 1 0 0
E2 = − 2 1 0 R (2) + R → 0 1 0 = I 3.
1 2
0 0 1 0 0 1
( ) ( )
1 0 0 1 0 0
E3 = 0 0 1 R 2 , 3→ 0 1 0 = I 3.
0 1 0 0 0 1
From the above, we see that the row operations are invertible. Hence, we introduce the definition of inverse
row operation.
Definition 2.6.2.
i. R i
()
1
c
is the inverse row operation of R i(c).
We use the symbols E i(c), E i ( c ) + j, and E i , j to denote the elementary matrix obtained by performing the row
operation R i(c) + R (c) + R , and R i , j, respectively, to the m × m identity matrix I. Then we have
i j
i. I m R i(c)→ E i(c) R i
()1
c
→ I m.
ii. I m R i(c) + R j→ E i ( c ) + j R 1( − c) + R j→ I m.
iii. I m R i , j→ Ei , j R i , j → I m.
… 3/14
2.6 Elementary matrices
Example 2.6.2.
( )
1 0
0 1
R 2(5)→
( ) () ( )
1 0
0 5
R2
1
5
→
1 0
0 1
.
( )
1 0
0 1
R 1(2) + R 2→
( )1 0
2 1
R 1( − 2) + R 2→
( ) 1 0
0 1
.
( ) 1 0
0 1
R 1 , 2→
( )
0 1
1 0
R 1 , 2→
( )1 0
0 1
.
It is easy to show
(2.6.1)
E i(c)E i ()1
c
= I m; E i ( c ) + j E i
()
1
c
+j = I m, E i , j E i , j = I m.
The following result shows that if a single row operation is used on an m × n matrix A, then the resulting
matrix is equal to the product of the elementary matrix E and A, where E is the matrix obtained by performing
the same row operation on the n × n identity matrix In.
Theorem 2.6.1.
… 4/14
2.6 Elementary matrices
Proof
( )( )( )
1 ⋯ 0⋯ 0 a 11 ⋯ a 1i⋯ a 1n a 11 a 12 ⋯ a 1n
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
i. 0 ⋯ c⋯ 0 a i1 ⋯ a ii⋯ a in = ca i1 ca i2 ⋯ ca in .
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
0 ⋯ 0⋯ 1 a m1 ⋯ a mi⋯ a mn a m1 a m2 ⋯ a mn
( )( )
1 ⋯ 0⋯ 0⋯ 0 a 11 ⋯ a i1⋯ a 1j⋯ a 1n
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
0 ⋯ 1… 0⋯ 0 a i1 ⋯ a ii… a ij⋯ a in
ii. ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
0 ⋯ c⋯ 1⋯ 0 a j1 ⋯ a ji⋯ a jj⋯ a jn
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
0 ⋯ 0⋯ 0⋯ 1 a m1 ⋯ a mi⋯ a mj⋯ a mn
… 5/14
2.6 Elementary matrices
( )
a 11 ⋯ a 1i⋯ a 1j⋯ a 1n
⋮ ⋮ ⋮ ⋮ ⋮
a i1 ⋯ a ii⋯ a ij⋯ a in
= ⋮ ⋮ ⋮ ⋮ ⋮ .
ca i1 + a j1 ⋯ ca ii + a ji⋯ ca ij + a jj⋯ ca in + a jn
⋮ ⋮ ⋮ ⋮ ⋮
a m1 ⋯ a mi⋯ a mj⋯ a mn
( )( )
1 ⋯ 0⋯ 0⋯ 0 a 11 ⋯ a 1i⋯ a 1j⋯ a 1n
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
0 ⋯ 0⋯ 1⋯ 0 a i1 ⋯ a ii⋯ a ij⋯ a in
iii. ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
0 ⋯ 1⋯ 0⋯ 0 a j1 ⋯ a ji⋯ a jj⋯ a jn
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
0 ⋯ 0⋯ 0⋯ 1 a m1 ⋯ a mi⋯ a mj⋯ a mn
… 6/14
2.6 Elementary matrices
( )
a 11 ⋯ a 1i⋯ a 1j⋯ a 1n
⋮ ⋮ ⋮ ⋮ ⋮
a j1 ⋯ a ji⋯ a jj⋯ a jn
= ⋮ ⋮ ⋮ ⋮ ⋮ .
a i1 ⋯ a ii⋯ a ij⋯ a in
⋮ ⋮ ⋮ ⋮ ⋮
a m1 ⋯ a mi⋯ a mj⋯ a mn
Example 2.6.3.
Let
( ) ( )
1 0 0 0 1 1
I= 0 1 0 and A = 1 2 −3 .
0 0 1 2 1 1
i. Apply R 2( − 2) to I and A:
… 7/14
2.6 Elementary matrices
( ) ( )
1 0 0 1 0 0
I = 0 1 0 R ( − 2)→ 0 −2 0 = E1
2
0 0 1 0 0 1
( ) ( )
0 1 1 0 1 1
A = 1 2 − 3 R ( − 2)→ −2 −4 6 = B 1.
2
2 1 1 2 1 1
Show that E 1A = B 1.
ii. Apply R 2( − 2) + R 3 to I and A:
( ) ( )
1 0 0 1 0 0
I = 0 1 0 R 2( − 2) + R 3→ 0 1 0 = E2
0 0 1 0 −2 1
( ) ( )
0 1 1 0 1 1
A = 1 2 − 3 R 2( − 2) + R 3→ 1 2 −3 = B 2.
2 1 1 0 −3 7
Show that E 2A = B 2.
iii. Apply R 1 , 2 to I and A:
… 8/14
2.6 Elementary matrices
( ) ( )
1 0 0 0 1 0
I = 0 1 0 R → 1 0 0 = E3
1, 2
0 0 1 0 0 1
( ) ( )
0 1 1 1 2 −3
A = 1 2 −3 R → 0 1 1 = B 3.
1, 2
2 1 1 2 1 1
Show that E 3A = B 3.
Solution
( )( ) ( )
1 0 0 0 1 1 0 1 1
E 1A = 0 −2 0 1 2 −3 = −2 −4 6 = B 1.
0 0 1 2 1 1 2 1 1
( )( ) ( )
1 0 0 0 1 1 0 1 1
E 2A = 0 1 0 1 2 −3 = 1 2 −3 = B 2.
0 −2 1 2 1 1 0 −3 7
( )( ) ( )
0 1 0 0 1 1 1 2 −3
E 3A = 1 0 0 1 2 −3 = 0 1 1 = B 3.
0 0 1 2 1 1 2 1 1
… 9/14
2.6 Elementary matrices
By using Theorem 2.6.1 , we can prove the following result, which gives the relation between a matrix and
the resulting matrix obtained by using multiple row operations on the matrix.
Theorem 2.6.2.
If we use k row operations on an m × n matrix A and denote the resulting matrix by B,then
B = E kE k − 1⋯E 2E 1A,
Proof
We denote the row operations by r 1, r 2, ⋯ , r k in order. Then
A r1 → E 1A r2 ( )
→ E 2(E 1A) = E 2E 1A ⋯ r k→ E k E k − 1⋯E 2E 1A = B.
To understand Theorem 2.6.2 , we give the following example. We use two row operations to change A to
B as follows:
( ) ( ) ( )
0 2 −1 1 −1 3 1 −1 3
A= 1 −1 3 R 1 , 2→ 0 2 − 1 R ( − 2) + R → 0 2 −1 = B.
1 3
2 0 1 2 0 1 0 2 −5
10/14
2.6 Elementary matrices
( ) ( )
1 0 0 0 1 0
I= 0 1 0 R → 1 0 0 = E 1.
1, 2
0 0 1 0 0 1
( ) ( )
1 0 0 1 0 0
I= 0 1 0 R 1( − 2) + R 3→ 0 1 0 = E 2.
0 0 1 −2 0 1
By Theorem 2.6.2 , we obtain B = E 2E 1A. In the following, we directly verify that the result B = E 2E 1A
holds.
( )( )( )
1 0 0 0 1 0 0 2 −1
E 1 E 2A = 0 1 0 1 0 0 1 −1 3
−2 0 1 0 0 1 2 0 1
( )( ) ( )
1 0 0 1 −1 3 1 −1 3
= 0 1 0 0 2 −1 = 0 2 −1 = B.
−2 0 1 2 0 1 0 2 −5
We end this section with the ranks of elementary matrices. By Theorem 2.5.4 , we have the following
result.
Corollary 2.6.1.
11/14
2.6 Elementary matrices
Proof
Because E and I are row equivalent and r(I) = m, it follows from Theorem 2.5.4 that r(E) = m.
By Theorem 2.6.2 , we obtain the following result, which shows that the rank of the product of finitely
many elementary matrices is n.
Theorem 2.6.3.
Proof
By Theorem 2.6.2 , E and E1 are row equivalent. By Theorem 2.5.4 and Corollary 2.6.1 ,
r(E) = r(E 1) = m.
Exercises
1. Determine which of the following matrices are elementary matrices.
1.
( 2 0
0 1 )
2.
( 0 1
1 0 )
3.
( 1 5
0 1 )
4.
( 2 0
0 2 )
12/14
2.6 Elementary matrices
5.
( )
0 2
2 0
6.
( )
1
−2 1
0
( )
0 0 1
7. 0 1 0
1 0 0
( )
1 0 0
8. −1 1 0
0 0 1
( )
1 0 0
9. −1 1 0
1 0 1
( )
1 0 0 0
0 1 0 0
10.
0 0 5 0
0 0 0 1
( )
1 0 0 0
0 1 0 0
11.
0 0 0 0
0 0 0 1
13/14
2.6 Elementary matrices
( )
1 0 0 0
0 0 0 1
12.
0 0 1 0
0 1 0 0
2. Use row operations to change the above matrices (1), (2), (3), (6), (7), (8), (10), and (12) to an identity
matrix.
3. Let I =
( )
1 0
0 1
. Verify the following assertions.
()
1. E 2(5)E 2
1
5
= I;
2. E 1 ( 2 ) + 2E 1
()1
2
+2 = I;
3. E 1 , 2E 1 , 2 = I.
( ) ( )
0 2 −1 1 −1 3
A= 1 −1 3 and B = 0 2 −1 .
2 0 1 0 0 −4
14/14
2.7 The inverse of a square matrix
Definition 2.7.1.
AB = I n,
then A is said to be invertible (or nonsingular) and B is called the inverse of A. We write B = A − 1. If A is not invertible, then A is said to be singular.
We only introduce the notion of inverse matrices for square matrices. Therefore, if a matrix is not a square matrix, then it has no inverses.
By Definition 2.7.1 , if A is invertible, then AA − 1 = I n. Moreover, A is not invertible if and only if for each n × n matrix B, AB ≠ I n.
We state the following theorem without proof, which provides some equivalent results for inverse matrices.
Theorem 2.7.1.
i. AB = I n.
ii. BA = I n.
iii. AB = BA = I n.
By Theorem 2.7.1 , we can show that if A is invertible, then its inverse is unique. Indeed, assume that there are two n × n matrices B 1 and B 2 satisfying AB 1 = I n and
AB 2 = I n. Then, by Theorem 2.7.1 (ii), B 1A = I n. Hence,
B 1 = B 1I n = B 1(AB 2) = (B 1A)B 2 = I nB 2 = B 2.
Example 2.7.1.
A=
( 2
−1
−5
3 ) and B =
( )
3 5
1 2
.
(2): 100%
Show that and ( ) ( )
0 0
0 0
1 1
1 1
are not invertible.
… 1/12
2.7 The inverse of a square matrix
Solution
(1) Because AB =
( 2
−1
−5
3 )( ) ( )
3 5
1 2
=
1 0
0 1
= I 2, by Definition 2.7.1 , B is the inverse of A.
(2) Let
( )
b 11 b 12
b 21 b 22
be a 2 × 2 matrix. Because
( )( )( )( )
0 0 b 11 b 12 0 0 1 0
= ≠ ,
0 0 b 21 b 22 0 0 0 1
by Definition 2.7.1 ,
( )
1 0
0 1
is not invertible. Because
( )( )( )( )
1 1 b 11 b 12 b 11 + b 21 b 12 + b 22 1 0
= ≠ ,
1 1 b 21 b 22 b 11 + b 21 b 12 + b 22 0 1
by Definition 2.7.1 ,
( )
1 1
1 1
is not invertible.
Example 2.7.2.
( ) ( )
1
a1
0 ⋯ 0
a1 0 ⋯ 0 −1
1
0 a2 ⋯ 0 0 ⋯ 0
a2
= .
⋮ ⋮ ⋱ ⋮
⋮ ⋮ ⋱ ⋮
0 0 ⋯ an 1
0 0 ⋯ an
Solution
( −1 −1
Because a i ≠ 0 for each i ∈ I n, diag a 1 , a 2 , ⋯ , a n
−1
) exists. It is easy to verity that
… 2/12
2.7 The inverse of a square matrix
( ) ( −1 −1
diag a 1, a 2, ⋯ , a n diag a 1 , a 2 , ⋯ , a n
−1
)=I .
n
In Example 2.7.2 , we use Definition 2.7.1 to verify whether a 2 × 2 matrix is invertible. But it is not a good method to use Definition 2.7.1 to verify whether a square
matrix with a large size is invertible. Later, we shall study some other useful methods such as using ranks and determinants to determine whether square matrices are
invertible.
By Definition 2.7.1 and (2.6.1) , we obtain the following result, which shows that every elementary matrix is invertible.
Theorem 2.7.2.
Every elementary matrix is invertible and its inverse is an invertible elementary matrix.
Theorem 2.7.3.
Assume that B is a row echelon matrix of an n × n matrix A. Then there exists an invertible n × n matrix E = E kE k − 1⋯E 2E 1 such that B = EA, where E i is an n × n
elementary matrix for i = 1, ⋯ , k.
Proof
By Theorem 2.6.2 , there exists an n × n matrix E = E kE k − 1⋯E 2E 1 such that B = EA. By Theorem 2.7.2 , E i is an n × n invertible matrix and by Theorem 2.7.4 , E is
invertible.
By Definition 2.7.1 , we prove the following result, which will be used to prove Theorem 2.7.3 .
Theorem 2.7.4.
(AB) − 1 = B − 1A − 1.
Proof
Because
Now, we show how to find the inverse matrix of a 2 × 2 matrix A given in (1.2.4) .
Theorem 2.7.5.
Let A =
( )
a b
c d
be the same as in (1.2.4) . Then the following assertions hold.
… 3/12
2.7 The inverse of a square matrix
2. If A is invertible, then
(2.7.1)
A −1 =
1
(
|A| − c
d −b
a ).
Proof
Let
(2.7.2)
B=
( ) x1 x2
y1 y2
.
( )( )( )( )
a b x1 x2 ax 1 + by 1 ax 2 + by 2 1 0
AB = = = ,
c d y1 y2 cx 1 + dy 1 cx 2 + dy 2 0 1
we obtain
(2.7.3)
ax 1 + by 1 = 1, cx 1 + dy 1 = 0.
(2.7.4)
ax 2 + by 2 = 0, cx 2 + dy 2 = 1.
By (2.7.3) , we obtain
Subtracting the above two equations implies (ad − bc)x 1 = d and |A|x 1 = d. Also, by (2.7.3) , we obtain
Subtracting the above two equations implies |A|y 1 = − c. Similarly, applying the above method to (2.7.4) , we obtain |A|x 2 = − b and |A|y 2 = a. Hence, if AB = I, then
… 4/12
2.7 The inverse of a square matrix
|A | x1 = d, | A | y1 = − c, | A | x2 = − b, and A y 2 = a, | |
that is, the condition (2.7.5) is a necessary condition for AB = I.
(1) Assume that A is invertible. By Example 2.7.1 (2), A is not a zero matrix. So one of a, b, c, d is not zero. This, together with (2.7.5) , implies |A| ≠ 0. Conversely,
assume that |A| ≠ 0. By (2.7.5) ,
(2.7.6)
d −c −b a
x1 = , y = , x = and y 2 = ,
|A| 1 |A| 2 |A| |A|
that is,
(2.7.7)
B=
( ) ( )
x1 x2
y1 y2
=
1
|A| − c
d −b
a
Example 2.7.3.
For each of the following matrices, determine whether it is invertible. If so, calculate its inverse.
A=
( ) (
2 −4
1 3
B=
1
−2 −4
2
) .
Solution
Because |A | =
| | 2 −4
1 3
= (2)(3) − ( − 4)(1) = 10 ≠ 0, by Theorem 2.7.5 , A is invertible and
( )
3 2
A −1 =
1
( ) ( )
3
|A| − 1 2
4
=
1 3
10 − 1 2
4
=
− 10
10
1
5
1
5
.
Because |B | =
| 1
−2 −4
2
| = (1)( − 4) − (2)( − 2) = 0, B is not invertible.
… 5/12
2.7 The inverse of a square matrix
Example 2.7.4.
For each of the following matrices, find all x ∈ ℝ such that it is invertible.
A=
( ) ( )
1 x3
1 1
B=
x 4
1 x
Solution
Because if |A | =
| |
1 x3
1 1
= 1 − (x 3)(1) = 1 − x 3 = 0, then x = 1. Hence, when x ≠ 1, |A| ≠ 0 and by Theorem 2.7.5 , A is invertible. Because if
|B | =
| |
x 4
1 x
= (x)(x) − (4)(1) = x 2 − 4 = 0, then x = − 2 or x = 2. Hence, when x ≠ − 2 and x ≠ 2, |B| ≠ 0 and by Theorem 2.7.5 , B is invertible.
In Theorem 2.7.5 (1), we use the determinants of 2 × 2 matrices to determine whether they are invertible. The result remains true for n × n matrices, see Theorem 3.4.1 .
In Corollary 8.1.3 , we shall show how to use the eigenvalues of a matrix to determine whether it is invertible. Here we show that we can show how to use the rank of an
n × n matrix to determine whether it is invertible.
Theorem 2.7.6.
Proof
(1) Assume that A is invertible. Then there exists an n × n matrix B such that AB = I. If r(A) < n, then by Theorem 2.7.3 , there exists an invertible matrix E such that
C = EA, where C is the reduced row echelon matrix of A. By Theorem 2.6.3 , r(E) = n and by Theorem 2.5.4 , r(C) = r(A) < n. This implies that the rows from r(A) + 1
to n of the reduced row echelon matrix C are zero rows and thus, the rows from r(A) + 1 to n of CB are zero rows. By Theorem 2.5.3 , r(CB) ≤ r(A) < n. On the other
hand, because AB = I, we have
a contradiction.
(2) Assume that r(A) = n and B is the reduced row echelon matrix of A. Then r(B) = n and B = I n. By Theorem 2.7.3 , there exists an invertible matrix E such that
I n = EA and A is invertible.
By Theorem 2.7.5 (1), we see that when n = 2, the determinant of A can be used to justify whether A is invertible. The result remains true when n ≥ 3, see Theorem
3.4.1 (i).
Example 2.7.5.
… 6/12
2.7 The inverse of a square matrix
( ) ( )
1 1 1 1 1 1
A= 0 1 1 B= 0 −1 0
1 0 1 1 0 1
Solution
( )↓ ( )↓ ( )
1 1 1 1 1 1 1 1 1
A= 0 1 1 R 1( − 1) + R 3→ 0 1 1 R 2(1) + R 3→ 0 1 1 .
1 0 1 0 −1 0 0 0 1
( )↓ ( )↓ ( )
1 1 1 1 1 1 1 1 1
B= 0 −1 0 R 1( − 1) + R 3→ 0 −1 1 R 2( − 1) + R 3→ 0 −1 0 .
1 0 1 0 −1 0 0 0 0
Corollary 2.7.1.
1. An n × n matrix is invertible if and only if its reduced row echelon matrix is the identity matrix In.
2. An n × n matrix is not invertible if and only if its row echelon matrix contains at least one zero row.
Proof
We only prove (1). Assume that B is the reduced row echelon matrix of A. By Theorem 2.5.4 , r(A) = r(B). If A is invertible, it follows from Theorem 2.7.6 that
r(A) = n. Hence, r(B) = n and B = I n. Conversely, if the reduced row echelon matrix of A is the identity matrix I n, then by Theorem 2.7.3 , there exists an invertible
matrix E such that I n = EA and A is invertible.
Corollary 2.7.2.
An n × n matrix is invertible if and only if there exists a finite sequence of elementary matrices F 1, F 2, ⋯ , F k − 1, F k such that
A = F 1F 2⋯F k − 1F k.
Proof
By Theorem 2.7.3 and Corollary 2.7.1 , A is invertible if and only if there exists an invertible matrix E = E kE k − 1⋯E 2E 1 such that I n = EA, where E i is an elementary
matrix for i = 1, ⋯ , k if and only if
… 7/12
2.7 The inverse of a square matrix
−1
Let F i = Ei for i ∈ {1, 2, ⋯ , k}. By Theorem 2.7.2 , Fi is an invertible elementary matrix for i ∈ {1, 2, ⋯ , k}.
Corollary 2.7.3.
Let A and B be m × n matrices. Assume that P is an m × m invertible matrix such that B = PA. Thenr(A) = r(B).
Proof
Because P is an m × m invertible matrix, by Corollary 2.7.2 , there exists a finite sequence of m × m elementary matrices F 1, F 2, ⋯ , F k − 1, F k such that
P = F 1F 2⋯F k − 1F k and thus, B = F 1F 2⋯F k − 1F kA. This shows that we use k row operations to change A to B and A and B are row equivalent. By Theorem 2.5.4 ,
r(A) = r(B).
The following result shows that the converse of Theorem 2.7.4 holds.
Theorem 2.7.7.
Assume that A and B are n × n matrices and AB is invertible. Then both A and B are invertible.
Proof
Let C = AB. Because AB is invertible, by Theorem 2.7.6 , r(C) = n. Let D be the reduced row echelon matrix of A. By Theorem 2.7.3 , there exists an invertible
matrix E such that D = EA. Hence,
(2.7.8)
EC = EAB = DB.
By Theorem 2.5.4 , r(A) = r(D) and r(DB) = r(C) = n. This implies r(D) = n. In fact, if r(D) < n, then because D is a row echelon matrix, by Corollary 2.7.1 (2), D
contains at least one zero row, so the last row of D must be a zero row. By (2.2.5) , we see that the last row of DB is a zero row. By Theorem 2.5.3 , we have
r(DB) < n, a contradiction. Hence, r(A) = r(D) = n. By Theorem 2.7.6 , A is invertible. Now, because A is invertible, by Corollary 2.7.1 (1), D = I and by (2.7.8) ,
EC = B. By Theorem 2.5.4 , r(B) = r(C) = n and by Theorem 2.7.6 , B is invertible.
| | |
E(A I n) = (EA EI n) = (I n A − 1).
Theorem 2.7.8.
… 8/12
2.7 The inverse of a square matrix
Proof
| |
1. Assume that we use row operations to change (A / I n) to a matrix (I n / B). From the left sides of (A I n) and (I n B), we see that A and I n are row equivalent. By
Theorems (2.5.4), r(A) = r(I n) = n. By Theorem 2.7.6 , A is invertible. By Theorem 2.7.3 , there exists an invertible matrix E = E kE k − 1⋯E 2E 1 such that
| |
EA = I n. This implies E = A − 1. From the right sides of (A I n) and (I n B), we see that EI n = B. This implies B = EI n = E = A − 1.
|
2. Assume that we use row operations to change (A I n) to the reduced row echelon matrix (C | D), where C contains at least one zero row. From the left sides of
|
(A I n) and (C | D), we see that A and C are row equivalent. By Theorems (2.5.4) and 2.5.3 , r(A) = r(C) < n. It follows from Theorem 2.7.6 that A is not
invertible.
Remark 2.7.1
By Theorem 2.7.8 , the method to find the inverse of an n × n matrix A is given below.
|
i. Change the matrix (A I n) to a row echelon matrix (C | D). If C contains at least one zero row, then A is not invertible.
ii. If C does not contain any zero rows, then A is invertible and we continue changing the row echelon matrix (C | D) to the reduced row echelon matrix, which
Example 2.7.6.
For each of the following matrices, determine whether it is invertible. If so, find its inverse.
( ) ( )
1 1 1 1 −3 4
A=
( 2
−4
−3
5 ) B= 0 1 1
1 0 1
C= 2 −5 7
0 −1 1
Solution
( | ) |
5 3
(A | I 2) =
( 2
−4 | )
−3 1 0
5 0 1
R 1(2) + R 2→
( | )
2 −3 1 0
0 −1 2 1
R 2( − 1)→
( |
2 −3 1
0 1 −2 −1
0
) R 2(3) + R 1→
( | 2 0 −5 −3
0 1 −2 −1 )()
R1
1
2
→
1 0 −
0 1
2
−2
−2 −1
= (I 2 A − 1).
( )
5 3
−1
−2 −2
Hence, A is invertible and A = .
−2 −1
… 9/12
2.7 The inverse of a square matrix
( | ) ( | ) ( | ) ( | ) ( | ) |
1 1 1 1 0 0 1 1 1 1 0 0 1 1 1 1 0 0 R3 ( − 1 ) + R1
1 1 0 2 −1 −1 1 0 0 1 −1 0
(B | I 3) = 0 1 1 0 1 0 R ( − 1) + R → 0 1 1 0
1 0 R (1) + R → 0 1 1 0 1 0 → 0 1 0 1 0 − 1 R ( − 1) + R → 0 1 0 1 0 −1 = (I 3 B
1 3 2 3 2 1
R3 ( − 1 ) + R2
1 0 1 0 0 1 0 −1 0 −1 0 1 0 0 1 −1 1 1 0 0 1 −1 1 1 0 0 1 −1 1 1
( )
1 −1 0
Hence B is invertible and B − 1 = 1 0 −1 .
−1 1 1
( | ) ( | ) ( | )
1 −3 4 1 0 0 1 −3 4 1 0 0 1 −3 4 1 0 0
(C | I 3) = 2 − 5 7 0 1 0 R 1( − 2) + R 2→ 0 1
− 1 − 2 1 0 R 2(1) + R 3→ 0 1 −1 −2 1 0 .
0 −1 1 0 0 1 0 −1 1 0 0 1 0 0 0 −2 1 1
Because the left side of the last matrix contains a row of zeros, by Theorem 2.7.8 , C is not invertible.
By Definition 2.7.1 , it is easy to prove the following result, which provides some properties of inverse matrices. We omit these proofs.
Theorem 2.7.9.
−1
1. (A − 1) = A.
1
2. If k ≠ 0, then kA is invertible and (kA) − 1 = k A − 1.
−1 n n
3. A n is invertible for n ∈ ℕ and (A n) = (A − 1) . In this case, we write (A − n) = (A − 1) .
−1 T
4. A T is invertible and (A T) = (A − 1) .
5. If A is symmetric, then A − 1 is symmetric.
6. AA T and A T A are invertible.
7. If A is triangular, then A is invertible if and only if its diagonal entries are all nonzero.
8. If A is lower triangular, then A − 1 is lower triangular. If A is upper triangular, then A − 1 is upper triangular.
Exercises
1. Show that B is an inverse of A if
( )
1 1
( )
1 1 2
−2
A= and B = .
−1 1 1 1
2 2
10/12
2.7 The inverse of a square matrix
3. For each of the following matrices, determine whether it is invertible. If so, find its inverse.
( )( )
1
0 0 1 0 0 0
3
0 2 0 0
A = 1 B=
0 4
0 0 0 3 0
0 0 4 0 0 0 4
C =
( ) (
3 −4
2 3
D=
2
−1 −2
4
)
4. Let A =
( )
1 x2
1 1
. Find all x ∈ ℝ such that A is invertible.
5. Let A =
( )
1 2
1 3
and B =
( )
3 2
2 2
. Find A − 1, B − 1, and (AB) − 1 and verify (AB) − 1 = B − 1A − 1.
6. For each of the following matrices, use its rank to determine whether it is invertible.
( )
1 0 0 0
( ) ( )
0 2 −5 2 1 4
2 1 2 0
A= 1 5 −5 B= C= 1 −2 4
−2 −3 1 −2
1 0 8 −2 −1 −4
1 −1 0 0
7. For each of the following matrices, use row operations to determine if it is invertible. If so, find its inverse.
A =
( ) (
−1 2
4 7
B=
−4
1 −3
5 ) (
C=
−2
1 −2
4 )
( ) ( ) ( )
1 −1 1 1 1 2 0 −8 −9
D = 0 1 −1 E= 2 0 −5 F= 2 4 −1
2 0 −1 3 2 1 1 −2 −5
8. Let A =
( )
−1 2
1 3
3
. Find A 3, A − 1, (A − 1) , and verify (A 3)
−1
= (A − 1) .
3
9. Let A =
( )
−4 1
3 1
. Find A − 1, (A T)
−1
and verify that (A T)
−1
= (A − 1) .
T
11/12
2.7 The inverse of a square matrix
10. Let A =
( )
1 3
3 2
. Find A − 1. Is A − 1 symmetric?
12/12
Chapter 3 Determinants
Chapter 3 Determinants
(3.1.1)
( )
a 11 a 12 a 13
A= a 21 a 22 a 23
a 31 a 32 a 33
(3.1.2)
… 1/13
Chapter 3 Determinants
| |
a 11 a 12 a 13
|A| = a 21 a 22 a 23
a 31 a 32 a 33
= (a 11a 22a 33 + a 21a 32a 13 + a 31a 12a 23) − (a 13a 22a 31 + a 23a 32a 11 + a 33a 21a 12).
Note that the three numbers in each of the terms are from different rows and different columns of A.
Example 3.1.1.
| |
1 −2 −1
|A | = 0 2 3 .
4 −5 −3
Solution
By (3.1.3) , we obtain
(3.1.3)
This gives us another way to compute |A | . Indeed, if we write |A| and adjoin it to its first two columns, we get
(3.1.4)
| |
a 11 a 12 a 13 a 11 a 12
a 21 a 22 a 23 a 21 a 22
a 31 a 32 a 33 a 31 a 32
Drawing an arrow from a 11 to a 33 through a 22 (from the top left to the bottom right), we see that the product of
the three numbers a 11, a 22, a 33 on the arrow is the first product in (3.1.3) . Drawing an arrow from a 12 to
a 31 through a 23 and from a 13 to a 32 through a 21, we get the other two products. Similarly, we can get the
three terms
by drawing an arrow from a 13 to a 31 through a 22 (from the top right to the bottom left), from a 11 to a 32 through
a 23 and from a 12 to a 33 through a 21.
Example 3.1.2.
… 3/13
Chapter 3 Determinants
| |
2 4 6
|A | = − 4 5 6 .
7 −8 9
Solution
By (3.1.4) , we obtain
| |
2 4 6 2 4
|A | = − 4 5 6 − 4 5 = [(2)(5)(9) + (4)(6)(7) + (6)( − 4)( − 8)] − [(6)(5)(7) + (2)(6)( − 8) + (4)( − 4)(9)] = 480.
7 −8 9 7 −8
Before we introduce the determinants for matrices with orders greater than 3, we reorganize the six terms in
(3.1.3) . The basic idea is to use the determinants of 2 × 2 matrices to compute 3 × 3 matrices.
(3.1.5)
… 4/13
Chapter 3 Determinants
| |
a 11 a 12 a 13
a 21 a 22 a 23 = (a a a + a a a + a a a )
|A| = 11 22 33 12 23 31 13 21 32
a 31 a 32 a 33
= a 11
| a 22 a 23
a 32 a 33 | | | | |
− a 12
a 21 a 23
a 31 a 33
+ a 13
a 21 a 22
a 31 a 32
where
(3.1.6)
M 11 =
| |
a 22 a 23
a 32 a 33
, M 12 =
| | | |
a 21 a 23
a 31 a 33
, M 13 =
a 21 a 22
a 31 a 32
.
Remark 3.1.1.
M 11 is the determinant of the matrix obtained from A by deleting row 1 and column 1 of A, M 12 is the
determinant of the matrix obtained from A by deleting row 1 and column 2, and M 13 is the determinant of
the matrix obtained from A by deleting row 1 and column 3.
The expression on the right side of (3.1.5) is called an expansion of minors of the determinant |A|
corresponding to the first row of A.
… 5/13
Chapter 3 Determinants
Note that the signs in (3.1.5) change and follow the following rule:
( − 1) 1 + 1, ( − 1) 1 + 2, ( − 1) 1 + 3.
These powers are obtained from subscripts of a 11, a 12, and a 13, respectively.
We define
(3.1.7)
By (3.1.5) , we obtain
(3.1.8)
Definition 3.1.1.
Let M ij be the determinant of the 3 × 3 matrix obtained from A by deleting the ith row and the jth column
of A. M ij is called the minor of a ij or the ijth minor of A.
(3.1.9)
A ij = ( − 1) i + jM ij
… 6/13
Chapter 3 Determinants
The expression on the right side of (3.1.8) is called an expansion of cofactors of the determinant |A|
corresponding to the first row of A.
Similarly, by reorganizing the six terms in (3.1.3) , we obtain the expansion of minors or cofactors of the
determinant |A| corresponding to the second row or the third row or each column of A. Each of these
expressions equals |A | .
Theorem 3.1.1.
(3.1.10)
Example 3.1.3.
Let
( )
2 4 6
A= −4 5 6 .
7 −8 9
Solution
… 7/13
Chapter 3 Determinants
1. By (3.1.6) , we have
M 11 =
| |
5
−8 9
6
= 93, M 12 =
| |
−4 6
7 9
= − 78, M =
| −4
7
5
−8 | = − 3.
By (3.1.5) , we get
A 11 = ( − 1) 1 + 1M 11 = M 11 = 93.
A 12 = ( − 1) 1 + 2M 12 = − M 12 = − ( − 78) = 78.
A 13 = ( − 1) 1 + 3M 11 = M 13 = − 3.
Example 3.1.4.
Let
( )
2 4 6
A= −4 5
6 .
7 −8 9
1. Find M21, M22, M23, A21, A22, A23 and compute |A| using the cofactors.
2. Find
: 100% M31, M32, M33, A31, A32, A33 and compute |A| using the cofactors.
… 8/13
Chapter 3 Determinants
Solution
M 21 =
| |
4
−8 9
6
= 36 + 48 = 84, A 21 = ( − 1) 2 + 1M 21 = − M 21 = − 84.
1. M 22 =
| |
2 6
7 9
= 18 − 42 = − 24, A 22 = ( − 1) 2 + 2M 22 = M 22 = − 24.
M 23 =
| |
2 4
7 8
= − 16 − 28 = − 44, A 23 = ( − 1) 2 + 3M 23 = − M 23 = 44.
2.
M 31 =
| |
4 6
5 6
= 24 − 30 = − 6, A 31 = ( − 1) 3 + 1M 31 = M 31 = − 6.
M 32 =
| |
2
−4 6
6
= 12 + 24 = 36, A 32 = ( − 1) 3 + 2M 32 = − M 32 = − 36.
M 33 =
| |
2
−4 5
4
= 10 + 16 = 26, A 33 = ( − 1) 3 + 3M 33 = M 33 = 26.
(3.1.11)
… 9/13
Chapter 3 Determinants
| | | | | |
a b
c d
=
a b
c d
T
=
a c
b d
= ad − bc.
Theorem 3.1.2.
Proof
(3.1.12)
| |
a 11 a 21 a 31 a 11 a 21
|A T | = a 12 a 22 a 32 a 12 a 22 = (a 11a 22a 33 + a 21a 32a 13 + a 31a 12a 23) − (a 31a 22a 13 + a 11a 32a 23 + a 21a 12a 33).
a 13 a 23 a 33 a 13 a 23
It is easy to see that the expansion of cofactors of the determinant |A| corresponding to the ith row of A is
| |
the same as the expansion of minors or cofactors of the determinant A T corresponding to the ith column
of A T. For example, the expansion of cofactors of the determinant |A| corresponding to the first row of A
is :
10/13
Chapter 3 Determinants
|A | = a 11
| | | | | |
a 22 a 23
a 32 a 33
− a 12
a 21 a 23
a 31 a 33
+ a 13
a 21 a 22
a 31 a 32
.
| |
and the expansion of cofactors of the determinant A T corresponding to the first row of A T is
|A T | = a 11
| a 22 a 23
a 32 a 33 | | | | |
− a 12
a 21 a 23
a 31 a 33
+ a 13
a 21 a 22
a 31 a 32
The following result shows that the determinants of upper or lower triangular matrices equal the products of
entries on the main diagonals.
Theorem 3.1.3.
| || |
a 11 a 12 a 13 a 11 0 0
0 a 22 a 23 = a 21 a 22 0 = a 11a 22a 33 = a 11a 22a 33.
0 0 a 33 a 31 a 32 a 33
Proof
We only prove the second equality because the first equality follows from Theorem 3.1.2 . By the
expansion of cofactors, we have
11/13
Chapter 3 Determinants
| |
a 11 0 0
a 21 a 22 0
a 31 a 32 a 33
= a 11A 11 + 0A 12 + 0A 13 = a 11
| |
a 22
a 23 a 33
0
= a 11a 22a 33.
Example 3.1.5.
| | | |
3 −8 9 6 0 0
|A | = 0 − 2 4 , | B | = 3 − 1 0
0 0 4 1 0 5
Solution
By Theorem 3.1.3 , |A | = 3( − 2)(4) = − 24; | B | = 6( − 1)(5) = − 30.
Exercises
1. Let A =
( )
a b
c d
, where a, b, c, d ∈ ℝ. Show that
|λI − A | = λ 2
− tr(A)λ + A ||
for each λ ∈ ℝ.
12/13
Chapter 3 Determinants
2. Let A 1 =
( )
2
3 −2
4
, A2 =
( )
1
−1 4
2
, A3 =
( −1
−1 −3
1
) For each i = 1, 2, 3, find all λ ∈ ℝ such that
|λI − Ai | = 0.
3. Use (3.1.3) and (3.1.4) to calculate each of the following determinants.
| | | | | |
1 2 3 2 −2 4 0 −1 3
|A | = − 2 1 2 |B| = 4 3 1 |C| = 4 1 2
3 −1 4 0 1 2 0 0 1
4. Compute the determinant |A| by using its expansion of cofactors corresponding to each of its rows,
where
| |
1 2 3
|A | = − 2 1 2 .
3 −1 4
| | | |
2 3 −4 −2 0 0
|A | = 0 6 0 |B| = 1 1 0
0 0 5 0 3 8
13/13
3.2 Determinants of Math Equation matrices
(3.2.1)
Definition 3.2.1.
Let Mij be the determinant of the (n − 1) × (n − 1) matrix obtained from A by deleting the ith row and
the jth column of A. Mij is called the minor of aij or the ijth minor of A.
(3.2.2)
i+j
Aij = (−1) Mij
Example 3.2.1.
Let
1/8
3.2 Determinants of Math Equation matrices
1 −3 5 6
⎛ ⎞
⎜2 4 0 3 ⎟
A = ⎜ ⎟.
⎜1 5 9 −2 ⎟
⎝ ⎠
4 0 2 7
Solution
∣1 5 6∣
∣ ∣ 3+2
M32 = 2 0 3 = 8, A32 = (−1) M32 = −M32 = −8,
∣ ∣
∣4 2 7∣
∣1 −3 5∣
∣ ∣ 2+4
M24 = 1 5 9 = −192, A24 = (−1) M24 = M24 = −192.
∣ ∣
∣4 0 2∣
Definition 3.2.2.
(3.2.3)
k=1
The expression on the right side of (3.2.3) is called an expansion of cofactors of the determinant |A| via the
first row of A.
2/8
3.2 Determinants of Math Equation matrices
In Definition 3.2.2 , we use the cofactors of the first row of A to define the determinant |A|. Like Theorems
3.1.1 and 3.1.12, we can use the expansion of cofactors of each row or each column of A to compute the
determinant |A|, but the proof of the result is difficult for n ≥ 4. Hence, we state the result without proof.
Theorem 3.2.1.
For each i, j ∈ In ,
n n
T
|A| = ∑ aik Aik = ∑ akj Akj = |A |.
k=1 k=1
By Theorem 3.2.1 , we can choose any row of A together with its cofactors to compute the determinant of A.
Hence, the smart choice would be to choose the row or column that contains the most zero entries.
Theorem 3.2.2.
Proof
Assume that all entries of the ith row are zero, that is,
Using the expansion of cofactors of the ith row and Theorem 3.2.1 , we have
|A| = ai1 Ai1 + ai2 Ai2 + ⋯ + ain Ain = 0Ai1 + 0Ai2 + ⋯ + 0Ain = 0.
Example 3.2.2.
3/8
3.2 Determinants of Math Equation matrices
∣0 2 3 4∣ ∣1 3 5 2∣
∣2 3 5∣ ∣ ∣ ∣ ∣
∣ ∣ ∣0 1 0 1∣ ∣0 −1 3 4∣
|A| = 0 0 0 |B| = |C| =
∣ ∣ ∣ ∣ ∣ ∣
0 0 1 0 2 1 9 6
∣1 −2 4∣ ∣ ∣ ∣ ∣
∣3 1 2 0∣ ∣3 2 4 8∣
Solution
Because the matrix A contains a zero row, |A| = 0. We use the expansion of cofactors of row 3 of B to
compute |B|.
∣0 2 4∣
∣ ∣
|B| = 0A31 + 0A32 + A33 + 0A34 = A33 = 0 1 1 = −6.
∣ ∣
∣3 1 0∣
∣1 5 2∣ ∣1 3 2∣ ∣1 3 5∣
∣ ∣ ∣ ∣ ∣ ∣
= 2 9 6 −3 2 1 6 +4 2 1 9 = 160.
∣ ∣ ∣ ∣ ∣ ∣
∣3 4 8∣ ∣3 2 8∣ ∣3 2 4∣
Similar to Theorem 3.1.3 , by Theorems 3.2.3 and 3.2.1 , we obtain the following result on the
determinant of a lower or upper triangular matrix.
Theorem 3.2.3.
4/8
3.2 Determinants of Math Equation matrices
Proof
Let A be the lower triangular matrix. Using Theorem 3.2.1 repeatedly, we get
∣ a22 0 ⋯ 0 ∣
∣ ∣ ∣ a33 ⋯ 0 ∣
∣ a32 a33 ⋯ 0 ∣ ∣ ∣
|A| = a11 ∣ ∣ = a11 a22 ∣ ∣ = a11 a22 ⋯ ann .
⋮ ⋱ ⋮
∣ ∣ ∣ ∣
⋮ ⋮ ⋱ ⋮
∣ ∣ ∣ an3 ⋯ ann ∣
∣ an2 an3 ⋯ ann ∣
The second result follows from Theorem 3.2.1 and the above result.
Example 3.2.3.
Evaluate
∣5 0 0 0 ∣ ∣ −2 3 0 1∣
∣ ∣ ∣2 1 7 ∣ ∣ ∣
∣2 3 0 0 ∣ ∣ ∣ ∣ 0 0 2 4∣
|A| = |B| = 0 2 −5 |C| =
∣ ∣ ∣ ∣ ∣ ∣
1 6 7 0 0 0 1 3
∣ ∣ ∣0 0 1 ∣ ∣ ∣
∣0 0 2 −1 ∣ ∣ 0 0 0 2∣
Solution
5/8
3.2 Determinants of Math Equation matrices
∣5 0 0 0∣
∣ ∣
∣2 3 0 0∣
|A| = = 5 × 3 × 7 × (−1)1 = 105.
∣ ∣
1 6 7 0
∣ ∣
∣0 0 2 1∣
∣2 1 7 ∣
∣ ∣
|B| = 0 2 −5 = 2 × 2 × 1 = 4.
∣ ∣
∣0 0 1 ∣
∣ −2 3 0 1∣
∣ ∣
∣ 0 0 2 4∣
|C| = = (−2) × 0 × 1 × 2 = 0.
∣ ∣
0 0 1 3
∣ ∣
∣ 0 0 0 2∣
Exercises
1. For each of the following matrices, find M32 , M24 , A32 , A24 , M41 , M43 , A41 , A43
1 −1 2 3 1 0 2 −1
⎛ ⎞ ⎛ ⎞
⎜ 2 2 0 ⎟3 ⎜ 2 0 0 1 ⎟
A = ⎜ ⎟ B = ⎜ ⎟.
⎜ 1 2 3 −1 ⎟ ⎜ 1 0 1 −1 ⎟
⎝ ⎠ ⎝ ⎠
−3 0 1 6 −1 0 1 2
2. Compute each of the following determinants by using its expansion of cofactors of each row.
6/8
3.2 Determinants of Math Equation matrices
∣ 1 0 2 −1 ∣ ∣ 1 −1 2 3 ∣
∣ ∣ ∣ ∣
∣ 2 0 0 1 ∣ ∣ 2 2 0 3 ∣
|A| = |B| = .
∣ ∣ ∣ ∣
1 0 1 −1 1 2 3 −1
∣ ∣ ∣ ∣
∣ −1 0 1 3 ∣ ∣ −3 0 1 6 ∣
∣0 0 0 1∣
∣ ∣
∣0 1 0 1∣
|A| = .
∣ ∣
0 0 1 0
∣ ∣
∣1 1 −1 0∣
∣1 0 0∣ ∣2 0 0∣ ∣1 2 3 ∣
∣ ∣ ∣ ∣ ∣ ∣
|A| = 0 2 0 |B| = 2 −2 0 |C| = 0 1 2
∣ ∣ ∣ ∣ ∣ ∣
∣0 0 6∣ ∣0 1 2∣ ∣0 0 −1 ∣
∣1 −1 3 5 0∣ ∣1 −1 3 5 0∣
∣ ∣ ∣ ∣
0 0 0 0 0 0 3 0 0 0
∣ ∣ ∣ ∣
|D| = ∣ 0 0 1 2 7∣ |E| = ∣ 0 0 −1 2 7∣
∣ ∣ ∣ ∣
5 6 3 −4 6 0 0 0 −4 6
∣ ∣ ∣ ∣
∣2 −2 6 10 0∣ ∣0 0 0 0 2∣
∣ 1 0 0 0 ∣ ∣ −2 0 0 0∣
∣ ∣ ∣ ∣
∣ −2 2 0 0 ∣ ∣ 0 3 0 0∣
|F | = |G| =
∣ ∣ ∣ ∣
−1 6 −6 0 0 0 −1 0
∣ ∣ ∣ ∣
∣ 4 0 2 −1 ∣ ∣ 0 0 0 2∣
7/8
3.3 Evaluating determinants by row operations
(3.3.1)
|A| = k|B|.
How can we find the number k? To do that, we need to know the relations between |A| and |B| if B is obtained
from A by a single row operation.
Theorem 3.3.1.
1
Ri ( )
1. If A −
c
Ri,j
2. If A −
→ B, then |A| = −|B|.
Ri (c)+Rj
Proof
1. Without loss of generalization, we assume that A and B are of the following forms:
… 1/26
3.3 Evaluating determinants by row operations
⎜ ⎟ ⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟ ⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟
A = ⎜ cai1 cai2 ⋯ cain
⎟ and B = ⎜ ai1 ai2 ⋯ ain
⎟.
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟
⎜ ⎟ ⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⋮ ⋮ ⋮ ⋮ ⎜ ⎟
⎝ ⎠
an1 an2 ⋯ ann ⎝ ⎠
an1 an2 ⋮ ann
1
Ri ( )
Then A − → B and Aik for each k ∈ In because the jth rows of A and B are the same for
c
−− = Bik
n n n
2. We first prove that the result holds when we interchange two successive rows. Let
⎜ ⎟ ⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟ ⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ai1 a12 ⋯ ain ⎟ ⎜ a(i + 1)1 a(i + 1)2 ⋯ a(i + 1)n ⎟
A = ⎜ ⎟, B = ⎜ ⎟.
⎜ ⎟ ⎜ ⎟
⎜ a(i + 1)1 a(i + 1)2 ⋯ a(i + 1)n ⎟ ⎜ ai1 ai2 ⋯ ain ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟ ⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎝ ⎠ ⎝ ⎠
an1 an2 ⋯ ann an1 an2 ⋯ ann
Ri, (i+1)
Then A −−−→ B, that is, we get B by interchanging the ith row and the (i + 1)th row of A. By
Theorem 3.2.1 , we expand by cofactors via the ith row of A and the (i + 1)th row of B and get
… 2/26
3.3 Evaluating determinants by row operations
(3.3.3)
n
n
|A| = ∑ aik Aik and |B| = ∑ aik B(i+1)k ,
k=1
k=1
where Aik and B(i+1)k are the cofactors of A and B, respectively. We note that the (i + 1)th row of
B is (ai1 ai2 ⋯ ain ). We denote by Mik the minors of A and by U(i+1)k the minors of B. Then it is
easy to see that Mik = U (i+1)k for each k ∈ In . Then by (3.2.2) , we have
i+1+k i+k
B(i+1)k = (−1) U (i+1)k = −(−1) Mik = −Aik .
Now we assume
… 3/26
3.3 Evaluating determinants by row operations
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
a11 a12 ⋯ a1n ⎜ ⎟
⎛ ⎞ ⎜ a(i − 1)1 a(i − 1)2 ⋯ a(i − 1)n ⎟
⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟ ⎜ aj1 aj2 ⋯ ajn ⎟
⎜ ⎟ ⎜ ⎟
⎜ a ai2 ⋯ ain ⎟ ⎜ ⎟
i1 ⎜ a(i + 1)1 a(i + 1)2 … a(i + 1)n ⎟
⎜ ⎟
⎜ ⎟ ⎜ ⎟
A = ⎜ ⋮ ⋮ ⋮ ⋮
⎟ and B = ⎜ ⎟.
⎜ ⎟ ⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎜ ⎟ ⎜ ⎟
⎜ a aj2 ⋯ ajn ⎟ ⎜ ⎟
⎜ j1 ⎟ ⎜ a(j − 1)1 a(j − 1)2 ⋯ a(j − 1)n ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ai1 a12 ⋯ ain ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟ ⎜ ⎟
⎜ ⎟
⎝ ⎠ ⎜ a(j + 1)1 a(j + 1)2 ⋯ a(j + 1)n ⎟
an1 an2 ⋯ ann ⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎝ ⎠
an1 an2 ⋯ ann
Ri,j
A−
→ B. We can get B from A by interchanging two successive rows several times by the following
ways:
i. Move the ith row vector: (ai1 ai2 ⋯ ain ) to the (i + 1)th row by interchanging the ith row
and the (i + 1)th row of A, then move the vector (ai1 ai2 ⋯ ain ) from the i + 1th row to
the (i + 2)th row by interchanging (ai1 ai2 ⋯ ain ) and (a(i+2)1 a(i+2)2 ⋯ a(i+2)n ). By
j−i times, we move (ai1 a12 ⋯ ain ) to the jth row. The resulting matrix is
… 4/26
3.3 Evaluating determinants by row operations
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎜ ⎟
⎜ a(i+1)1 a(i−1)2 ⋯ a(i−1)n ⎟
⎜ ⎟
⎜ ⎟
⎜ a(i+1)1 a(i+1)2 ⋯ a(i+1)n ⎟
⎜ ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎜ ⎟
C := ⎜ ⎟
a(j−i)1 a(j−i)2 ⋯ a(j−i)n
⎜ ⎟
⎜ ⎟
⎜ aj1 aj2 ⋯ ajn ⎟
⎜ ⎟
⎜ ⎟
⎜ ai1 ai2 ⋯ ain ⎟
⎜ ⎟
⎜ a(j+1)1 a(j+1)2 ⋯ a(j+i)n ⎟
⎜ ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎝ ⎠
an1 an2 ⋯ ann
and
(3.3.4)
j−i
∣A∣ = (−1) ∣C ∣.
∣ ∣ ∣ ∣
(a(j−1)1 a(j−1)2 ⋯ a(j−1)n ) of C Continuing the process, we can move (aj1 aj2 ⋯ ajn ) back
to the ith row by [(j − i) − i] times. The resulting matrix exactly is B. Moreover, we have
… 5/26
3.3 Evaluating determinants by row operations
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎜ ⎟
⎜ a(i − 1)1 a(i − 1)2 ⋯ a(i − 1)n ⎟
⎜ ⎟
⎜ ⎟
⎜ ai1 ai2 ⋯ ain ⎟
⎜ ⎟
⎜ ⎟
a(i+1)1 a(i+1)2 ⋯ a(i+1)n
⎜ ⎟
⎜ ⎟
⎜ ⎟
D = ⎜ ⋮ ⋮ ⋮ ⋮ ⎟.
⎜ ⎟
⎜ ⎟
⎜ a(j−1)1 a(j−1)2 ⋯ a(j−1)n ⎟
⎜ ⎟
⎜ ⎟
⎜ ai1 ai2 ⋯ ain ⎟
⎜ ⎟
⎜ a a(j+1)2 ⋯ a(j+1)n ⎟
(j+1)1
⎜ ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎝ ⎠
an1 an2 ⋯ ann
Note that after interchanging the ith and jth rows of D, the resulting matrix still is D. Hence,
|D| = −|D|. This implies |D| = 0.
ii. Expanding by cofactors of D via the jth row of D given in (3.3.5) and using the result (i), we
obtain
… 6/26
3.3 Evaluating determinants by row operations
(3.3.6)
k=1
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎜ ⎟
⎜ ai1 ai2 ⋯ ain ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎜ ⎟
⎜ ⎟
A = ⎜a a(j−1)2 ⋯ a(j−1)n ⎟
(j−1)1
⎜ ⎟
⎜ ⎟
⎜ aj1 aj2 ⋯ ajn ⎟
⎜ ⎟
⎜ ⎟
⎜ a(j+1)1 a(j+1)2 ⋯ a(j+1)n ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎝ ⎠
an1 an2 ⋯ ann
and
… 7/26
3.3 Evaluating determinants by row operations
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎜ ⎟
⎜ ai1 ai2 ⋯ ain ⎟
⎜ ⎟
⎜ ⎟
⎜ ⎟
⋮ ⋮ ⋮ ⋮
⎜ ⎟
⎜ ⎟
B = ⎜ a(j−1)1 a(j−1)2 ⋯ a(j−1)n ⎟.
⎜ ⎟
⎜ ⎟
⎜ cai1 + aj1 cai2 + ai2 ⋯ cain + ajn ⎟
⎜ ⎟
⎜ ⎟
⎜ a(j+1)1 a(j+1)2 ⋯ a(j+1)n ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎝ ⎠
an1 an2 ⋯ ann
Ri(c) +Rj
Then A −−−−→ B. Let Aik and Bik be the jth row cofactors of A and B, respectively. Then
by (3.3.5) , (3.3.7), and (3.3.8), we see that
(3.3.9)
for k ∈ In .
Expanding by cofactors of B via the jth row of B, we have by (3.3.9) and (3.3.6) and
Theorem 3.2.1 ,
n n n n
|B| = ∑(caik + ajk )Bjk = ∑(caik + ajk )Ajk = c ∑ aik Djk + ∑ ajk Ajk
k=1
… 8/26
3.3 Evaluating determinants by row operations
By Theorem 3.3.1 , we see that there exists a number k such that |A| = k|B|, where B is obtained from A
by several row operations, and k can be obtained from the row operations used in Theorem 3.3.1 (i) and (ii).
Before we use Theorem 3.3.1 to compute determinants, we try to understand and digest each result given
in Theorem 3.3.1 and use it to derive some results.
Hence, we can take the common factor of one row out of the determinant. Similarly, because ∣∣AT ∣∣ = |A| , we
can take a common factor of one column out of the determinant.
Example 3.3.1.
∣ 3 −3 6 ∣ ∣ −1 −1 12 ∣
∣ ∣ ∣ ∣
|A| = 12 4 16 |B| = 2 4 6 .
∣ ∣ ∣ ∣
∣ 0 −2 5 ∣ ∣ 0 1 6 ∣
Solution
… 9/26
3.3 Evaluating determinants by row operations
∣1 −1 2∣
∣ ∣
= 3(4) 3 1 4 = (12)(16) = 192.
∣ ∣
∣0 −2 5∣
∣ −1 −1 12 ∣ ∣ −1 −1 12 ∣
∣ ∣ ∣ ∣
|B| = ∣ (2)(1) (2)(2) (2)(3) ∣ = 2 1 2 3
∣ ∣
∣ ∣
∣ 0 1 6 ∣ ∣ 0 1 6 ∣
∣ −1 −1 (3)(4) ∣ ∣ −1 −1 4∣
∣ ∣ ∣ ∣
= ∣ 1 2 (3)(1) ∣ = (2)(3) 1 2 1 = (6)(3) = 18.
∣ ∣
∣ ∣
∣ 0 1 (3)(2) ∣ ∣ 0 1 2∣
The following result gives the relation between |cA| and |A|.
Theorem 3.3.2.
n
|cA| = c |A| .
Proof
The result holds if c = 0. We assume that c ≠ 0. Repeating (3.3.2) implies
10/26
3.3 Evaluating determinants by row operations
Example 3.3.2.
Let A be a 3 × 3 matrix. Assume that |A| = 6. Compute each of the following determinants.
T
|2A| and ∣
∣(3A)
∣.
∣
Solution
By Theorem 3.3.2 ,
3
|2A| = 2 |A| = 8 × 6 = 48.
Theorem 3.3.1 (2) means that when interchanging any two different rows of a determinant, the determinant
changes its sign.
Example 3.3.3.
11/26
3.3 Evaluating determinants by row operations
Evaluate
∣0 0 1∣
∣ ∣
|A| = 0 2 3 .
∣ ∣
∣1 −2 5∣
Solution
∣0 0 1∣ ∣1 −2 5∣
R1 , 3
∣ ∣ ∣ ∣
|A| = 0 2 3 ––––
–––––– − 0 2 3 = −2.
∣ ∣ ∣ ∣
∣1 −2 5∣ ∣0 0 1∣
Corollary 3.3.1.
Proof
The result (i) follows from the result (i) in Theorem 3.3.1 (3) and the result (ii) follows from Theorem
3.2.1 and the result (i).
Example 3.3.4.
Evaluate
∣ −1 3 2 0 3 ∣
∣1 3 −2 4∣ ∣ ∣
∣ ∣ 3 1 0 2 1
∣ ∣
∣2 7 −4 8∣
A = B = ∣ 3 9 1 5 9 ∣.
∣ ∣
3 9 1 5 ∣ ∣
∣ ∣ 2 −3 1 2 −3
∣1 3 −2 4∣ ∣ ∣
∣ 1 3 0 4 3 ∣
12/26
3.3 Evaluating determinants by row operations
Solution
Because A has two equal rows, rows 1 and 4, |A| = 0. Similarly, because B has two equal columns,
columns 2 and 5, |B| = 0.
Theorem 3.3.3.
Proof
(1) Assume that A is an n × n matrix and the ith and jth rows are proportional. Without a loss of
generalization, we may assume that 1 ≤ i < j ≤ n and Ri (c) = Rj for some c ≠ 0. By Theorem
3.3.1 (1) and Corollary 3.3.1 , we have
∣ ∣ ∣ ∣
|A| = = c = 0.
∣ ⋮ ⋮ ⋮ ⋮ ∣ ∣ ⋮ ⋮ ⋮ ⋮ ∣
∣ ∣ ∣ ∣
cai1 cai2 ⋯ cain ai1 ai2 ⋯ ain
∣ ∣ ∣ ∣
∣ ∣ ∣ ∣
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
∣ ∣ ∣ ∣
∣ an1 an2 ⋯ ann ∣ ∣ an1 an2 ⋯ ann ∣
because the ith and jth rows of the last determinant are the same.
(2) The result follows from Theorem 3.2.1 and the result (1).
13/26
3.3 Evaluating determinants by row operations
Example 3.3.5.
Evaluate
∣ 1 3 −2 −2 ∣
∣3 9 1 ∣ ∣ ∣
∣ ∣ ∣ 3 9 1 −6 ∣
|A| = 1 3 −2 |B| = .
∣ ∣ ∣ ∣
−2 6 −1 4
∣2 6 −4 ∣ ∣ ∣
∣ 1 1 4 −2 ∣
Solution
Because rows 2 and 3 of A are proportional, by Theorem 3.3.3 (1), |A| = 0. Because columns 1 and 4
of B are proportional, by Theorem 3.3.3 (2), |B| = 0.
Theorem 3.3.1 (3) means that when adding the ith row multiplied by a nonzero constant c to the jth row, the
determinant remains unchanged. Using Theorem 3.3.1 (3) and Theorem 3.2.2 , one can provide another
proof of Theorem 3.3.3 (1). Indeed, we use the row operation Ri (−c) + Rj on A. Then the jth row of the
resulting matrix denoted by B is a zero row. by Theorem 3.2.2 , |B| = 0 and by Theorem 3.3.1 (3),
|A| = |B| = 0.
Example 3.3.6.
Compute
∣ 2 −3 5 ∣
∣ ∣
|A| = 1 7 2 .
∣ ∣
∣ −4 6 −10 ∣
Solution
14/26
3.3 Evaluating determinants by row operations
∣ 2 −3 5 ∣ ∣2 −3 5∣
∣ ∣ ∣ ∣
|A| = 1 7 2 R1 (2) + R3 1 7 2 = 0.
∣ ∣ ––––––––––– ∣ ∣
–––––––––––
∣ −4 6 −10 ∣ ∣0 0 0∣
Now, we use examples to show how to use row operations to calculate determinants. Before calculating the
determinant of a matrix, it is best to check the following:
Example 3.3.7.
∣2 6 10 4∣ ∣ 1 0 0 1 ∣
∣0 1 5∣ ∣ ∣ ∣ ∣
∣ ∣ ∣0 −1 3 4∣ ∣ −2 1 0 −2 ∣
|A| = 3 −6 9 |B| = |C| = .
∣ ∣ ∣ ∣ ∣ ∣
2 1 9 6 0 2 1 0
∣2 6 1∣ ∣ ∣ ∣ ∣
∣3 2 4 8∣ ∣ 4 0 1 −1 ∣
Solution
15/26
3.3 Evaluating determinants by row operations
∣ 1 ⏐
∣0 1 5∣ ∣3 −6 9∣ −2 3∣ ⏐
R1,2 ∣ ∣ ⏐
∣ ∣ ∣ ∣
|A| = ––––
– − 5 = −3 ∣ 0 ⏐
3 −6 9 ––– 0 1 1 5 ∣ ⏐ R1 (−2) + R3
∣ ∣ ∣ ∣ ––––––––––––
––––––––––––
∣ ∣ ⏐ – –
∣2 6 1∣ ∣2 6 1∣ ∣ 2 6 1∣ ⏐↓
∣ 1 −2 3 ∣ ⏐ ∣ 1 −2 3 ∣
⏐
∣ ∣ ⏐ R2 (−10) + R3 ∣ ∣
−3 ∣ 0 ⏐ ––––––––––––––
1 5 ∣ ⏐ –––––––––––––– − 3∣ 0 1 5 ∣
∣ ∣ ⏐ ∣ ∣
⏐
∣ 0 10 −5 ∣ ↓ ∣ 0 0 −55 ∣
= (−3)(1)(1)(−55) = 165.
∣ 1 ⏐
∣2 6 10 4∣ 3 5 2∣ ⏐
∣ ∣ ∣ ∣ ⏐
∣0 −1 3 4∣ ∣ 0 −1 3 4 ∣ ⏐R1 (−2)+R3
|B| = = 2 ⏐−−−−− →
∣
2 1 9 6
∣ ∣ ∣ ⏐R (−3)+R
2 1 9 6 ⏐ 1 4
∣ ∣ ∣ ∣ ⏐
∣3 2 4 8∣ ⏐
∣ 3 2 4 8∣ ↓
∣ 1 ⏐ ∣ ⏐
3 5 2∣ ⏐ ∣ 1 3 5 2 ⏐
∣ ∣ ⏐ ∣ ∣ ⏐
∣ 0 ⏐ ∣ ⏐
−1 3 4 ∣ ⏐R2 (−5)+R3 ∣ 0 −1 3 4 ⏐
2∣ ∣ ⏐−−−−− → ∣ ∣ ⏐
∣ 0 ∣ ⏐R2 (−7)+R4 ∣ 0 ⏐
−5 −1 2 ⏐ 0 −16 −18 ∣ ⏐
∣ ∣ ⏐ ∣ ∣ ⏐
∣ 0 ⏐ ⏐
−7 −11 2∣ ↓ ∣ 0 0 −32 −26 ∣ ↓
∣ 1 3 5 2 ∣
∣ ∣
∣ 0 −1 3 4 ∣
R3 (−2) + R4 ∣ ∣ = (2)(1)(−1)(−16)(10) = 320.
–––––––––––––
–2 ∣ 0
–––––––––––– 0 −16 −18 ∣
∣ ∣
∣ 0 0 0 10 ∣
∣1 −2 0 4 ∣ ∣1 −2 0 4 ∣
∣ ∣ ∣ ∣
T ∣0 1 2 0 ∣ ∣0 1 2 0 ∣
|C| = ∣
∣C
∣ =
∣ R1 (−1) + R4 = −5.
∣ ∣ ∣ ∣
0 0 1 1 ––––––––––––
–––––––––––––
– 0 0 1 1
∣ ∣ ∣ ∣
∣1 −2 0 −1 ∣ ∣0 0 0 −5 ∣
16/26
3.3 Evaluating determinants by row operations
Corollary 3.3.2.
Proof
Ri (c) Ri (c)
i. Let B = Ei (c)A. By Theorem 2.6.1 (i), A −−−→ B and In −−−→ Ei (c). By Theorem
3.3.1 ,
1 1 1
|A| = |B| and |Ei (c)| = |In | = .
c c c
Ri (c) + Rj Ri (c) + Rj
ii. Let B = Ei(c)+j A. By Theorem 2.6.1 (iii), A −−−−−−−→ B and In −−−−−−−→ Ei(c)+j .
By Theorem 3.3.1 ,
This implies
17/26
3.3 Evaluating determinants by row operations
∣
∣Ei(c)+j A∣
∣ = |B| = |A| = ∣
∣Ei(c)+j ∣
∣ |A| .
Ri,j Ri,j
This implies
By using Corollary 3.3.2 repeatedly, it is easy to prove the following result, which gives the
relation among the determinants of an n × n matrix, its row echelon matrix, and elementary
matrices.
Theorem 3.3.4.
Let A and B be n × n matrices. Assume that there exist elementary matrices E1 , E2 , ⋯ , Ek such that
B = Ek Ek−1 ⋯ E2 E1 A. Then
By Theorem 3.3.1 and |I| = 1, it is easy to see that the following result on the determinants of
elementary matrices holds.
Theorem 3.3.5.
At the end of the section, we generalize the formula (2.7.1) of an inverse matrix from a 2 × 2 matrix to
an n × n matrix for n ≥ 3.
18/26
3.3 Evaluating determinants by row operations
Definition 3.3.1.
(3.3.10)
⎝ ⎠
An1 An2 ⋯ Ann
is called the matrix of cofactors of A. The transpose of the matrix of cofactors of A is called the adjoint of A,
denoted by adj(A)
(3.3.11)
T
A11 A12 ⋯ A1n A11 A21 ⋯ An1
⎛ ⎞ ⎛ ⎞
⎝ ⎠ ⎝ ⎠
An1 An2 ⋯ Ann A1n A2n ⋯ Ann
Example 3.3.8.
19/26
3.3 Evaluating determinants by row operations
a b
A = ( ).
c d
Solution
Because A11 = M11 = d, A12 = −M12 = −c, A21 = −M21 = −b, and A22 = M22 = a, the
matrix of cofactors of A is
A11 A12 d −c
( ) = ( )
A21 A22 −b a
and
T
A11 A12 d −b
adj (A) = ( ) = ( ).
A21 A22 −c a
Note that (2.7.1) involves the adjoint matrix of A given in Example 3.3.8 . By Theorem 2.7.5 and
Example 3.3.8 , we see that
(3.3.12)
a b d −b
A adj(A)= ( )( ) = |A| I.
c d −c a
Theorem 3.3.6.
Let A and adj(A) be the same as in (3.2.1) and (3.3.11) , respectively. Then
(3.3.13)
20/26
3.3 Evaluating determinants by row operations
A adj(A) = |A| I.
Proof
Let i, j ∈ In and
⎝ ⎠
an1 an2 ⋯ ann
By Theorem 3.2.1 , we expand |A| via the jth row and obtain
(3.3.14)
k=1
Replacing the jth row by the ith row of A, by Corollary 3.3.1 (1) we obtain
(3.3.15)
k=1
21/26
3.3 Evaluating determinants by row operations
Let
C = (cij ) = A adj(A)
Then
k=1
Theorem 3.3.7.
Let A and adj(A) be the same as in (3.2.1) and (3.3.11) , respectively. If |A| ≠ 0, then
1
−1
A = adj(A).
|A|
Proof
Because |A| ≠ 0, by Theorem 3.3.13, we have
22/26
3.3 Evaluating determinants by row operations
1 1 1
A( adj(A)) = [A adj(A)] = (|A|I) = I.
|A| |A| |A|
Example 3.3.9.
1 1 1
⎛ ⎞
A = ⎜0 1 1⎟.
⎝ ⎠
1 0 1
Solution
By (3.1.9) , we have
∣1 1∣ ∣0 1∣ ∣0 1∣
A11 = ∣ ∣ = 1, A12 = − ∣ ∣ = 1, A13 = ∣ ∣ = −1,
∣0 1∣ ∣1 1∣ ∣1 0∣
∣1 1∣ ∣1 1∣ ∣1 1∣
A21 = −∣ ∣ = −1, A22 = ∣ ∣ = 0, A23 = − ∣ ∣ = 1,
∣0 1∣ ∣1 1∣ ∣1 0∣
∣1 1∣ ∣1 1∣ ∣1 1∣
A31 = ∣ ∣ = 0, A32 = − ∣ ∣ = −1, A33 = ∣ ∣ = 1.
∣1 1∣ ∣0 1∣ ∣0 1∣
23/26
3.3 Evaluating determinants by row operations
1 −1 0
⎛ ⎞
−1
1
A = adj(A) = adj(A) = ⎜ 1 0 −1 ⎟ .
|A|
⎝ ⎠
−1 1 1
The inverse of A is the same as that of B given in Example 2.7.6 , where a different method is used.
Exercises
1. Evaluate each of the following determinants by inspection.
∣ 1 2 −3 ∣ ∣2 −2 4∣ ∣ 1 2 3 ∣
∣ ∣
∣ ∣ ∣ ∣
|A| = 0 1 2 |B| = 2 −2 4 |C| = ∣ −2 1 2 ∣
∣ ∣ ∣ ∣
∣ 1
∣
∣ −2 −4 6 ∣ ∣0 1 2∣ ∣ 1 − −1 ∣
2
24/26
3.3 Evaluating determinants by row operations
∣ 1 2 −3 −2 ∣
∣2 1 4∣ ∣ 0 2 3 ∣ ∣ ∣
∣ ∣ ∣ ∣ ∣ 2 1 2 0 ∣
|A| = 1 −2 4 |B| = −2 1 2 |C| =
∣ ∣ ∣ ∣ ∣ ∣
−2 −3 1 −2
∣0 2 4∣ ∣ 3 −1 −1 ∣ ∣ ∣
∣ 1 −1 2 0 ∣
∣ 0 0 0 0 ∣ ∣ −2 0 0 0∣
∣ ∣ ∣ ∣
∣ −2 2 0 0 ∣ ∣ 3 3 0 1∣
|D| = |E| =
∣ ∣ ∣ ∣
0 6 −6 0 0 1 −1 0
∣ ∣ ∣ ∣
∣ 4 0 2 −1 ∣ ∣ −1 0 0 2∣
∣0 −1 3 5 0 ∣ ∣ 1 −1 3 5 0∣
∣ ∣ ∣ ∣
1 −1 2 4 −1 1 2 0 0 0
∣ ∣ ∣ ∣
|F | = ∣ 0 0 1 2 7 ∣ |G| = ∣ −2 0 −1 2 1∣
∣ ∣ ∣ ∣
1 1 −2 3 1 3 0 0 −6 6
∣ ∣ ∣ ∣
∣2 −2 6 10 0 ∣ ∣ 2 0 0 0 1∣
∣1 0 0 0 0 ∣
∣1 0 0 1∣ ∣ ∣
∣ ∣ 2 7 0 2 1
∣ ∣
∣2 7 0 2∣
|A| = |B| = ∣ 0 6 3 0 0 ∣.
∣ ∣
0 6 3 0 ∣ ∣
∣ ∣ 1 3 1 2 −1
∣5 3 1 2∣ ∣ ∣
∣0 6 1 2 1 ∣
0 2 −5
⎛ ⎞
A = ⎜1 5 −5 ⎟
⎝ ⎠
1 0 8
25/26
3.4 Properties of determinants
Theorem 3.4.1.
Proof
Let C be the reduced row echelon matrix of A. By Theorem 2.7.3 , there exist elementary matrices
E 1, E 2, ⋯ , E k − 1, E k such that
(3.4.1)
C = E kE k − 1⋯E 2E 1A.
By Theorem 3.3.4 ,
(3.4.2)
|C | = | EkǁEk − 1 | ⋯ | E2ǁE1ǁA | .
… 1/11
3.4 Properties of determinants
Assume that |A | ≠ 0. By Theorem 3.3.5 and (3.4.2) , |C | ≠ 0. Because C is the reduced row
echelon matrix of A, C is an upper triangular matrix and all the entries on the main diagonal that are
leading entries are not zero. This implies r(C) = n. By (3.4.1) and Theorem 2.5.4 , r(A) = r(C) = n.
By Theorem 2.7.6 , A is invertible. Conversely, assume that A is invertible. By Corollary 2.7.1 (i),
C = I n and |C | = 1. It follows from (3.4.2) and Theorem 3.3.5 that |A | ≠ 0.
In Example 2.7.5 , we use ranks of matrices, that is, Theorem 2.7.6 , to verify whether square matrices
are invertible. Now we revisit the two matrices and use determinants, that is, Theorem 3.4.1 , to determine
whether they are invertible.
Example 3.4.1.
For each of the following matrices, use its determinant to determine whether it is invertible.
( ) ( )
1 1 1 1 1 1
A= 0 1 1 B= 0 −1 0
1 0 1 1 0 1
Solution
|A| =
| | ↓
1 1 1
0 1 1
1 0 1 | |
1
↓
R 1( − 1) + R 3_ _ 0 1 1
0 −1 0
1 1
| |
1 1 1
R 2(1) + R 3_ _ 0 1 1
0 0 1
= 1 ≠ 0.
… 2/11
3.4 Properties of determinants
| |
1
↓ 1
|B | = 0 − 1 0
1 0 1
1
| |
1
R 1( − 1) + R 3_ _ 0 − 1 0
0 −1 0
↓ 1 1
R 2( − 1) + R 3_ _
| |
1 1 1
0 − 1 0 = 0.
0 0 0
Theorem 3.4.2.
|AB | = | AǁB | .
Proof
Assume that C is the reduced row echelon matrix of A. By Theorem 2.7.3 , there exist elementary
matrices E 1, E 2, ⋯ , E k − 1, E k such that
C = E kE k − 1⋯E 2E 1A.
It follows that
(3.4.3)
A = F 1F 2⋯F k − 1F kC,
… 3/11
3.4 Properties of determinants
−1
where F i = E i is an elementary matrix for i = 1, 2, ⋯ , k. By Theorem 3.3.4 ,
(3.4.4)
|A | = | F1ǁF2 | ⋯ | Fk − 1ǁFkǁC | .
If A is invertible, then by Corollary 2.7.1 (i), C = I n By (3.4.4)
|A | = | F1ǁF2 | ⋯ | Fk − 1ǁFkǁ.
By (3.4.3) , we obtain
AB = F 1F 2⋯F k − 1F kB
Example 3.4.2.
ath: 100%
erify that |AB | = | AǁB| if
… 4/11
3.4 Properties of determinants
A=
( )
3 1
2 1
and B =
( )
−1 3
5 8
.
Solution
Because |A | = 3 − 2 = 1 and |B | = − 8 − 15 = − 23, we have
AB =
( )( ) ( )
3 1
2 1
−1 3
5 8
=
2 17
3 14
,
| | | |
AB =
2 17
3 14
= − 23. Hence, |AB | = | AǁB | .
Theorem 3.4.3.
| |
If A is invertible, then |A | ≠ 0 and A − 1 =
1
|A|
.
Proof
Because AA − 1 = I and |I | = 1, it follows from Theorem 3.4.2 that
|AA | = | AǁA | = | I | = 1.
−1 −1
… 5/11
3.4 Properties of determinants
| |
This implies that |A | ≠ 0 and A − 1 = |A|
1
.
Example 3.4.3.
| T
(2A − 1) . |
Solution
By Theorems 3.2.1 , 3.3.2 , and 3.4.3 , we obtain
|
(2A − 1)
T
| | | | |
= 2A − 1 = 2 3 A − 1 = 8
1
|A|
=
8
6
=
4
3
.
Theorem 3.4.2 shows that the determinant of the product of two n × n matrices equals the product of the
determinants of the two matrices. But in general, it is not true for the addition of two n × n matrices, that is, in
general,
|A + B| ≠ |A| + |B|.
Example 3.4.4.
Let A =
( )
1 2
2 5
and B =
( )
3 1
1 3
. Show
|A + B| ≠ |A| + |B|.
… 6/11
3.4 Properties of determinants
Solution
Because A + B =
( )
4 3
3 8
, |A + B| = 23. On the other hand, |A| = 1, |B| = 8 and |A | + | B| = 9. Hence,
|A + B| ≠ | A | + | B | .
The following result provides a method to compute |A | + | B|, where A and B are two n × n matrices and their
(n − 1) corresponding rows must be the same.
Theorem 3.4.4.
Let
( )( )
a 11 a 12 ⋯ a 1n a 11 a 12 ⋯ a 1n
a 21 a 22 ⋯ a 2n a 21 a 22 ⋯ a 2n
⋮ ⋮ ⋱ ⋮ ⋮ ⋮ ⋱ ⋮
A= and B = .
a i1 a i2 ⋯ a in b i1 b i2 ⋯ b in
⋮ ⋮ ⋱ ⋮ ⋮ ⋮ ⋱ ⋮
a n1 a n2 ⋯ a nn a n1 a n2 ⋯ a nn
Then
(3.4.5)
… 7/11
3.4 Properties of determinants
| |
a 11 a 12 ⋯ a 1n
a 21 a 22 ⋯ a 2n
⋮ ⋮ ⋱ ⋮
|A | + | B | = .
a i1 + b i1 a i2 + b i2 ⋯ a in + b in
⋮ ⋮ ⋱ ⋮
a n1 a n2 ⋯ a nn
Proof
We denote by |C| the right-side determinant of (3.4.5) . By Theorem 3.2.1 ,
||
A = ∑ nk = 1a ikA ik, | B | = ∑nk = 1bikAik and
n n n
|C | = ∑ (a ik + b ik)A ik = ∑ a ikA ik + ∑ b ikA ik = |A| + |B|.
k=1 k=1 k=1
Example 3.4.5.
Compute |A | + | B| if
( ) ( )
1 7 5 1 7 5
A= 2 0 3 and B = 2 0 3 .
1 4 7 0 1 −1
Solution
… 8/11
3.4 Properties of determinants
Because the first two rows of A and B are the same, by Theorem 3.4.4 , we have
| || || |
1 7 5 1 7 5 1 7 5
|A | + | B| = 2 0 3 + 2 0 3 = 2 0 3
1 4 7 0 1 −1 1+0 4+1 7−1
| |
1 7 5
= 2 0 3 = − 28.
1 5 6
Exercises
1. For each of the following matrices, use its determinant to determine whether it is invertible.
… 9/11
3.4 Properties of determinants
( ) ( )
0 −1 3 5 2
1 0 0 0
1 −1 2 4 −1
2 1 2 0
A = B= 0 0 1 2 7
−2 −3 1 −2
1 1 −2 3 1
1 −1 0 0
0 −2 6 10 4
( ) ( )
2 1 4 0 1 3
C = 1 −2 4 D= −2 1 2
−2 −1 −4 1 −1 −1
( ) ( )
1 −1 3 5 0
1 0 0 0
0 0 0 0 0
−2 2 0 0
E = −2 0 −1 2 1 F=
0 6 −6 0
3 0 0 −6 6
4 0 2 −1
2 0 0 0 1
2. For each of the following matrices, find all real numbers x such that it is invertible.
( )
x 0 0 0
( )
2 x 4
2 x 2 0
A= B= 1 −2 4
−2 −3 1 −2
−2 −1 x
1 0 0 1
10/11
3.4 Properties of determinants
1. | − 3A|
| |
2. A − 1
3. |A |T
4. |A |3
|
5. (2A − 1)
T
|
|
6. (2( − A) T)
−1
|
( ) ( )
2 1 3 2 1 3
4. Compute |A | + | B| if A = 1 0 3 and B = 1 0 3 .
1 4 7 0 1 −1
5. Compute |A | + | B| if
( ) ( )
1 0 0 2 1 0 0 2
−1 0 1 2 3 0 0 1
A= and B = .
0 1 4 3 0 1 4 3
−2 2 0 1 −2 2 0 1
11/11
Chapter 4 Systems of linear equations
(4.1.1)
a 1x 1 + a 2x 2 + ⋯ + a nx n = b,
In particular, a linear equation with two variables can be written as the following form
(4.1.2)
ax + by = c,
where a, b, c are constants and x,y are variables. If either a or b is not zero, then (4.1.2) is an equation of
a straight line in the xy-plane.
A linear equation with three variables can be written in the following form
(4.1.3)
… 1/16
Chapter 4 Systems of linear equations
ax + by = cz = d,
where a, b, c, d are constants and x, y, z are variables. If one of a, b, c is not zero, then (4.1.3) is an
3 3
equation of a plane in ℝ . We shall study more about planes in ℝ in Section 6.3 .
Note that in a linear equation, the power of each variable must be 1 and there are no terms containing the
product of two or more variables. Hence, in an equation, if there is a variable whose power is not 1 or if there
is a term that contains a product of two or more variables, then the equation is not a linear equation.
Example 4.1.1.
1. 0x + 0y = 1;
2. x 2 + y 2 = 1;
3. XY = 2;
4. √2x 1 + x 2 − x 3 = 1.
Solution
(1) and (4) are linear while (2) and (3) are not linear.
Let m, n ∈ ℕ. A set of m linear equations in the same n variables is called a system of linear equations or a
linear system, which is of the form:
(4.1.4)
… 2/16
Chapter 4 Systems of linear equations
{
a 11x 1 + a 12x 2 + ⋯ + a 1nx n = b 1
a 21x 1 + a 22x 2 + ⋯ + a 2nx n = b 2
⋮
a m1x 1 + a m2x 2 + ⋯ + a mnx n = b m.
The matrix
(4.1.5)
( )
a 11 a 12 ⋯ a 1n
a 21 a 22 ⋯ a 2n
A=
⋮ ⋮ ⋱ ⋮
a m1 a m2 ⋯ a mn
(4.1.6)
( |)
a 11 a 12 ⋯ a 1n b 1
→
a 21 a 22 ⋯ a 2n b 2
(A | b)
⋮ ⋮ ⋱ ⋮ ⋮
a m1 a m2 ⋯ a mn b m
… 3/16
Chapter 4 Systems of linear equations
→
is called the augmented matrix of (4.1.4) , where b = (b 1, b 2, ⋯, b m) T.
Example 4.1.2.
For each of the following systems of linear equations, find its coefficient and augmented matrices.
1.
{ x1 − x2 = 7
x1 + x2 = 5
{
x 1 + x 2 + 3x 3 = 0
2. (2) 2x 1 + 4x 2 − 3x 3 = 0
3x 1 + 6x 2 − 5x 3 = 0
{
2x 1 + 3x 3 = 2
3. 4x 2 − 3x 3 = 1
x 1 + 2x 2 − x 3 = 0
Solution
1. A =
( )
1 −1
1 1
and (A | b) =
→
( |)
1 −1 7
1 1 5
.
( ) ( |)
1 1 3 1 1 3 0
→
2. A = 2 4 −3 and (A | b) = 2 4 −3 0 .
3 6 −5 3 6 −5 0
… 4/16
Chapter 4 Systems of linear equations
( ) ( |)
2 0 3 2 0 3 2
→
3. A = 0 4 −3 and (A | b) = 0 4 −3 1 .
1 2 −1 1 2 −1 0
→ →
X = (x 1, x 2, ⋯, x n) T ∈ ℝ n and b = (b 1, b 2, ⋯, b m) T.
By (2.2.2) , the system (4.1.4) can be written into the matrix form
(4.1.7)
→ →
AX = b.
→ →
The system (4.1.7) with b = 0, that is,
(4.1.8)
→ →
AX = 0,
→ →
is called a homogeneous system. If b ≠ 0, it is called a nonhomogeneous system.
→ → →
Let a 1, a 2, ⋯, a n be the column vectors of A given in (4.1.5) . By (2.2.3) , (4.1.4) can be written into
the linear combination form:
(4.1.9)
… 5/16
Chapter 4 Systems of linear equations
→ → →
→
x 1a 1 + x 2a 2 + ⋯ + x na n = b.
Example 4.1.3.
(4.1.10)
{
x 1 + x 2 + 3x 3 = 2
2x 1 + 4x 2 − 3x 3 = 1
3x 1 + 6x 2 + 5x 3 = 0
Write the above system into the forms (4.1.7) and (4.1.9) .
Solution
Let
( ) ( ) ()
1 1 3 x1 2
A=
→
2 4 −3 , X = x 2 , and →
b= 1 .
3 6 5 x3 0
→ →
Then the system (4.1.10) can be rewritten as AX = b. Let
… 6/16
Chapter 4 Systems of linear equations
() () ( ) ()
1 1 3 2
→ → →
→
a1 = 2 , a = 4 , a = − 3 , and b = 1 .
2 3
3 6 5 0
→ → →
→
Then the system (4.1.10) can be rewritten as x 1a 1 + x 2a 2 + x 3a 3 = b.
Definition 4.1.1.
(4.1.11)
{
a 11s 1 + a 12s 2 + ⋯ + a 1ns n = b 1
a 21s 1 + a 22s 2 + ⋯ + a 2ns n = b 2
⋮
a m1s 1 + a m2s 2 + ⋯ + a mns n = b m,
then (s 1, s 2, ⋯, s n) is called a solution of (4.1.4) . Solving a system of linear equations means finding
all solutions of the system.
Note that any solution (s 1, s 2, ⋯, s n) of (4.1.4) must satisfy each equation of (4.1.4) . If
(s 1, s 2, ⋯, s n) does not satisfy one of the equations of (4.1.4) , it is not a solution of (4.1.4) .
… 7/16
Chapter 4 Systems of linear equations
Example 4.1.4.
{ x − y = 7,
x + y = 5.
Solution
(6, − 1) satisfies both equations, so P 1(6, − 1) is a solution. (8, 1) satisfies the first equation but does not
satisfy the second one, so P 2(8, 1) is not a solution. (2,3) satisfies the second equation but does not
satisfy the the first one, so P 3(2, 3) is not a solution. (0,0) satisfies neither of the two equations, so
P 4(0, 0) is not a solution.
Example 4.1.5.
(4.1.12)
{
x 1 + x 2 + 3x 3 = 2
x1 − x2 + x3 = − 3
2x 1 + 4x 3 = − 1
… 8/16
Chapter 4 Systems of linear equations
5 3
Verify that ( − 2 , 2
, 1) is a solution of (4.1.12) while (1, 3, − 1) is not a solution.
Solution
3 3
Let x 1 = 2 , x 2 = 2 , and x 3 = 1. Then
{
5 3
x 1 + x 2 + 3x 3 = − 2
+ 2
+ 3(1) = 2,
5 3
x1 − x2 + x3 = − 2
− 2
+ 1 = − 3,
5
2x 1 + 4x 3 = 2( − 2 ) + 4(1) = − 1.
(
Hence, − 2 ,
5 3
2 )
, 1 satisfies each of the three equations in (4.1.12) and is a solution of (4.1.12) .
x 1 + x 2 + 3x 3 = 1 + 3 + 3( − 1) = 1 ≠ 2
Hence,(1, 3, − 1) does not satisfy the first equation of (4.1.12) , and thus it is not a solution of
(4.1.12) .
Note that (x 1, x 2, x 3) = (1, 3, − 1) satisfies the second equation but not the third one of (4.1.12) .
→
By Definition 4.1.1 , we see that the zero vector 0 in ℝ n is always a solution of the homogeneous system
(4.1.8) . Sometimes, the zero solution of (4.1.8) is called a trivial solution. A nonzero solution of
(4.1.8) is called a nontrivial solution.
… 9/16
Chapter 4 Systems of linear equations
(4.1.13)
NA = {X ∈ ℝ : AX = 0 }.
→ n → →
Definition 4.1.2.
The set N A is called the solution space of the homogeneous system (4.1.8) or the nullspace of the
matrix A. The nullity of A is called the dimension of the solution space N A, denoted by dim (N A), that is,
(4.1.14)
dim(N A) = null(A).
A solution space that only contains the zero vector is said to be a zero space.
Definition 4.1.3.
The linear system (4.1.4) is said to be consistent if it has at least one solution, and to be inconsistent if
it has no solutions.
By Definition 4.1.3 , a homogeneous system is consistent because it has at least a trivial solution.
By (4.1.9) and Theorem 1.4.3 , we obtain the following relation among the consistency of a linear
system, the linear combination, and the spanning space.
Theorem 4.1.1.
10/16
Chapter 4 Systems of linear equations
{ } { }
→ → → →
Let S = a 1, ⋯, a n be the same as in (1.4.2) ,A= a 1, ⋯, a n be the same as in (4.1.5) , and
→
b = (b 1, b 2, ⋯, b m) T. Then the following assertions are equivalent.
→ →
1. AX = b is consistent.
→
2. b is a linear combination of S.
→
3. b ∈ span S.
Proof
→ → →
If (1) holds, then by Definition 4.1.3 , there exists X = (x 1, x 2, ⋯, x n) such that AX = b. By (2.2.3) ,
→
(4.1.9) holds and thus, the result (2) holds. Conversely, if (2) holds, then there exist X = (x 1, x 2, ⋯, x n)
→ →
such that (4.1.9) holds. By (2.2.3) , AX = b and the result (1) holds. This shows that (1) and (2) are
equivalent. By Theorem 1.4.3 , the results (2) and (3) are equivalent.
→
By Theorem 4.1.1 , we see that the system (4.1.4) is inconsistent if and only if b is not a linear
→
combination of S and if and only if b ∉ span S.
For a system of two linear equations, it is not hard to solve it. Here we give examples to show how to solve
such systems and provide geometric meanings that exhibit the solution structures.
Example 4.1.6.
(4.1.15)
11/16
Chapter 4 Systems of linear equations
{ x 1 − x 2 = 7,
x 1 + x 2 = 5.
Solution
Adding the two equations implies 2x 1 = 12 and x 1 = 6. Substituting x 1 = 6 into the first equation implies
x 2 = x 1 − 7 = 6 − 7 = − 1, so (x 1, x 2) = (6, − 1) is a solution of the system.
In geometric terms, each of the equations in (4.1.15) represents a line and the solution (6, − 1) is the
intersection point of the two lines. In fact, the solution (6, − 1) is the unique solution of the system.
Example 4.1.7.
{ x 1 + x 2 = 1,
2x 1 + 2x 2 = 2
Solution
The two equations are the same and the solutions of the first equation are the solutions of the system.
Let x 2 = t ∈ ℝ. Then x 1 = 1 − t and
( ) () ( )
x1
x2
=
1
0
+t
−1
1
12/16
Chapter 4 Systems of linear equations
In geometry, the two equations represent the same line and every point on the line is a solution of the
system. Hence, the system has infinitely many solutions.
Example 4.1.8.
{ x 1 + x 2 = 1,
x 1 + x 2 = 2.
Solution
It is obvious that any (x 1, x 2) does not satisfy both equations. Hence, the above system has no solutions.
In geometric terms, the two equations represent two parallel lines. Hence, the above system has no
solutions.
The above three examples show that a system of two linear equations with two variables has only one
solution, or infinitely many solutions, or no solutions. The same fact holds for a system of linear equations of
two or more variables. We shall study the general case in Section 4.5 .
The methods used to solve a system of two linear equations are not suitable for a general system (4.1.4) .
In Sections 4.2–4.4, we shall study approaches to solve (4.1.4) or (4.1.7) , where we do not often
mention the linear combinations and spanning spaces. In Sections 4.5 and 4.6, following Theorem 4.1.1 ,
→ → → →
we shall study how to determine whether AX = b is consistent, b is a linear combination of S, and b ∈ span
S.
13/16
Chapter 4 Systems of linear equations
Exercises
1. Determine which of the following equations are linear.
a. x + 2y + z = 6
b. 2x − 6y + z = 1
2
c. − √2x + 6 − 3 y = 4 − 3z
d. 3x 1 + 2x 2 + 4x 3 + 5x 4 = 1
e. 2xy + 3yz + 5z = 8
2. For each of the following systems of linear equations, find its coefficient and augmented matrices.
1.
{ 2x 1 − x 2 = 6
4x 1 + x 2 = 3
{
− x 1 + 2x 2 + 3x 3 = 4
2. 3x 1 + 2x 2 − 3x 3 = 5
2x 1 + 3x 2 − x 3 = 1
{
x 1 − 3x 3 − 4 = 0
3. 2x 2 − 5x 3 − 8 = 0
3x 1 + 2x 2 − x 3 = 4
3. For each of the following augmented matrices, find the corresponding linear system.
14/16
Chapter 4 Systems of linear equations
( | )( | )
1 2 3 1 2 1 1
−1 1 0 −2 0 −1 −2
0 −1 1 0 0 3 −1
→ → →
4. For each of the following linear systems, write it into the form AX = b and express b as a linear
combination of the column vectors of A.
{
− x 1 + x 2 + 2x 3 = 3
a. 2x 1 + 6x 2 − 5x 3 = 2
− 3x 1 + 7x 2 − 5x 3 = − 1
{
x 1 − x 2 + 4x 3 = 0
b. − 2x 1 + 4x 2 − 3x 3 = 0
3x 1 + 6x 2 − 8x 3 = 0,
c.
{ x 1 − 2x 2 + x 3 = 0
− x 1 + 3x 2 − 4x 3 = 0
{ x − 2y = 7,
x + 2y = 5.
15/16
Chapter 4 Systems of linear equations
{
− x 1 + x 2 + 2x 3 = 3
2x 1 + 6x 2 − 5x 3 = 2
− 3x 1 + 7x 2 − 5x 3 = − 1
2. Verify that
( 5
6
,
7
6
,
4
3 ) is a solution of the system.
3. Determine if the vector b = (3, 2, − 1) T is a linear combination of the column vectors of the
→
1.
{ x1 − x2 = 6
x1 + x2 = 5
2.
{ x 1 + 2x 2 = 1
2x 1 + x 2 = 2
3.
{ x 1 + 2x 2 = 1
x 1 + 2x 2 = 2
16/16
4.2 Gaussian Elimination
(4.2.1)
→ →
AX = b,
Recall that for each i ∈ I n, the ith column of the coefficients matrix A is from the coefficients of the variable x i
in the system. Because A is a row echelon matrix, some columns of A would contain leading entries.
A variable x i is said to be a basic variable if the ith column of A contains a leading entry, and to be a free
variable if the ith column of A does not contain a leading entry.
The method to solve such a system (4.2.1) is to treat the free variables, if any, as parameters, and then
solve each linear equation for the basic variable starting from the last equation to the first one, where the
parameters and the obtained values of the basic variables are substituted. The method used to solve such a
system is called back-substitution.
Example 4.2.1.
(1)
x 1 + 2x 2 − 3x 3 = 4
… 1/17
4.2 Gaussian Elimination
(2)
2x 2 − 6x 3 = 5
(3)
− x3 = 2
a. Find the coefficient matrix and the augmented matrix of the system.
b. Find the basic variables and free variables of the system.
c. Solve the system using back-substitution.
→
d. Determine if the vector b = (4, 5, 2) T is a linear combination of the column vectors of the
coefficient matrix of the system.
Solution
a. The coefficient and the augmented matrices are
( ) ( |)
1 2 −3 1 2 −3 4
A= 0 2 − 6 , (A | →
b) = 0 2 −6 5 .
0 0 −1 0 0 −1 2
b. A is a 3 × 3 row echelon matrix. The columns 1,2 3 of A contain leading entries and thus, x 1, x 2,
and x 3 are basic variables. There are no free variables.
c. To solve the system, we first consider equation (3). Solving equation (3) for the basic variable x 3,
we get x 3 = − 2. Substituting x 3 = − 2 into equation (2) and solving equation (2) for the basic
variable x 2, we obtain
… 2/17
4.2 Gaussian Elimination
1 1 7
x2 = (5 + 6x 3) = [5 + 6( − 2)] = − .
2 2 2
7
Substituting x 2 = − 2
and x 3 = − 2 into equation (1) and solving equation (1) for the basic variable
x 1, we obtain
7
x 1 = 4 − 2x 2 + 3x 3 = 4 − 2( − ) + 3( − 2) = 5.
2
( 7
)
Hence, (x 1, x 2, x 3) = 5, − 2 , − 2 is a solution of the system.
d. Because the system has a solution, the vector b = (4, 5, 2) T is a linear combination of the column
→
vectors of A, namely,
→ 7→ →
→
b = 5a 1 − a 2 − 2a 3,
2
() () ()
1 2 −3
→ → →
where a 1 = 0 , a = 2 , and a = −6 .
2 3
0 0 −1
→
Example 4.2.1 shows that if r(A) = r(A | b) = n, where n = 3 is the number of variables, then the system
has a unique solution.
Example 4.2.2.
(1)
… 3/17
4.2 Gaussian Elimination
x 1 − x 2 − 3x 3 + x 4 = 1
(2)
− x2 − x3 + x4 = 2
(3)
− x4 = 3
a. Find the coefficient matrix and the augmented matrix of the system.
b. Find the basic variables and free variables of the system.
c. Solve the system using back-substitution and express its solution as a linear combination.
→
d. Determine if the vector b = (1, 2, 3) T is a linear combination of the column vectors of the
coefficient matrix of the system.
Solution
a. The coefficient and the augmented matrices are
( ) ( |)
1 −1 −3 1 1 −1 −3 1 1
→
A= 0 −1 −1 1 , (A | b) = 0 −1 −1 1 2 .
0 0 0 −1 0 0 0 −1 3
b. A is a row echelon matrix. The columns 1,2,4 of A contain leading entries and thus, x 1, x 2, and x 4
are basic variables. The third column of A does not contain any leading entries, so x 3 is a free
variable.
… 4/17
4.2 Gaussian Elimination
c. Because x 3 is a free variable, let x 3 = t, where t ∈ ℝ is an arbitrary real number. Solving equation
(3) for the basic variable x 4 implies x 4 = − 3. Substituting x 4 = − 3 into equation (2) and solving
equation (2) for the basic variable x 2, we obtain
x 2 = − 2 − x 3 + x 4 = − 2 − t + ( − 3) = − 5 − t.
Substituting x 2 = − 5 − t, x 3 = t and x 4 = − 3 into equation (1) and solving equation (1) for the
basic variable x 1, we obtain
x 1 = 1 + x 2 + 3x 3 − x 4 = 1 + ( − 5 − t) + 3t − ( − 3) = − 1 + 2t.
()( )()()()()
x1 − 1 + 2t −1 2t −1 2
x2 −5 − t −5 −t −5 −1
= = + = +t .
x3 0+t 0 t 0 1
x4 − 3 + 0t −3 0t −3 0
→
d. Because the system has infinitely many solutions, b = (1, 2, 3) T is a linear combination of the
column vectors of A.
→
The above example shows that if r(A) = r(A | b) = 3 < n, where n = 4 is the number of variables, then the
system has infinitely many solutions.
Example 4.2.3.
… 5/17
4.2 Gaussian Elimination
x 1 − x 2 − 3x 3 + x 4 − x 5 = 2 (1)
− x 3 + x 4 = 3 (2)
− x 4 + x 5 = 1 (3)
a. Find the coefficient matrix and the augmented matrix of the system.
b. Find the basic variables and free variables of the system.
c. Solve the system using back-substitution and express its solutions as a linear combination.
→
d. Determine if the vector b = (2, 3, 1) T is a linear combination of the column vectors of the
coefficient matrix of the system.
Solution
a. The coefficient and the augmented matrices are
( )
1 −1 −3 1 −1
A = 0 0 −1 1 0 ,
0 0 0 −1 1
( |)
1 −1 −3 1 −1 2
→
(A | b) = 0 0 −1 1 0 3 .
0 0 0 −1 1 1
b. A is a 3 × 5 row echelon matrix. The columns 1,3,4 of A contain leading entries and thus, x 1, x 3,
and x 4 are basic variables. The columns 2,5 of A do not contain any leading entries, so x 2 and x 5
are free variables.
… 6/17
4.2 Gaussian Elimination
c. Because x 2 and x 5 are free variables, let x 2 = s and x 5 = t, where s, t ∈ ℝ. Substituting x 5 = t into
equation (3) and solving equation (3) for the basic variable x 4, we obtain x 4 = − 1 + x 5 = − 1 + t.
Substituting x 4 = − 1 + t into equation (2) and solving equation (2) for the basic variable x 3, we
obtain
x 3 = − 3 + x 4 = − 3 + ( − 1 + t) = − 4 + t.
x 1 = 2 + x 2 + 3x 3 − x 4 + x 5 = 2 + s + 3( − 4 + t) − ( − 1 + t) + t = − 9 + s + 3t.
Hence,
(x 1, x 2, x 3, x 4, x 5) = ( − 9 + s + 3t, s, − 4 + t, − 1 + t, t)
… 7/17
4.2 Gaussian Elimination
() ( )( )
x1
− 9 + s + 3t − 9 + s + 3t
x2 s 0 + s + 0t
x3 = −4 + t = − 4 + 0s + t
x4 −1 + t − 1 + 0s + t
x5 t 0 + 0s + t
( ) () ( )
s 3t
−9
s 0t
0
= + 0 + t
−4
0 t
−1
0 t
( ) () ()
−9 1 3
0 1 0
= −4 +s 0 +t 1 .
−1 0 1
0 0 1
d. Because the system has infinitely many solutions, b = (2, 3, 1) T is a linear combination of the
→
column vectors of A.
→
Example 4.2.3 also shows that if r(A) = r(A | b) = 3 < n, where n = 5 is the number of variables, then the
system has infinitely many solutions.
… 8/17
4.2 Gaussian Elimination
Example 4.2.4.
{
x 1 − x 2 − 3x 3 = 1
i. − x2 + x3 = 3
0x 3 = 1
{
x 1 − x 2 − 3x 3 = 1
− x2 + x3 = 3
ii.
2x 3 = − 1
0x 3 = 5
a. Find the coefficient matrix and the augmented matrix of the system.
b. Find the basic variables and free variables of the system.
c. Solve the system.
Solution
i.
a. The coefficient and the augmented matrices are
( ) ( |)
1 −1 −3 1 −1 −3 1
→
A= 0 −1 1 , (A | b) = 0 −1 1 3 .
0 0 0 0 0 0 1
… 9/17
4.2 Gaussian Elimination
b. A is a 3 × 3 row echelon matrix. The columns 1,2 of A contain leading entries and thus, x 1
and x 2 are basic variables. The column 3 of A does not contain any leading entries, so x 3 is
a free variable.
c. Because the last equation implies 0 = 1, which shows that any (x 1, x 2, x 3) doesn’t satisfy
the equation. Hence, the system has no solutions.
ii.
a. The coefficient and the augmented matrices are
( ) ( |)
1 −1 −3 1 −1 −3 1
0 −1 1 →
0 −1 1 3
A= , (A | b) = .
0 0 2 0 0 2 −1
0 0 0 0 0 0 5
b. A is a 4 × 3 row echelon matrix. The columns 1,2,3 of A contain leading entries and thus,
x 1, x 2, and x 3 are basic variables. There are no free variables.
c. Because the last equation implies 0 = 5, which shows that any (x 1, x 2, x 3) doesn’t satisfy
the equation. Hence, the system has no solutions.
→
Note that in Example 4.2.4 (i), r(A) = 2 < 3 = n, where n is the number of variables, and r(A | b) = 3.
→
Example 4.2.4 (i) shows that if r(A) < r(A | b), then the system has no solutions, and even when a system
contains free variables, the system may have no solutions. In Example 4.2.4 (ii), r(A) = 3 = n, where n is
→ →
the number of variables and r(A | b) = 4. Example 4.2.4 (ii) also shows that if r(A) < r(A | b), then the
system has no solutions, and even when r(A) = n, the system may have no solutions.
→ →
Now, we study how to solve the general system AX = b given in (4.1.7) , where A is not a row echelon
matrix.
10/17
4.2 Gaussian Elimination
Let B be a row echelon matrix of A. By Theorems 2.6.2 and 2.7.3 , there exists an invertible matrix E
such that B B = EA. By (4.1.7) , we obtain
(4.2.2)
→ → →
BX = (EA)X = E b : = →
c.
Theorem 4.2.1.
Proof
→
We have shown above that if X is a solution of (4.1.7) , then it is a solution of (4.2.2) . Conversely,
→
noting that E is invertible, we obtain that if X is a solution of (4.2.2) , then it is a solution of (4.1.7) .
Hence, (4.1.7) and (4.2.2) are equivalent.
→
Let (A | b) be the augmented matrix of (4.1.4) given in (4.1.6) . Note that (4.2.2) can be expressed as
→ →
E(A | b) = (EA | E b) = (B | →
c).
→ →
Hence, to solve (4.1.7) , we first use row operations to change (A | b) to (B | →
c) and then to solve BX = →
c,
where B is a row echelon matrix, by using back-substitution. By Theorem 4.2.1 , the solutions of (4.2.2)
are the solutions of (4.1.7) . This method is called Gaussian elimination.
Example 4.2.5.
11/17
4.2 Gaussian Elimination
{
2x 1 + 4x 2 + 6x 3 = 18
a. 4x 1 + 5x 2 + 6x 3 = 24
3x 1 + x 2 − 2x 3 = 4
{
2x 1 + 2x 2 − 2x 3 = 4
b. 3x 1 + 5x 2 + x 3 = − 8
− 4x 1 − 7x 2 − 2x 3 = 13
c.
{ x 1 + 2x 2 − x 3 + 3x 4 = 5
− 2x 1 + 3x 2 − 5x 3 + x 4 = 4
{
x 1 + 2x 2 + 3x 3 = 4
d. 4x 1 + 7x 2 + 6x 3 = 17
2x 1 + 5x 2 + 12x 3 = 10
Solution
a.
12/17
4.2 Gaussian Elimination
( |) ( | )↓
2 4 6 18 1 2 3 9
→ 1
(A | b) = 4 5 6 24 R 1( )→ 4 5 6 24
2
3 1 −2 4 3 1 −2 4
( | )
R1 ( − 4 ) + R2
1 2 3 9
→ 0 −3 − 6 − 12
R1 ( − 3 ) + R3
0 − 5 − 11 − 23
( | )↓
1 2 3 9
1
R 2( − 3 )→ 0 1 2 4
0 − 5 − 11 − 23
( |)
1 2 3 9
R 2(5) + R 3→ 0 1 2 4 = (B | →
c).
0 0 −1 −3
x 1 + 2x 2 + 3x 3 = 9
(2)
x 2 + 2x 3 = 4
(3)
13/17
4.2 Gaussian Elimination
− x3 = − 3
Now, we use back-substitution to solve the above system. The variables x 1, x 2, and x 2 are basic
variables. Solving equation (3) for basic variable x 3 implies x 3 = 3. Substituting x 3 = 3 into equation
(2) and solving equation (2) for the basic variable x 2, we obtain x 2 = 4 − 2x 3 = 4 − 2(3) = − 2.
Substituting x 2 = − 2 and x 3 = 3 into equation (1) and solving equation (1) for the basic variable
x 1, we obtain
x 1 = 9 − 2x 2 − 3x 3 = 9 − 2( − 2) − 3(3) = 4.
( | ) ( | )↓
2 2 −2 4 1 1 −1 2
→ 1
(A | b) = 3 5
1 − 8 R 1( )→ 3 5 1 −8
2
− 4 − 7 − 2 13 − 4 − 7 − 2 13
( | ) ( | )↓
1 1 −1 2 1 1 −1 2
b. R1 ( − 3 ) + R2
1
→ 0 2 4
− 14 R 2( )→ 0 1 2 −7
2
R1 ( 4 ) + R3
0 − 3 − 6 21 0 − 3 − 6 21
( |)
1 1 −1 2
R 2(3) + R 3→ 0 1 2 −7 = (B | →
c).
0 0 0 0
The system of linear equations corresponding to (B | →
c) is
(1)
14/17
4.2 Gaussian Elimination
x1 + x2 − x3 = 2
(2)
x 2 + 2x 3 = − 7.
The variables x 1 and x 2 are the basic variables and x 3 is a free variable. Let x 3 = t, where t ∈ ℝ.
Substituting x 3 = t into equation (2) and solving equation (2) for basic variable x 2, we have
x 2 = − 7 − 2x 3 = − 7 − 2t. Substituting x2=−7−2t and x3=t into equation (1) and solving equation
(1) for the basic variable x1, we obtain
x1=2−x2+x3=2−(−7−2t)+t=9+3t.
c. (A|b→)=(12−13−23−51|54)↓R1(2)+R2→(12−1307−77|514)R2(17)→(12−1301−11|52)=(B|c→).
The system of linear equations corresponding to (B|c→) is
(1)
x1+2x2−x3+3x4=5
(2)
x2−x3+x4=2.
The variables x1 and x2 are the basic variables and x3 and x4 are free variables. Let x3=s and
x4=t, where s, t∈ℝ. Substituting x3=s and x4=t into equation (2) and solving equation (2) for the
basic variable x2, we obtain x2=2+x3−x4=2+s−t. Substituting x2=2+s−t, x3=s and x4=t into
equation (1) and solving equation (1) for the basic variable x1, we obtain
x1=5−2x2+x3−3x4=5−2(2+s−t)+s−3t=1−s−t.
15/17
4.2 Gaussian Elimination
d. (A|b→)=
(1234762512|41710)↓→R1(−2)+R3R1(−4)+R2(1230−1−6016|412)↓R2(1)+R3→(1230−1−6000|413)=
(B|c→).
The system of linear equations corresponding to (B|c→) is
(1)
x1+2x2+3x3=4
(2)
−x2−6x3=1
(3)
0x1+0x2+0x3=3.
The last equation (3) reads 0=3; this is impossible. Thus the original system has no solutions.
Exercises
1. Consider each of the following systems.
i. { x1+x2−3x3=2−x2−6x3=4−x3=1
ii. { x1−2x2+3x3+x4=−1x2−x3+x4=4−x4=2
iii. { −x1−2x2−2x3+x4−3x5=4−2x3+3x4=1−4x4+2x5=5
iv. { −3x1−2x2−2x3=2x2−x3=20x3=4
c. Solve the system by using back-substitution if it is consistent, and express its solution as a
linear combination.
d. Determine if the vector b→ in the right side of the system is a linear combination of the column
vectors of the coefficient matrix of the system.
2. Solve each of the following systems by using Gaussian elimination and express its solutions as linear
combinations if they have infinitely many solutions.
a. { x1−2x2+3x3=62x1+x2+4x3=5−3x1+x2−2x3=3
b. { x1+x2−2x3=32x1+3x2−x3=−4−2x1−3x2−x3=4
c. { x1+2x2+3x3=44x1+7x2+6x3=172x1+5x2+12x3=7
d. { x1+2x2−x3+3x4=53x1−x2+4x3+2x4=1−x1+5x2−6x3+4x4=9
e. { x1−2x2=32x1−x2=0−x1+4x2=−1
f. { −x1+2x2−3x3=−42x1−x2+2x3=2x1+x2−x3=2
17/17
4.3 Gauss-Jordan Elimination
Example 4.3.1.
{
2x 1 + 4x 2 + 6x 3 = 18
a. 4x 1 + 5x 2 + 6x 3 = 24
3x 1 + x 2 − 2x 3 = 4
{
x1 + x2 − x3 = 1
b. − 3x 1 + 5x 2 − x 3 = − 1
3x 1 + 7x 2 − 5x 3 = 4
… 1/20
4.3 Gauss-Jordan Elimination
{
2x 2 + 3x 3 + x 4 = 4
c. x 1 + x 3 − 2x 4 = 2
− 3x 1 + 5x 2 − 5x 4 = − 5
d.
{ x 1 + x 2 − 2x 3 = 1
− 2x 1 − x 2 + x 3 = − 2
Solution
( | ) ( | )↓
2 4 6 18 1 2 3 9 R1 ( − 4 ) + R2
→ 1
(A | b) = 4 5 6 24 R 1( )→ 4 5 6 24 →
2
R1 ( − 3 ) + R3
3 1 −2 4 3 1 −2 4
( | ) ( ) ( | )↓
1 2 3 9 1 2 3 9
1
0 −3 − 6 − 12 R 2 − → 0 1 2 4 R 2(5) + R 3→
3
0 − 5 − 11 − 23 0 − 5 − 11 − 23
a.
( | ) ( | )↑
1 2 3 9 1 2 3 9 R3 ( − 2 ) + R2
0 1 2 4 R 3( − 1)→ 0 1 2 4 →
R3 ( − 3 ) + R1
0 0 −1 −3 0 0 1 3
( | )↑ ( | )
1 2 0 0 1 0 0 4
0 1 0 −2 R 2( − 2) + R 1→ 0 1 0 −2 .
0 0 1 3 0 0 1 3
… 2/20
4.3 Gauss-Jordan Elimination
( | )↓ ( |)
1 1 −1 1 R1 ( 3 ) + R2
1 1 −1 1
→
(A | b) = −3 5 −1 −1 → 0 8 −4 2
R1 ( − 3 ) + R3
3 7 −5 4 0 4 −2 1
( | )↓ ( |)
1 1 −1 1 1 1 −1 1
1
R 2( 2 )→ 0 4 −2 1 R 2( − 1) + R 3→ 0 4 −2 1
b. 0 4 −2 1 0 0 0 0
1
R 2( 4 )→
(
1 1 −1 1
0 1 −2
0 0
1
0
↑| ) ( | )
1
4
0
R 2( − 1) + R 1→
1 0 −2
0 1 −2
0 0 0
1
1
3
4
1
4
0
.
{
1 3
x1 − 2 x3 = 4
1 1
x2 − 2 x3 = 4 ,
3 1 1 1
(x 1, x 2, x 3) = ( + t, + t, t)
4 2 4 2
→
(A | b) =
(
0
1
|) (
2 3
0 1 −2 2
−3 5 0 −5 −5
1 4
R 1 , 2→
1
0
0 1 −2 2
2 3
−3 5 0 −5 −5
1
|) ↓4
( |)
1 0 1 −2 2
R 3(3) + R 3→ 0 2 3 1
4 R ( − 2) + R →
2 3
0 5 3 − 11 1
( | ) ( | )↓
1 0 1 −2 2 1 0 1 −2 2
c.
0 2 3 1 4 R 2 , 3→ 0 1 − 3 − 13 − 7
0 1 − 3 − 13 − 7 0 2 3 1 4
( |)
1 0 1 −2 5
1
R 2( − 2) + R 3→ 0 1 − 3 13 − 3 R ( )→
3 9
0 0 9 27 18
( | )↑ ( | )
1 0 1 −2 5 R3 ( 3 ) + R2
1 0 0 −5 3
0 1 − 3 13 − 3 → 0 1 0 22 3 .
R3 ( − 1 ) + R1
0 0 1 3 2 0 0 1 3 2
The system of linear equations corresponding to the last matrix is
… 4/20
4.3 Gauss-Jordan Elimination
{
x 1 − 5x 4 = 3
x 2 − 22x 4 = 3
x 3 + 3x 4 = 2,
where x 1, x 2, and x 3 are basic variables and x 3 is a free variable. Let x 4 = t ∈ ℝ. Then
d.
→
(A | b) =
( 1 1
−2 −1
−2 1
1 −2 | )↓ R 1(2) + R 2→
( 1 1 −2 1
0 1 −3 0 | )↑
R 2( − 1) + R 1→
( 1 0
0 1 −3 0|)
1 1
= (B | →
c).
{ x1 + x3 = 1
x 2 − 3x 3 = 0
where x 1 and x 2 are basic variables and x 3 is a free variable. Let x 3 = t. Then x 1 = 1 − x 3 = 1 − t and
x 2 = 3x 3 = 3t. Hence,
(4.3.1)
… 5/20
4.3 Gauss-Jordan Elimination
( ) ( ) () ( ) () ( )
x1 1−t 1 −t 1 −1
x2 = 3t = 0 + 3t = 0 +t 3 .
x3 t 0 t 0 1
Remark 4.3.1.
The system a) in Example 4.3.1 is the same as the system (a) in Example (4.2.5) and we have used
two methods to solve the system. The solution (4.3.1) of the system d) in Example 4.3.1 is the
parametric equation of the line of intersections of the two planes
x 1 + x 2 − 2x 3 = 1 and − 2x 1 − x 2 + x 3 = − 2.
We shall study the lines and planes in more detail in Sections 6.2 and 6.3 .
For a homogeneous system (4.1.8) , we can use Gauss-Jordan elimination to solve it. Assume that we use row
→ →
operations to change the augmented matrix (A | 0) to the reduced row echelon matrix denoted by (B | 0) from
which we see that there are null(A) free variables.
The following result gives the expression of the solution space of (4.1.8) .
Theorem 4.3.1.
Let k = null(A). Let x n1, x n2, ⋯ , x nk be free variables of (4.1.8) ,where 1 ≤ n 1 < n 2 < ⋯ < n k ≤ n. Then
there exist k vectors
(4.3.2)
→ → →
v 1 , v 2 , ⋯ , v k in ℝ m
… 6/20
4.3 Gauss-Jordan Elimination
→ →
(P 1) for each i ∈ {1, 2, ⋯ , k}, the ith component of v i is 1 and the component on n j of v i is zero for j ≠ i and
j ∈ {1, 2, ⋯, k}.
{ }
→ → →
(P 1)N A = span v 1 , v 2 , ⋯ , v k .
Proof
→
Let x n = t i for i ∈ {1, 2, ⋯, k}. We solve the system corresponding to the augmented matrix (B | 0) and we
i
can rewrite the solution into the following form
(4.3.3)
→ → →
→
X = t1v1 + t2v2 + ⋯ + tk vk ,
{ }
→ → →
where v1, v2, ⋯ , vk satisfies (P 1). By (4.3.3) , we see that (P 2) holds.
Definition 4.3.1.
→ → →
α1v1, α2v2, ⋯ , αk vk
… 7/20
4.3 Gauss-Jordan Elimination
→ → →
are called the basis vectors of the solution space N A, where the vectors v 1 , v 2 , ⋯ , v k are given in
(4.3.2) .
Example 4.3.2.
For each of the following homogeneous systems, solve it by using Gauss-Jordan elimination, write its solution
as a linear combination, and find its solution space N A and dim(N A).
{
2x 1 + 4x 2 + 6x 3 = 0
a. 4x 1 + 5x 2 + 6x 3 = 0
3x 1 + x 2 − 2x 3 = 0.
{
x 1 + 2x 2 − x 3 = 0
b. 3x 1 − 3x 2 + 2x 3 = 0
− x 1 − 11x 2 + 6x 3 = 0.
c.
{ x 1 − 2x 2 − 3x 3 + x 4 = 0
x 2 − 2x 3 + 2x 4 = 0.
Solution
a.
… 8/20
4.3 Gauss-Jordan Elimination
( | ) ( | )↓
2 4 6 0 R1
()
1
2 1 2 3 0 R1 ( − 4 ) + R2
|
(A b)
→
= 4 5 6 0 → 4 5 6 0 →
R1 ( − 3 ) + R3
3 1 −6 0 3 1 −2 0
( | ) ( | )↓
1 2 3 0 R2
( )
−
1
3 1 2 3 0 R2 ( 5 ) + R3
0 −3 −6 0 → 0 1 2 0 →
0 − 5 − 11 0 0 − 5 − 11 0
( | ) ( | )↑
1 2 3 0 R3 ( − 1 )
1 2 3 0 R3 ( − 2 ) + R
0 1 2 0 → 0 1 2 0 →
R3 ( − 3 ) + R1
0 0 −1 0 0 0 1 0
( | )↑ ( | ) |
1 2 0 0 R2 ( − 2 ) + R1
1 0 0 0
→
0 1 0 0 → 0 1 0 0 = (D 0).
0 0 1 0 0 0 1 0
| →
The system corresponding to the augmented matrix (D 0) only has the zero solution (0, 0, 0). Hence,
the original system has only the zero solution. The solution space is N A = {0 }. Because
→
… 9/20
4.3 Gauss-Jordan Elimination
( | )↓ ( | )↓
1 2 −1 0 R1 ( − 3 ) + R2
1 2 −1 0
→
(A | b) = 3 −3 2 0 → 0 −9 5 0
R1 ( 1 ) + R3
− 1 − 11 6 0 0 −9 5 0
( |)
1
R2 ( − 1 ) + R3
1 2 −1 0 R2 ( − 9 )
→ 0 −9 5 0 →
0 0 0 0
b.
( | )↑ ( | )
1
1 2 −1 1 0
0 R2 ( − 2 ) + R1
9 0
5 →
0 1 −9 0 → 5 0 = (D | 0).
0 1 −9
0 0
0 0 0
0 0 0
{
1
x1 + 9 x3 = 0
5
x2 − 9 x3 = 0.
1 5 1 5
This implies x 1 = − 9 x 3 and x 2 = 9 x 3. Let x 3 = t ∈ ℝ. Then x 1 = − 9 t and x 2 = 9 t. We write the
solution as a linear combination as follows:
10/20
4.3 Gauss-Jordan Elimination
()( )()
1 1
x1 − 9t −9
x2 = 5 =t 5 .
9
t 9
x3
t 1
→ 1 5
Let v 1 = ( − 9 , 9
, 1). Then the solution space is
{ }
→ →
NA = tv1 : t ∈ ℝ = span{ v 1 }.
( | )↑ ( |)
1 −2 −3 1 0 R2 ( 2 ) + R1 1 0 −7 5 0
→
c. (A | b) = → .
0 1 −2 2 0 0 1 −2 2 0
The system corresponding to the last augmented matrix is
{ x 1 − 7x 3 + 5x 4
x 2 − 2x 3 + 2x 4
=
=
0
0.
11/20
4.3 Gauss-Jordan Elimination
( ) ( ) () ( )
x1 7s − 5t 7 −5
x2 2s − 2t 2 −2
= =s +t .
x3 s 1 0
x4 t 0 1
→ →
Let v 1 = (7, 2, 1, 0) and v 2 = ( − 5, − 2, 0, 1). Then
{ } { }
→ → → →
NA = s v 1 + t v 2 : s, t ∈ ℝ = span v 1 , v 2 .
Example 4.3.3.
There are three types of food provided to a lake to support three species of fish. Each fish of Species 1
weekly consumes an average of one unit of Food 1, two units of Food 2, and one unit of Food 3. Each fish of
Species 2 consumes, each week, an average of two units of Food 1, five units of Food 2, and 3 units of Food
3. The average weekly consumption for a fish of Species 3 is two units of Food 1, two unit of Food 2, and six
units of Food 3. Each week, 10000 units of Food 1, 15000 units of Food 2, and 25000 units of Food 3 are
supplied to the lake. Assuming that all food is eaten, how many fish of each species can coexist in the lake?
Solution
We denote by x 1, x 2, and x 3 the number of fish of the three species in the lake. Using the information in the
problem, we see that x 1 fish of Species 1 consume x 1 units of Food 1, x 2 fish of Species 2 consume 2x 2 units
of Food 1, and x 3 fish of Species 3 consume 2x 3 units of Food 1. Hence, x 1 + 2x 2 + 2x 3 = 10000 = total weekly
12/20
4.3 Gauss-Jordan Elimination
supply of Food 1. Similarly, we can establish the equations for the other two foods. This leads to the following
system
{
x 1 + 2x 2 + 2x 3 = 10000
2x 1 + 5x 2 + 2x 3 = 15000
2x 1 + 3x 2 + 6x 3 = 25000.
( | )↓ ( | )↓
1 2 2 10000 R1 ( − 2 ) + R2
1 2 2 10000
→
(A | b) = 2 5 2 15000 → 0 1 − 2 − 5000
R1 ( − 2 ) + R3
2 3 6 25000 0 −1 2 5000
( | )
→ ↑
R2 ( 1 ) + R3
1 2 2
0 1 − 2 − 5000
0 0 0
10000
0
R 2 ( − 21 ) + R 1
→
( | )
1 0 6 2000
0 1 − 2 − 5000 .
0 0 0 0
13/20
4.3 Gauss-Jordan Elimination
{ x 1 + 6x 3 = 20000
x 2 − 2x 3 = − 5000,
{
x 1 = 20000 − 6t
x 2 = − 5000 + 2t
x 3 = t,
Noting that the number of fish must be nonnegative, we must have x 1 ≥ 0, x 2 ≥ 0, and x 3 ≥ 0. Hence,
{
x 1 = 2000 − 6t ≥ 0
x 2 = − 5000 + 2t ≥ 0
x 3 = t ≥ 0.
Solving the above system of inequalities, we get 2500 ≤ t ≤ 20000 / 6 ≈ 3333.3. Because t is an integer, so we
take 2500 ≤ t ≤ 3333. This means that the population that can be supported by the lake with all food
consumed is
{
x 1 = 20000 − 6x 3
x 2 = − 5000 + 2x 3
2500 ≤ x 3 ≤ 3333.
14/20
4.3 Gauss-Jordan Elimination
For example, if we take x 3 = 2500, then x 1 = 20000 − 6(2500) = 5000 and x 2 = − 5000 + 2(2500) = 0. This
means that if there are 2500 fish of Species 3, then there are no fish of Species 2. Otherwise, the lake cannot
support all three species with the current supply of food. If we take x 3 = 3000, then
x 1 = 20000 − 6(3000) = 2000 and x 2 = − 5000 + 2(3000) = 1000. This provides the suggestion that with the
current supply of food, if 2000 fish of Species 1, 1000 fish of Species 2, and 3000 fishes of Species 3 can
coexist in the lake.
In an economic system with steel and automobile industries, let x 1 and x 2 denote the outputs of the steel and
automobile industries, respectively, and b 1 and b 2 represent the external demand from outside the steel and
automobile industries. Let a 11 represent the internal demand placed on the steel industry by itself, that is, the
steel industry needs a 11 units of the output of the steel industry to produce one unit of its output. Thus, a 11x 1
is the total amount the steel industry needs from itself. Let a 12 represent the internal demand placed on the
steel industry by the automobile industry, that is, the automobile industry needs a 12 units of the output of the
steel industry to produce one unit of output of the automobile industry. a 12x 2 is the total amount the
automobile industry needs from the steel industry. Thus, the total demand on the steel industry is
(4.3.4)
a 11x 1 + a 12x 2 + b 1.
Let a 21 represent the internal demand placed on the automobile industry by the steel industry, that is, the
steel industry needs a 21 units of the output of the automobile industry to produce one unit of output of the
steel industry. Hence, a 21x 1 is the total amount the steel industry needs from the automobile industry. Let a 22
represent the internal demand placed on the automobile industry by itself, that is, the automobile industry
needs a 22 units of the output of the automobile industry to produce one unit of its output. Hence, a 22x 2 is the
total amount the automobile industry needs from itself. Thus, the total demand on the automobile industry is
(4.3.5)
15/20
4.3 Gauss-Jordan Elimination
a 21x 1 + a 22x 2 + b 2.
Now we assume that the output of each industry equals its demand, that is, there is no overproduction. Then,
by hypotheses, (4.3.4) and (4.3.5) , we obtain the following system
{ a 11x 1 + a 12x 2 + b 1 = x 1
a 21x 1 + a 22x 2 + b 2 = x 2,
or equivalently,
(4.3.6)
{ (1 − a 11)x 1 − a 12x 2 = b 1,
− a 21x 1 + (1 − a 22)x 2 = b 2.
Example 4.3.5.
In the economic system given in Example 4.3.4 , suppose that the external demands are 100 and 200,
9 1
respectively. Suppose that a 11 = 10
, a 12 = 2 , a 21 = 110, and a 22 = 15. Find the output in each industry such
that supply exactly equals demand.
Solution
By (4.3.6) , we obtain
(4.3.7)
16/20
4.3 Gauss-Jordan Elimination
{ ( ) 9 1
1− 10
x 1 − 2 x 2 = 10,
1
( )
− 10 x 1 + 1 −
1
5
x 2 = 20.
( )
1 1
− 2 100
| ( | )
R 1 ( 10 ) 1 − 5 100 R1 ( 1 ) + R2
10
|
(A b) =
→
1 4 200
→
R 2 ( 10 ) −1 8 200
→
− 10 5
( | ) ( | )
R2 ( 3 ) R2 ( 5 ) + R1
1 − 5 100 1 − 5 100
→ →
0 3 300 0 1 100
( | )
1 0 600
0 1 100
.
Hence, x 1 = 600 and x 2 = 100. The outputs needed for supply to equal demand are 600 and 100 for the steel
and automobile industries, respectively.
Exercises
1. Solve each of the following systems using Gauss-Jordan elimination.
a.
{ x1 − x2 + x3 = 2
2x 1 − 3x 2 + x 3 = − 1
17/20
4.3 Gauss-Jordan Elimination
b.
{ 3x 1 − x 2 + 3x 3 = − 1
− 4x 1 + 2x 2 − 4x 3 = 2
{
x 1 + 3x 2 − x 3 = 2
c. 4x 1 − 6x 2 + 6x 3 = 14
− 3x 1 − x 2 − 2x 3 = 3
{
x 1 − 2x 2 + 2x 3 = 2
d. − 2x 1 + 3x 2 − 4x 3 = − 2
− 3x 1 + 4x 2 + 6x 3 = 0
{
x2 − x3 = 1
e. − x 1 + 3x 2 − x 3 = − 2
3x 1 + 3x 2 − 2x 3 = 4
{
x 1 + x 2 + 2x 3 = 1
f. − 2x 1 − 4x 2 − 5x 3 = − 1
3x 1 + 6x 2 + 5x 3 = 0
{
2x 2 + 3x 3 − 2x 4 = 3
g. 2x 1 + 3x 3 + 2x 4 = 1
− 3x 1 + x 2 − 2x 3 = 3
18/20
4.3 Gauss-Jordan Elimination
{
x 1 − x 2 + 2x 3 = − 1
− x 1 + 3x 2 − 4x 3 = − 3
h.
x2 − x3 = − 2
x 1 − 2x 2 + 3x 3 = 1
2. For each of the following homogeneous systems, solve it by using Gauss-Jordan elimination, write its
solution as a linear combination, and find its solution space N A and dim(N A).
a.
{ x + 2y − z = 0
2x − y + 3y = 0
{
2x − y + 3z = 0
b. 4x − 2y + 6z = 0
− 6x + 3y − 9z = 0
{
x 1 + 4x 2 + 6x 3 = 0
c. 2x 1 + 6x 2 + 3x 3 = 0
− 3x 1 + x 2 + 8x 3 = 0.
{
− x 1 + 2x 2 + x 3 = 0
d. 3x 1 − 3x 2 − 9x 3 = 0
− 2x 1 − x 2 + 12x 3 = 0.
19/20
4.3 Gauss-Jordan Elimination
{
2x 1 + 2x 2 − x 3 + x 5 = 0
− x 1 − x 2 + 2x 3 − 3x 4 + x 5 = 0
e.
x 1 + x 2 − 2x 3 − x 5 = 0
x3 + x4 + x5 = 0
{
x 1 + x 2 − 2x 3 = 0
− 3x 1 + 2x 2 − x 3 = 0
f.
− 2x 1 + 3x 2 − 3x 3 = 0
4x 1 − x 2 − x 3 = 0.
20/20
4.4 Inverse matrix method and Cramer’s rule
(4.4.1)
→ →
AX = b,
where A is an n × n matrix. However, if A is an invertible matrix, then there are two other methods that can be
used to solve such systems. One method is to use the inverse matrix of A to solve the system, called the
inverse matrix method, and another is to use the determinant of A to solve the system, called Cramer’s rule.
Theorem 4.4.1.
→
If A is an invertible n × n matrix, then for each b ∈ ℝ n, the system (4.4.1) has the unique solution
(4.4.2)
→ →
X = A − 1 b.
Proof
→ →
Because A is invertible, A − 1 exists and X given in (4.4.2) is defined. With X given in (4.4.2) ,
because
… 1/12
4.4 Inverse matrix method and Cramer’s rule
→ → → → →
AX = A(A − 1 b) = (AA − 1) b = I n b = b,
→
where I n is the identity matrix, and X given in (4.4.2) is a solution of (4.4.1) . Now we assume that
→ → → → →
(4.4.1) has a solution Y, that is, AY = b. Multiplying both sides of AY = b by A − 1, we obtain
→ →
A − 1(AY) = A − 1 b.
→ → → → →
This implies Y = A − 1 b and Y = X. Hence, (4.4.1) has only one solution X given in (4.4.2) .
Example 4.4.1.
{
x 1 + 2x 2 + x 3 = 2
2x 1 + 5x 2 + 2x 3 = 6
x1 − x3 = 2
Solution
From the system, we obtain the coefficient matrix
( ) ()
1 2 1 2
→
A= 2 5 2 and b = 6 .
1 0 −1 2
… 2/12
4.4 Inverse matrix method and Cramer’s rule
( | ) ( | )
1 2 1 1 0 0 Ri ( − 2 ) + R2
1 2 1 1 0 0
(A | I 3) = 2 5 2 0 1 0 → 0 1 0 −2 1 0
R1 ( − 1 ) + R3
1 0 −1 0 0 1 0 −2 −2 −1 0 1
( | ) ( | )
1 1 0 0
R2 ( 2 ) + R3
1 2 1 1 0 0 R3 ( − 2 ) 1 2 1
−2 1 0
→ 0 1 0 −2 1 0 → 0 1 0
5 1
0 0 −2 −5 2 1 0 0 1 −1 − 2
2
( | )
3 1
R3 ( − 1 ) + R1
1 2 0 −2 1 2
R2 ( − 2 ) + R1
→ 0 1 0 −2 1 0 →
0 0 1 5 1
2
−1 − 2
( | )
5 1
1 0 0 2
−1 2
0 1 0 −2 1 0 = (I 3 | A − 1).
0 0 1 5 1
2
−1 − 2
By (4.4.2) ,
… 3/12
4.4 Inverse matrix method and Cramer’s rule
() ( )( ) ( )
5 1
x1 2
−1 2 2 0
→
X= x2 →
= A − 1b = −2 1 0 6 = 2
x3 5 1 2 −2
2
−1 − 2
Corollary 4.4.1.
Example 4.4.2.
Find all the numbers a ∈ ℝ such that the following system has a unique solution for each (b 1, b 2) ∈ ℝ 2.
{ x + y = b1
2x + ay = b 2
Solution
Because |A | =
| |
1 1
2 a
= a − 2, we have if a ≠ 2, | A | ≠ 0. By Corollary 4.4.1 , if a ≠ 2, then the system
Example 4.4.3.
… 4/12
4.4 Inverse matrix method and Cramer’s rule
Show that the following system has a unique solution for each (b 1, b 2, b 3) ∈ ℝ 3.
{
x 1 + 2x 2 + x 3 = b 1
2x 1 + 5x 2 + 2x 3 = b 2
x1 − x3 = b3
Solution
We find the rank of A.
( ) ( )
1 2 1 R ( − 2 ) + R2
1 2 1 R2 ( 2 ) + R3
A= 2 5 2 → 0 1 0 →
R1 ( − 1 ) + R3
1 0 −1 0 −2 −2
( )
1 2 1
0 1 0 .
0 0 −2
Hence, r(A) = 3. By Corollary 4.4.1 , the system has a unique solution for each (b 1, b 2, b 3) ∈ ℝ 3.
By Corollary 4.4.1 , we see that if |A | ≠ 0, then the system (4.4.1) has a unique solution. The Cramer’s
rule we study below gives the expression of the unique solution by using determinants.
(4.4.3)
… 5/12
4.4 Inverse matrix method and Cramer’s rule
|A1 | |A 2 | |A n |
x1 = , x2 = , ⋯, x n = ,
|A| |A| |A|
→
where A i is obtained by replacing the ith column of A by b, namely,
(4.4.4)
( )
a 11 a 12 ⋯ a1 ( i − 1 ) b1 a1 ( i + 1 ) ⋯ a 1n
a 21 a 22 ⋯ a2 ( i − 1 ) b2 a2 ( i + 2 ) ⋯ a 2n
Ai = .
⋮ ⋮ ⋱ ⋮ ⋮ ⋮ ⋱ ⋮
a n1 a n2 ⋯ an ( i − 1 ) bn an ( i + 1 ) ⋯ a nn
Proof
By Theorem 3.2.1 and (4.4.4) , we have
(4.4.5)
n
|A i | = ∑ b kA kj.
k=1
By (3.3.11) ,
(4.4.6)
… 6/12
4.4 Inverse matrix method and Cramer’s rule
( )( ) ( )
A 11 A 21⋯ A n1 b1
∑ nk = 1b kA k1
⋮ ⋮ ⋮ ⋮ ⋮
→
adj(A) b = A 1i A 2i⋯ A ni bi = ∑ nk = 1b kA ki .
⋮ ⋮ ⋮ ⋮ ⋮
A 1n A 2n⋯ A nn bn
∑ nk = 1b kA km
(4.4.7)
→ → 1 → →
X = A − 1b = adj(A) b for each b ∈ ℝ n.
|A|
Example 4.4.4.
a.
{ 2x − 3y = 7
x + 5y = 1
… 7/12
4.4 Inverse matrix method and Cramer’s rule
{
x 1 + 2x 3 = 6
b. − 3x 1 + 4x 2 + 6x 3 = 30
− x 1 − 2x 2 + 3x 3 = 8
Solution
a. a)Let
A=
( )( )
a 11 a 12
a 21 a 22
=
2 −3
1 5
→
and b =
( ) ()
b1
b2
=
7
1
.
By (4.4.4) , we have
A1 =
( )( )
b 1 a 12
b 2 a 22
=
7 −3
1 5
and A 2 =
( )( )
a 11 b 1
a 21 b 2
=
2 7
1 1
.
(x, y) = ( 38
13
, −
5
13 )
is the solution of the system.
b. Let
… 8/12
4.4 Inverse matrix method and Cramer’s rule
( )( ) ()()
a 11 a 12 a 13 1 0 2 b1 6
A= a 21 a 22 a 23 = −3 4 6 →
and b = b2 = 30 .
a 31 a 32 a 33 −1 −2 3 b3 8
By (4.4.4) , we obtain
( )( )
b 1 a 12 a 13 6 0 2
A1 = b 2 a 22 a 23 = 30 4 6
b 3 a 32 a 33 8 −2 3
( )( )
a 11 b 1 a 13 1 6 2
A2 = a 21 b 2 a 23 = − 3 30 6
a 31 b 3 a 33 −1 8 3
( )( )
a 11 a 12 b 1 1 0 6
A3 = a 21 a 22 b 2 = −3 4
30 .
a 31 a 32 b 3 −1 −2 8
By computation,
| | | |
|A| = 44, A 1 = − 40, A 2 = 72, and A 3 = 152, | |
and
… 9/12
4.4 Inverse matrix method and Cramer’s rule
10 18 38
Hence, by (4.4.3) ,(− 11
, 11
, 11
) is the solution of the system.
Exercises
1. Solve each of the following systems using the inverse matrix method.
a.
{ 3x 1 + 4x 2 = 7
x 1 − x 2 = 14
{
x1 + x2 + x3 = 2
b. 2x 1 + 3x 2 + x 3 = 2
x1 − x3 = 1
{
x 1 − 2x 2 + 2x 3 = 1
c. 2x 1 + 3x 2 + 3x 3 = − 1
x 1 + 6x 3 = − 4
{
x 1 + 2x 2 + 3x 3 = 0
d. 2x 1 + 5x 2 + 3x 3 = 1
x 1 + 8x 3 = − 1
10/12
4.4 Inverse matrix method and Cramer’s rule
{
x 1 − 2x 3 + x 4 = 0
x 2 + x 3 + 2x 4 = 1
e.
− x1 − x2 + x3 = − 1
2x 1 + 9x 4 = 1
2. Show that if a ≠ 3, then the following system has a unique solution for (b 1, b 2) ∈ ℝ 2.
{ x − y = b1
ax − 3y = b 2
3. For each (b 1, b 2, b 3) ∈ ℝ 3, show that the following system has a unique solution.
{
x 1 + 2x 3 = b 1
− 3x 1 + 4x 2 + 6x 3 = b 2
− x 1 − 2x 2 + 3x 3 = b 3
a.
{ x − 3y = 6
2x + 5y = − 1
11/12
4.4 Inverse matrix method and Cramer’s rule
{
x 1 + 2x 3 = 2
b. − 2x 1 + 3x 2 + x 3 = 16
− x 1 − 2x 2 + 4x 3 = 5
{
x1 + x2 − x3 = 1
c. 2x 1 + x 2 − x 3 = 0
x 2 − 2x 3 = − 2
{
x1 + x2 + x3 = 0
d. 2x 1 − x 2 − x 3 = 1
x 2 + 2x 3 = 2
{
x1 − x4 = 0
x2 + x3 = 0
e.
x1 − x2 = 0
x3 − x4 = 1
12/12
4.5 Consistency of systems of linear equations
| →
In this section, we use ranks of r(A) and (A b) to determine whether a linear system (4.1.7) , that is,
(4.5.1)
→ →
AX = b,
where the coefficient matrix A is an m × n matrix, has a unique solution, infinitely many solutions,, or no solutions without solving
the system.
Theorem 4.5.1.
|
→
1. r(A) = r(A b) = n if and only if the system (4.5.1) has a unique solution.
Proof
| →
| →
Note that A is an m × n matrix and (A b) is an m × (n + 1) matrix. Use row operations to change (A b) to a row echelon matrix,
denoted by (B | →
c). Then B is a row echelon matrix of A and
| →
r(A) = r(B) and r(A b) = r(B →
c). |
… 1/12
4.5 Consistency of systems of linear equations
| →
i. If r(A) = r(A b) = n, then the system (4.5.1) | →
has a unique solution. Indeed, if r(A) = r(A b) = n, then by Theorem
2.5.2 , we have n ≤ m and r(B) = r(B | →
c) = n. We write B = (b ij) . This implies that b ii ≠ 0 for i = 1, 2, ⋯, n. If
m×n
m > n, then all rows of the matrix (B | →
c) from n + 1 to m are zero rows. Using back-substitution to solve the system
→
c, we obtain only one solution.
BX = →
| →
ii. If r(A) = r(A b) < n, then the system (4.5.1) has infinitely many solutions.
| →
Indeed, because r(A) = r(A b) < n, r(B) = r(B → |
c) < n. This implies that if r(B) < m, then all rows of the matrix (B | →
c)
→ →
from r(B) + 1 to m are zero rows and the system BX = →
c has n − r(B) free variables. Hence, BX = →
c has infinitely many
solutions.
| →
iii. If r(A) < r(A b), then the system (4.5.1) |
has no solutions. In fact, if r(A) < r(A b), then r(B) < r(B | →
→
c) and c j ≠ 0,
→
where j = r(B) + 1. This implies that the jth equation of the system BX = →
c is
0x 1 + ⋯ + 0x n = c j,
→
from which we see that the system BX = →
c has no solutions.
| →
Note that r(A) ≤ r(A b) and r(A) ≤ min{n, m} ≤ n. Hence, there are only three cases: r(A) < r(A b), r(A) < r(A b) = n, | →
| →
| →
and r(A) < r(A b) < n, from which we see that the converses of (i)-(iii) hold.
By Theorem 4.5.1 , we obtain the following result, which gives the structure of the solutions of (4.1.4) .
Corollary 4.5.1.
Proof
… 2/12
4.5 Consistency of systems of linear equations
| →
| →
| →
Because r(A) ≤ r(A b) and r(A) ≤ n, there are only three cases: r(A) = r(A b) = n, r(A) = r(A b) < n, and r(A) = r(A b). The | →
Example 4.5.1.
Use Theorem 4.5.1 to determine whether the following systems have a unique solution, infinitely many solutions, or no
solutions.
{
2x 1 + 4x 2 + 6x 3 = 18
a. 4x 1 + 5x 2 + 6x 3 = 24
3x 1 + x 2 − 2x 3 = 4
{
2x 1 + 2x 2 − 2x 3 = 4
b. 3x 1 + 5x 2 + x 3 = − 8
− 4x 1 − 7x 2 − 2x 3 = 13
{
x 1 + 2x 2 + 3x 3 = 4
c. 4x 1 + 7x 2 + 6x 3 = 17
2x 1 + 5x 2 + 12x 3 = 10
{
x1 − x2 = 1
d. − 2x 1 − 5x 2 = 5
− 3x 1 + 10x 2 = − 10
Solution
… 3/12
4.5 Consistency of systems of linear equations
| ( )
1 2 3 9
(B →
c) = 0 1 2 4
0 0 −1 −3
| ( )
1 1 −1 2
(B →
c) = 0 1 2 −7
0 0 0 0
| ( )
1 2 3 4
(B →
c) = 0 −1 −6 1
0 0 0 3
d.
( ) ( )
1 −1 1 R1 ( 2 ) + R2
1 −1 1
| →
(A b) = −2 −5 5 →
R1 ( − 3 ) + R3
0 −7 7
− 3 10 − 10 0 7 −7
( )
R2 ( 1 ) + R3
1 −1 1
→ 0 −7 7 .
0 0 0
… 4/12
4.5 Consistency of systems of linear equations
The following result shows that if n > m, then it is impossible for (4.5.1) to have a unique solution. Hence, if (4.5.1) has a
unique solution, then n ≤ m, see Example 4.5.1 (a) and (d).
Corollary 4.5.2.
Proof
Because n > m, by Theorem 4.5.1 (1), (4.5.1) doesn’t have a unique solution. Hence, by Corollary 4.5.1 , we see
that (4.5.1) either has infinitely many solutions or no solutions.
| →
The ranks of A and (A b) are not directly involved in the conditions of Corollary 4.5.2 . Hence, one can easily determine that
(4.5.1) doesn’t have a unique solution by comparing m and n.
Example 4.5.2.
For each of the following systems, determine whether it has a unique solution.
a.
{ x 1 − x 2 − 3x 3 − 4x 4 = 1
− x1 + x2 − x3 = − 1
b.
{ x1 + x2 + x3 = 1
x1 + x2 + x3 = 2
Solution
a. Because m = 2 < 4 = n, by Corollary 4.5.2 , the system doesn’t have a unique solution.
b. Because m = 2 < 3 = n, by Corollary 4.5.2 , the system doesn’t have a unique solution.
By Theorem 4.5.1 , we obtain the following result, which can be used to determine whether a linear system is consistent.
Corollary 4.5.3.
… 5/12
4.5 Consistency of systems of linear equations
1. (4.5.1) |
is consistent if and only if r(A) = r(A b).
→
2. (4.5.1) |
is inconsistent if and only if r(A) < r(A b).
→
Note that a homogeneous system (4.1.8) is always consistent because it has a zero solution. Hence, for a homogeneous
system, the third case of Theorem 4.5.1 never occurs. By Theorem 4.5.1 , we obtain the following result on homogeneous
systems.
Theorem 4.5.2.
→ →
1. AX = 0 has a unique solution if and only if r(A) = n.
→ →
2. (2)AX = 0 has infinitely many solutions if and only if r(A) = n.
Remark 4.5.1.
→ →
By Theorem 4.5.2 (2), if n > m, then the homogeneous system AX = 0 has infinitely many solutions because by Theorem
2.5.2 , r(A) ≤ min{m, n} = m < n.
By Theorems 2.7.6 and 3.4.1 , we obtain the following result for a homogeneous system AX→=0→, where A is an n×n
matrix.
Theorem 4.5.3.
Example 4.5.3.
Determine whether each of the following systems has a unique solution or infinitely many solutions.
a. { x1−2x2=0−2x1+3x2=0−3x1+4x2=0.
… 6/12
4.5 Consistency of systems of linear equations
b. { x1+3x2−x3=0−x1+x2+6x3=0−3x1−x2+2x3=0.
c. { x1−2x2−x3+x4=0x2−x3+2x4=0.
d. { x1−3x2+x3=0−x1+2x2−2x3=02x1−5x2+3x3=03x1−7x2+5x3=0.
Solution
a. Because
A=(1−2−23−34)→R1(3)+R3R1(2)+R2(1−20−10−2)→R2(−2)+R3(1−20−100),
r(A)=2. Note that m=3 and n=2. Hence, r(A)=2=n and by Theorem 4.5.2 (1), the system has a unique solution.
b. Because
|A|=| 13−1−116−3−12 |13−11−3−1=(2−54−1)−(3−6−6)=−44≠0,
it follows from Theorem 4.5.3 that the system has a unique solution.
c. Because m=2<4=n, it follows from Theorem 4.5.2 (2) that the system has infinitely many solutions.
d. Because
A=
(1−31−12−22−533−75)→R1(−3)+R4R1(1)R2R1(−2)+R3(1−310−1−1011022)→R2(2)+R4R2(1)+R3(1−310−1−1000000),
r(A)=2. Hence, n=3<4=m and r(A)=2<3=n. It follows from Theorem 4.5.2 (2) that the system has infinitely many
solutions.
In Section 4.4, we show that if A is an n×n matrix and r(A)=n, then the system AX→=b→ has a unique solution for every b→∈ℝn,
see Corollary 4.4.1 . Now we generalize the result to the general system (4.5.1) .
i. How to determine A(ℝn)=ℝm, where A(ℝn) is the same as in (2.2.1) , that is, how to determine whether AX→=b→ is
consistent for every b→∈ℝm?
ii. If A(ℝn)≠ℝm, that is, AX→=b→ is not consistent for every b→∈ℝm? then how to find all the vector b→ in ℝm such that
AX→=b→ is consistent? or how to find all the vector b→ in ℝm such that AX→=b→ is inconsistent?
By Corollary 4.5.3 , we see that for a given vector b→∈ℝm, AX→=b→ is consistent if r(A)=r(A|b→). Here we want AX→=b→ to
be consistent for every b→∈ℝm, that is, for every b→∈ℝm, r(A)=r(A|b→).
… 7/12
4.5 Consistency of systems of linear equations
The following result shows that r(A)=r(A|b→) for every b→∈ℝm is equivalent to r(A)=m.
Theorem 4.5.4.
2. There exists a vector b→∈ℝm such that the system (4.5.1) has no solutions in Rn if and only if r(A)<m.
Proof
It is obvious that the results (i) and (ii) are equivalent, and the results (i) and (2) are equivalent. So we only prove that results (i)
and (iii) are equivalent. Assume that (i) holds. We change (A|b→) to (B|c→) by using row operations, where B is a row echelon
matrix of A. By Theorem 4.2.1 , for each c→∈ℝm, the linear system
(4.5.2)
BX→=c→
has at least one solution in ℝn. If we assume r(A)<m, then r(B)=r(A)<m, the rows from r(A)+1 to m of B are zero rows. Let
c0→∈ℝm be a vector whose (r(A)+1)th component is 1 and other components are zero. Then r(B|c0→)=r(B)+1 and r(B)
<r(B|c0→). By Theorem 4.5.1 (3), we obtain that the following system
BX→=c0→
has no solutions in ℝn. Because the system (4.5.1) is equivalent to the system (4.5.2) , it follows that the system
(4.5.1) has no solutions in ℝn, which contradicts the fact that system (4.5.1) is consistent for each b→∈ℝm. Hence,
r(A)=m and (iii) holds.
Conversely, we assume r(A)=m. We use row operations to change A to a row echelon matrix B. Then r(B)=r(A)=m. For each
c→∈ℝm,
m=r(B)≤r(B|c→)≤min{m, n+1}=m.
… 8/12
4.5 Consistency of systems of linear equations
Hence, r(B)=r(B|c→). By Corollary 4.5.3 , (4.5.2) is consistent. By Theorem 4.2.1 , the system (4.5.1) is consistent
for each b→∈ℝm and (i) holds.
Theorem 4.5.4 provides a criterion to justify whether a linear system is consistent, that is, whether a linear system has at least
one solution.
The following result gives criteria to determine whether a linear system has infinitely many solutions or a unique solution.
Theorem 4.5.5.
1. The system (4.5.1) has infinitely many solutions for each b→∈ℝm if and only if r(A)=m<n.
2. The system (4.5.1) has a unique solution in ℝn for each b→∈ℝm if and only if m=n, r(A)=n if and only if m=n and |A|
≠0.
Proof
1. Assume that the system (4.5.1) has infinitely many solutions for each b→∈ℝm. By Theorem 4.5.4 (1), r(A)=m. If
m=n, then A is an n×n matrix and is invertible. By Corollary 4.4.1 , the system (4.5.1) with m=n has a unique
solution for each b→∈ℝm, which contradicts the fact that the system (4.5.1) has infinitely many solutions for each
b→∈ℝm. Hence, m<n.
Conversely, assume that r(A)=m and m<n. Let b→∈ℝm. We change (A|b→) to (B|c→) by using row operations, where B
is a row echelon matrix of A. Because r(A)=m, by Theorem 4.2.1 ,
(4.5.3)
BX→=c→
has at least one solution for every c→∈ℝm. Because r(A)=m, we have r(B)=m. Because m<n, v(B)=n−m≥1, so the
system (4.5.3) has at least one free variable. Hence, the system (4.5.3) has infinitely many solutions. It follows
from Theorem 4.2.1 that the system (4.5.1) has infinitely many solutions.
2. If m=n, r(A)=n, then by Theorems 4.5.4 , the system (4.5.1) has a unique solution in ℝn for each b→∈ℝm=ℝn.
Conversely, assume that the system (4.5.1) has a unique solution in ℝn for each b→∈ℝm. By Theorem 4.5.4 (1),
r(A)=m and m≤n. Because the system (4.5.1) has a unique solution in ℝn for each b→∈ℝm, by the above result (1),
m=n; otherwise, if m<n, then the system (4.5.1) would have infinitely many solutions in ℝn for each b→∈ℝm, a
contradiction.
… 9/12
4.5 Consistency of systems of linear equations
Corollary 4.5.2 shows that if (4.5.1) has a unique solution, then n≤m. But Theorem 4.5.5 (2) shows that if the system
(4.5.1) has a unique solution in ℝn for each b→∈ℝm, then A must be a square matrix. In other words, if A is an m×n matrix with
m≠n, then there exists a vector b→∈ℝm such as AX→=b→ has infinitely many solutions or has no solutions.
Example 4.5.4.
{ x1−x2+x3=b13x1−2x2−x3=b2
Solution
Because
A=(1−113−2−1)→R1(−3)+R2(1−1101−4),
1. Because r(A)=2=m, by Theorem 4.5.4 (1), the system is consistent for all b1, b2∈ℝ.
2. Because r(A)=2=m and m=2<3=n, by Theorem 4.5.5 (1), the system has infinitely many solutions for all b1, b2∈ℝ.
Example 4.5.5.
For each of the following systems, determine whether the system is consistent for all b1, b2, b3∈ℝ. If not, find conditions on
b1, b2, b3 such that the system is consistent.
a. { x1−x2+2x3+x4=b12x1+4x2−4x3−x4=b23x1+3x2−2x3=b3
b. { x1−x2+2x3=b1−3x1+2x2+x3=b24x1−3x2+x3=b3.
c. { x1+x2+2x3=b1x1+x3=b23x1+4x2+7x3=b3
Solution
a. (4.5.4)
0%
10/12
4.5 Consistency of systems of linear equations
(A|b→)=
(1−121b124−4−1b233−20b3)→R1(−3)+R3R1(−2)+R2(1−121b106−8−3b2−2b106−8−3b3−3b1)→R2(−1)+R3(1−121b106−8−3b2−2
From the left side of the last matrix in (4.5.4) , we have r(A)=2<3=m. By Theorem 4.5.4 (2), the system is not
consistent for all b1, b2, b3∈ℝ.
To choose b1, b2, b3∈ℝ such that the system with these b1, b2, b3 are consistent, by Corollary 4.5.3 , we need r(A)=
(A|b→). By (4.5.4) , we see that if b3−b2−b1=0, then r(A)=(A|b→). Hence, the condition on b1, b2, b3 is b3−b2−b1=0,
that is, when b3=b1+b2, the system is consistent.
b. (4.5.5)
(A|b→)=
(1−12b1−321b24−31b3)→R1(−4)+R3R1(3)+R2(1−12b10−17b2+3b101−7b3−4b1)→R2(1)+R3(1−12b10−17b2+3b1000b3+b2−b1).
From the left side of the last matrix in (4.5.5) , we have r(A)=2<3=m. It follows from Theorem 4.5.4 (2) that the
system is not consistent for all b1, b2, b3∈ℝ. By (4.5.5) , we see that if b3+b2−b1=0, then r(A)=2=(A|b→). Hence,
when b3+b2−b1≠0, the system is consistent.
c. (A|b→)=
(112b1101b2347b3)→R1(−3)+R3R1(−1)+R2(112b10−1−1b2−b1011b3−3b1)→R2(1)+R3(112b1011b1−b2000−4b1+b2+b3)=
(B|c→).
Exercises
1. For each of the following systems, determine whether it has a unique solution, infinitely many solutions, or no solutions.
a. { x1−2x2+3x3=62x1+x2+4x3=5−3x1+x2−2x3=3
b. { x1+x2−2x3=32x1+3x2−x3=−4−2x1−3x2x3=4
c. { x1+2x2+3x3=44x1+7x2+6x3=172x1+5x2+12x3=7
d. { x1−2x2−x3+2x4=0−x2−2x3+x4=0
e. { x1−x2+3x3=2−2x1+2x2−6x3=52x1−2x2+6x3=4
f. { x1−x2−3x3−4x4=1−x1+x2−x3=−1
a. { 2x1−3x2−x3−5x4=−1−3x1+6x2−2x3+x4=3x1+x4=−2
11/12
4.5 Consistency of systems of linear equations
b. { x1−3x2+2x3=−1x1+4x2−x3=2
3. For each of the following homogeneous systems, determine whether it has a unique solution or infinitely many solutions.
a. { x1−2x2−x3=0−x1+3x2+2x3=0−x1+x2=0x2+x3=0.
b. { x1+x2−2x3=02x1−x2+4x3=0−x1−2x2−3x3=0.
c. { x1−x2+x3=02x2−4x3+x3=0.
d. { x1−x2+2x3=0−2x1−2x2+x3=0−x1−3x2+3x3=03x1+x2+x3=0.
5. Determine if the following system is consistent for all b1, b2, b3∈ℝ.
{ x1−x2+2x3+x4=b12x1+4x2−4x3=b2−3x1+4x3−x4=b3
6. Find conditions on b1, b2, b3 such that the following system is consistent.
{ x1−x2+2x3=b12x1+4x2−4x3=b2−3x1−2x3=b3
7. Determine whether the following system has infinitely many solutions for all b1, b2∈ℝ.
{ x1−2x2+2x3=b1−3x1+3x2+x3=b2
12/12
4.6 Linear combination, spanning space and consistency
| →
and r(A b), without solving the system AX = b
→ →
Combining Theorem 4.1.1 and Corollary 4.5.3 , we obtain the following criteria.
Theorem 4.6.1.
{ }
→ → → →
Let S : = a 1, ⋯, a n be the same as in A = (a 1, ⋯, a n) be the same as in (4.1.5) , and
→
b = (b 1, b 2, ⋯, b m) T. Then the following assertions are equivalent.
→ →
1. AX = b is consistent.
→
2. b is a linear combination of S.
→
3. b ∈ span S.
By Theorem 4.6.1
→ →
| →
| →
(4), we see that if b = 0, then r(A) = r(A 0) = r(A b). It follows that 0 is a linear
→
Example 4.6.1.
… 1/11
4.6 Linear combination, spanning space and consistency
{ }
→ → → → → →
Let S = a 1, a 2, a 3 and A = (a 1, a 2, a 3), where
() ( ) () ( )
1 −2 3 6
→ → →
→
a1 = 2 , a2 = 1 , a3 = 4 , b= 5 .
3 −1 7 11
→ → →
1. Determine whether AX = b is consistent, where X = (x 1, x 2, x 3) T.
→
2. Determine whether the vector b is a linear combination of S.
→
3. Determine whether the vector b ∈ span S.
Solution
( ) ( )
1 −2 3 6 R1 ( − 2 ) + R2
1 −2 3 6
|
(A b) =
→
2 1 4 5 →
R1 ( − 3 ) + R3
0 5 −2 −7
3 − 1 7 11 0 5 −2 −7
( )
R2 ( − 1 ) + R3
1 −2 3 6
→ 0 5 −2 −7 .
0 0 0 0
| →
Hence, r(A) = r(A b) = 2. By Theorem 4.6.1 , we have the following conclusions:
… 2/11
4.6 Linear combination, spanning space and consistency
→ →
1. The system AX = b is consistent.
→
2. The vector b is a linear combination of S.
→
3. The vector b ∈ span S.
Example 4.6.2.
→
Let S and A be the same as in Example 4.6.1 and let b 1 = (6, 5, 10) T.
→ → →
1. Determine whether AX = b is consistent, where X = (x 1, x 2, x 3) T.
→
2. Determine whether the vector b is a linear combination of S.
→
3. Determine whether the vector b 1 ∈ span S.
Solution
|
We find r(A) and r(A b).
→
( ) ( )
1 −2 3 6 R1 ( − 2 ) + R2
1 −2 3 6
|
(A b) =
→
2 1 4 5 →
R1 ( − 3 ) + R3
0 5 −2 −7
3 − 1 7 10 0 5 −2 −8
( )
R2 ( − 1 ) + R3
1 −2 3 6
→ 0 5 −2 −7 .
0 0 0 −1
… 3/11
4.6 Linear combination, spanning space and consistency
| |
→ →
Hence, r(A) = 2 and r(A b 1) = 3. Because r(A) < r(A b 1), by Theorem 4.6.1 , we have the
conclusions.
→
→
1. The system AX = b 1 is inconsistent.
→
2. The vector b is not a linear combination of S.
→
3. The vector b ∉ span S.
Example 4.6.3.
Let S =
{( ) ( )}
6
3
,
−4
−2
.
→
1. Determine whether b = (2, 1) is in span S.
→
2. Determine whether b = (1, 2) is in span S.
Solution
→
Let b = (b 1, b 2) T.
… 4/11
4.6 Linear combination, spanning space and consistency
()
( )
1
( )
R1 6 2 b1
6 − 4 b1 1 −3
R1 ( − 3 ) + R2
|
(A b) =
→
3 − 2 b2
→
3 − 2 b2
6
→
( )
2 b1
1 −3 6
b1
.
0 0 b2 − 2
b1
1. If b 2 − 2 | →
= 0, then r(A) = (A b) and by Theorem 4.6.1
→
, b ∈ span S. Hence, if
b1 2
→ →
b = (b 1, b 2) = (2, 1), then b 2 − 2
=1− 2
= 0 and b = (2, 1) ∈ span S.
b1
2. If b 2 − 2 | →
≠ 0, then r(A) < (A b) and by Theorem 4.6.1
→
, b ∉ span S. Hence, if
b1 1 3
→ →
b = (b 1, b 2) = (1, 2), then b 2 − 2
=2− 2
= 2
≠ 0 and b = (1, 2) ∉ span S.
Example 4.6.4.
→ → →
→
Let a 1 = (1, 1, 3) , a 2 = (1, 0, 4) , a 3 = (2, 1, 7) T, and b = (b 1, b 2, b 3) T.
T T
{ }
→ → →
→
1. Find conditions on b 1, b 2, b 3 such that b ∈ span a 1, a 2, a 3 .
{ }
→ → →
→
2. Find conditions on b 1, b 2, b 3 such that b ∉ span a 1, a 2, a 3 .
… 5/11
4.6 Linear combination, spanning space and consistency
{ }
→ → →
→ T →
3. Find a vector b = (b 1, b 2, b 3) such that b ∉ span a 1, a 2, a 3 .
Solution
→→→
→
Let b = (b 1, b 2, b 3) and A = (a 1 a 2 a 3).
( ) ( )
1 1 2 b1 1 1 2 b1
R1 ( − 1 ) + R2
|
(A b) =
→
1 0 1 b2 →
R1 ( − 3 ) + R3
0 − 1 − 1 b2 − b1
3 4 7 b3 0 1 1 b 3 − 3b 1
( ) |
1 1 2 b1
R2 ( 1 ) + R3
→ 0 1 1 b1 − b2 = (B →
c).
0 0 0 − 4b 1 + b 2 + b 3
| →
1. If − 4b 1 + b 2 + b 3 = 0, that is, b 3 = 4b 1 − b 2, then r(A) = (A b). By Theorem 4.6.1 ,
{ }
→ → →
→
b ∈ span a 1, a 2, a 3 .
{ }
→ → →
| →
2. If b 3 ≠ 4b 1 − b 2 ≠ 0, then r(A) < (A b). By Theorem 4.6.1
→
, b ∉ span a 1, a 2, a 3 .
{ }
→ → →
→
3. Let b 1 = b 2 = 0 and b 3 = 1. Then b 3 ≠ 4b 1 − b 2. Hence b = (0, 0, 1) ∉ span a 1, a 2, a 3 .
… 6/11
4.6 Linear combination, spanning space and consistency
→
Theorem 4.6.1 provides a criterion to determine whether a given vector b belongs to span S. In the
→
following, we study how to determine whether every vector b in ℝ m belongs to span S, that is, to determine
whether span S = ℝ m. By (2.2.1) , we have span S = A(ℝ n). This, together with
→ →
Theorem 4.5.4 (1), implies that span S = A(ℝ n) = ℝ m if and only if the system AX = b is consistent for every
→
b ∈ ℝ m. This leads us to obtain the following criterion to determine whether span S = ℝ m by using ranks.
Theorem 4.6.2.
1. span S = ℝ m
2. The system AX = b is consistent for every vector b in ℝ m.
→ → →
3. r(A) = m.
Proof
→
By Theorems 4.5.4 and 4.6.1 , span S = ℝ m if and only if every vector b in ℝ n belongs to span S if
→ → →
and only if the system AX = b is consistent for every vector b in ℝ n if and only if r(A) = m.
Example 4.6.5.
{( ) ( ) ( )}
1 −2 1
S= −4 , 5 , 2 .
4 −5 3
Solution
… 7/11
4.6 Linear combination, spanning space and consistency
( ) ( )
1 −2 1 R1 ( 4 ) + R2
1 −2 −1 R2 ( 1 ) + R3
A = −4 5 2 → 0 −3 6 →
R1 ( − 4 ) + R3
4 −5 3 0 3 −1
( )
1 −2 −1
0 −3 6 .
0 0 5
Example 4.6.6.
{( ) ( ) ( ) ( )}
1 0 −2 0
S= 0 , 1 , 0 , 3 .
1 −1 −2 −3
Solution
… 8/11
4.6 Linear combination, spanning space and consistency
( ) ( )
1 0 −2 0 R1 ( − 1 ) + R3
1 0 −2 0 R2 ( 1 ) + R3
A = 0 1 0 3 → 0 1 0 3 →
1 −1 −2 −3 0 −1 0 −3
( )
1 0 −2 0
0 1 0 3 .
0 0 0 0
Exercises
{ }
→ → → → → → → →
1. Let S = a 1, a 2, a 3, a 4 and A = (a 1, a 2, a 3, a 4), where
() ( ) () ( )
1 −2 3 −1
→ → → →
a1 = 2 , a2 = 1 , a3 = 4 , a1 = −4 ,
3 −1 7 −5
()
2
→
b= 2 .
4
→ → →
1. Determine whether AX = b is consistent, where X = (x 1, x 2, x 3, x 4) T.
→
2. Determine whether the vector b is a linear combination of S.
… 9/11
4.6 Linear combination, spanning space and consistency
→
3. Determine whether the vector b ∈ span S.
→ → →
2. Let a 1 = (1, − 2, 2) , a 2 = ( − 1, 0, − 10) , a 3 = (1, − 1, 6) T, and b = (b 1, b 2, b 3) T.
T T →
{ }
→ → →
→
1. Find conditions on b 1, b 2, b 3 such that b ∈ span a 1, a 2, a 3 .
{ }
→ → →
→
2. Find conditions on b 1, b 2, b 3 such that b ∉ span a 1, a 2, a 3 .
{ }
→ → →
→ →
3. Find a vector b = (b 1, b 2, b 3) T such that b ∉ span a 1, a 2, a 3 .
3. Let S =
{( ) ( )}
2
1
,
−4
−2
.
i. S =
{( ) ( ) ( ) ( )}
−1
0
,
1
0
,
−2
1
,
3
5
.
ii. S =
{( ) ( ) ( )}
−1
1
,
2
−2
,
−5
5
.
iii. S =
{( ) ( ) ( ) ( )}
0
1
,
2
1
,
−2
3
,
4
1
.
10/11
4.6 Linear combination, spanning space and consistency
{( ) ( ) ( )}
1 2 3
S= 2 , 5 , 4 .
1 0 5
{( ) ( ) ( )}
1 1 1
0 0 2
S= , , .
4 0 3
1 −1 4
11/11
Chapter 5 Linear transformations
Definition 5.1.1.
(5.1.1)
⎝ ⎠⎝ ⎠
am1 an2 ⋯ amn xn
→ → → →
where The vector T ( X ) is said to be the image of under T and is
T n
X = (x1 , x2 , ⋯ , xn ) ) ∈ R . X X
→
called the inverse image of T ( X ). The space Rn is the domain of T and Rm is the codomain of T. The
matrix A is said to be the standard matrix for T and T is called the multiplication by A. A linear transformation
from Rn to Rn is said to be a linear operator.
… 1/11
Chapter 5 Linear transformations
If T is a linear transformation from Rn to Rm , then we say that T maps Rn into Rm , which is denoted by the
symbol T : Rn → Rm .
When m = n = 1, the linear transformation is a function T : R → R defined by T (x) = a11 x. Hence, linear
transformations are generalizations of such functions to higher dimensional spaces.
By Definition 5.1.1 , we see that every matrix determines a linear transformation. Hence, for every linear
transformation, there is a unique matrix corresponding to it. Sometimes, we write T as TA in order to emphasize
that the linear transformation is determined by the standard matrix A.
Example 5.1.1.
1 0 1
⎛ ⎞
1 2 3
A1 = ( ) , A2 = ⎜ −1 1 2⎟,
1 0 1
⎝ ⎠
3 −2 0
Solution
Because the size of A1 is 2 × 3, the domain and codomain of TA are R3 and R2 , respectively.
1
1 1
⎛ ⎞ ⎛ ⎞
1 2 3 6
TA1 ⎜ 1 ⎟ = ( )⎜1⎟ = ( ).
1 0 1 2
⎝ ⎠ ⎝ ⎠
1 1
… 2/11
Chapter 5 Linear transformations
−1 1 0 1 −1 0
⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
TA2 ⎜ 0 ⎟ = ⎜ −1 1 2⎟⎜ 0 ⎟ = ⎜ 3 ⎟.
⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠
1 3 −2 0 1 −3
Example 5.1.2.
1 4 7
⎛ ⎞
→ → → → → →
Let A = ⎜ 2 5 8⎟. Find T ( e1 ), T ( e2 ), T ( e3 ), where e1 , e2 , e3 are the standard vectors in R3 .
⎝ ⎠
3 6 9
Solution
1 1 4 7 1 1
⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
→
T ( e1 ) = T ⎜0⎟ = ⎜2 5 8 ⎟⎜0⎟ = ⎜2⎟.
⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠
0 3 6 9 0 3
0 1 4 7 0 4
⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
→
T ( e1 ) = T ⎜1⎟ = ⎜2 5 8 ⎟⎜1⎟ = ⎜5⎟.
⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠
0 3 6 9 0 6
0 1 4 7 0 7
⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
→
T ( e1 ) = T ⎜0⎟ = ⎜2 5 8 ⎟⎜0⎟ = ⎜8⎟.
⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠
1 3 6 9 1 9
By Example 5.1.2 , we see that the column vectors of the standard matrix A are the images of the standard
vectors in R3 under T. In fact, the result holds for any linear transformations.
Theorem 5.1.1.
→ →
Let T : Rn → Rm be a linear transformation and e1 , ⋯ , en be the standard vectors in Rn given in
(1.1.1) . Then the standard matrix A of T is given by
… 3/11
Chapter 5 Linear transformations
→ → →
A = (T ( e 1 ), T ( e 2 ), ⋯ , T (e n )).
(5.1.2)
⎝ ⎠
am1 x1 + am2 x2 + ⋯ + amn xn
or
(5.1.3)
Example 5.1.3.
a. T (x, y, z) = ( 12 x, y, 4z);
b. T (x, y, z) = ( 12 x + 1
2
z,
1
3
y+
1
3
z).
Solution
1
x 0 0 x
⎛ ⎞ ⎛ 2 ⎞⎛ ⎞
a. T ⎜y ⎟ = ⎜ 0 1 0⎟⎜y ⎟;
⎝ ⎠ ⎝ ⎠⎝ ⎠
z 0 0 4 z
… 4/11
Chapter 5 Linear transformations
x 1 1
x
⎛ ⎞ 0 ⎛ ⎞
b. T
2 2
⎜y ⎟ = ( )⎜y ⎟.
1 1
⎝ ⎠ 0 ⎝ ⎠
3 3
z z
Definition 5.1.2.
A linear operator T : R
n
→ R
n
satisfying
(5.1.4)
→ → →
n
T (X ) = kX for each X ∈ R
By Definition 5.1.2 , we see that the standard matrix for a contraction or dilation with factor k is a diagonal
matrix kIn .
Proposition 5.1.1.
→ →
Let T : R
n
→ R
m
be a linear transformation and let X , Y ∈ R
n
and α, β ∈ R. Then
(5.1.5)
→ → → →
T (α X + β Y ) = αT ( X ) + βT ( Y ).
Proof
→ →
Let A be the m × n standard matrix of T. Then by Definition 5.1.1 , we have T ( X ) = A X and
→ →
T(Y ) = AY . By Theorem 2.2.1 , we have
… 5/11
Chapter 5 Linear transformations
→ → → → → → → →
T (α X + β Y ) = A(α X + β Y ) = α(A X ) + β(A Y ) = αT ( X ) + βT ( Y )
Example 5.1.4.
→ →
Let X = (1, 1, −1), Y = (−1, 2, 1), and
0 1 1
⎛ ⎞
A = ⎜1 2 −1 ⎟ .
⎝ ⎠
1 −1 3
→ →
Compute TA (3 X − 2Y ) .
Solution
By Proposition 5.1.1 , we have
→ → → →
TA (3 X − 2 Y ) = 3TA ( X ) − 2TA ( Y )
0 1 1 1 0 1 1 −1
⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞
= 3⎜1 2 −1 ⎟ ⎜ 1 ⎟ − 2⎜1 2 −1 ⎟ ⎜ 2 ⎟
⎝ ⎠⎝ ⎠ ⎝ ⎠⎝ ⎠
1 −1 3 −1 1 −1 3 1
0 3 0 6 −6
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
= 3⎜ 4 ⎟ − 2 ⎜ 2 ⎟ = ⎜ 12 ⎟ − ⎜ 4 ⎟ = ⎜ 8 ⎟.
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
−3 0 −9 0 9
Definition 5.1.3.
… 6/11
Chapter 5 Linear transformations
1. If S, T : R
n m
→ R are linear transformations, then we define the following operations:
→ → →
i. (Addition) (T + S)( X ) = T ( X ) + S( X ) .
→ → →
ii. (Subtraction) (T − S)( X ) = T ( X ) − S( X ) .
→ →
iii. (Scalar multiplication) (αT )( X ) = α(T ( X )) for α ∈ R .
2. If T : Rn → Rm and S : R
m
→ R
l
are linear transformations, then we define the following
composition of T and S:
(5.1.6)
→ →
(ST )( X ) = S(T ( X )).
By Definition 5.1.1 and the above operations, we see that if A and B are the standard matrices for
→ R , respectively, then A + B, A − B, αA are the standard matrices for
n m
TA , TB : R
(i)′ → →
(TA + TB )( X ) = (A + B)( X ) .
(ii)′ → →
(TA − TB )( X ) = (A − B)( X ) .
(iii)′ → →
(αTA )( X ) = (αA)( X ) for α ∈ R .
→ →
(TB TA )( X ) = (BA)( X ).
Note that the standard matrix for TB TA is BA instead of AB and the composition linear transformation TB TA is
defined under the condition that the domain of TB must be the same as the codomain of TA .
Example 5.1.5.
… 7/11
Chapter 5 Linear transformations
Solution
(Method 1). We write T and S as follows.
3 −2 1 1
⎛ ⎞ ⎛ ⎞
x1 x1 x1 x1
T ( ) = ⎜ −1 −1 ⎟ ( ) and S ( ) = ⎜ 1 −1 ⎟ ( ).
x2 x2 x2 x2
⎝ ⎠ ⎝ ⎠
2 3 −2 4
Then
x1 x1 x1
(2T − 3S) ( ) = 2T ( ) − 3S ( )
x2 x2 x2
3 −2 1 1
⎛ ⎞ ⎛ ⎞
x1 x1
= 2 ⎜ −1 −1 ⎟ ( ) − 3⎜ 1 −1 ⎟ ( )
⎝ ⎠ x2 ⎝ ⎠ x2
2 3 −2 4
3 −2 1 1
⎛ ⎞ ⎛ ⎞
x1
= [2 ⎜ −1 −1 ⎟ − 3 ⎜ 1 −1 ⎟] ( )
⎝ ⎠ ⎝ ⎠ x2
2 3 −2 4
3 −7 3x1 − 7x2
⎛ ⎞ ⎛ ⎞
x1
= ⎜ −5 1 ⎟( ) = ⎜ −5x1 + x2 ⎟ .
⎝ ⎠ x2 ⎝ ⎠
10 −6 10x1 − 6x2
… 8/11
Chapter 5 Linear transformations
= (6x1 − 4x2 , −2x1 − 2x2 , 4x1 + 6x2 ) − (3x1 + 3x2 , 3x1 − 3x2 , −6x1 + 12x2 )
Example 5.1.6.
and
Solution
Because
1 1
⎛ ⎞
x1 x1
TA ( ) = ⎜2 −3 ⎟ ( ) and
x2 ⎝ ⎠ x2
3 −2
x1 x1
⎛ ⎞ ⎛ ⎞
1 −1 1
TB ⎜ x2 ⎟ = ( ) ⎜ x2 ⎟ ,
⎝ ⎠ 1 0 −2 ⎝ ⎠
x3 x3
we have
… 9/11
Chapter 5 Linear transformations
1 1
⎛ ⎞
x1 1 −1 1 x1
(TB TA ) ( ) = ( )⎜2 −3 ⎟ ( )
x2 1 0 −2 x2
⎝ ⎠
3 −2
2 2 x1
= ( )( ).
−5 5 x2
Exercises
1. For each of the following matrices,
2 −2 4 −1 −2 2 1
⎛ ⎞ ⎛ ⎞
A1 = ⎜ −1 2 3 0 ⎟ A2 = ⎜ 1 1 −3 ⎟
⎝ ⎠ ⎝ ⎠
0 1 2 −1 1 0 −2
1 −2 0 −1 1 1
⎛ ⎞ ⎛ ⎞
⎜ −2 3 1 ⎟ ⎜ 2 0 −2 ⎟
A3 = ⎜ ⎟ A4 = ⎜ ⎟
⎜ 3 −2 3 ⎟ ⎜ −1 −1 1 ⎟
⎝ ⎠ ⎝ ⎠
1 2 −1 0 0 1
1 2 1 0
2. Let A = ( ). Find TA ( ) and TA ( ) .
−1 3 0 1
−1 2 1 → → → → → →
3. Let A = ( ). Find TA ( e1 ), TA ( e2 ), TA ( e3 ), where e1 , e2 , e3 are the standard vectors in
3 0 6
R
3
.
4. For each of the following linear transformations, express it in (5.1.1) .
a. T (x1 , x2 ) = (x1 − x2 , 2x1 + x2 , 5x1 − 3x2 ).
b. T (x1 , x2 , x3 , x4 ) = (6x1 − x2 − x3 + x4 , x2 + x3 − 2x4 , 3x3 + x4 , 5x4 ).
c. T (x1 , x2 , x3 ) = (4x1 − x2 + 2x3 , 3x1 − x3 , 2x1 + x2 + 5x3 ).
d. T (x1 , x2 , x3 ) = (8x1 − x1 + x3 , x2 − x3 , 2x1 + x2 , x2 + 3x3 ).
10/11
Chapter 5 Linear transformations
→ →
5. Let X = (2, −2, −2, 1), Y = (1, −2, 2, 3) and
1 0 1 0
⎛ ⎞
A = ⎜ 1 −2 1 1⎟.
⎝ ⎠
−1 1 2 0
→ →
Compute TA ( X − 3 Y ) .
6. Let S, T : R3 → R3 be linear operators such that T (1, 2, 3) = (1, −4) and S(1, 2, 3) = (4, 9).
Compute (5T + 2S)(1, 2, 3) .
7. Let S, T 3
: R → R be linear transformations such that T (1, 2, 3) = (1, 1, 0, −1) and
4
11/11
5.2 Onto, one to one, and inverse transformations
Definition 5.2.1.
(5.2.1)
T(ℝ n) = {T(X) ∈ ℝ
→ m →
: X ∈ ℝn }
is called the range of T.
By Definitions 5.1.1 and 5.2.1 and Theorem 2.2.2 , we obtain the following result, which
establishes the relations among the range of T, A(ℝ n) given in (2.2.1) and the spanning space given in
(1.4.8) .
Theorem 5.2.1.
{ }
→ → →
Let S = a 1, a 2, ⋯, a n be the set of the column vectors of A given in (2.1.2) Then
(5.2.2)
… 1/12
5.2 Onto, one to one, and inverse transformations
By Theorems 4.6.1 and 5.2.1 , we obtain the following result, which uses the ranks of r(A) and
→ →
r(A | b) to determine whether b ∈ T(ℝ n).
Theorem 5.2.2.
Let S be the same as in Theorem 5.2.1 . Then the following assertions are equivalent.
→
i. b ∈ T(ℝ n);
→ →
ii. AX = b is consistent;
→
iii. b is a linear combination of S;
→
iv. b ∈ span S;
→
v. r(A) = r(A | b),
Example 5.2.1.
Let
→ →
T(x 1, x 2, x 3) = (x 1 − x 2, − 2x 1 + 2x 2 + x 3, − x 1 + x 2 + x 3) and b = (, , 0, − 1). Determine whether b ∈ T(ℝ 3).
Solution
… 2/12
5.2 Onto, one to one, and inverse transformations
( ) ( )
1 −1 0 1 R1 ( 2 ) + R2
1 −1 0 1
A= −2 2 1 0 → 0 0 1 2 R ( − 1) + R →
2 3
R1 ( 1 ) + R3
−1 1 1 −1 0 0 1 0
( )
1 −1 0 1
0 0 1 2 .
0 0 0 −2
, b ∉ T(ℝ 3).
→ →
This implies r(A) = 2 < 3 = r(A | b). By Theorem 5.2.2
Definition 5.2.2.
By Theorem 5.2.1 , T is onto if and only if span S = ℝ m. By Theorem 4.6.2 , we obtain the following
result, which uses the rank of A to determine whether T is onto.
Theorem 5.2.3.
Let T : ℝ n → ℝ m be a linear transformation with standard matrix A. Then the following assertions are
equivalent.
i. T is onto.
ii. span S = ℝ m.
→ → →
iii. The system T(X) = b is consistent for each b ∈ ℝ m.
iv. r(A) = m.
By Theorem 2.5.2 , r(A) ≤ min{m, n}. Hence, if T : ℝ n → ℝ m is onto, then by Theorem 5.2.3 ,
m = r(A) ≤ min{m, n} ≤ n.
… 3/12
5.2 Onto, one to one, and inverse transformations
Hence, the necessary condition for T to be onto is m ≤ n and if m > n, then T is not onto.
Example 5.2.2.
a. T(x 1, x 2, x 3, x 4) = (x 1 + 2x 2 − 2x 3 + x 4, 3x 2 + 4x 3, x 1 + 2x 3 + x 4).
b. T(x 1, x 2, x 3, ) = (x 1 + x 2, − x 1 + 2x 2 + 4x 3, 2x 1 − x 2 + 3x 3, 5x 1 + 3x 2 + 6x 3).
Solution
( ) ( )
1 2 −2 1 R1 ( − 1 ) + R3
1 2 −2 1 R3 ( 1 ) + R2
A = 0 3 4 0 → 0 3 4 0 →
1 0 2 1 0 −2 4 0
a.
( ) ( )
1 2 −2 1 R ( 2 ) + R3
1 2 −2 1
0 1 8 0 → 0 1 0 . 8
0 −2 4 0 0 0 20 0
Hence, r(A) = 3 = m. By Theorem 5.2.3 , T is onto.
b. Because the standard matrix A of T is a 4 × 3 matrix, by Theorem 5.2.3 , T is not onto because
r(A) ≤ 3 = min{4, 3} < 4 = m.
→ →
In (4.1.13) , we introduce the solution space N A of a homogeneous system AX = 0. For a linear
→ →
transformation T, the solution space of TX = 0 is called a kernel for T and is denoted by ker(T), that is,
(5.2.3)
ker(T) = {X ∈ ℝ : T(X) = 0 }.
→ n → →
… 4/12
5.2 Onto, one to one, and inverse transformations
→
The kernel ker(T) is the set of all inverse images of zero vector 0 in ℝ m under T. If A is the standard matrix of
T, then by (4.1.13) and Definition 5.1.1 , we have
(5.2.4)
ker(T A) = N A.
Definition 5.2.3.
→ → → → → →
Because T(X) = AX, if T is one to one, then AX = 0 implies X = 0. That is, T one to one if and only if the
→ →
system AX = 0 only has the zero solution. The latter can be justified by Theorem 4.5.2 (1). Hence, we
obtain the following criterion, which uses the rank of A to determine whether T is one to one.
Theorem 5.2.4.
Let T : ℝ n → ℝ m be a linear transformation with standard matrix A. Then the following assertions hold.
i. T is one to one.
→ →
ii. AX = 0 only has the zero solution.
iii. r(A) = n.
n = r(A) ≤ min{m, n} ≤ m.
Hence, what’s necessary for T to be one to one is n ≤ m and if n > m, then T is not one to one.
Example 5.2.3.
a. T(x, y) = (x − y, 2x + y, 4x − y).
b. T(x, y) = (x − 2y, 2x − 4y, 4x − 8y).
c. T(x, y, z) = (x − y − z, 2x + y + 3y).
Solution
a. Because
( ) ( ) ( )
1 −1 R1 ( − 2 ) + R2
1 −1 R2 ( − 1 ) + R3
1 −1
A3 × 2 = 2 1 → 0 3 → 0 3 ,
R1 ( − 4 ) + R3
4 −1 0 3 0 0
( ) ( )
1 −2 R1 ( − 2 ) + R2
1 −2
A3 × 2 = 2 −4 → 0 0 ,
R1 ( − 4 ) + R3
4 −8 0 0
Let T : ℝ n → ℝ m be a linear transformation. By Theorem 5.2.3 , T is onto if and only if r(A) = m and by
Theorem 5.2.4 , T is one to one if and only if r(A) = n. Hence, when m = n, we see that onto and one to
one linear transformations are the same. This, together with Theorems 2.7.6 and 2.7.5 , implies the
following result that uses the rank or determinant of A to determine whether a linear operator T : ℝ n → ℝ n is
onto or one to one.
… 6/12
5.2 Onto, one to one, and inverse transformations
Theorem 5.2.5.
Let T be a linear operator from ℝ n to ℝ n with standard matrix A. Then the following assertions are
equivalent.
i. T is one to one.
ii. T is onto.
iii. |A | ≠ 0.
iv. r(A) = n.
v. A is invertible.
Let T : ℝ n → ℝ n be a linear operator with standard matrix A. If A is invertible, then A − 1 exists. The linear
operator with standard matrix A − 1 is called the inverse of T, denoted by T − 1, that is,
(5.2.5)
→ →
T − 1(X) = A − 1X.
Example 5.2.4.
Determine whether the following linear operators are invertible. If so, find their inverses.
Solution
a. Because
… 7/12
5.2 Onto, one to one, and inverse transformations
|A | =
| cosθ − sinθ
sinθ cosθ | = cos 2θ + sin 2θ = 1 ≠ 0,
A −1 =
1
|A|
=
| cosθ − sinθ
sinθ cosθ | ( =
cosθ
− sinθ cosθ
sinθ
) .
By (5.2.5) ,
T −1
() (
x
y
=
cosθ
− sinθ cosθ
sinθ
)( )
x
y
.
b. Because |A | =
| |
2 1
3 4
= 5 ≠ 0, by Theorem 5.2.5 , T is invertible. By Theorem 2.7.5 (2),
( )
4 1
( )
1 4 −1 5
−5
A −1 = = .
|A| − 3 2 3 2
−5 5
By (5.2.5) , we have
… 8/12
5.2 Onto, one to one, and inverse transformations
( )
4 1
() ()
x1 5
−5 x1
−1
T = .
x2 3 2 x2
−5 5
c. By computation,
| |
1 1 −1
|A | = − 1 2 0 = 4 ≠ 0.
0 1 1
Hence, T is invertible.
… 9/12
5.2 Onto, one to one, and inverse transformations
( ) ( )
1 1 −1 1 0 0 R1 ( 1 ) + R2
1 1 −1 1 0 0
(A | I 3) = −1 2 0 0 1 0 → 0 3 −1 1 1 0
0 1 1 0 0 1 0 1 1 0 0 1
( )
R2 , 3
1 1 −1 1 0 0 R2 ( − 3 ) + R3
→ 0 1 1 0 0 1 →
0 3 −1 1 1 0
( )
1
1 1 −1 1 0 0 R( − 4 )
0 1 1 0 0 1 →
0 0 −4 1 1 −3
( ) ( )
3 1 3
1 1 −1 1 0 0 1 1 0 4
−4 4
R3 ( − 1 ) + R2
0 1 1 0 0 1 1 1 1
→ 0 1 0 4 4 4
1 1 3 R3 ( 1 ) + R1
0 0 1 −4 −4 4 1 1 3
0 0 1 −4 −4 4
( )
1 1 1
1 0 0 2
−2 2
R ( − 1 ) + R1
1 1 1
→ 0 1 0 4 4 4
= (I 3 | A − 1).
1 1 3
0 0 1 −4 −4 4
10/12
5.2 Onto, one to one, and inverse transformations
By (5.2.5) , we have
( )( )
1 1 1
()
−2
x1 2 2 x1
1 1 1
T − 1 x2 = 4 4 4
x2 .
x3 1 1 3 x3
−4 −4 4
Exercises
1. For each of the following linear transformations, determine whether it is onto.
a. T(x 1, x 2, x 3) = (2x 1 − 2x 2 + x 3, 3x 1 − 2x 2 + x 3, x 1 + 2x 2 − 4x 3, 2x 1 − 3x 2 + 2x 3).
b. T(x 1, x 2, x 3) = (x 1 + 2x 2 − x 3, − x 1 − 2x 2, − x 1 + x 2 + 2x 3).
c. T(x 1, x 2, x 3, x 4) = (x 1 − 2x 2 + x 4, 3x 1 + 2x 4, x 1 − 2x 3 + 3x 4).
d. T(x 1, x 2, x 3, x 4) = (x 1 − x 2 + x 3, x 1 − x 4, 2x 1 − x 2 + x 3 − x 4).
2. Let T : ℝ 2 → ℝ 2 be defined by
T(x, y) = (x − y, 2x − 2y).
→ →
3. Show that T is not onto and find a vector b ∈ ℝ 2 such that b ∉ T(ℝ 2).
4. For each of the following linear transformations, determine whether it is one to one.
a. T(x 1, x 2, x 3) = (x 1 − x 2 + 2x 3, − x 1 − x 2, x 1 + x 2 − x 3, − 2x 1 − x 2 + x 3).
b. T(x 1, x 2, x 3) = ( − x 1 + x 2 + x 3, x 1 − x 2 − 2x 3, + x 1 + x 2).
c. T(x 1, x 2, x 3, x 4) = (x 1 − x 2 + x 3, 2x 1 + x 2 + x 3, x 1 + x 2 + 2x 4).
d. T(x 1, x 2, x 3) = (x 1 − x 2 + 3x 3, x 1 − 2x 3, 2x 1 − x 2 + x 3).
11/12
5.2 Onto, one to one, and inverse transformations
5. For each of the following linear operators, determine whether it is invertible. If so, find its inverse.
a. T(x 1, x 2) = (x 1 + x 2, 6x 1 + 7x 2).
b. T(x 1, x 2) = (3x 1 − 9x 2, 4x 1 − 12x 2).
c. T(x 1, x 2, x 3) = (x 1 + 3x 2 + 2x 3, x 2 − x 3, x 3).
d. T(x 1, x 2, x 3) = (x 1 + 3x 2, 3x 1 + 7x 2 − 4x 3, x 1 + 5x 2 + 5x 3).
12/12
5.3 Linear operators in Math Equation and Math Equation
Projection operators in ℝ 2
Theorem 5.3.1.
Let T : ℝ 2 → ℝ 2 be the orthogonal projection on a line l passing through the origin (0, 0) and let the angle
between l and the x-axis be θ (see Figure 5.3 ). Then
(5.3.1)
T
()
x
y
=
( cos 2θ
cosθsinθ
sinθcosθ
sin 2θ )( )x
y
.
Proof
From Figure 5.3 , we see that
(5.3.2)
(w 1, w 2) = (rcosθ, rsinθ).
… 1/19
5.3 Linear operators in Math Equation and Math Equation
(5.3.3)
r = xcosθ + ysinθ.
(5.3.4)
… 2/19
5.3 Linear operators in Math Equation and Math Equation
The result (5.3.3) can be proved by using the dot product. Indeed, by Figure 5.3 and (5.3.2) , we
see that
→
BA = (x − rcosθ, y − rsinθ).
→ → → →
Because BA ⊥ OB, the dot product BA ⋅ OB = 0, that is,
(x − rcosθ, y − rsinθ)
( )
rcosθ
rsinθ
= 0.
0 = (x − rcosθ)(rcosθ) + (y − rsinθ)(rsinθ)
= r[(x − rcosθ)cosθ + (y − rsinθ)sinθ]
[ ]
= r xcosθ + ysinθ − r(cos 2θ + sin 2θ) = r[xcosθ + ysinθ − r].
… 3/19
5.3 Linear operators in Math Equation and Math Equation
π π
By Theorem 5.3.1 with θ = 0, 2 , 4 , we obtain
Example 5.3.1.
T
( ) ( )( ) ( )
x
y
=
1 0
0 0
x
y
=
x
0
.
T
( ) ( )( ) ( )
x
y
=
0 0
0 1
x
y
=
0
y
.
() ()
1 1 x+y
T
()
x
y
=
2
1
2
2
1
2
()
x
y
=
2
x+y
2
.
… 4/19
5.3 Linear operators in Math Equation and Math Equation
Solution
i. The projection operator on the x-axis is
T
()
x
y
=
( cos 20
cos0sin0
sin0cos0
sin 20 ) ( )( ) ( )
=
1 0
0 0
x
y
=
x
0
.
( )
π π π
cos 2 2
() ( )( ) ( )
x sin 2 cos 2 0 0 x 0
T = = = .
y π π
2
π 0 1 y y
cos 2 sin 2 sin 2
( )( ) ( )
π π π 1 1 x+y
cos 2 4
() ()
x sin 4 cos 4 2 2 x 2
T = = = .
y π π
2
π 1 1 y x+y
cos 4 sin 4 sin 4 2 2 2
Reflection operators in ℝ 2
Theorem 5.3.2.
Let T : ℝ 2 → ℝ 2 be the reflection operator about a line l passing through the origin (0, 0) and let the angle
between l and the x-axis be θ (see Figure 5.3 ). Then
… 5/19
5.3 Linear operators in Math Equation and Math Equation
(5.3.5)
T
() (
x
y
=
cos2θ sin2θ
sin2θ − cos2θ )( )x
y
.
Proof
By the midpoint formula (1.2.14) and Theorem 5.3.1 , from Figure 5.3 , we have
()
w1 + x
2
w2 + y
2
=
( cos 2θ
cosθsinθ
sinθcosθ
sin 2θ )( ) x
y
.
This implies
T
()
x
y
=
( ) () (
w1
w2
= −
x
y
+2
cos 2θ
cosθsinθ
sinθcosθ
sin 2θ )( )
x
y
=
( )( )
2cos 2θ − 1 2sinθcosθ
2cosθsinθ 2sin 2θ − 1
x
y
.
This, together with cos2θ = 2cos 2θ − 1 = 1 − 2sin 2θ and sin2θ = 2sinθcosθ, implies (5.3.5) .
… 6/19
5.3 Linear operators in Math Equation and Math Equation
π π
By Theorem 5.3.2 with θ = 0, 2 , 4 , we obtain
Example 5.3.2.
T
( ) ( )( ) ( )
x
y
=
1
0 −1
0 x
y
=
−y
x
.
ii. The reflection about the y-axis (see Figure 6.2 (II)) is
T
( ) ( )( ) ( )
x
y
=
−1 0
0 1
x
y
=
−x
y
.
iii. The reflection about the line y = x (see Figure 6.2 (II)) is
… 7/19
5.3 Linear operators in Math Equation and Math Equation
T
( ) ( )( ) ( )
x
y
=
0 1
1 0
x
y
=
y
x
.
Solution
i. The reflection about the x-axis is
T
() (
x
y
=
cos2(0)
sin2(0) − cos2(0)
sin2(0)
) ( )( ) ( )
=
1
0 −1
0 x
y
=
x
−y
.
() ()
( )
π π
cos2 sin2
( )
2 2
T
() x
y
= =
cosπ sinπ
sinπ − cosπ
() () sin2
π
2
− cos2
π
2
( )( ) ( )
=
−1 0
0 1
x
y
=
−x
y
.
… 8/19
5.3 Linear operators in Math Equation and Math Equation
() ()
( )( )
π π
cos2 sin2 π π
4 4
()
x cos 2 sin 2
T = =
y π π
() () sin2
π
4
− cos2
π
4
sin 2 − cos 2
( )( ) ( )
=
0 1
1 0
x
y
=
y
x
.
Example 5.3.3.
1. Find the linear operator on ℝ 2 that reflects about the line y = x and then projects on the y-axis.
2. Find the linear operator on ℝ 2 that projects on the y-axis and then reflects about the line y = x.
Solution
The reflection about the line y = x is T 1(x, y) = (y, x) and the projection on the y-axis is T 2(x, y) = (0, y).
A rotation operator on ℝ 2 is an operator that rotates each vector in ℝ 2 through a fixed angle θ (see Figure
6.2 (IV)).
Theorem 5.3.3.
… 9/19
5.3 Linear operators in Math Equation and Math Equation
(5.3.6)
T
() (
x
y
=
cosθ − sinθ
sinθ cosθ )( )x
y
.
Proof
We rotate (x, y) to (b 1, b 2) through a counterclockwise angle θ Then
(5.3.7)
T(x, y) = (b 1, b 2).
Let r be the length of the segment from the origin to the point (x, y) and let ϕ denote the angle between
the half-line beginning at the origin (0, 0) and passing through (x, y) and the x-axis. Then
(x, y) = (rcosϕ, rsinϕ) and
() (
b1
b2
=
rcos(θ + ϕ)
rsin(θ + ϕ) ) (
=
rcosθcosϕ − rsinθsinϕ
rsinθcosϕ + rsinϕcosθ )
=
( xcosθ − ysinθ
xsinθ + ycosθ ) ( =
cosθ − sinθ
sinθ cosθ )( )
x
y
Example 5.3.4.
10/19
5.3 Linear operators in Math Equation and Math Equation
Solution
π
By (5.3.6) with θ = 6 ,
( ) ( )
π π √3 1
cos 6 − sin 6 −2
T
()
x
y
= π
sin 6 cos 6
π ()
x
y
=
2
1 √3 ()x
y
.
2 2
( ) ()
√3 1 √3 − 1
() ()
1 2
−2 1 2
and T = = .
1 1 √3 1 1 + √3
2 2 2
Example 5.3.5.
Find the standard matrix for the linear operator on ℝ 2 that rotates by θ 1 and then rotates by θ 2.
Solution
By (5.3.6) with θ = θ 1 and θ = θ 2, the standard matrix, denoted by A, for the linear operator is
11/19
5.3 Linear operators in Math Equation and Math Equation
A =
( cosθ 2 − sinθ 2
sinθ 2 cosθ 2 )( cosθ 1 − sinθ 1
sinθ 1 cosθ 1 )
=
( cosθ 2cosθ 1 − sinθ 2sinθ 1 − cosθ 2sinθ 1 − sinθ 2cosθ 1
sinθ 2cosθ 1 + cosθ 2sinθ 1 − sinθ 2sinθ 1 + cosθ 2cosθ 1 )
=
( cos(θ 1 + θ 2) − sin(θ 2 + θ 1)
sin(θ 2 + θ 1) cos(θ 1 + θ 2) ) .
Projection operators in ℝ 3
Theorem 5.3.4.
12/19
5.3 Linear operators in Math Equation and Math Equation
( ) ( )( )
x 1 0 0 x
T y = 0 1 0 y ,
z 0 0 0 z
( ) ( )( )
x 1 0 0 x
T y = 0 0 0 y .
z 0 0 1 z
( ) ( )( )
x 0 0 0 x
T y = 0 1 0 y .
z 0 0 1 z
Reflection operators in ℝ 3
Theorem 5.3.5.
( ) ( )( )
x 1 0 0 x
T y = 0 1 0 y .
z 0 0 −1 z
( ) ( )( )
x 1 0 0 x
T y = 0 −1 0 y .
z 0 0 1 z
( ) ( )( )
x −1 0 0 x
T y = 0 1 0 y .
z 0 0 1 z
The axis of rotation is a ray emanating from the origin. The rotation is counterclockwise looking toward the
origin along the axis of rotation.
An angle of rotation is said to be positive if the rotation is counterclockwise looking toward the origin along
the axis of rotation, and said to be negative if the rotation is clockwise.
Theorem 5.3.6.
1. The rotation operator in ℝ 3 about the positive z-axis with a positive angle θ of rotation is
(5.3.8)
() ( )( )
x cosθ − sinθ 0 x
T y = sinθ cosθ 0 y .
z 0 0 1 z
14/19
5.3 Linear operators in Math Equation and Math Equation
2. The rotation operator in ℝ 3 about the positive x-axis with a positive angle θ of rotation is
(5.3.9)
() ( )( )
x 1 0 0 x
T y = 0 cosθ − sinθ y .
z 0 sinθ cosθ z
3. The rotation operator in ℝ 3 about the positive y-axis with a positive angle θ of rotation is
(5.3.10)
() ( )( )
x cosθ 0 sinθ x
T y = 0 1 0 y .
z − sinθ 0 cosθ z
Proof
1. The rotation operator in the xy-plane in ℝ 2 is
15/19
5.3 Linear operators in Math Equation and Math Equation
2. So, the rotation operator in the xy-plane in ℝ 3 (see Figure 6.4 (I)) is
16/19
5.3 Linear operators in Math Equation and Math Equation
or equivalently
Example 5.3.6.
Find the standard matrix for the linear operator on ℝ 3 that rotates by θ about the positive z-axis and then
reflects about the yz-plane and then projects on the xy-plane.
17/19
5.3 Linear operators in Math Equation and Math Equation
Solution
Let A, B, and C be the standard matrices, respectively, for the linear operator that rotates by θ about the
positive z-axis, reflects about the yz-plane, and projects on the xy-plane. Then
( ) ( ) ( )
cosθ − sinθ 0 −1 0 0 1 0 0
A= sinθ cosθ 0 ,B = 0 1 0 ,C = 0 1 0 .
0 0 1 0 0 1 0 0 0
( )( )( )
1 0 0 −1 0 0 cosθ − sinθ 0
CBA = 0 1 0 0 1 0 sinθ cosθ 0
0 0 0 0 0 1 0 0 1
( )
− cosθ sinθ 0
= sinθ cosθ 0 .
0 0 0
Exercises
1. In Theorem 5.3.1
π π
, if θ = 6 , 3 , find T
( )
4
−4
.
18/19
5.3 Linear operators in Math Equation and Math Equation
2. Find the linear operator on ℝ 2 that reflects about the y-axis and then reflects about the x-axis,
and the linear operator on ℝ 2 that reflects about the x-axis and then reflects about the y-axis.
π
π
4
and the image of x =
( )
1
−1
under the operator.
19/19
Chapter 6 Lines and planes in Math Equation
→
a = (x 1, x 2, x 3) and b = (y 1, y 2, y 3) be two nonzero vectors in ℝ 3. We want to find a vector →
Let → n = (a, b, c) in
→ →
ℝ 3 such that →
n⊥→a and →n ⊥ b . By Theorem 1.2.6 ,→ a = 0 and →
n⋅→ n ⋅ b = 0. This implies that →
n satisfies the
following system of linear equations
(6.1.1)
{ x 1a + x 2b + x 3c = 0
y 1a + y 2b + y 3c = 0.
Lemma 6.1.1.
(6.1.2)
… 1/10
Chapter 6 Lines and planes in Math Equation
(a, b, c) =
(| | | | | |)
x2 x3
y2 y3
, −
x1 x3
y1 y3
,
x1 x2
y1 y2
is a solution of (6.1.1) .
Substituting a, b, c given in (6.1.2) to (6.1.1) shows that (a, b, c) defined in (6.1.2) is a solution
of (6.1.1) . Here, we provide a proof of Lemma 6.1.1 that shows where the solution (a, b, c) given in
(6.1.2) comes from.
Proof
→
If →
a is parallel to b , then the three determinants in (6.1.2) are zero, and (a, b, c) = (0, 0, 0) is a
→
solution of (6.1.1) . If →
a is not parallel to b , then one of the three determinants in (6.1.2) is not zero.
Cramer’s rule
{ x 1a + x 2b = − x 3c
y 1a + y 2b = − y 3c,
we obtain
a=
1
c | || |
− x 3c x 2
− y 3c y 2
=
x3 y3
x2 y2
and b =
1
c | || |
x 1 − x 3c
y 1 − y 3c
=
x1 y1
x3 y3
.
Definition 6.1.1.
→ → →
Let →
a = (x 1, x 2, x 3) and b = (y 1, y 2, y 3) be two vectors. The cross product →
a × b of →
a and b is defined by
(6.1.3)
→
a×b=
→
(| | | | | |)
x2 x3
y2 y3
, −
x1 x3
y1 y3
,
x1 x2
y1 y2
.
→ → →
Let i = (1, 0, 0), j = (0, 1, 0), , and k = (0, 0, 1) be the standard vectors in ℝ 3. Following the minor
→
expansion, we rewrite →
a × b into the following two forms:
(6.1.4)
→
a×b =
→
| | | | | |
x2 x3
y2 y3
→
i −
x1 x3
y1 y3
→
j +
x1 x2
y1 y2
→
k
| |
→ → →
i j k
= x1 x2 x3 .
y1 y2 y3
Let →
c = (z 1, z 2, z 3). By Theorem 3.1.1 and (6.1.3) , we have
(6.1.5)
… 3/10
Chapter 6 Lines and planes in Math Equation
| |
x1 x2 x3
c = y1 y2 y3 .
→
(→
a × b) ⋅ →
z1 z2 z3
→ →
The value (→ c is called the scalar triple product of →
a × b) ⋅ → c.
a, b, →
→ → →
By (6.1.5) and Corollary 3.3.1 , we see that (→ a = 0 and (→
a × b) ⋅ → a × b) ⋅ b = 0. By Theorem 1.2.6 ,
→ →
the cross product →
a × b is perpendicular to both vectors →
a and b (see Figure 6.1 (I)), that is,
(6.1.6)
… 4/10
Chapter 6 Lines and planes in Math Equation
→ → →
(→
a × b) ⊥ →
a and (→
a × b) ⊥ b.
Example 6.1.1.
→ → → → →
Let →
a = (1, 2, − 2) and b = (3, 0, 1). Find →
a × b and verify that (→ a and (→
a × b) ⊥ → a × b) ⊥ b .
Solution
| |
→ → →
i j k
→ →
a × b = 1 2 −2 =
3 0 1
| | | | | |
2 −2
0 1
→
i −
1 −2
3 1
→
j +
1 2
3 0
→
k
→ → →
= 2 i − 7 j − 6 k = (2, − 7, 6).
Because
→
(→
a × b) ⋅ →
a = (2)(1) + ( − 7)(2) + ( − 6)( − 2) = 0
and
→ →
(→
a × b) ⋅ b = (2)(3) + ( − 7)(0) + ( − 6)( − 1) = 0
→ → →
Hence, (→ a and (→
a × b) ⊥ → a × b) ⊥ b .
→
By Corollary 1.3.1 (2) and Definition 6.1.3, we obtain the following result, which shows that the norm of →
a×b
→
is equal to the area of a parallelogram in ℝ 3 determined by →
a and b .
Theorem 6.1.1.
… 5/10
Chapter 6 Lines and planes in Math Equation
→
Let →
a = (x 1, x 2, x 3) and b = (y 1, y 2, y 3) be two vectors. Then the area of the parallelogram determined by
→
a and b is
→
| | a × b | |.
→ →
→
Let →
a, b, and →c be three vectors in ℝ 3 that are not in the same plane but their initial points are the same.
Then these vectors determine a parallelepiped (see Figure 6.2 (I)). The purpose of this section is to
derive formulas for the volume of such a parallelepiped. It is known that the volume of a parallelepiped is
equal to the product of the area of its base and its height. The base of the parallelepiped is determined by
→
any two of the vectors → c and is a parallelogram in ℝ 3. Here we use the base determined by →
a, b, and → a and
→
b. By Theorem 6.1.1 , the area of the parallelogram determined by →
a and b is
→
| | a × b | | . To find the
→ →
→
height of the parallelepiped, we use the projection of the vector →c on the vector →
a × b because the norm of
the projection is equal to the height of the parallelepiped (see Figure 6.2 (I)). Based on the above ideas,
→
we show the following two formulas for the volume of the parallelepiped determined by →
a, b, and →
c.
… 6/10
Chapter 6 Lines and planes in Math Equation
Theorem 6.1.2.
→
Let →
a = (x 1, x 2, x 3) and b = (y 1, y 2, y 3), →
c = (z 1, z 2, z 3) be three vectors. Then the volume of the
parallelepiped determined by the three vectors is given by
… 7/10
Chapter 6 Lines and planes in Math Equation
(6.1.7)
x1 x2 x3
a × b) | = ǁ y 1 y 2 y 3 ǁ.
→
V = |→
c ⋅ (→
z1 z2 z3
Proof
| |n | |.
→
parallelogram determined by →
a and b is →
The height of the parallelepiped is equal to the norm of
→
the projection of →
c on →
a × b (see Figure 6.2 (I)). By (1.3.1) ,
→
c ⋅ →
→
n →
| c ⋅ n|
→ →
h = ǁproj n cǁ = ǁ → 2 nǁ =
→ .
ǁ nǁ ǁ→
nǁ
Hence,
|→
c⋅→
n| →
→ →
V = ǁ nǁh = ǁ nǁ → = |→
c⋅→
n | = |→
c ⋅ (→
a × b) | .
ǁ nǁ
Example 6.1.2.
→
Let →
a = (3, − 2, − 5), b = (1, 4, − 4), and →
c = (0, − 3, − 2). Use the two formulas in Theorem 6.1.2
→
to find the volume of the parallelepiped determined by →
a, b, and →
c.
Solution
By (6.1.3) , we have
… 8/10
Chapter 6 Lines and planes in Math Equation
| |
→ → →
i j k
→
b×→
c = 1 4 −4 =
0 −3 −2
| 4 −4
−3 −2 | | | | |
→
i −
1 −4
0 −2
→
j +
1 4
0 −3
→
k
→ → →
= ( − 8 − 12) i − ( − 2 − 0) j + ( − 3 − 0) k = ( − 20, 2, − 3).
Hence, we obtain
→
V = |→
a ⋅ (b × →
c) | = | (3)( − 20) + ( − 2)(2) + ( − 5)( − 3) | = | − 49 | .
By (3.1.4) , we have
| | |
3 −2 −5 3 −2
V= 1 4 −4 1 4 = | − 49 | = 49.
0 −3 −2 0 −3
Exercises
→ → → →
1. Find →
a × b and verify (→ a and (→
a × b) ⊥ → a × b) ⊥ b .
→
1. →
a = (1, 2, 3); b = (1, 0, 1),
→
2. →
a = (1, 0, − 3), b = ( − 1, 0, − 2),
→
3. →
a = (0, 2, 1); b = ( − 5, 0, 1),
→
4. →
a = (1, 1, − 1); b = ( − 1, 1, 0),
→ →
2. Let →
a = (1, − 2, 3), b = ( − 1, − 4, 0), and →
c = (2, 3, 1). Find the scalar triple product of → c.
a, b, →
→
3. Find the area of the parallelogram determined by →
a and b .
→
1. →
a = (1, 0, 1), b = ( − 1, 0, 1);
… 9/10
Chapter 6 Lines and planes in Math Equation
→
2. →
a = (1, 0, 0), b = ( − 1, 1, − 2).
→
4. Find the volume of the parallelpiped determined by a, b, and →
c.
→
→
1. →
a = (1, − 1, 0), b = ( − 1, 0, 2), →
c = (0, − 1, − 1).
→
2. →
a = ( − 1, − 1, 0), b = ( − 1, 1, − 2), →
c = ( − 1, 1, 1).
10/10
6.2 Equations of lines in Math Equation
Definition 6.2.1.
1/8
6.2 Equations of lines in Math Equation
→
→
A vector υ in ℝ 3 is said to be parallel to a line in ℝ 3 if it is parallel to PQ for any two different points P and
Q on the line.
→ → →
Note that if P 1 and Q 1 are two other different points on the line, then PQ is parallel to P 1Q 1 because PQ and
→ →
P 1Q have the same or opposite direction. By Definition 1.1.6 , if a vector is parallel to PQ, it is parallel to
1
→ →
→
P 1Q 1. Hence, in Definition 6.2.1 , υ is parallel to PQ for any two different points P and Q on the line if and
→
→
only if there exist two different points P and Q on the line such that υ is parallel to PQ.
Definition 6.2.2.
→
Let P 0(x 0, y 0, z 0)be a point in ℝ 3 and let υ = (a, b, c) be a nonzero vector in ℝ 3. Then the set of all points
→
→
P(x, y, z) in ℝ 3 satisfying P 0Pǁ υ constitutes a line in ℝ 3.
→
By Definitions 6.2.1 and 6.2.2 , we see that a line passing through the point P 0 that is parallel to υ is
the same line in Definition 6.2.2 .
The following result gives a parametric equation of a line passing through a point P 0 that is parallel to a given
→
vector υ .
Theorem 6.2.1.
→
Let P 0(x 0, y 0, z 0) be a point in ℝ 3 and let υ = (a, b, c) be a nonzero vector in ℝ 3. Then the parametric
→
equation of the line passing through P 0 that is parallel to υ is
(6.2.1)
2/8
6.2 Equations of lines in Math Equation
{
x = x 0 + ta
y = y 0 + tb −∞ < t < ∞
z = z 0 + tc
Proof
→ →
Let P(x, y, z) be any point on the line. Then P 0P = (x − x 0, y − y 0, z − z 0) and P 0P is parallel to
→
→
υ P 0P = (x − x 0, y − y 0, z − z 0). By Definition 1.1.6 , there exists a constant t ∈ ℝ that depends on P
→
→
such that P 0P = t υ. This, together with Definition 1.1.4 , implies (6.2.1) holds.
By (6.2.1) , we see that the solution (4.3.1) of the system d) in Example 4.3.1 is a parametric
equation of a line.
By the proof of Theorem 6.2.1 , we see that for each point (x, y, z) on the line, there exists a unique
number t ∈ ℝ such that (6.2.1) holds. Conversely, by (6.2.1) , we see that for each t ∈ ℝ, there exists a
unique point (x, y, z)on the line that satisfies (6.2.1) . Hence, there is a one to one relation between the
points on the line and the numbers t ∈ ℝ.
Example 6.2.1.
3/8
6.2 Equations of lines in Math Equation
a. Find the parametric equation of the line passing through the point P 0(1, 2, 3) that is parallel to the
→
vector υ = (2, 3, − 1).
b. Where does the line intersect the yz-plane?
Solution
x = 1 + 2t, y = 2 + 3t, z = 3 − t.
b. Because the line intersects the yz-plane, x = 0. By the first equation of (6.2.2) with x = 0, we
1 1
obtain t = − 2 . Substituting t = − 2
into the second and third equations of (6.2.2) implies that
y=
1
2
7
and z = 2 . Hence, the intersection point is 0,
( 1
2
,
7
2 ) .
Example 6.2.2.
Find an equation of the line that passes through P 1(1, − 1, 1) and P 2(2, 1, 0).
Solution
→
→
Let P 0(x 0, y 0, z 0) = P 1(1, − 1, 1) and υ = (a, b, c) = P 1P 2 = (1, 2, − 1). By (6.2.1) , we have
x = 1 + t, y = − 1 + 2t, z = 1 − t.
Now, we discuss the distance from a point to a given line in ℝ 3, that is, the distance between the point and
the intersection point of the given line and the line passing through the point that is orthogonal to the given
line.
Theorem 6.2.2.
4/8
6.2 Equations of lines in Math Equation
→
→
ǁ υ × PQ ǁ
D= → for any point P on the line.
ǁυǁ
Proof
→ →
Let P be a point on the line. Then D = ǁPQ − proj υPQǁ. It follows from (1.3.4)
→ and Theorem 6.1.1
that
→
→ → →
ǁ υ × PQǁ
D = ǁPQ − proj υPQǁ = →
→ .
ǁ υǁ
Example 6.2.3.
(6.2.3)
x = t, y = − 1 + t, z = 1 − t.
Solution
→
→
Let t = 0. Then P(0, − 1, 1) is on the line and PQ = (1, 1, − 2). By (6.2.3) , υ = (1, 1, − 1), which is
→
parallel to the line. By computation, ǁ υǁ = √3 ,
(| | | | | |)
→ 1 −1 1 −1 1 1
→
υ × PQ = , − , = ( − 1, 1, 0)
1 −2 1 −2 1 1
5/8
6.2 Equations of lines in Math Equation
→
→
and ǁ υ × PQǁ = √2. By Theorem 6.2.2 ,
→
→
ǁ υ × PQǁ √6
D= = .
→
ǁ υǁ 3
(6.2.4)
Ax + By = C,
Theorem 6.2.3.
(6.2.5)
|Ax1 + By1 − C |
D= .
√A 2
+B 2
Proof
Without loss of generalization, we assume that A ≠ 0. Let y = t. Solving (6.2.4) for x, we obtain
C B
x= − t.
A A
6/8
6.2 Equations of lines in Math Equation
(6.2.6)
C B
x= A
− A
t, y = t, z = 0 + 0t.
Note that D is equal to the distance from Q * (x 1, y 1, 0) to the line (6.2.6) . From (6.2.6) ,
→
( B
)
υ = − A , 1, 0 and the point P *
( C
A )
, 0, 0 is on the line (6.2.6) . By computation,
→ √A 2 + B 2 →
→ | Ax1 + By1 − C |
ǁ υǁ = |A|
and ǁ υ × P * Q * ǁ = |A|
.
By Theorem 6.2.2 ,
→
ǁυ × P Q * ǁ
→ * |Ax1 + By1 − C |
D= → = .
ǁ υǁ √A 2
+B 2
Example 6.2.4.
Solution
7/8
6.2 Equations of lines in Math Equation
Exercises
1. a. Find the parametric equation of the line passing through the point P 0( − 1, 2, 0) that is parallel
→
to the vector υ = (1, − 1, 1).
b. Where does the line intersect the xy-plane?
2. Find the parametric equation of the line that passes through P 1(2, 1, − 1) and P 2( − 2, 3, − 5).
3. Find the distance from the point Q( − 1, 2, 1) to the line
x = 1 + t, y = 2 − 4t, z = − 3 + 2t.
8/8
6.3 Equations of planes in Math Equation
Definition 6.3.1.
→
→
n ⋅ P 0P = 0
Theorem 6.3.1.
The equation of the plane passing through P 0(x 0, y 0, z 0) with the normal vector →
n = (a, b, c) is
(6.3.1)
ax + by + cz = d,
… 1/22
6.3 Equations of planes in Math Equation
where
(6.3.2)
()
x0
d = (a, b, c) ⋅ y0 = ax 0 + by 0 + cz 0.
z0
Proof
→
P 0P = (x − x 0, y − y 0, z − z 0).
→
By Definition 6.3.1 , n ⋅ P 0P = 0, which implies
→
This implies ax + by + cz = ax 0 + by 0 + cz 0
Note that equation (6.3.1) is the same as (6.3.1) , so it is a linear equation with three variables x, y, z.
The coefficients a, b, c of x, y, z of (6.3.1) are the components of a normal vector of the plane.
Example 6.3.1.
Solution
… 2/22
6.3 Equations of planes in Math Equation
By (6.3.1) ,→
n = (2, 1, − 1).
→
By Definition 6.3.1 , we see that the normal vector n is orthogonal to the vector P 0P for every point P in
→
→
The following result shows that the normal vector n is orthogonal to the vector PQ for any two different points
→
Theorem 6.3.2.
Let →
n be the normal vector of a plane that passes through a point P 0(x 0, y 0, z 0). Then →
v is orthogonal to
→
any vector PQ, where P and Q are any two different points in the plane.
Proof
→ →
Let P and Q be any two different points in the plane. By Definition 6.3.1 ,→
n ⋅ P 0P = 0 and →
n ⋅ P 0Q = 0 .
→ → →
Because PQ = P 0Q − P 0P, we have
→ → →
n ⋅ PQ = ( n ⋅ P 0Q) − ( n ⋅ P 0P) = 0.
→ → →
→
Hence n is orthogonal to PQ.
→
Definition 6.3.2.
… 3/22
6.3 Equations of planes in Math Equation
→
→
A vector υ in ℝ 3 is said to be orthogonal to a plane if it is orthogonal to PQ for any two different points P
→
→ 3 3
and Q in the plane. A vector υ in ℝ is said to be orthogonal to a line in ℝ if it is orthogonal to PQ for any
two different points P and Q on the line.
By Definition 6.3.2 , we see that a vector in ℝ 3 is orthogonal to a plane if and only if it is orthogonal to any
line in the plane. Also, by Definitions 6.3.1 and 6.3.2 and Theorem 6.3.2 , we see that a normal
vector of a plane is orthogonal to the plane as well as any line in the plane.
Definition 6.3.3.
A vector in ℝ 3 is said to be parallel to a plane in ℝ 3 if there exists a line in the plane that is parallel to the
vector.
By Definitions 6.2.1 and 6.3.3 , we obtain the following results. Its proof is straightforward and is left to
the reader.
Proposition 6.3.1.
→
1. If P 1(x 1, y 1, z 1) and P 2(x 2, y 2, z 2) are two different points in a plane, then the vector P 1P 2 is
parallel to the plane.
2. A vector →n is orthogonal to a plane if and only if it is orthogonal to any vector that is parallel to the
plane.
→ → →
3. If both →
a and b are parallel to a plane and →
a is not parallel to b, then the cross product →
a × b is
orthogonal to the plane.
(6.3.3)
… 4/22
6.3 Equations of planes in Math Equation
{ a 1x + b 1y + c 1z = d 1
a 2x + b 2y + c 2z = d 2,
| →
where A and (A b) are the coefficient matrix and the augmented matrix of (6.3.3) , respectively.
Geometrically, the two planes are the same or intersect in a line or are parallel. Hence, if they are not
parallel, then they intersect in a straight line.
Definition 6.3.4.
Two planes are said to be parallel if their normal vectors are parallel, and to be orthogonal if their normal
vectors are orthogonal.
By Definitions 1.1.6 and 1.2.5 and Theorem 1.2.6 , we obtain the following results, (see Figure
6.4 (III)).
… 5/22
6.3 Equations of planes in Math Equation
Theorem 6.3.3.
→ →
Let n 1 and n 2 be the normal vectors of two planes p 1 and p 2, respectively. Then the following assertions
hold.
→ →
i. p 1ǁp 2 if and only if there exists k ∈ ℝ such that n 2 = kn 1.
→ →
ii. p 1 ⊥ p 2 if and only if n 1 ⋅ n 2 = 0.
Example 6.3.2.
For each pair of the following planes, determine whether it is parallel or orthogonal.
a. − x − y + z = 2, 3x + 3y − 3z = 5;
b. 2x − y + z = 1, 4x − 2y + 2z = 2;
c. 2x − 2y + 3z = 1, x + y + z = 2;
00
d. 2x − y + z = 2, x + 3y + z = 2.
… 6/22
6.3 Equations of planes in Math Equation
Solution
→ → → → →→
a. Let n 1 = ( − 1, − 1, 1) and n 2 = (4, − 2, 2). Then n 2 = − 3n 1 and n 1ǁn 2. The two planes are
parallel.
→ → → → →→
b. Let n 1 = (2, − 1, 1) and n 2 = (4, − 2, 2). Then n 2 = 2n 1 and n 1ǁn 2. By Theorem 6.3.3 , the two
planes are parallel.
→ → → →
c. Let n 1 = (2, − 2, 3) and n 2 = (1, 1, 1). Then n 2 ≠ kn 1 for any k ∈ ℝ and by Theorem 6.3.3 , the
→ →
two planes are not parallel. Because n 1 × n 2 = 3 ≠ 0, the two planes are not orthogonal.
→ → → →
d. Let n 1 = (2, − 1, 1) and n 2 = (1, 3, 1). Because n 1 × n 2 = 0, the two planes are orthogonal.
Example 6.3.3.
Find the parametric equation of the line of intersection of the planes x − y − z = 1 and x + 2y − 4z = 2.
Analysis
We need to find a point on the line and a vector parallel to the line as we mentioned above. The cross
product of the normal vectors of the two planes is parallel to the line of intersection of the two planes, so
we would choose it. But how to find a point on the line of intersection of the two planes? To find a point
on the line of intersection, we must solve the system of the two equations. When we solve the system,
we shall see that we automatically obtain the equation of the line of intersection. Hence, the methodology
to find the equation of the line of intersection of the two planes is simply to solve the system of the two
equations.
… 7/22
6.3 Equations of planes in Math Equation
Solution
(6.3.4)
{ x−y−z=1
x + 2y − 4z = 4.
|
(A b) =
(→
1 −1 −1 1
1 2 −4 4 )R 1( − 1) + R 2→
( 1 −1 −1 1
0 3 −3 3 )
R2 () (1
3
→
1 −1 −1 1
0 1 −1 1 )
R 2(1) + R 1→
( 1 0 −2 2
0 1 −1 1 )
.
{ x − 2z = 2
y − z = 1,
where x, y are basic variables and z is a free variable. Let z = t. Then y = 1 + t and x = 2 + 2t. The
parametric equation of the line of intersection of the planes is x = 2 + 2t, y = 1 + t, z = t.
In Example 6.3.3 , we solve the system (6.3.4) to obtain the parametric equation of the line of
intersection of the two planes. From this line, we can obtain a point, say, P 0(2, 1, 0) on the line, and a vector,
→
say, υ = (2, 1, 1) that is parallel to the line of intersection. In some cases, we do not want any points on the
the line of intersection, but we only want a vector that is parallel to the line of intersection.
… 8/22
6.3 Equations of planes in Math Equation
The following result provides such a vector that is parallel to the line of intersection of two planes (see Figure
6.4 (I)), where we do not need to solve the system of the two linear equations.
Theorem 6.3.4.
→ →
Assume that the two planes given in (6.3.3) are not parallel. Then n 1 × n 2 is parallel to the intersection
line of the two planes.
Proof
→
We denote by p 1 and p 2 the two planes and by l the line of intersection of the two planes. Because n 1 is
→
perpendicular to the plane p 1, it is perpendicular to the line l. Similarly, n 2 is perpendicular to the line l.
→ → → →
Hence, l is perpendicular to n 1 and n 2. This implies that l is parallel to n 1 × n 2.
By Theorem 6.3.4 , we obtain the following result (see Figure 6.4 (II)).
Theorem 6.3.5.
A vector is parallel to two nonparallel planes if and only if it is parallel to the intersection line of the two
nonparallel planes.
Proof
If a vector is parallel to two nonparallel planes, then it is orthogonal to the normal vectors of the two
→ →
planes and thus, it is parallel to n 1 × n 2. It follows from Theorem 6.3.4 that the vector is parallel to the
intersection line of the two planes. Conversely, if a vector is parallel to the intersection line of two planes,
then by Definition 6.3.3 , the vector is parallel to each of the two planes.
Example 6.3.4.
… 9/22
6.3 Equations of planes in Math Equation
Find an equation of the line passing through P(1, 1, 1) that is parallel to the planes 2x + y − z = 1 and
x − y + z = − 2.
Analysis
Before we give the solution, we analyze the problem of finding the equation of the line. Recall the steps
for finding an equation of a line given in Section 6.2 . To establish an equation of a line, we need to
find a point on the line and a vector parallel to the line. Because the line passes through P(1, 1, 1), the
point P(1, 1, 1) is on the line that we need. Next, we need to find a vector that is parallel to the line.
Because the line we seek is assumed to be parallel to the two planes 2x + y − z = 1 and x − y + z = − 2,
by Theorem 6.3.5 , the line is parallel to the intersection line of the two planes. By Theorem 6.3.4 ,
the cross product of the two normal vectors of the two planes is parallel to the intersection line of the two
planes and thus, it is parallel to the line we seek. Hence, we can choose a vector that is parallel to the
cross product of the two normal vectors.
Solution
→ → →
Let n 1 = (2, 1, − 1) and n 2 = (1, 1, − 1) be the normal vectors of the two planes. Because n 1 is not
→ → →
parallel to n 2, the two planes intersect in a line. By Theorem 6.3.4 , n 1 × n 2 is parallel to the line of
→ →
→
intersection. By computation, we have n 1 × n 2 = (0, − 3, − 3) = − 3(0, 1, 1). Let υ = (0, 1, 1). Then
→ →
→
υǁn 1 × n 2. By (6.2.1) ,
x = 1, y = 1 + t, z = 1 + t
10/22
6.3 Equations of planes in Math Equation
Example 6.3.5.
Find the equation of the plane passing through (1, − 1, − 2) and perpendicular to →
n = (2, 1, − 1).
Solution
2x + y − z = (2, 1, − 1) ⋅ (1, − 1, − 2) = 3.
Example 6.3.6.
Solution
1. Step 1. We can choose one of the three points as a point in the plane, say,
P 0(x 0, y 0, z 0) = P( − 1, 1, 1).
→
2. Step 2. To find the normal vector →
n, we need to find two nonparallel vectors → a and b, both of
→
which are parallel to the plane. By Proposition 6.3.1 (1), P 1P 2 = ( − 1, − 1, 0) and
11/22
6.3 Equations of planes in Math Equation
→ → →
P 1P 3 = (1, 0, − 3) are parallel to the plane. It is obvious that P 1P 2 is not parallel to P 1P 3 because
( − 1, − 1, 0) ≠ k(1, 0, − 3) for any k ∈ ℝ. By computation, we have
| |
→ → →
→ → i j k
P 1P 2 × P 1P 3 = −1 −1 0
1 0 −3
=
(| −1 0
0 −3 | |
, −
−1 0
1 −3 || ,
−1 −1
1 0 |)
= (3, − 3, 1).
→ →
We choose n = P 1P 2 × P 1P 3 = (3, − 3, 1) as the normal vector of the plane.
→
3x − 3y + z = (3, − 3, 1) ⋅ ( − 1, 1, 1) = − 5.
Example 6.3.7.
(6.3.5)
x = 1 + t, y = − 1 − t, z = − 3 + 2t
Analysis
12/22
6.3 Equations of planes in Math Equation
To establish an equation of a plane, we need to find a point in the plane and a normal vector of the plane.
From the given line (6.3.5) , we can obtain two things: one is the point (1, − 1, − 3) when t ≠ 0 and
→ →
another is the vector υ = (1, − 1, 2), which is parallel to the line, so υ = (1, − 1, 2), is parallel to the
→ →
plane we seek. By Theorem 6.3.5 , the cross product n 1 × n 2 of the normal vectors of the two planes is
parallel to the intersection line of the two planes. By the assumption, the plane we seek is parallel to the
→ → → →
→
line of intersection of the planes. Hence, n 1 × n 2 is parallel to the plane. The cross product υ × (n 1 × n 2) is
the normal vector for the plane.
Solution
→ →
Let n 1 = (1, − 1, − 1) and n 2 = (1, − 2, 1). By computation we have
(| | | | | |)
→ → −1 −1 1 −1 1 −1
n1 × n2 = , − , = ( − 3, − 2, − 1).
−2 1 1 1 1 −2
→
Let υ = (1, − 1, 2). Then
(| | | || |)
→ → −1 2 1 2 1 −1
→
υ × (n 1 × n 2) = , − ,
−2 −1 −3 −1 −3 −2
x − y − z = (1, − 1, − 1) ⋅ (1, − 1, − 3) = 5.
Example 6.3.8.
13/22
6.3 Equations of planes in Math Equation
Find an equation for the plane through P( − 1, − 1, − 1) that contains the line of intersection of the
planes x − y − z = 1 and x − 2y + z = 3.
Analysis
→
From the line of intersection of the planes, we can obtain a point P 0(x 0, y 0, z 0) and a vector υ. Because
the line of intersection of the planes is contained in the plane we seek, so P 0(x 0, y 0, z 0) is in the plane
→
and υ is parallel to the plane. To find the normal vector, we need another vector that is is parallel to the
→
plane. Because P 0(x 0, y 0, z 0) and P( − 1, − 1, − 1) are in the plane, so P 0P is parallel to the plane. The
→
→
normal vector n = υ × P 0P. Because we need a point from the line of intersection of the planes, we must
→
Solution
{ x−y−z=1
x − 2y + z = 3.
→
(A | b) =
( 1 −1 −1 1
1 −2 1 3 )R 1( − 1) + R 2→
( 1 −1 −1 1
0 −1 2 2 ) R 2( − 1)→
( 1 −1 −1 1
0 1 −2 −2 ) R 2(1) + R 1→
( 1 0 −3 −1
0 1 −2 −2 ) .
14/22
6.3 Equations of planes in Math Equation
{ x − 3z = − 1
y − 2z = − 2,
where x and y are basic variables and z is a free variable. Let z = t. Then
x = − 1 + 3z = − 1 + 3t and y = − 2 + 2z = − 2 + 2t.
→
From the above equation of the line, we obtain P 0(x 0, y 0, z 0) = ( − 1, − 2, 0), υ = (3, 2, 1).
→
P 0P = ( − 1, − 1, − 1) − ( − 1, − 2, 0) = (0, 1, − 1),
and
(| | | | | |)
→ 2 1 3 1 3 2
→
υ × P 0P = , − ,
1 −1 0 −1 0 4
= ( − 3, 3, 3) = − 3(1, − 1, − 1).
Let →
n = (1, − 1, − 1). Then by (6.3.1) and (6.3.2) , the equation of the plane is
x − y − z = (1, − 1, − 1) ⋅ ( − 1, − 2, 0) = 1.
Example 6.3.9.
Find the equation of the plane passing through the point P( − 1, 1, 1) and perpendicular to the planes
2x − y + z = 1 and x + y − z = 2.
Solution
15/22
6.3 Equations of planes in Math Equation
→ →
Let n 1 = (2, − 1, 1) and n 2 = (1, 1, − 1) be the normal vectors of the two given planes and →
n be the
→ →
normal vector of the plane we must find. By Definition 6.3.4 , n is perpendicular to both n 1 and n 2 and
→
→ →
thus, it is parallel to n 1 × n 2. By computation, we have
(| | | | | |)
→ → −1 1 2 1 2 −1
n1 × n2 = , − ,
1 −1 1 −1 1 1
Let →
n = (0, 1, 1). Then by (6.3.1) and (6.3.2) , the equation of the plane is
We end this section by deriving the formula for the (perpendicular) distance from a point to a plane (see
Figure 6.5 (I)).
16/22
6.3 Equations of planes in Math Equation
Definition 6.3.5.
ax + by + cz = d
is the distance between the point Q and the intersection point of the straight line passing through Q and
perpendicular to the plane.
Theorem 6.3.6.
(6.3.6)
17/22
6.3 Equations of planes in Math Equation
Proof
Let →
n = (a, b, c) be the normal vector of the plane. Let P(x 0, y 0, z 0) be a point in the plane. Because
→
n is parallel to the line passing through Q and perpendicular to the plane, the projection of PQ to →
→
n
→
equals the projection of PQ to the line. The norms of the two projections are the same. Hence, we
have
(6.3.7)
| |
→
→
n ⋅ PQ
→
D = ǁProj nPQǁ =→
ǁ→
nǁ
D =
| | →
n→ ⋅ PQ
=
| a ( x1 − x0 ) + b ( y1 − y0 ) + c ( z1 − z0 ) |
ǁ n→ ǁ
√a 2 + b 2 + c 2
| ax1 + by1 + cz1 − ( ax0 + by0 + cz0 ) | | ax1 + by1 + cz1 − d |
= =
√a 2 + b 2 + c 2 √a 2 + b 2 + c 2
18/22
6.3 Equations of planes in Math Equation
Example 6.3.10.
Solution
By (6.3.6) , we have
The distance between two parallel planes is the distance from any point Q in one plane to the other (see
Figure 6.5 (II)). Hence, to find the distance between two parallel planes, you need to find a point in one
plane. Usually, one takes y = z = 0 and solves the equation of the plane for x. If the solution is denoted by x 0,
then the point Q(x 0, 0, 0) is in the plane.
Example 6.3.11.
x + y − z = 1 (1)
2x + 2y − 2z = 3. (2)
Solution
In equation (1), we let y = z = 0. Then x = 1 and Q(1, 0, 0) is a point in plane (1). The distance from Q(1,
0, 0) to plane (2) equals the distance between the two planes. By (6.3.6) , we have
19/22
6.3 Equations of planes in Math Equation
Exercises
1. Find a normal vector for each of the following planes.
a. x + 2y − z = 1,
b. 4x − 2y − 3z = 5,
c. − x − 6y + z − 1 = 0,
d. 2x + 3z − 4 = 0.
3. For each pair of the following planes, find a parametric equation of the line of intersection of the two
planes.
a. x − y + z = 2, 2x − 3y + z = − 1;
b. 3 x − y + 3z = − 1, − 4x + 2y − 4z = 2,
c. x − 3y − 2z = 4, 3x + 2y − 4z = 6.
4. Find an equation for the line through P(2, − 3, 0) that is parallel to the planes 2x + 2y + z = 2 and
x − 3y = 5.
5. Find an equation for the line through P(2, − 3, 0) that is perpendicular to the two lines
− 1 + y, y = 2 − 2t, z = 1 − t and x = 1 − t, y = 1 − t, z = 2 + t.
20/22
6.3 Equations of planes in Math Equation
6. Find an equation of the plane passing through the point P and perpendicular to the vector →
n.
a. P(1, − 1, 0), →
n = (1, 3, 5)
b. P(1, − 1, 3), n = ( − 1, 0, − 1)
→
c. P(0, − 1, 2), →
n = ( − 1, 1, − 1)
d. P(1, 0, 0), →
n = (1, 1, − 1)
7. Find an equation for the plane through P(2, 7, − 1) that is parallel to the plane 4x − y + 3z = 3.
8. Find an equation of the plane passing through the following given three points:
a. P 1(1, 2, − 1), P 2(2, 3, 1), P 3(3, − 1, 2)
b. P 1(1, 0, 1) P 2(0, 1, − 1) P 3(1, 1, − 2)
9. Find an equation for the plane containing the line x = 3 + 6t, y = 4, z = t and that is parallel to the line
of intersection of the planes 2x + y + z = 1 and x − 2y + 3z = 2.
10. Find an equation for the plane through P(1, 4, 4) that contains the line of intersection of the planes
x − y + 3z = 5 and 2x + 2y + 7z = 0.
11. Find an equation of the plane passing through the point Q(1, 2, − 1) and perpendicular to the planes
x + y + z = 2 and − x + 2y + 3z = 5.
12. Find an equation for the plane through P(1, 1, 3) that is perpendicular to the line
x = 2 − 3t, y = 1 + t, z = 2t.
13. Find an equation for the plane that contains the line x = 3 + t, y = 5, z = 5 + 2t, and is perpendicular to
the plane x + y + z = 4.
14. Find the distance from the point to the plane.
a. P(1, − 2, − 1), x − y + 2z = − 1;
b. P(0, 1, 2), 2x + y + 3z = 4;
c. P(1, 0, 1), 2x − 5y + z = 3;
d. P( − 1, − 1, 1), − x + 2y + 3z = 1.
21/22
6.3 Equations of planes in Math Equation
22/22
Chapter 7 Bases and dimensions
{ }
→ → → →
S= a 1, ⋯, a n and A = (a 1, ⋯, a n)
be the same as in (1.4.2) and (4.1.5) , respectively. Let ℝ n → ℝ m be the linear transformation with
standard matrix A given by
(7.1.1)
→ →
T(X) = AX,
where X = (x 1, x 2, ⋯, x n) T. By (2.2.3)
→
, we have
(7.1.2)
→ → →
→
AX = x1a1 + x2a2 + ⋯ + xnan.
… 1/10
Chapter 7 Bases and dimensions
→ →
By Theorem 4.5.2 , we see that a homogeneous system AX = 0 has either only zero solution or infinitely
many solutions. This, together with (7.1.2) , implies that the homogeneous system
(7.1.3)
→ → →
→
x 1a 1 + x 2a 2 + ⋯ + x na n = 0
has either only zero solution or infinitely many solutions. Hence, either of the following assertions holds.
Definition 7.1.1.
(7.1.4)
→ → →
→
x 1a 1 + x 2a 2 + ⋯ + x na n = 0,
→ → →
then S is said to be a linearly dependent set or the vectors a 1, a 2, ⋯, a n are said to be linearly
dependent. If S is not linearly dependent, then S is said to be linearly independent.
By Theorems 4.5.2 and 5.2.4 , we obtain the following criteria on linear independence.
Theorem 7.1.1.
… 2/10
Chapter 7 Bases and dimensions
1. S is linearly independent.
→ →
2. The homogeneous system AX = 0 has only zero solution.
3. The linear transformation T given in (7.1.1) is one to one.
4. r(A) = n.
→ →
By Theorem 7.1.1 , we see that S is linearly dependent if and only if the homogeneous system AX = 0 has
infinitely many solutions if and only if r(A) < n. Moreover, if n > m, then S is linearly dependent because
r(A) ≤ min{m, n} = m < n. If S contains a zero vector, then S is linearly dependent because A contains a
zero column and r(A) < n. In particular, {a } is linearly dependent if a is a zero vector in ℝ m,
→ →
and linearly
a is a nonzero vector in ℝ m.
independent if →
Example 7.1.1.
{ }
→ → →
1. Determine whether S = a 1, a 2, a 3 is linearly independent, where
→ → →
a 1 = (1, 1, − 1), a 2 = (2, − 1, − 11), a 3 = ( − 1, 2, 10).
{ }
→ → → →
2. Determine whether S = a 1, a 2, a 3, a 4 is linearly independent, where
→ → → →
a 1 = (1, 0, − 1), a 2 = (2, − 1, − 10), a 3 = ( − 1, 2, 8), a 4 = (0, 2, − 5).
Solution
1. Because
… 3/10
Chapter 7 Bases and dimensions
( ) ( )
1 2 −1 1 2 −1
R1 ( − 1 ) + R2
A= 1 −1 2 → 0 −3 3
R1 ( 1 ) + R3
− 1 − 11 10 0 −9 9
( )
1 2 −1
R2 ( − 3 ) + R3
→ 0 −3 3 ,
0 0 0
The following result gives the relation between linear dependence and linear combination.
Theorem 7.1.2.
→ →
S is linearly dependent if and only if there exists a vector a i in S such that a i is a linear combination of
the rest of vectors in S.
Proof
By Definition 7.1.1 , we see that if S is linearly dependent, then there exist x 1, ⋯, x n, not all zero,
such that (7.1.4) holds. If we assume that xi is the nonzero number, then (7.1.4) implies that
→ → → → →
x i a i = − x 1 a 1 − ⋯ − x i − 1a i − 1 − x i + 1a i + 1 − ⋯ − x n a n
… 4/10
Chapter 7 Bases and dimensions
and
→ x1 → xi − 1 → xi + 1 → xn →
ai = − a1 − ⋯ − a − a − ⋯ − a n.
xi xi i − 1 xi i + 1 xi
→ → → → →
Hence, a i is a linear combination of a 1, ⋯, a i − 1, a i + 1, ⋯, a n. .
→ → → → →
Conversely, if a i is a linear combination of a 1, ⋯, a i − 1, a i + 1, ⋯, a n, then there exist
y 1, ⋯, y i − 1, y i + 1, ⋯, y n such that
→ → → → →
a i = y 1 a 1 + ⋯ + y i − 1 a i − 1 + y i + 1a i + 1 + ⋯ + y n a n .
If n > m, then by Theorem 7.1.1 , S is linearly dependent. If n ≤ m, then for each j ≠ i, we can use row
operations R j( − y j) + R i on the transpose A T of A. Hence, the ith row of the resulting matrix denoted by B
is a zero row. Moreover, A T and B are row equivalent. By Theorem 2.5.5 (i) and Theorem 2.5.4 ,
T
r(A) = r(A ) = r(B). Because B contains a zero row, r(B) < n. It follows that r(A) < n. By Theorem
7.1.1 , S is linearly dependent.
By Theorem 7.1.2 , we obtain the following result, which shows that two nonzero vectors are linearly
dependent if and only if they are parallel.
Corollary 7.1.1.
{ }
→ → →
Let S = a 1, a 2 be a set of two nonzero vectors in ℝ m. Then S is linearly dependent if and only ifa 1 and
→
a 2 are parallel.
… 5/10
Chapter 7 Bases and dimensions
Proof
→ →
By Theorem 7.1.2 , S is linearly dependent if and only if there exists k ∈ ℝ such that a 1 = ka 2 or
→ → → →
a 2 = ka 1. By Definition 1.1.6 , the latter holds if and only if a 1 and a 2 are parallel.
The following result shows that if S is linearly dependent, then the set obtained by adding any finitely many
vectors into S is linearly dependent.
Corollary 7.1.2.
{ }
→ →
Let T = b 1, ⋯, b k ⊂ ℝ m. Then if S is linearly dependent, then
{ }
→ → → →
S ∪ T: = a 1, ⋯, a n, b 1, ⋯, b k
is linearly dependent.
Proof
→
Because S is linearly dependent, one of vectors in S, say a i , is a linear combination of the rest of the
→
vectors in S. By Theorem 1.4.2 , a i is a linear combination of the rest of the vectors in S and all
vectors in T. It follows that one of the vectors in S ∪ T is a linear combination of the rest of the vectors in
S ∪ T. The result follows from Theorem 7.1.2 .
Example 7.1.2.
… 6/10
Chapter 7 Bases and dimensions
{ }
→ → →
Determine whether a 1, a 2, a 3 is linearly dependent, where
→ → →
a 1 = (2, 4, 6), a 2 = (1, 0, 1), a 3 = (1, 2, 3).
Solution
{ } { }
→ → → → → → →
Because a 1 = 2a 3, a 1, a 3 is linearly dependent. By Corollary 7.1.2 , a 1, a 2, a 3 is linearly
dependent.
Theorem 7.1.3.
→ → → → →
→ m →
Let A = (a 1, ⋯, a n) be the same as in (4.1.5) . Let b ∈ ℝ . If b is a linear combination of a 1, a 2, ⋯, a n,
then
{ }
→ → →
→
a 1, a 2, ⋯, a n, b
The following result shows that any nonempty subset of linearly independent vectors is linearly independent.
Theorem 7.1.4.
… 7/10
Chapter 7 Bases and dimensions
Proof
The proof is by contradiction. Assume that S 1 is not linearly independent, that is, S 1 is linearly dependent.
By Theorem 7.1.2 , one of the vectors in S 1 is a linear combination of the rest of the vectors in S 1. By
Corollary 7.1.2 , S is linearly dependent, which contradicts the assumption that S is linearly
independent. The contradiction shows that the assumption that S 1 is linearly dependent is false. Hence,
S 1 is linearly independent.
Example 7.1.3.
{ }
→ → → →
Let e 1 = (1, 0, 0) and e 3 = (0, 0, 1). Then e1, e3 is linearly independent.
Solution
→→→ →
Let I 3 = ( e 1 e 2 e 3 ), where e 2 = (0, 1, 0). Then r(I 3) = 3. It follows from Theorem 7.1.1 that
{ } { }
→ → → → →
S= e1, e2, e3 is linearly independent. By Theorem 7.1.4 , S1 = e1, e3 is linearly independent.
If n = m, then by Theorems 4.5.3 and 7.1.1 , we obtain the following result, which uses determinants to
determine linear independence.
Theorem 7.1.5.
Let A be an n × n matrix and let S be the set of column vectors of A. Then the following assertions hold.
Example 7.1.4.
{ }
→ → → → → → →
Let e 1 , e 2 , e 2 , ⋯, e n be the standard vectors in ℝ n given in (1.1.1) . Show that S = e 1 , e 2 , ⋯, e n is
linearly independent.
Solution
Example 7.1.5.
Determine whether S given in Example 7.1.1 is linearly independent by using Theorem 7.1.5 .
Solution
| | | |
1 2 −1 1 2 −1
R1 ( 1 ) + R2
|A | = 1 −1 2 ‗ 0 −3 3 = 0,
R1 ( − 1 ) + R2
− 1 − 11 10 0 −9 9
Exercises
{ }
→ → → → → →
1. Let a 1 = (2, 4, 6) T, a 2 = (0, 0, 0) T, and a 3 = (1, 2, 3) T. Determine whether a 1, a 2, a 3 is linearly
dependent.
… 9/10
Chapter 7 Bases and dimensions
{ }
→ → → →
2. Determine whether a 1, a 2, a 3, a 4 is linearly dependent, where
() () () ()
1 1 0 0
→ 2 → 0 → 2 → 2
a1 = , a2 = , a3 = , a4 = .
1 1 0 0
0 0 0 1
{ }
→ → → →
3. Determine whether S = a 1, a 2, a 3, a 4 is linearly independent, where
( ) () ( ) ( )
1 7 −1 6
→ → → →
a1 = 3 , a2 = 9 , a3 = − 2 , a4 = −2 .
−4 8 1 5
{ }
→ → →
4. Determine that S = a 1, a 2, a 3 is linearly independent, where
( ) ( ) ()
−1 3 1
→ → →
a1 = 0 , a2 = −1 , a = 2 .
3
−1 −6 3
10/10
7.2 Bases of spanning and solution spaces
(7.2.1)
{ }
→ → →
S= a 1, a 2, ⋯, a n
be a set of vectors in ℝ m. In Section 1.4 , we introduced the following spanning space (see Definition
1.4.2 )
(7.2.2)
{ }
→ → →
V : = span S = x 1a 1 + x 2a 2 + ⋯ + x na n : x i ∈ ℝ, i ∈ I n .
→ → →
In this spanning space, the vectors a 1, a 2, ⋯, a n are not required to be linearly independent. In some cases,
we are interested in the case when these vectors are linearly independent. Hence, we introduce the notion of
a basis.
Definition 7.2.1.
Example 7.2.1.
Let
(7.2.3)
{ }
→ → →
E: = e 1 , e 2 , ⋯, e n
be the set of standard vectors in ℝ n given in (1.1.1) or Example 7.1.4 . Then the following
assertions hold.
1. ℝ n = span S.
2. S is linearly independent.
3. dim(ℝ n) = n.
Solution
(1) follows from (1.4.9) and (2) follows from Example 7.1.4 . By Definition 7.2.1 , the set E of the
standard vectors constitutes a basis of ℝ n, dim(ℝ n) = n, and (3) holds.
The basis E in (7.2.3) is called the standard basis of ℝ n. Because the dimension of ℝ n is n, we call ℝ n an
n-dimensional (Euclidean) space.
(7.2.4)
→→ →
A = (a1a2⋯an).
… 2/20
7.2 Bases of spanning and solution spaces
By Definition 7.2.1 and Theorems 7.1.1 and 7.1.5 , we obtain the following result.
Theorem 7.2.1.
Let S and A be the same as in (7.2.1) and (7.2.4) . Then the following assertions hold.
Example 7.2.2.
{ }
→ → → →
1. Let a 1 = (1, 0, 1) and a 2 = (1, 1, 1) T. Show that S =
T a 1, a 2 is a basis of span S and
dim(span S) = 2.
{ }
→ → → → → →
2. Let a 1 = (1, 0) , a 2 = (1, 1) , and a 3 = (2, − 3) T. Determine whether S =
T T a 1, a 2, a 3 is a basis
of span S.
Solution
→→
1. Let A = (a 1 a 2). Then
( ) ( )
1 0 1 R1 ( − 1 ) + R2 1 0 1
T
A = →
1 1 1 0 1 0
and r(A T) = 2. By Theorem 2.5.5 , r(A) = r(A T) = 2 and by Theorem 7.2.1 , S is a basis of
span S. By Definition 7.2.1 , dim(span S) = 2.
… 3/20
7.2 Bases of spanning and solution spaces
2. Note that n = 3 and m = 2. Because n > m, it follows from Theorem 7.2.1 that S is not a basis
of span S.
Theorem 7.2.2.
→
Assume that S is a basis of span S. Then for each b ∈ span S, there exists a unique vector
(x 1, x 2, ⋯, x n) such that
(7.2.5)
→ → →
→
b = x 1a 1 + x 2a 2 + ⋯x na n.
Proof
→→ →
→
Let A = (a 1 a 2⋯ a n). By Theorem 7.2.1 , r(A) = n ≤ m. Because b ∈ span S, by (7.2.2) , the system
{ }
→ → →
Note that the bases of span S are not unique. For example, if S = a 1, a 2, ⋯, a n is a basis of span S and
β i ≠ 0 for i = 1, 2, ⋯, n, then
{ }
→ → →
T: = β 1a 1, β 2a 2, ⋯, β na n
… 4/20
7.2 Bases of spanning and solution spaces
{ }
→ → →
Now, we assume that S is a basis and b 1, b 2, ⋯, b k in span S is another basis such that
(7.2.6)
{ } { }
→ → → → → →
span a 1, a 2, ⋯, a n = span b 1, b 2, ⋯, b k .
The question is that if (7.2.6) holds, is k equal to n? Equivalently, is the dimension of a spanning space
unique?
In the following, we shall prove k = n, so we give a positive answer to the above question.
Lemma 7.2.1.
{ }
→ → →
Let S be the same as in (7.2.1) and let T = b 1, b 2, ⋯, b k be a set of vectors in span S. Let
→→ → →→ →
Am × n = (a 1 a 2⋯ a n) and B m × n = (b 1 b 2⋯ b n). Then the following assertions hold.
B m × n = A m × nC n × n.
Proof
… 5/20
7.2 Bases of spanning and solution spaces
→ →
i. Because b i ∈ span S for i ∈ I k, by (2.2.3) , there exists a vector C i : = (c 1i, c 2i, ⋯, c ni) T such
that
(7.2.8)
→ → → → →
b i = c 1ia 1 + c 2ia 2 + ⋯ + c nia n = AC i .
→→ →
Let C n × k = (C 1C 2⋯C k). By (2.2.6) and (7.2.8) , we have
→→ → → → →
Bm × k = (b 1 b 2⋯ b k ) = (A m × kC 1A m × kC 2⋯A m × kC k) = A m × nC n × k.
ii. Because S is linearly independent, by Theorem 7.2.1 , there exists a unique vector
→
C i : = (c 1i, c 2i, ⋯, c ni) T such that (7.2.8) holds for i ∈ I n and (7.2.7) with k = n holds. We will
→
prove that C n × n is invertible by contradiction. If not, then there exists a nonzero vector X 0 ∈ ℝ n
→
→
such that C n × nX 0 = 0. Because B m × n = A m × nC n × n, it follows that
→ →
→ →
B m × nX 0 = A m × nC n × nX 0 = A 0 = 0.
By Theorem 7.1.1 , T is linearly dependent, which contradicts the hypothesis that T is linearly
independent. Hence, C n × n is invertible.
Lemma 7.2.2.
Let S and T be the same as in Lemma 7.2.1 . Assume that S is linearly independent. Then the
following assertions hold.
… 6/20
7.2 Bases of spanning and solution spaces
Proof
Note that (i) and (ii) are equivalent, so we only prove (i). Let
→→ → →→ →
A m × n = (a 1 a 2⋯ a n) and B m × k = (b 1 b 2⋯ b k ).
By Lemma 7.2.1 (i), there exists an n × k matrix C n × k such that B m × k = A m × nC n × k. Because k > n, it
→ →
follows from Theorem 4.5.2 (2) that the homogeneous system C n × kX = 0 has infinitely many
solutions. This implies that the homogeneous system
→ → → →
B m × kX = A m × nC n × kX = A m × n 0 = 0
By Lemma 7.2.2 (ii), we see that the number of linearly independent vectors in span S is less than or
equal to the number of vectors in S.
The following result shows that if T is linearly independent and span T = span S, then k = n.
Theorem 7.2.3.
Let S and T be the same as in Lemma 7.2.1 . Assume that S and T are linearly independent. Then
k = n if and only if
(7.2.9)
… 7/20
7.2 Bases of spanning and solution spaces
{ } { }
→ → → → → →
span a 1, a 2, ⋯, a n = span b 1, b 2, ⋯, b k .
Proof
Assume that (7.2.9) holds. We will prove k = n. By (7.2.9) , T is the set of vectors in span S. By
Lemma 7.2.2 (ii), k ≤ n. Similarly, by (7.2.9) , S is the setofvectorsinspan T and by Lemma 7.2.2
{ }
→ → →
(ii), n ≤ k. Hence, k = n. Now, we assume that k = n. In this case, T : = b 1, b 2, ⋯, b n is a set of
→
vectors in the spanning space span S. We prove that (7.2.9) holds. Because b i ∈ span S for each
→ →
i ∈ {1, 2, ⋯, n}, every vector b ∈ span T implies that b ∈ span S. We will prove that for
→ →
b ∈ span S, b ∈ span T. Let
→→ → →→ →
Am × n = (a 1 a 2⋯ a n) and B m × n = (b 1 b 2⋯ b n).
→ → → →
For each b ∈ span S, by (7.2.2) , there exists a vector Y : = (y 1, y 2, ⋯, y n) ∈ ℝ n such that A m × nY = b.
−1
By Lemma 7.2.1 (ii), there exists a unique invertible n × n matrix C n × n such that A m × n = B m × nC n × n .
→ −1 →
Let X = C n × n Y. Then
BX = B m × nC n−×1n Y = A m × nY = b.
→ → → →
→
This implies b ∈ span T. Hence, span T = span S.
By Theorem 7.2.3 , we see that the dimension of span S is independent of the choices of linearly
independent vectors whose spanning space is span S and thus, the dimension of a spanning space is
unique. Moreover, the spanning space of any n linearly independent vectors in span S is equal to span S.
… 8/20
7.2 Bases of spanning and solution spaces
Theorem 7.2.3 requires all vectors of T to be in span S. If this condition is not satisfied, then span T might
not be equal to span S.
Example 7.2.3.
Let S =
{( )}1
0
and T =
{( )}0
1
. Show that span T ≠ span S.
Solution
Note that
span S =
{( ) }
0
t
:t ∈ ℝ and span T =
{( ) }
0
s
:s ∈ ℝ .
It is obvious that
()0
1
∉ span S and thus, span T ≠ span S.
However, in Theorem 7.2.3 , if m = n, then we do not need to verify the condition that all vectors of T are
in span S because this condition is satisfied automatically.
Corollary 7.2.1.
{ } { }
→ → → → → →
Let S = a 1, a 2, ⋯, a n and T = b 1, b 2, ⋯, b n be sets of linearly independent vectors in ℝ n. Then
span T = span S = ℝ n.
Proof
… 9/20
7.2 Bases of spanning and solution spaces
{ }
→ → →
By Theorem 7.2.3 n
(1), ℝ = span e 1 , e 2 , ⋯, e n . Because all vectors of S are in ℝ n, they are in span
E. By Theorem 7.2.3 , span S = span E = ℝ n. Similarly, we can prove that span T = ℝ n. Hence, span
T = span S.
By Corollary 7.2.1 , we see that any set of n linearly independent vectors in ℝ n constitutes a basis of ℝ n.
Theorem 7.2.4.
{ } { }
→ → → → → →
→
Let S = a 1, a 2, ⋯, a n be a basis of span S and assume that S ≠ ℝ m. Then S 1 = a 1, a 2, ⋯, a n, b is
→
linearly independent for each b ∉ span S.
Proof
(7.2.10)
→
n = r(A) ≤ r(A | b) ≤ n + 1.
{ }
→ → →
By (1.4.9) , ℝ m = span e 1 , e 2 , ⋯, e m . Because span S ≠ ℝ m, by Theorem 7.2.3 , n < m and thus
→
n + 1 ≤ m. Let b ∉ span S. Then the system
(7.2.11)
→ → →
→
b = x 1a 1 + x 2a 2 + ⋯x na n
10/20
7.2 Bases of spanning and solution spaces
→ →
has no solutions. By Theorem 4.5.1 (3), r(A) < r(A | b). By (7.2.10) , r(A | b) = n + 1. By Theorem
7.1.1 (i), S 1 is linearly independent.
By Theorem 7.2.4 , if n < m, then we can extend a set of linearly independent vectors
{ }
→ → → →→ →
S= a 1, a 2, ⋯, a n toabasisof ℝ m. Let A = (a 1 a 2⋯ a n) be the same as in (7.2.4) . We use methods given
→ →
m →
in Sections 4.7 or 4.8 to find a vector b 1 ∈ ℝ such that the system AX = b 1 has no solutions (Section 4.7) or
{ }{ }
→ → → → → →
such that b 1 ∈ span S (Section 4.8). By Theorem 7.2.4 , S1 : = S, b 1 = a 1, a 2, ⋯, a n, b 1 is linearly
independent. If n + 1 = m, then S 1 isabasisof ℝ m. If n + 1 < m, then repeating the process and using S 1, we
{ }
→ → →
obtain a vector b 2 such that b 2 ∉ span S 1 and S 2 = S 1, b 2 is linearly independent. We repeat the process
until we find S m − n.
Example 7.2.4.
{ }
→ → → →
→ →
Let a 1 = (1, 0, 0, 1) and a 2 = (1, − 1, 1, − 1) T. Find a vector b ∈ ℝ 4 such that S =
T a 1, a 2, b is a
basis of span S.
Solution
→→
→
Let A = (a 1 a 2) and b = (b 1, b 2, b 3, b 4) T. Then
11/20
7.2 Bases of spanning and solution spaces
( ) ( )
1 1 b1 1 1 b1
0 − 1 b2 R1 ( − 1 ) + R4 0 −1 b2
→
(A | b) = →
0 1 b3 0 1 b3
1 − 1 b4 0 − 2 b4 − b1
( )
1 1 b1
R2 ( − 2 ) + R4 0 −1 b2
→ = (B | →
c).
R2 ( 1 ) + R3 0 0 b3 + b2
0 0 b 4 − b 1 − 2b 2
→ →
If either b 3 + b 2 ≠ 0, or b 4 − b 1 − 2b 2 ≠ 0, then r(A) = 2 and r(A | b) ≥ 3. Hence, r(A) < r(A | b). It follows
from Theorem 4.5.1 (3) that if either b 3 + b 2 ≠ 0, or b 4 − b 1 − 2b 2 ≠ 0, then the system has no
→
solutions. Hence we can choose b 1 = b 4 = 0, b 2 = b 3 = 1. Then b 3 + b 2 ≠ 0. Let b = (0, 1, 1, 0) T. By
{ }
→ →
→
Theorem 7.2.4 ,S= a 1, a 2, b is a basis of span S.
Theorem 7.2.5.
{ }
→ → →
Let S and A be the same as in (7.2.1) and (7.2.4) . Let T = b n , b n , ⋯, b n be a set of r vectors
1 2 r
in span S. Assume that T is linearly independent and r(A) = r. Then the following assertions hold.
12/20
7.2 Bases of spanning and solution spaces
i. span T = span S.
ii. Each of the vectors in S but not in T is a linear combination of vectors in T.
Proof
i. It is obvious that every vector in span T is in span S. We will prove that every vector in span S
→ →
must belong to span T. In fact, if not, then there exists a vector b ∈ span S and b ∈ span T.
Because T is linearly independent, by Theorem 7.2.4 , the following set
{ }
→ → →
→
b n , b n , ⋯, b n , b
1 2 r
is linearly independent. Hence, we have r(A) ≥ r + 1, which contradicts the condition r(A) = r.
→ → →
ii. For every vector b in S, because span T = span S, b ∈ span T. Hence, b is a linear combination of
vectors in T. In particular, result (ii) holds.
{ }
→ → →
The following result provides a method to choose basis vectors from span a 1, a 2, ⋯, a n .
Theorem 7.2.6.
{ }
→ → →
Let a 1, a 2, ⋯, a n be a set of vectors in ℝ m and let A be the same as in (7.2.4) . Assume that
{ }{ }
→ → → → → →
r(A) = r ≥ 1. Then there exists a subset a n , a n , ⋯, a n of a 1, a 2, ⋯, a n .
1 2 r
13/20
7.2 Bases of spanning and solution spaces
{ } { }
→ → → → → →
span a 1, a 2, ⋯, a n = span a n , a n , ⋯, a n .
1 2 r
Proof
Let S be the same as in (7.2.1) . Because r(A) = r ≥ 1, there is at least one nonzero vector in S. We
→
{
choose one nonzero vector in S, denoted by a n . Let I n : = 1, 2, ⋯, n and we use the notation:
1 }
{ }
i ∈ I n\ n 1 means that i ∈ I n with i ≠ n i.
{}
→ →
1. If a i ∈ span a n
1 { }
for i ∈ I n\ n 1 , then there exists y i ∈ ℝ such that
(7.2.12)
→ →
y ia n = a i .
1
{}
→
Let T = a n . Then span S = span T. Indeed, it is obvious that span T is a subset of span S. We
1
→
prove that span S is a subset of span T. Let b ∈ span S. Then there exist x 1, x 2, ⋯, x n ∈ ℝ such
that
→
b = x 1a 1 + x 2a 2 + ⋯ + x n a n + ⋯ + x na n
1 1
= x 1y 1a n + x 2y 2a n + ⋯ + x n a n + ⋯ + x ny na n
1 2 1 1 1
= (x 1y 1 + x 2y 2 + ⋯ + x n + ⋯ + x ny n)a n .
1 1
14/20
7.2 Bases of spanning and solution spaces
→
→
This shows b ∈ span T and span S = span T. Now, we prove that r(A) = 1. By (7.2.12) , a 1 is a
→
linear combination of a n . By Theorem 7.1.3
1 { }
→ →
an , a1
1
is linearly dependent and
→→ →
r((a n a 1)) = r((a n ) = 1.
1 1
{ }
→ → → → →
Because y 2a n = a 2, by Theorem 1.4.2 , a 2 is a linear combination of a n , a 1 . By Theorem
1 1
{ }
→ → →
7.1.3 , a n , a 1, a 2 is linearly dependent and
1
→ → → →→ →
r((a n , a 1, a 2)) = r((a n a 1)) = r((a n ) = 1.
1 1 1
→ →→ → → →
r((a n a 1 a 2⋯a j − 1a j + 1⋯ a n)) = 1.
1
→ → → → → →
Let B = (a n , a 1, a 2, ⋯a j − 1, a j + 1, ⋯, a l ). By the third row operations, it is easy to prove
1
T T
r(B ) = r(A ). By Theorem 2.5.5 , r(B) = r(A) = 1.
15/20
7.2 Bases of spanning and solution spaces
{}
→ →
2. In the above, we have proved that if a i ∈ span a n
1
for each i ∈ I n : = {1, 2, ⋯, n } with i ≠ n1,
{} {}
→ →
then r(A) = 1. Hence, if r(A) > 1, there exists a n ∈ S\ a n such that a n ∉ span a n . By
2 1 2 1
{ }
→ → → →
Theorem 7.2.4 , an , an is linearly independent and by Theorem 7.1.1 , r((a n , a n )) = 2.
1 2 1 2
{ }
→ → →
Without loss of generalization, we assume that n 1 < n 2. If a i ∈ span a n , a n for each
1 2
{ } { }
i ∈ I n\ n 1, n 2 , then for each i ∈ I n\ n 1, n 2 , there exists y 1, y i2 ∈ ℝ such that
(7.2.13)
→ → →
y i1a n + y i2a n = a i .
1 2
{ }
→ →
Let T = a n , a n . Then span S = span T. We prove that span S is a subset of span T. Let
1 2
→
b ∈ span S. Then there exist x 1, x 2, ⋯, x n ∈ ℝ such that
16/20
7.2 Bases of spanning and solution spaces
→
b = x 1a 1 + x 2a 2 + ⋯ + x n a n + ⋯ + x n a n + ⋯ + x na n
1 1 2 2
n
= ∑ x ia i + x n a n + x n a n
1 1 2 2
i = 1 , i ≠ n1 , n2
n
→ →
= ∑ x i(y i1a n + y i2a n ) + x n a n + x n a n
1 2 1 1 2 2
i = 1 , i ≠ n1 , n 2
[ ] [ ]
n n
→
= (∑ x iy i1) + x n an + ( ∑ x iy i2) + x n an .
1 1 2 2
i = 1 , i ≠ n1 , n 2 i = 1 , i ≠ n1 , n 2
→
→
This shows b ∈ span T and span S = span T. Now, we prove that r(A) = 2. By (7.2.13) , a 1 is a
{ } { }
→ → → → →
linear combination of a n , a n . By Theorem 7.1.3 , an , an , a1 is linearly dependent and
1 1 1 2
→ →→ →→
r((a n a n a 1)) = r((a n a n ) = 2.
1 2 1 2
→ → → → → → →
Because y 21a n + y 22a n = a 2, by Theorem 1.4.2 , a 2 is a linear combination of a n , a n , a 1. By
1 2 1 2
{ }
→ → → →
Theorem 7.1.3 , a n , a n , a 1, a 2 is linearly dependent and
1 2
→ → →→ → →→ →→
r((a n a n a 1 a 2)) = r((a n a n a 1)) = r((a n a n ) = 2.
1 2 1 2 1 2
→ →→ → → → → → →
r((a n a n a 1, a 2, ⋯a n − 1, a n + 1⋯a n − 1, a n + 1⋯ a n)) = 2.
1 2 1 1 2 2
→ →→ → → → → → →
Let B = (a n a n a 1, a 2, ⋯a n − 1, a n + 1⋯a n − 1, a n + 1⋯ a n). Then, by the third row operations, it
1 2 1 1 2 2
T T
is easy to prove r(B ) = r(A ). By Theorem 2.5.5 , r(B) = r(A) = 2.
Finally, we apply linear independence to study bases of the solution spaces of the homogenous system
→ →
AX = 0 .
Theorem 7.2.7.
→ → →
Let A be the same as in (7.2.4) . Assume that null(A) = k. Then the basis vectors υ 1 , υ 2 , ⋯, υ k given
in Definition 4.3.1 are linearly independent.
Proof
(7.2.14)
→ → →
→
x 1 υ 1 + x 2 υ 2 + ⋯ + x k υ k = 0.
→ → →
→
Let υ = x 1 υ 1 + x 2 υ 2 + ⋯ + x k υ k . By Theorem 4.3.1 (P 1), the components on
→ →
n 1, n 2⋯, n k(1 ≤ n 1 < n 2 < ⋯ < n k ≤ n) are x 1, x 2, ⋯, x k, respectively. By (7.2.14) , υ = 0 implies that
→ → →
x 1 = x 2 = ⋯ = x k = 0 and thus, υ 1 , υ 2 , ⋯, υ k are linearly independent.
18/20
7.2 Bases of spanning and solution spaces
→ → →
By Theorem 7.2.7 and Definition 7.2.1 , we see that the set of vectors υ 1 , υ 2 , ⋯, υ k given in
Definition 4.3.1 is a basis of the solution space N A. As mentioned above, the bases of a spanning space
are not unique; hence, the solution space N A has many other bases.
By Theorems 7.2.3 and 7.2.7 , we will prove the following result, which shows that the number of
vectors in all bases of N A is equal to the nullity null(A) of the matrix A.
Theorem 7.2.8.
→ → → → → →
Let υ 1 , υ 2 , ⋯, υ k be the basis vectors of N A given in Definition 4.3.1 and let u 1, u 2, ⋯, u l be vectors
in N A. Assume that the following conditions hold.
→ → →
1. u 1, u 2, ⋯, u l are linearly independent.
{ }
→ → →
2. N A = span u 1, u 2, ⋯, u l .
Then l = k.
Proof
→ → →
By Theorem 7.2.7 , the basis vectors υ 1 , υ 2 , ⋯, υ k are linearly independent. By conditions (1) and (2)
and Theorem 7.2.3 , l = k.
{ }
→ → →
Theorem 7.2.8 shows that any set of k-linearly independent vectors in N A = span υ 1 , υ 2 , ⋯, υ k
constitutes a basis of N A. We shall use Theorem 7.2.8 in the proof of Theorem 8.2.4 .
19/20
7.2 Bases of spanning and solution spaces
Exercises
{ }
→ → → →
1. Let a 1 = (1, − 1, 1, 0), and a 2 = (1, 0, 1, 1). Determine whether S = a 1, a 2 is a basis of span S. If
{ }
→ → →
2. Determine whether S = a 1, a 2, a 3 is a basis of span S, where
→ → →
i. a 1 = (1, 0, − 1, 1), a 2 = (0, 0, 1, 1) and a 3 = ( − 2, 1, 0, 1).
→ → →
ii. a 1 = (1, 0, 1, 1), a 2 = (0, 0, 1, − 1) and a 3 = (1, 0, 2, 0).
{ }
→ → → → → → → →
3. Let a 1 = (1, − 1), a 2 = (0, 1), a 3 = ( − 3, 2) and a 4 = (1, 0). Determine whether S = a 1, a 2, a 3, a 4
is a basis of span S.
{ }
→ → → →
→
4. Let a 1 = (1, 0, 1) and a 2 = (1, 1, 1). Find a vector S = a 1, a 2, b is a basis of span S.
20/20
7.3 Methods of finding bases of spanning spaces
Definition 7.3.1.
The spanning space span S is said to be the column space of A, and to be the row space of B.
We denote by CA and RB the column space of A and the row space of B, respectively. Then
(7.3.1)
n n
CA = RB = span S = A(R ) = T (R ).
If the set S given in (7.2.1) is not linearly independent, then it is not a basis of span S. Our purpose is to find
→ → →
linearly independent vectors in Rm , say T = { b1 , b2 , ⋯ , br } in Rm , such that span T = span S, that
is,
(7.3.2)
→ → → → → −
→
span{ b1 , b2 , ⋯ , br } = span{a1 , a2 , ⋯ , an }.
… 1/14
7.3 Methods of finding bases of spanning spaces
Usually, we seek bases with all the vectors satisfying one of the following requirements:
We first prove two results, which provide the methods to find basis vectors that satisfy the above requirement
(1) or (2).
We start with bases that satisfy the above requirement (1). Let
7.3.3
⎝ ⎠
cm1 cm2 ⋯ cmm
−
→
be a row echelon matrix and Ci the ith column vector of C.
By Theorems 7.1.1 and 7.2.5 , we prove the following result, which shows that the column vectors with
the leading entries of a row echelon matrix C form a basis for the column space of C.
Lemma 7.3.1.
−
−→ −
−→ −
−→
Let C be a row echelon matrix and let Cn , 1
Cn 2 ⋯ , Cn r be the column vectors of C containing leading
entries. Then the following assertions hold.
−
−→ −
−→ −
−→
i. {Cn 1 , Cn 2 ⋯ , Cn r } is linearly independent.
… 2/14
7.3 Methods of finding bases of spanning spaces
−
−→ −
−→ −
−→ −
→ −
→ −→
ii. span{Cn 1
, Cn 2 ⋯ , Cn r } = span{C1 , C2 ⋯ , Cn } .
Proof
−
−→−
−→ −
−→
i. Let M = (Cn 1 Cn 2 ⋯ Cn r ). By Theorem 2.5.1 (4), r(M ) = r. This, together with Theorem
−
−→ −
−→ −
−→
7.1.1 (1), implies that {Cn 1
, Cn 2 ⋯ , Cn r } is linearly independent.
ii. By the result (i) and Theorem 7.2.5 , the result (ii)holds.
By Lemma 7.3.1 , we prove the following result, which shows that if C is a row echelon matrix of A and a set
of column vectors of C formsabasisfor the column space of C, then the corresponding column vectors of A form
a basis for the column space of A, which is equal to span S.
Theorem 7.3.1.
→→ −→
Let S be the same as in (7.2.1) and A = (a1 a2 ⋯ an ) be the same as in (7.2.4) . Let
−
→−→ −→ −
−→ −
−→ −
−→
C = (C1 C2 ⋯ Cn ) be a row echelon matrix of A. Let Cn 1
, Cn 2 , ⋯ , Cn r be the column vectors of C
−→ − → −→
containing leading entries. Then the set of the pivot columns of A, that is, {an 1
, an 2 , ⋯ , an r } is a basis
of span S.
Proof
Because C is a row echelon matrix of A and there are r columns vectors of C that contain the leading
entries of C, by Theorem 2.5.1 , r(A) = r(C) = r. By Lemma 7.3.1 , we have
(7.3.4)
−
−→ −
−→ −
−→ −
→ −
→ −→
span{Cn 1 , Cn 2 , ⋯ , Cn r } = span{C1 , C2 , ⋯ , Cn }.
… 3/14
7.3 Methods of finding bases of spanning spaces
It follows from Theorem 2.7.3 that there exists an invertible matrix F such that A = F C. Hence, by
(2.2.6) , we have
→→ −
→− → −→ −→ − → −→
−→
A = (a1 a2 ⋯ an ) = F (C1 C2 ⋯ Cn ) = F (F C1 F C2 ⋯ F Cn )
and thus,
(7.3.5)
→ −→
ai = F Ci for i ∈ In .
−→− → − → −
−→−
−→ −
−→
Let A∗ = (an 1 an 2 ⋯ an r ) and C ∗ = (Cn 1 Cn 2 ⋯ Cn r ). By (7.3.5) and (2.2.6) ,
−
−→ −
−→ −
−→ −
−→−
−→ −
−→
−→− → − →
A
∗ ∗
= (an 1 an 2 ⋯ an r ) = (F Cn 1 F Cn 2 ⋯ F Cn r ) = F (Cn 1 Cn 2 ⋯ F Cn r ) = F C . Because F is
invertible, by Corollary 2.7.3 ,
∗ ∗
r(A ) = r(C ) = r(C) = r(A) = r.
−→ − → −→
By Theorem 7.1.1 (1), S ∗ = {an 1 , an 2 , ⋯ , an r } is linearly independent.
→ → −
→ −→ −→ −→
span S = span{a1 , a2 , ⋯ , an } = span{F C1 , F C2 , ⋯ , F Cn }
−
→ −
→ −→ −
−→ −
−→ −
−→
= F span{C1 , C2 , ⋯ , Cn } = F span{Cn 1 , Cn 2 , ⋯ , E Cn r }
−
−→ −
−→ −
−→
∗
− → −→
= span{F Cn 1 , F Cn 2 , ⋯ , F Cn r } = span S . an 2 , ⋯ , an r }.
… 4/14
7.3 Methods of finding bases of spanning spaces
Next, we work with bases that satisfy the above requirement (2). We first give the following result, which shows
that if B and D are row equivalent (see Definition 2.5.2 ), then all the row vectors of D are contained in the
spanning space of the row vectors of B.
Lemma 7.3.2.
Assume that B and D are row equivalent. Then, each row of D is a linear combination of the row vectors of
B.
Proof
Because B and D are row equivalent, by Theorem 2.7.3 , there exists an invertible matrix F such that
D = F B. By Theorem 2.2.3 , we have
T T T T
D = B F = AF ,
→ −
→
Let be the ith row vector of D for i ∈ In and Xi be the ith column vector of F T for
T
di = (x1i , ⋯ xni )
→T −
→
T
→ → −→
di = AX i = x1i a1 + x2i a2 + ⋯ + xni an
Theorem 7.3.2.
Assume that D is a row-echelon matrix of B. Then the row vectors with the leading entries of D form a basis
for RB .
… 5/14
7.3 Methods of finding bases of spanning spaces
Proof
→ → −
→
Let d1 , d2 , ⋯ , dn be the row vectors of D. Because D is a row-echelon matrix, we assume that
→ → → −−→ −
→
d1 , d2 , ⋯ , dr are the row vectors of D with the leading entries. Then other rows from dr+1 to dn are
→→ →
zero rows if r < n. Let D∗ = ( d1 d2 ⋯ dr ). Because r(D∗ ) = r, by Theorem 7.1.1 , the set D∗ is
linearly independent. Then by Theorem 2.5.4 and Theorem 2.5.1 (3),
∗
r(B) = r(D) = r(D ) = r.
→
→ → −
→
di ∈ span{a1 , a2 , ⋯ , an } = span S for i = 1, 2, ⋯ , r,
→ → → → → −
→
span{ d1 , d2 , ⋯ , dr } = span{a1 , a2 , ⋯ , an }
→ → →
and { d1 , d2 , ⋯ , dr } is a basis of RB .
By Theorems 7.3.1 and 7.3.2 , we see that there are two methods to find the bases.
Method 1. Finding bases of span S satisfying the above requirement (1) is equivalent to finding the pivot
columns of A (see Section 2.9). Hence, we use row operations to change A to a row echelon matrix
−
→−→ −→
C = (C1 C2 ⋯ Cn ).
… 6/14
7.3 Methods of finding bases of spanning spaces
−
−→ −
−→ −
−→
Let Cn , Cn , ⋯ , Cn be the column vectors of C containing leading entries. We shall prove that the
1 2 r
−→ − → −→
{an 1 , an 2 , ⋯ , an r }
Method 2. To find bases of span S that satisfy the above requirement (2), we use row operations to change B
−
−→ −
−→ −
−→
to a row echelon matrix D. Let dm1 , dm2 , ⋯ , dmr be the column vectors of D containing leading entries.
Then, we shall prove that
−
−→ −
−→ −
−→
{dm1 , dm2 , ⋯ , dmr }
Example 7.3.1.
1 1 2 1
⎛ ⎞
C = ⎜0 0 −1 −3 ⎟ .
⎝ ⎠
0 0 0 8
Solution
−
→
We denote the column vectors of C by Ci , i = 1, 2, 3, 4, that is,
−
→−→−
→−→
C = (C1 C2 C3 C4 ).
… 7/14
7.3 Methods of finding bases of spanning spaces
−
→ −
→ −
→
The column vectors of C containing the leading entries are C1 , C3 , C4 , and by Theorem 7.3.1 ,
−
→ −
→ −
→
{C1 , C3 , C4 } is a basis for the column space of C.
Example 7.3.2.
Find a set of column vectors of A that forms a basis for CA and dim(CA ), where
1 1 2 1
⎛ ⎞
A = ⎜1 1 1 −2 ⎟ .
⎝ ⎠
2 2 1 1
Solution
→
We denote the column vectors of A by ai , i = 1, 2, 3, 4. Then
→→→→
A = (a1 a2 a3 a4 ).
1 1 2 1 1 1 2 1
⎛ ⎞ ⎛ ⎞
R1 (−1)+R2
A = ⎜1 1 1 −2 ⎟ −−−−−→ ⎜0 0 −1 −3 ⎟
R1 (−2)+R3
⎝ ⎠ ⎝ ⎠
2 2 1 1 0 0 −3 −1
1 1 2 1
R2 (−3)+R3
⎛ ⎞
−−−−−→ ⎜0 0 −1 −3 ⎟ = C,
⎝ ⎠
0 0 0 8
where C is the same as in Example 7.3.1 . From the matrix C, we see that the column vectors
−
→ −
→ −
→
→ → →
C1 , C3 , C4 contain leading entries of C. The pivot column vectors in A are a1 , a3 , a4 , that is,
… 8/14
7.3 Methods of finding bases of spanning spaces
1 2 1
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
→ → →
a1 = ⎜ 1 ⎟ , a3 = ⎜ 1 ⎟ , a4 = ⎜ −2 ⎟ .
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
2 1 1
→ → →
By Theorem 7.3.1 , the column vectors a1 , a3 , a4 of A formabasisof CA and
→ → → → → → →
span{a1 , a3 , a4 = span{a1 , a2 , a3 , a4 }.
Example 7.3.3.
1 0 0 4
⎛ ⎞
⎜0 1 0 ⎟ 7
D = ⎜ ⎟.
⎜0 0 0 −1 ⎟
⎝ ⎠
0 0 0 0
Solution
Note that D is a row echelon matrix. By Theorem 7.3.2 , the row vectors with the leading entries in D
→ → →
form a basis for RD . Let and Then
T T T
d1 = (1, 0, 0, 4) , d2 = (0, 1, 0, 7) , d3 = (0, 0, 0, − 1) .
→ → →
d1 , d2 , d3 form a basis of RD .
Example 7.3.4.
… 9/14
7.3 Methods of finding bases of spanning spaces
1 −1 3
⎛ ⎞
B = ⎜ 2 0 4⎟.
⎝ ⎠
−1 −3 1
Solution
1 −1 3 1 −1 3
⎛ ⎞ R1 (−2)+R2
⎛ ⎞ R2 (2)+R3
B = ⎜ 2 0 4⎟ −−−−−→ ⎜0 2 −2 ⎟ −−−−−→
R1 (1)+R3
⎝ ⎠ ⎝ ⎠
−1 −3 1 0 −4 4
1 −1 3
⎛ ⎞
⎜0 2 2 ⎟ = D.
⎝ ⎠
0 0 0
→ → → →
Let d1 = (1, − 1, 3), d2 = (0, 2, 2). Then by Theorem 7.3.2 , d1 , d2 are basis vectors for RB .
→ → → →
Note that d1 is a row vector of B, but d2 is not a row vector of B, so the basis { d1 , d2 } does not satisfy the
requirement (1).
Example 7.3.5.
→ → → →
Let S = {a1 , a2 , a3 , a4 }, where
1 −2 0 −2
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
→ → → →
a1 = ⎜ 2 ⎟ , a2 = ⎜ 0 ⎟ , a3 = ⎜ 4 ⎟ , a4 = ⎜ −4 ⎟ .
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
−3 4 −2 6
10/14
7.3 Methods of finding bases of spanning spaces
a. Use Theorem 7.3.1 to find a subset of S that forms a basis for span S.
b. Use Theorem 7.3.2 to find another basis for span S.
Solution
a. Because we seek a subset of S that forms a basis for span S, we need to put the vectors in S as the
column vectors of a matrix A, that is,
1 −2 0 −2
⎛ ⎞
A = ⎜ 2 0 4 −4 ⎟ .
⎝ ⎠
−3 4 −2 6
1 −2 0 −2 1 −2 0 −2
⎛ ⎞ ⎛ ⎞
R1 (−2)+R2
A = ⎜ 2 0 4 −4 ⎟ −−−−−→ ⎜0 4 4 0 ⎟
R1 (3)+R3
⎝ ⎠ ⎝ ⎠
−3 4 −2 6 0 −2 −2 0
1 −2 0 −2
R2 (
1
)+R3 ⎛ ⎞
2
−−−−−→ ⎜0 4 4 0 ⎟ = C.
⎝ ⎠
0 0 0 0
−
→ −
→
The column vectors C1 , C2 of C contain the leading entries of C. The corresponding column
→ → → →
vectors of A are a1 , a2 . By Theorem 7.3.1 , {a1 , a2 } forms a basis of span S.
b. We define a matrix B by putting the vectors in S as its row vectors
1 2 −3
⎛ ⎞
⎜ −2 0 4 ⎟
B = ⎜ ⎟.
⎜ 0 4 −2 ⎟
⎝ ⎠
−2 −4 6
11/14
7.3 Methods of finding bases of spanning spaces
1 2 −3 1 2 −3
⎛ ⎞ ⎛ ⎞
R1 (2)+R2 −2 ⎟ R2 (−1)+R3
⎜ −2 0 4 ⎟ ⎜0 4
B = ⎜ ⎟ −− − −−→⎜ ⎟ −−−−− →
⎜ 0 4 −2 ⎟ R1 (2)+R4 ⎜ 0 4 −2 ⎟
⎝ ⎠ ⎝ ⎠
−2 −4 6 0 0 0
1 2 −3
⎛ ⎞
⎜0 4 −2 ⎟
⎜ ⎟ = D.
⎜0 0 0 ⎟
⎝ ⎠
0 0 0
→ → → →
Let d1 = (1, 2, − 3), d2 = (0, 4, − 2). It follows from Theorem 7.3.2 that { d1 , d2 }
Exercises
1. Find a basis for the column space of C, where
1 −3 4 −2 5 4
⎛ ⎞
⎜0 0 1 3 −2 −6 ⎟
C = ⎜ ⎟.
⎜0 0 0 0 1 5 ⎟
⎝ ⎠
0 0 0 0 0 0
2. For each of the following matrices, find a set of its column vectors that forms a basis for its column
space.
12/14
7.3 Methods of finding bases of spanning spaces
1 3 0 2 0
⎛ ⎞
0 2 −1
⎛ ⎞
⎜1 1 1 0 0⎟
A1 = ⎜ 1 −1 3 ⎟ A2 = ⎜ ⎟
⎜2 4 1 −1 0⎟
⎝ ⎠
2 0 1 ⎝ ⎠
0 0 0 3 1
1 −3 4 −2 5 4
⎛ ⎞
⎜ 2 −6 9 −1 8 2 ⎟
A3 = ⎜ ⎟.
⎜ 2 −6 9 −1 9 7 ⎟
⎝ ⎠
−1 3 −4 2 −5 −4
3. For each of the following row echelon matrices, find the basis and dimension of its row space and use
the basis vectors to denote the row space.
0 1 −2 0 1 1 2 5 0 3
⎛ ⎞ ⎛ ⎞
1 1 0
⎛ ⎞
⎜0 0 0 1 3⎟ ⎜0 1 3 0 0⎟
D1 = ⎜ 0 1 0⎟ D2 = ⎜ ⎟ D3 = ⎜ ⎟
⎜0 0 0 0 0⎟ ⎜0 0 0 1 0⎟
⎝ ⎠
0 0 0 ⎝ ⎠ ⎝ ⎠
0 0 0 0 0 0 0 0 0 0
1 −1 3
⎛ ⎞
B = ⎜ 2 0 4⎟.
⎝ ⎠
−1 −3 1
→ → → → →
5. Let S = {a1 , a2 , a3 , a4 , a5 }, where
→ T → T → T
a1 = (1, − 2, 0, 3) , a2 = (2, − 5, − 3, 6) , a3 = (0, 1, 3, 0) ,
→ T → T
a4 = (2, − 1, 4, − 7) , a5 = (5, − 8, 1, 2) .
13/14
7.3 Methods of finding bases of spanning spaces
a. Find a subset of S that forms a basis for the spanning space span S.
b. Find another basis for span S.
14/14
7.4 Coordinates and orthonormal bases
{ }
→ → →
Let S = a 1, a 2, ⋯, a n be a basis of ℝ n. Then
{ }
→ → →
n
ℝ = span a 1, a 2, ⋯, a n .
Theorem 7.4.1.
{ }
→ → →
is a basis of ℝ n, then for each b ∈ ℝ n, there exists a unique vector (x 1, x 2, ⋯, x n)
→
If S = a 1, a 2, ⋯, a n
such that
(7.4.1)
→ → →
→
b = x 1a 1 + x 2a 2 + ⋯x na n.
… 1/21
7.4 Coordinates and orthonormal bases
Definition 7.4.1.
→
The numbers x 1, x 2, ⋯, x n in (7.4.1) are said to be the coordinates of the vector b relative to the basis
S. We write
(7.4.2)
→
( b) S = (x 1, x 2, ⋯, x n).
→ → →
If b = (b 1, b 2, ⋯, b n) ∈ ℝ n, then the coordinates b 1, b 2, ⋯, b n of b are the coordinates of b relative to the
standard basis E. Hence, according to (7.4.2) , we have
(7.4.3)
→ →
b = ( b) E.
→ → →
Let A be the matrix whose column vectors are a 1, a 2, ⋯, a n. Then we can rewrite (7.4.1) as
(7.4.4)
→ →
( b) E = A( b) S.
(7.4.5)
→ → → →
( b) S = A − 1( b) E = A − 1 b for b ∈ ℝ n.
Example 7.4.1.
… 2/21
7.4 Coordinates and orthonormal bases
Let S =
{( ) ( )}
1
1
,
−1
1
be a set of vectors in ℝ 2.
Solution
i. Let A =
( ) 1 −1
. Then |A | = 1 + 1 = 2 ≠ 0. By Theorem 7.2.1 (ii), S is a basis of ℝ 2.
t
1 1
→ →
( b) S = A − 1 b =
1
( )( ) ( )
1 1
2 −1 1
2
−4
=
−1
−3
,
→ →
that is, ( b) S = ( − 1, − 3) are the coordinates of the vector b = (2, − 4) relative to the basis S.
→
In Example 7.4.1 , we used the inverse matrix method to find the coordinates ( b) S. In fact, all the other
methods given in Chapter 4 , including Gaussian elimination, Gauss-Jordan elimination, and Cramer’s
→ →
rule, can be employed to solve the system AX = b .
Example 7.4.2.
→ → →
Let a 1 = (1, 0, 1), a 2 = (0, 1, 1), and a 3 = (1, 1, 0).
… 3/21
7.4 Coordinates and orthonormal bases
{ }
→ → →
i. Show that S = a 1, a 2, a 3 is a basis of ℝ 3.
→ →
ii. Let b = (1, 1, 1) T. Find ( b) S.
Solution
| |
1 0 1
{ }
→ → →
i. Because |A | = 0 1 1 = − 2 ≠ 0, by Theorem 7.2.1 (ii), a 1, a 2, a 3 is a basis of ℝ 3.
1 1 0
t
ii.
( ) ( )
1 0 1 1 R1 ( − 1 ) + R3
1 0 1 1 R2 ( − 1 ) + R3
→
(A | b) = 0 1 1 1 → 0 1 1 1 →
1 1 0 1 0 1 −1 0
( )
( )
( )
1 1 0 1 1
1 0 1 1 R3 −2
R3 ( − 1 ) + R2
0 1 1 1
0 1 1 1 → →
1 R3 ( − 1 ) + R1
0 0 −2 −1 0 0 1 2
( )
1
1 0 0 2
1
0 1 0 2 .
1
0 0 1 2
… 4/21
7.4 Coordinates and orthonormal bases
→
Hence, ( b) S = ( 1
2
,
1
2
,
1
2 ) .
In Example 7.4.2
→
, the coordinates of b = (1, 1, 1) relative to the basis S are
( 1
2
,
1
2
,
1
2 ) while its
→
Equation (7.4.5) gives the relation between the coordinates of a vector b relative to a basis S and the
standard basis E.
→
Now, we study the relations between the coordinates of a vector b relative to two bases of ℝ n, that is, we
shall generalize the relation given in (7.4.5) by replacing the standard basis E by any basis of ℝ n.
By Theorem 7.2.1 (ii) and Lemma 7.2.1 (ii) with m = n = k, we obtain the following result, which gives
the change of bases.
Theorem 7.4.2.
{ } { }
→ → → → → →
Let S = a 1, a 2, ⋯, a n and T = b 1, b 2, ⋯, b n be two bases of ℝ n. Let
→→ → →→ →
A = (a 1 a 2⋯ a n), B = (b 1 b 2⋯ b n). Then A and B are invertible and there exists a unique invertible n × n
matrix
(7.4.6)
… 5/21
7.4 Coordinates and orthonormal bases
( )
c 11 c 12 ⋯ c 1n
c 21 c 22 ⋯ c 2n
C=
⋮ ⋮ ⋱ ⋮
cn cn ⋯ c nn
1 2
(7.4.7)
{
→ → → →
b 1 = c 11a 1 + c 21a 2 + ⋯ + c n a n
1
→ → → →
b 2 = c 12a 1 + c 22a 2 + ⋯ + c n a n
2
⋮
→ → → →
b n = c 1na 1 + c 2na 2 + ⋯ + c nna n.
Definition 7.4.2.
{ } { }
→ → → → → →
Let S = a 1, a 2, ⋯, a n and T = b 1, b 2, ⋯, b n be two bases of ℝ n. The matrix C given in (7.4.6) is
The following result gives the change of coordinates under two bases.
… 6/21
7.4 Coordinates and orthonormal bases
Theorem 7.4.3.
{ } { }
→ → → → → →
→
Let S = a 1, a 2, ⋯, a n and T = b 1, b 2, ⋯, b n be two bases of ℝ n. Then for each vector b ∈ ℝ n,
(7.4.8)
→ →
( b) S = C( b) T.
Proof
Let A and B be the same as in Theorem 7.4.2 . Because S and T are bases of ℝ n, by (7.4.4) , we
→ n
have for each vector b ∈ ℝ ,
(7.4.9)
→ →
( b) E = A( b) S and (→ →
b) E = B( b) T.
→ → → → →
( b) S = A − 1( b) E = A − 1(B( b) T) = (A − 1B)( b) T = C( b) T
Example 7.4.3.
Let S =
{( ) ( )}
2
1
,
1
−2
and T =
{( ) ( )}
1
2
,
−1
1
. Assume that ( b) T =
→
( )
− 10
5 →
. Find ( b) S.
Solution
… 7/21
7.4 Coordinates and orthonormal bases
Let A =
( )
2 1
1 −2
and B =
( )
1 −1
2 1
. By Theorem 2.7.5 ,
( )
2 1
A −1 =
(
1 −2 −1
−5 −1 2
=
) 5
1
5
−5
5
2
.
Hence,
( ) ( )
2 1 4 1
−5
C = A − 1B =
5
1
5
−5
5
2 ( )
1 −1
2 1
=
5
3
−5 −5
3
.
( )
4 1
( ) ()
5
−5 5 6
→ →
( b) S = C( b) T = = .
3 3 − 10 3
−5 −5
… 8/21
7.4 Coordinates and orthonormal bases
{ }
→ → →
Let S = a 1, a 2, ⋯, a n be a basis of ℝ n. In general, these vectors may not be orthogonal. In some cases,
Definition 7.4.3.
{ }
→ → →
Let S = a 1, a 2, ⋯, a n be a basis of ℝ n.
→ →
a i ⊥ a j for i ≠ j and i, j ∈ {1, 2, ⋯, n},
|| | |
→
ai = 1 for i = 1, 2, ⋯, n,
Example 7.4.4.
Determine which of the following bases are orthogonal bases and orthonormal bases.
… 9/21
7.4 Coordinates and orthonormal bases
S1 =
{( ) ( )}
−1
1
,
−1
1
S2 =
{( ) ( )}
1
1
,
−1
1
{( ) ( )} {( ) ( ) ( )}
1 1
1 0 0
√2 √2
S3 = , S4 = 0 , 1 , 0
1 1
− 0 0 1
√2 √2
Solution
| | (1, 1) | | = √2 ≠ 1.
()
1
√2
= 0 and
ǁ
( )
1
√2 √2
,
1
ǁ = 1 and ǁ
( 1
√2
, −
1
√2 )
ǁ = 1.
10/21
7.4 Coordinates and orthonormal bases
→ → → → →
S 4 is an orthonormal basis. In fact, because it is easy to verify that e 1 ⋅ e 2 = 0, e 3 = 0 and e 2 ⋅ e 3 = 0.
orthonormal basis of ℝ 3.
In Example 7.4.4 , the basis S 4 is the standard basis of ℝ 3. In fact, the standard basis E given in (7.2.1)
is an orthonormal basis of ℝ n.
The coordinates of a vector b in ℝ n relative to an orthogonal basis or an orthonormal basis can be obtained
→
by using the methods given in Section 8.4. However, because the basis vectors in an orthogonal basis or an
orthonormal basis are orthogonal, we can provide formulas for their coordinates.
Theorem 7.4.4.
{ }
→ → →
i. If S = a 1, a 2, ⋯, a n is an orthogonal basis of ℝ n, then
(7.4.10)
(|| | )
→ → →
→ → →
b ⋅ a1 b ⋅ a2 b ⋅ an
→
( b) S = , , ⋯, .
|| | || |
→ → →
a1 |2 a2 |2 an |2
{ }
→ → →
ii.0%If S = a 1, a 2, ⋯, a n is an orthonormal basis of ℝ n, then
11/21
7.4 Coordinates and orthonormal bases
(7.4.11)
→ → →
→ → → →
( b) S = ( b ⋅ a 1, b ⋅ a 2, ⋯, b ⋅ a n).
Proof
{ }
→ → →
→
Because S = a 1, a 2, ⋯, a n is a basis of ℝ n, by Theorem 7.4.1 , for each b ∈ ℝ n, there exists a
→ → → →
→
b = x 1a 1 + x 2a 2 + ⋯ + x i a i + ⋯ + x na n.
(7.4.12)
→ → → → → → → → →
→
( b ⋅ a i ) = x 1(a 1 ⋅ a i ) + x 2(a 2 ⋅ a i ) + ⋯ + x i( a i ⋅ a i ) + ⋯ + x n(a n ⋅ a i ).
→ →
a j ⋅ a i = 0 for j ≠ i and i, j ∈ I n.
|| |
→ → → →
→
b ⋅ a i = x i( a i ⋅ a j ) = x i ai |2
12/21
7.4 Coordinates and orthonormal bases
and
→
→
b ⋅ ai
xi = for i ∈ I n.
|| |
→
ai |2
|| | |
→
ii. If S is an orthonormal basis, then ai = 1 for i ∈ I n. This, together with (7.4.10) , implies
Example 7.4.5.
→ → →
Let b = (0, 1) T. Find ( b) S and ( b) T, where
{( ) ( )}
1 1
S=
{( ) ( )}
1
1
,
−1
1
and T =
√2
1
√2
,
√2
−
1
√2
.
Solution
()
→ 1
By Example 7.4.4 , S is an orthogonal basis but not an orthonormal basis. Let a 1 = and
1
( )
→ −1
a2 = . Then
1
13/21
7.4 Coordinates and orthonormal bases
( ) || |
→ 1 →
→
b ⋅ a 1 = (0, 1) = 1, a 1 | 2 = 2.
1
and
( ) || |
→ −1 →
→
b ⋅ a 2 = (0, 1) = 1, a 2 | 2 = 2.
1
By (7.4.10)
→
, we obtain ( b) S = ( )
1
2
,
1
2
.
() ( )
1 1
→ √2 → √2
a1 = and a 2 = .
1 1
−
√2 √2
Then
() ()
1 1
→ √2 1 → √2 1
→ →
b ⋅ a 1 = (0, 1) 1 = and b ⋅ a 2 = (0, 1) 1
= − .
√2 − √2
√2 √2
14/21
7.4 Coordinates and orthonormal bases
By (7.4.11)
→
, ( b) S = ( 1
2
, −
1
2 ) .
The following result is the well-known Gram-Schmidt process, which shows that all bases of ℝ n can be
changed to orthogonal bases and orthonormal bases of ℝ n. We omit its proof.
{ }
→ → →
Let S = a 1, a 2, ⋯, a n be a basis of ℝ n. Let
→ →
b 1 = a 1,
→ →
→ → ( a2 ⋅ b1 ) →
b2 = a2 − b 1,
|| |→
b1 |2
→ → → →
→ → ( a3 ⋅ b1 ) → ( a3 ⋅ b1 ) →
b3 = a3 − b1 − b 2,
|| |→
b1 |2
|| |→
b2 |2
⋮
→ → → → → →
→ → ( an ⋅ b1 ) → ( an ⋅ b2 ) → ( bn ⋅ bn − 1 ) →
bn = an − b1 − b2 − ⋯ − b n − 1.
|| |→
b1 |2
|| |→
b2 |2
|| | →
bn − 1 |2
Then
15/21
7.4 Coordinates and orthonormal bases
{ }
→ → →
i. b 1, b 2, ⋯, b n is an orthogonal basis of ℝ n.
{ | | | | | | | | | | | |}
→ → →
b1 b2 bn
ii. , , ⋯, is an orthonormal basis of ℝ n.
→ → →
b1 b2 bn
Example 7.4.6.
→ → →
Let a 1 = (1, 0, 1), a 2 = (0, 1, 1), and a 3 = (1, 1, 0) be the same as in Example 7.4.2 .
{ }
→ → →
1. Find an orthogonal basis S 1 = b 1, b 2, b 3 by using the Gram-Schmidt process. Moreover, if
→ →
b = (1, 1, 1) T, find ( b) S .
1
{ }
→ → →
→
2. Find an orthonormal basis S 2 : = q 1, q 2, q 3 by normalizing S 1. Moreover, if b = (1, 1, 1) T, find
→
( b) S .
2
Solution
→ →
1. Let b 1 = a 1 = (1, 0, 1) and let
16/21
7.4 Coordinates and orthonormal bases
→ →
→ → (a 2 ⋅ b 1) →
b2 = a2 − b 1.
|| |
→
b1 |2
|| |
→ → → → → →
2
To find b 2, we need to compute a 2 ⋅ b 1 and b 1 | . Because a 2 ⋅ b 1 = (0, 1, 1) ⋅ (1, 0, 1) = 0
|| |
→
and b 1 | 2 = 2, we have
( ) ( )
→ 1 1 1 1 1
b 2 = (0, 1, 1) − (1, 0, 1) = (0, 1, 1) − , 0, = − , 1, .
2 2 2 2 2
Let
→ → → →
→ → (a 3 ⋅ b 1) → (a 3 ⋅ b 2) →
b3 = a3 − b1 − b 2.
|| | || |
→ →
b1 |2 b2 |2
|| |
→ → → → → → → → → → 1
2
To find b 3, we need to compute a 3 ⋅ b 1, a 3 ⋅ b 2 and b 2 | . Because a 3 ⋅ b 1 = 1, a 3 ⋅ b 2 = 2 ,
|| |
→ 3
and b 2 | 2 = 2 , we obtain
17/21
7.4 Coordinates and orthonormal bases
1
( ) ( )
→ 1 2 1 1 2 2 2
b 3 = (1, 1, 0) − (1, 0, 1) − − , 1, = , , − .
2 3 2 2 3 3 3
2
( ) ( )
→ → 1 1 → 2 2 2
Hence, b 1 = (1, 0, 1), b 2 = − 2 , 1, 2
, b3 = 3
, 3
, − 3
form an orthogonal basis of ℝ 3.
|| | | || || || | |
→ → √6 → 2 √3 → →
→ →
By computation, b1 = √2, b2 = 2
and b3 = 3
, ( b ⋅ b 1) = 2, b ⋅ b 2 = 1, and
→ 2
→
b ⋅ b 3 = 3 . By Theorem 7.4.4 ,
( || | ) ( )
→ → → 2 2
→ → →
( )
b ⋅ b1 b ⋅ b2 b ⋅ b3 2 3 3 2 1
→
( b) S = , , = , , = 1, , .
2 4 4 3 2
|| | || |
1
→ → →
3 3
b1 |2 b2 |2 b3 |2
2. By computation, we obtain
18/21
7.4 Coordinates and orthonormal bases
→
( )
→ b1 √2 √2 √2
q1 : = = 2
(1, 0, 1) = 2
, 0, 2
,
|| ||
→
b1
( ) ( )
→ b2 √6 1 1 √6 √6 √6
q2 : = = 3
− 2 , 1, 2
= − 6
, 3
, 6
,
|| ||
→
b2
( ) ( )
→ b3 √3 2 2 2 √3 √3 √3
q3 : = = 2 3
, 3
, − 3
= 3
, 3
, − 3
.
|| ||
→
b3
{ }
→ → →
Hence, S 2 = q 1, q 2, q 3 forms an orthonormal basis of ℝ 3 and
( √6 √3
)
→ → →
→ → → →
( b) S = ( b ⋅ q 1, b ⋅ q 2, b ⋅ q 3) = √2, , .
2 3 3
Exercises
1. For each of the following pairs of vectors, verify that it is a basis of ℝ 2 and find the coordinates of the
→
vector b = (x, y) T.
19/21
7.4 Coordinates and orthonormal bases
S1 =
{( ) ( )} {( ) ( )}
−1
1
,
1
0
S2 =
−3
2
,
2
3
S3 =
{( ) ( )} {( ) ( )}
5
−6
,
4
−1
S4 =
−2
3
,
−2
5
→ → →
2. Let a 1 = (1, 0, 1) T, a 2 = (0, 1, 1) T, and a 3 = (1, 1, 0) T.
{ }
→ → →
i. Show that S = a 1, a 2, a 3 is a basis of ℝ 3.
→ →
ii. Let b = (1, 1, 1) T. Find ( b) S.
→ → →
3. Let a 1 = (1, 0, 0) , a 2 = (2, 4, 0) , and a 3 = (3, − 1, 5) T.
T T
{ }
→ → →
i. Show that S = a 1, a 2, a 3 is a basis of ℝ 3.
→ →
ii. Let b = (2, − 1, − 2). Find ( b) S.
4. Let S 1 =
{( ) ( )}
→
−1
1
,
1
0
and S 2 =
→
{( ) ( )}
−3
2
,
2
3
.
→ →
5. If ( b) T = (1, 2, 3), find ( b) S, where
20/21
7.4 Coordinates and orthonormal bases
{( ) ( ) ( )}
−1 1 0
S= 1 , 0 , 0 ,
0 1 1
{( ) ( ) ( )}
1 0 1
T= 0 , 1 , 1 .
0 −1 1
( ) ( )
→ → 4 3 → 3 4
6. Let b 1 = (0, 1, 0), b 2 = − 5 , 0, 5
, and b 3 = 5
, 0, 5
.
{ }
→ → →
i. Show that S : = b 1, b 2, b 3 is an orthonormal basis of ℝ 3.
→ →
ii. Let b = (1, − 1, 1). Find ( b) S.
→ → →
→
7. Let a 1 = (1, 1, 1), a 2 = (0, 1, 1), a 3 = (0, 0, 1), and b = ( − 1, 1, − 1) T.
{ }
→ → →
1. Find an orthogonal basis S 1 : = b 1, b 2, b 3 of ℝ 3 by using the Gram-Schmidt process and
→
( b) S .
1
{ }
→ → →
→
2. Find an orthonormal basis S 2 : = q 1, q 2, q 3 of ℝ 3 by normalizing S 1 and ( b) S .
2
21/21
Chapter 8 Eigenvalues and Diagonalizability
(8.1.1)
⎝ ⎠
an 1 an 2 ⋯ ann
Definition 8.1.1.
→
A real number λ is called an eigenvalue of A if there exists a nonzero vector X in Rn such that
(8.1.2)
→ →
AX = λX .
→
The vector X is said to be an eigenvector of the matrix A corresponding to the eigenvalue λ .
… 1/17
Chapter 8 Eigenvalues and Diagonalizability
Note that we restrict our study on eigenvalues in R, the set of real numbers. In fact, eigenvalues could be
complex numbers, which we shall study in Chapter 10 . Therefore, eigenvalues in the field of complex
numbers can be further studied after Chapter 10 .
An eigenvalue could be positive, zero, or negative. But, according to Definition 8.1.1 , an eigenvector must
be a nonzero vector. In other words, the zero vector is not an eigenvector although it is a solution of (8.1.2) .
From (8.1.2) and Theorems 4.5.2 and 4.5.3 , we see that A has a zero eigenvalue if and only if the
→
homogeneous system A X = 0 has a nonzero solution if and only if r(A) < n if and only if |A| = 0 if and
only if A is not invertible. Hence, A is invertible if and only if all the eigenvalues of A are nonzero, see Corollary
8.1.3 for another proof.
(8.1.3)
→ →
(λI − A) X = 0
By Theorem 4.5.3 , we obtain the following result, which shows that we can use determinants to find the
eigenvalues.
Theorem 8.1.1.
(8.1.4)
p(λ) := |λI − A| = 0.
… 2/17
Chapter 8 Eigenvalues and Diagonalizability
is a polynomial of degree n, which is called the characteristic polynomial of A. The equation p(λ) = 0 is
called the characteristic equation of A.
The following result can be proved by using the fundamental theorem of algebra from complex analysis.
Lemma 8.1.1.
Any polynomial
2 n
p(z) = a0 + a1 z + a2 z + ⋯ + an z (an ≠ 0)
where c and zj (j ∈ In ) are complex numbers (see Chapter 10 for the study of complex numbers).
Note that some of these complex numbers z1 , z2 , ⋯ , zn may be the same. By Lemma 8.1.1 , p(λ) can
be rewritten as follows.
(8.1.5)
r1 r2 rk
p(λ) = (λ − λ1 ) (λ − λ2 ) ⋯ (λ − λk ) ,
… 3/17
Chapter 8 Eigenvalues and Diagonalizability
where λi is a real or complex number, λi ≠ λj for i ≠ j, and ri is a positive integer for i, j ∈ {1, ⋯ , k}
satisfying
(8.1.6)
r1 + r2 + ⋯ + rk = n.
Hence, the characteristic equation p(λ) = 0 has n (real or complex) roots including repeated roots, which are
those roots λi with ri > 1 .
Definition 8.1.2.
The number ri is called the algebraic multiplicity of the eigenvalue λi for each i = 1, 2, ⋯, k .
In Definition 8.1.1 , if we allow the eigenvalues to be complex numbers, then the eigenvectors would take
complex numbers as their components, so complex vector space Cn , complex matrices, complex
determinants, etc., are involved. But from now on, we restrict our discussion of eigenvalues and eigenvectors to
real numbers. Hence, we shall only consider matrices A such that all of λ1 , ⋯ , λk in (8.1.5) are real. But
we give one question involving complex eigenvalues in question 8 of Exercise 9.2 .
(8.1.7)
→ → →
n
Eλi := N (λi I − A) = { X ∈ R : (λi I − A) X = 0 }.
The set Eλ is called the eigenspace of A. It is obvious that the eigenspace Eλ of A is exactly the solution
i i
→ →
space of (λi I − A) X = 0 or the nullspace of λi I − A, see (4.1.13) . Hence, the eigenspace Eλ i
contains all eigenvectors corresponding to the eigenvalue λi as well as the zero vector in Rn , which is not an
eigenvector (as mentioned above).
… 4/17
Chapter 8 Eigenvalues and Diagonalizability
Definition 8.1.3.
The dimension dim(Eλ ) of the eigenspace Eλ is called the geometric multiplicity of the eigenvalue λi for
i i
each i = 1, 2, ⋯ , k .
By Definition 8.1.1 , the eigenspace Eλ contains a nonzero vector, so the geometric multiplicity is greater
i
than or equal to 1.
We emphasize that there are two numbers, one being the algebraic multiplicity ri given in (8.1.5) and
another the geometric multiplicity dim(Eλ ), corresponding to each eigenvalue λi .
i
The following result shows that the geometric multiplicity is less than or equal to the algebraic multiplicity. The
proof is omitted because it involves additional properties of determinants.
Theorem 8.1.2.
1 ≤ dim(Eλi ) ≤ ri .
As a special case of Theorem 8.1.2 , we obtain the following result, which shows that if the algebraic
multiplicity of an eigenvalue is 1, then its geometric multiplicity is 1.
Corollary 8.1.1.
Let λi and ri bethesameasin (8.1.5) for i ∈ {1, 2, ⋯ , k}, Then if rj = 1 for some
j ∈ {1, 2, ⋯ , k}, then
dim(Eλi ) = rj = 1.
By Theorem 8.1.2 and (8.1.6) , we obtain the following result, which shows that the sum of the geometric
multiplicities is less than or equal to n.
… 5/17
Chapter 8 Eigenvalues and Diagonalizability
Corollary 8.1.2.
k ≤ ∑ dim(Eλi ) ≤ n.
i=1
The following result gives the eigenvalues of triangular (upper, lower, or diagonal) matrices.
Theorem 8.1.3.
Let A be an n × n triangular (upper, lower, or diagonal) matrix. Then the eigenvalues of A are all the
entries on the main diagonal of A.
Proof
⎜ 0 a22 ⋯ a2n ⎟
⎜ ⎟
A = ⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋱ ⋮ ⎟
⎝ ⎠
0 0 ⋯ ann
… 6/17
Chapter 8 Eigenvalues and Diagonalizability
Example 8.1.1.
For each of the following matrices, find its eigenvalues and determine its algebraic multiplicity.
−1 −1 0 4 −1 0
⎛ ⎞ ⎛ ⎞
A = ⎜ 0 2 3⎟, B = ⎜0 4 3 ⎟.
⎝ ⎠ ⎝ ⎠
0 0 3 0 0 −3
Solution
By Theorem 8.1.3 , the eigenvalues of A are λ1 = −1, λ2 = 2, and λ3 = 3. The algebraic multiplicity
of each of λ1 , λ2 , λ3 is 1 because
All the eigenvalues of B are λ1 = λ2 = 4 and λ3 = −3. The algebraic multiplicity for the eigenvalue 4 is 2
and the algebraic multiplicity for the eigenvalue 3 is 1 because ∣∣λI − A∣∣ = (λ − 4) .
2
(λ + 3)
→ →
(λi I − A) X = 0
and find the basis vectors by using the methods in Section 4.4 .
4. Step 4. Give the eigenspace Eλ using the basis vectors.
i
… 7/17
Chapter 8 Eigenvalues and Diagonalizability
Example 8.1.2.
Let
2 0
A = ( ).
0 2
Solution
∣λ−2 0∣
1. Because |λI − A| = ∣ we obtain a repeated eigenvalue
2
∣ = (λ − 2) = 0,
∣ 0 λ − 2∣
λ1 = λ2 = 2 .
→ →
For λ1 = λ2 = 2, (λ1 I − A) X = 0 becomes
0 0 x1 0
( )( ) = ( ).
0 0 x2 0
The corresponding system is 0x1 + 0x2 = 0, where x1 and x2 are free variables. Let x1 = t and
x2 = s . Then
x t 1 0 → →
( ) = ( ) = t( ) + s( ) = tυ1 + sυ2 ,
y s 0 1
→ →
where υ1 and υ2 is a solution of 0x1 Moreover,
T T
= (1, 0) = (0, 1) + 0x2 = 0.
→ →
Eλ1 = Eλ2 = span{υ1 , υ2 }.
… 8/17
Chapter 8 Eigenvalues and Diagonalizability
2. The algebraic multiplicity for the eigenvalue λ1 = λ2 = 2 is 2 because 2 is the power of the term
in the characteristic polynomial ∣∣λI − A∣∣ = (λ − 2) Because the eigenspace Eλ has two
2
λ−2 . 1
→ →
basis vectors υ1 and υ2 , dim(Eλ1 ) = 2. Hence, the geometric multiplicity of the eigenvalue 2 is
2.
Example 8.1.3.
0 0 −1
⎛ ⎞
Solution
∣ λ 0 1 ∣
∣ ∣
p(λ) = |λI − A| = −1 λ+2 −1 = λ(λ + 2)(λ − 3).
∣ ∣
∣ 0 0 λ−3∣
For λ1 = 0 ,
0 0 1
⎛ ⎞
λ1 I − A = ⎜ −1 2 −1 ⎟ .
⎝ ⎠
0 0 −3
0 0 1 x1 0
⎛ ⎞⎛ ⎞ ⎛ ⎞
⎜ −1 2 −1 ⎟ ⎜ x2 ⎟ = ⎜ 0 ⎟ .
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 0 −3 x3 0
… 9/17
Chapter 8 Eigenvalues and Diagonalizability
⎛ 0 0 1 0 ⎞ ⎛ −1 2 −1 0 ⎞
∣→ R1 , 2
(A∣ 0 ) = ⎜ ⎟ −→ ⎜ ⎟
∣ ⎜ −1 2 −1 0 ⎟ − ⎜ 0 0 1 0 ⎟
⎝ 0 0 −3 0 ⎠ ⎝ 0 0 −3 0 ⎠
⎛ −1 2 0 0 ⎞
R2 (3)+R3
−−−−−→⎜ 0 0 1 0 ⎟.
⎜ ⎟
R2 (1)+R1
⎝ ⎠
0 0 0 0
−x1 + 2x2 = 0
{
x3 = 0,
where x1 and x3 are basic variables and x2 is a free variable. Let x2 = t. Then x2 = 2t. Let
→ →
Then Eλ .
T
υ1 = (2, 1, 0) . 1
= span{υ1 }
For λ2 = −2,
−2 0 1
⎛ ⎞
λ2 I − A = ⎜ −1 0 −1 ⎟ .
⎝ ⎠
0 0 −5
−2 0 1 x1 0
⎛ ⎞⎛ ⎞ ⎛ ⎞
⎜ −1 0 −1 ⎟ ⎜ x2 ⎟ = ⎜ 0 ⎟ .
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 0 −5 x3 0
10/17
Chapter 8 Eigenvalues and Diagonalizability
⎛ −2 0 1 0 ⎞ ⎛ −1 0 −1 0 ⎞
∣→ R1, 2
(A∣ 0 ) = ⎜
⎜ −1 0 −1 0 ⎟− ⎜
⎟ →⎜ −2 0 1 0 ⎟
⎟
∣
⎝ 0 0 −5 0 ⎠ ⎝ 0 0 −5 0 ⎠
⎛ −1 0 −1 0 ⎞ R1 (
1
)
⎛ −1 0 −1 0 ⎞
R1 (−2)+R2 3
−−−−−→ ⎜ 0 0 3 0 ⎟ −−−→ ⎜ 0 0 1 0 ⎟
⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠
0 0 −5 0 0 0 −5 0
⎛ −1 0 −1 0 ⎞ ⎛ −1 0 0 0 ⎞
R2 (5)+R3 R2 (1)+R1
−−−−−→⎜
⎜ 0 0 1 0 ⎟ −
⎟ − −−−→⎜
⎜ 0 0 1 0 ⎟.
⎟
⎝ 0 0 0 0 ⎠ ⎝ 0 0 0 0 ⎠
x1 = 0
{
x3 = 0,
→
where x1 and x3 are basic variables and x2 is a free variable. Let x2 and υ2 Then
T
= t = (0, 1, 0) .
→
Eλ2 = span{υ2 } .
For λ3 = 3,
3 0 1
⎛ ⎞
λ3 I − A = ⎜ −1 5 −1 ⎟ .
⎝ ⎠
0 0 0
11/17
Chapter 8 Eigenvalues and Diagonalizability
3 0 1 x1 0
⎛ ⎞⎛ ⎞ ⎛ ⎞
⎜ −1 5 −1 ⎟ ⎜ x2 ⎟ = ⎜ 0 ⎟ .
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 0 0 x3 0
⎛ 3 0 1 0 ⎞ ⎛ −1 5 −1 0 ⎞
∣→ R1, 2
(A∣ 0 ) = ⎜
⎜ −1 5 −1 0 ⎟− ⎜
⎟ →⎜ 3 0 1 0 ⎟
⎟
∣
⎝ 0 0 0 0 ⎠ ⎝ 0 0 0 0 ⎠
⎛ −1 5 −1 0 ⎞ R2 (
1
)
R1 (3)+R2 15
−−−−−→⎜ 0 15 −2 0 ⎟ −−−→
⎜ ⎟
⎝ ⎠
0 0 0 0
1
⎛ −1 5 −1 0 ⎞ ⎛ −1 0 − 0 ⎞
3
R2 (−5)+R1
⎜ 2
⎟ − ⎜ 2 ⎟
0 1 − 0 −−−−→ ⎜ 0 1 − 0 ⎟.
⎜ 15 ⎟ ⎜ 15 ⎟
⎝ 0 0 0 0 ⎠ ⎝ ⎠
0 0 0 0
1
−x1 − x3 = 0
3
{
2
x2 − x3 = 0,
15
→
where x1 and x2 are basic variables, and x3 is a free variable. Let x3 and υ3 Then
T
= 15t = (−5, 2, 15) .
→
Eλ3 = span{υ3 } .
We shall give more examples of finding eigenvalues and eigenspaces in the next section.
Now, we turn our attention to the relations between A and its eigenvalues. Recall that the trace of A is given by
12/17
Chapter 8 Eigenvalues and Diagonalizability
The matrix A and its eigenvalues have the following relationships, which show that the sum of all the
eigenvalues equals the trace of A and the product of all the eigenvalues equals the determinant of A.
Theorem 8.1.4.
Let λ1 , ⋯ , λn be all the eigenvalues of the matrix A given in (8.1.1) . Then the following assertions
hold.
1. λ1 + ⋯ +λn = tr(A).
2. λ1 × ⋯ × λn = tr(A).
Proof
a11 a12
We only prove the case when n = 2. Let A = ( ). Then
a21 a22
∣ λ − a11 −a12 ∣
p(λ) = ∣ ∣ = (λ − a11 )(λ − a22 ) − a12 a21
∣ −a21 λ − a22 ∣
2 2
= λ − (a11 + a22 )λ + (a11 a22 − a12 a21 ) = λ − tr(A)λ + ∣
∣A∣
∣.
It follows that
(8.1.8)
2
p(λ) = λ − tr(A)λ + ∣
∣A∣
∣.
On the other hand, assume that λ1 and λ2 are solutions of p(λ) = 0. Then
(8.1.9)
13/17
Chapter 8 Eigenvalues and Diagonalizability
2
p(λ) = (λ − λ1 )(λ − λ2 ) = λ − (λ1 + λ2 )λ + λ1 λ2 .
Example 8.1.4.
Let A be the same as in Example 8.1.3 . Find tr(A) and |A| using Theorem 8.1.4 .
Solution
By Theorem 8.1.4 (2), we obtain the following result, which provides a criterion for a matrix to be invertible
(see Corollary 2.7.1 and Theorem 3.4.1 for other criteria).
Corollary 8.1.3.
A square matrix A is invertible if and only if all its eigenvalues are nonzero.
By Corollary 8.1.3 or equation (8.1.4) , we see that if zero is an eigenvalue of A, then A is not invertible.
Example 8.1.5.
Assume that A is a 2 × 2 matrix and 2I − A and 3I − A are not invertible. Find |A| and tr(A) .
Solution
Because 2I − A and 3I − A are not invertible, |2I − A| = 0 and |3I − A| = 0. Hence, 2 and 3 are the
eigenvalues of A. By Theorem 8.1.4 , |A| = 2 × 3 = 6 and tr(A) = 2 + 3 = 5 .
Theorem 8.1.5.
14/17
Chapter 8 Eigenvalues and Diagonalizability
→
1. Let λ be an eigenvalue of A and let X be an eigenvector of A. Then for each integer k, λk is an
→
eigenvalue of Ak and X is an eigenvalue of Ak .
2. Let ϕ(x) = a0 + a1 x + ⋯ + am xm be a polynomial of degree m and λ be an eigenvalue of A.
Then ϕ(λ) is an eigenvalue of ϕ(A) .
3. Assume that A is invertible and λ is an eigenvalue of A. Then λ
1
is an eigenvalue of A−1 .
Proof
→ →
1. Because A X = λX ,
→ → → → → →
2 2
A X = A(A X ) = A(λ X ) = λA X = λ(λ X ) = λ X .
→ →
Repeating the process implies that Ak X = λk X . The result follows.
2. Because λ is an eigenvalue of A, then for each positive integer k,
→ → → →
k k
λ X = A X for some vector X ≠ 0 .
Hence,
m m
ϕ(λ)I − ϕ(A) = (a0 + a1 λ + ⋯ + am λ )I − (a0 I + a1 A + ⋯ + am A )
2 2 m m
= a1 (λI − A) + a2 (λ I − A ) + ⋯ + am (λ I −A ).
It follows that
→ → → → →
2 2 m m
(ϕ(λ)I − ϕ(A)) X = a1 (λI − A) X + a2 (λ I − A ) X + ⋯ + am (λ I −A )X = 0
15/17
Chapter 8 Eigenvalues and Diagonalizability
→ → → →
3. If λ is an eigenvalue of A, then A X = λX for some vector X ≠ 0 . Because A is invertible,
A
−1
exists and
→ → →
−1 −1 −1
A (A X ) = A (λ X ) = λA ( X ).
→ → → →
This implies that X = λA
−1
(X ) and 1
λ
X = A
−1
( X ). Hence, 1
λ
is an eigen-value of A−1 .
Example 8.1.6.
Solution
Let ϕ(x) = 2x + 3x2 . Then ϕ(A) = 2A + 3A2 . Because 1, 2, 3 are the eigenvalues of A, it follows
from Theorem 8.1.5 that ϕ(1), ϕ(2), and ϕ(3) are the eigenvalues of ϕ(A). By computation, we have
2 2
ϕ(1) = 2 + 3 = 5, ϕ(2) = 2(2) + 3(2) = 16 and ϕ(3) = 2(3) + 3(3) = 33.
2
, and 1
3
.
Exercises
1. For each of the following matrices, find its eigenvalues and determine which eigenvalues are repeated
eigenvalues.
16/17
Chapter 8 Eigenvalues and Diagonalizability
−1 0 0 0
⎛ ⎞
2 0 0
⎛ ⎞
1 2 ⎜ 0 2 0 0⎟
A = ( ) B = ⎜1 1 0 ⎟ C = ⎜ ⎟
0 1 ⎜ 0 0 −1 0⎟
⎝ ⎠
2 −3 −4 ⎝ ⎠
0 0 0 0
2. For each of the following matrices, find its eigenvalues and eigenspaces.
1 2 3
⎛ ⎞
5 7 3 −1 3 0
A = ( ) B = ( ) C = ( ) D = ⎜2 1 3⎟
3 1 1 5 0 3 ⎝ ⎠
1 1 2
17/17
8.2 Diagonalizable matrices
Definition 8.2.1.
An n × n matrix A is said to be diagonalizable if there exists an n × n invertible matrix P such that P − 1AP is a
diagonal matrix. P is said to diagonalize A.
By Definition 8.2.1 and Theorem 3.4.1 (i), we obtain the following result.
Theorem 8.2.1.
An n × n matrix A is diagonalizable if and only if there exist an n × n matrix P with |P | ≠ 0 and a diagonal
matrix D such that
(8.2.1)
AP = PD.
(8.2.2)
→ →
D = diag(λ 1, λ 2, ⋯, λ n) and P = (p 1⋯ p n),
… 1/28
8.2 Diagonalizable matrices
→
where λ i is the eigenvalue of A and the column vector p i of P is the eigen-vector of A corresponding to λ i
for i ∈ I n.
Theorem 8.2.2.
An n × n matrix A is diagonalizable if and only if it has n linearly independent eigenvectors. In this case, the
matrix P in (8.2.2) diagonalizes A and the diagonalizable matrix is the matrix D in (8.2.2) .
Proof
→ →
Assume that A has n linearly independent eigenvectors: p 1, ⋯, p n such that
→ →
A p i = λ i p i for i = 1, ⋯, n.
→ → → → → →
AP = A(p 1⋯ p n) = (Ap 1⋯Ap n) = (λ 1p 1⋯λ np n)
→ →
= (p 1⋯ p n)diag(λ 1, λ 2, ⋯, λ n) = PD.
→ → → → → →
AP = A(p 1⋯ p n) = (Ap 1⋯Ap n) and PD = (λ 1p 1⋯λ np n),
it follows that
… 2/28
8.2 Diagonalizable matrices
→ →
A p i = λ i p i for i = 1, ⋯, n.
{ }
→ →
Because P is invertible, by Theorems 7.1.5 and 3.4.1 , p 1, ⋯, p n is linearly independent.
Remark 8.2.1.
1. By (8.2.2) and the proof of Theorem 8.2.2 , we see that the matrices D and P depend on
the order of eigenvalues and both follow the same order of the eigenvalues. Hence, different
orders of eigenvalues give different matrices D and P. Hence, the choices of D and P are not
unique (see Remark 8.2.3 ).
2. If there is more than one linearly independent eigenvector corresponding to one eigenvalue,
then different orders of these eigenvectors give different matrices P, but the diagonal matrix D
remains unchanged.
By Theorem 8.2.2 , we see that, to determine whether A is diagonalizable, we need linearly independent
eigenvectors of A and the number of linearly independent eigenvectors.
The following result provides the linear independence of the eigenvectors of a matrix.
Theorem 8.2.3.
→ → →
Let λ i and r i be the same as in (8.1.5) for i ∈ {1, 2, ⋯, k}. Assume that pi1, pi2, ⋯, pir i are linearly
independent eigenvectors corresponding to λ r for i ∈ I n. Then the set of all eigenvectors
i
… 3/28
8.2 Diagonalizable matrices
→ → → → → → → → →
p11, p12, ⋯, p1r 1; p21, p22, ⋯, p2r 2; ⋯; pk1, pk2, ⋯, pkr k
is linearly independent.
Proof
Assume that there exist x i1, x i2, ⋯xir i ∈ ℝ for i ∈ I k such that
(8.2.3)
( )
k ri
→
∑ ∑ x ijpi j = →0.
i=1 j=1
We prove below that x ij = 0 for j ∈ {1, 2, ⋯, ri } and i ∈ Ik. Indeed, for i ∈ Ik, let
(8.2.4)
ri
→ →
qi = ∑ x ijp ij.
j=1
k
→
Then (8.2.3) becomes ∑ i=1
→
q i = 0. Let i ∈ I k be fixed. Then
(8.2.5)
→ → →
→
q 1 + ⋯ + q i + ⋯ + q k = 0.
… 4/28
8.2 Diagonalizable matrices
→ →
Because Ap ij = λ ip ij for j ∈ {1, 2, ⋯, ri }, we have for i ∈ In,
(8.2.6)
[ ]
ri ri ir
→ → → → →
A qi = A ∑ x ijp ij = ∑ x ijAp ij = λ i ∑ x ijp ij = λ i q i
j=1 j=1 j=1
and
→ → → → → →
A(q 1 + ⋯ + q i + ⋯ + q k ) = λ 1q 1 + ⋯ + λ i q i + ⋯ + λ k q k .
(8.2.7)
→ → →
→
λ 1q 1 + ⋯ + λ i q i + ⋯ + λ k q k = 0.
→ → →
→
λ kq 1 + ⋯ + λ k q i + ⋯ + λ k q k = 0.
(8.2.8)
… 5/28
8.2 Diagonalizable matrices
→ → →
→
(λ k − λ 1)q 1 + ⋯ + (λ k − λ i) q i + ⋯ + (λ k − λ k − 1)q k − 1 = 0.
(8.2.9)
→ → →
→
(λ k − λ 1)λ 1q 1 + ⋯ + (λ k − λ i)λ i q i + ⋯ + (λ k − λ k − 1)λ k − 1q k − 1 = 0.
→ → →
→
(λ k − λ 1)λ k − 1q 1 + ⋯ + (λ k − λ i)λ k − 1 q i + ⋯ + (λ k − λ k − 1)λ k − 1q k − 1 = 0.
→ →
(λ k − λ 1)(λ k − 1 − λ 1)q 1 + ⋯ + (λ k − λ i)(λ k − 1 − λ i) q i + ⋯
→
→
+ (λ k − λ k − 1)(λ k − 1 − λ k − 2)q k − 2 = 0.
→ → → → → →
q k − 2, q k − 3, ⋯, q i + 1, q i − 1, ⋯, q 2, q 1,
we obtain
→
→
(λ k − λ i)(λ k − 1 − λ i)⋯(λ i + 1 − λ i)(λ i − 1 − λ i)⋯(λ 2 − λ i)(λ 1 − λ i) q i = 0.
… 6/28
8.2 Diagonalizable matrices
→ → → → →
→
Because λ j ≠ λ i for j ≠ i, we have q i = 0. By (8.2.4) , ∑ rj =i 1x ijp ij = 0.
→
Because pi1, pi2, ⋯, pir i are
linearly independent,
Theorem 8.2.3 shows that for each eigenvalue λ i, we can find linearly independent eigenvectors. For
example, for each eigenvalue λ i, we can take all the basis vectors of E λ and put them together, then the set of
i
all the vectors is linearly independent.
The following result gives the largest number of linearly independent eigenvectors of A.
Theorem 8.2.4.
Let λ i and r i be the same as in (8.1.5) for i ∈ {1, 2, ⋯, k}. Then the number ∑ ki= 1dim(E λi) is the largest
number of linearly independent eigenvectors of A.
Solution
→ → →
Let p 1, p 2, ⋯, p r be linearly independent eigenvectors of A. We regroup the eigenvectors corresponding to
λ 1, λ 2, ⋯, λ k as
{ }{ }{ }
→ → → → → → → → →
p11, p12, ⋯, p1s 1 ; p21, p22, ⋯, p2s 2 ; pk1, pk2, ⋯, pks r .
k
… 7/28
8.2 Diagonalizable matrices
Theorem 8.2.5.
Let λ i and r i be the same as in (8.1.5) i ∈ {1, 2, ⋯, k}. Then the following assertions hold.
Proof
By Corollary 8.1.2 ,k≤ ∑ ki= 1dim(E λi) ≤ n and for each eigenvalue λ i, there are dim(E λ ) linearly
i
independent eigenvectors. By Theorem 8.2.3 , the set of all these eigenvectors is linearly independent.
i. If ∑ ki= 1dim(E λi) = n, then A has n linearly independent eigenvectors. It follows from Theorem
8.2.2 that A is diagonalizable.
k
ii. If ∑ i=1
dim(E λ )
i
< n, then A does not have n linearly independent eigenvectors and by Theorem
8.2.2 , A is not diagonalizable.
The following result provides the methods used to determine whether an n × n matrix A is diagonalizable.
Corollary 8.2.1.
tAb e an n × n matrix and let λ i and r i be the same as in (8.1.5) for i ∈ {1, 2, ⋯, k}.
… 8/28
8.2 Diagonalizable matrices
ii. If there exists i 0 ∈ {1, 2, ⋯, l } such that dim(Eλ ) < rn , then A is not diagonalizable.
i0 i0
Proof
k
∑ dim(E λi) = r 1 + r 2 + ⋯ + r k = n.
i=1
By Theorem 8.2.5 , A is diagonalizable and the result (i) holds. If there exists i 0 ∈ {1, 2, ⋯, l }
such that dim(E λ ) < r n , then
ni i0
0
k
∑ dim(E λi) < r 1 + r 2 + ⋯ + r k = n.
i=1
… 9/28
8.2 Diagonalizable matrices
By Corollary 8.2.1 , we see that if each of the algebraic multiplicities is equal to the corresponding geometric
multiplicity, then A is diagonalizable (see (1) and (i) in Corollary 8.2.1 ) and if there exists a geometric
multiplicity, which is less than the corresponding algebraic multiplicity, then A is not diagonalizable (see (ii) in
Corollary 8.2.1 ).
Example 8.2.1.
( ) ( ) ( )
0 1 0 3 2 4 1 0 0
A= 0 0 1 B= 2 0 2 C= 1 2 0
4 − 17 8 4 2 3 −3 5 2
Solution
| |
λ −1 0
p(λ) = |λI − A | = 0 λ −1 = λ 2(λ − 8) − 4 + 17λ
− 4 17 λ − 8
= λ 2 − 8λ 2 + 17λ − 4 = (λ − 4)(λ 2 − 4λ + 1)
2
= (λ − 4)[(λ − 2) 2 − 3] = (λ − 4)[(λ − 2) 2 − (√3) ]
Solving p(λ) = 0 implies that λ 1 = 4, λ 2 = 2 + √3, and λ 3 = 2 − √3. Hence, A has three distinct eigenvalues.
By Corollary 8.2.1 (1), A is diagonalizable.
10/28
8.2 Diagonalizable matrices
| |
λ − 3 −2 −4
p(λ) = |λI − B | = −2 λ −2
−4 −2 λ − 3
( )( ) ( )
−4 −2 −4 x1 0
−2 −1 −2 x2 = 0 .
−4 −2 −4 x3 0
( )
( ) ( )
1
−4 −2 −4 0 R1 −2 2 1 2 0
|
(B 0) =
→
−2 −1 −2 0 → −2 −1 −2 0
−4 −2 −4 0 −4 −2 −4 0
( )
R1 ( 2 ) + R2
2 1 2 0
→ 0 0 0 0 .
R1 ( 4 ) + R3
0 0 0 0
2x + y + 2z = 0
and has two free variables. Thus, dim(E λ ) = 2 (algebraic multiplicity). By Corollary 8.2.1 (i), B is
1
diagonalizable.
Because
| |
λ−1 0 0
p(λ) = | λI − C | = −1 λ − 2 0 = (λ − 1)(λ − 2) 2 = 0.
3 −5 λ − 2
λ 1 = 1 has algebraic multiplicity 2 and λ 2 = 2 has algebraic multiplicity 2. We find dim(E λ ). For λ 2 = 2, the
2
system λ 2I − C = 0 becomes
( )( ) ( )
1 0 0 x1 0
−1 0 0 x2 = 0 .
3 −5 0 x3 0
12/28
8.2 Diagonalizable matrices
( ) ( )
1 0 0 0 R1 ( 1 ) + R2
1 0 0 0
|
(C 0) =
→
−1 0 0 0 →
R1 ( − 3 ) + R3
0 0 0 0
3 −5 0 0 0 −5 0 0
( )
R2 , 3
1 0 0 0
→ 0 −5 0 0 .
0 0 0 0
The system corresponding to the last augmented matrix has one free variable and thus, dim(E λ ) = 1 < 2
2
(algebraic multiplicity). By Corollary 8.2.1 (ii), C is not diagonalizable.
Example 8.2.2.
( )
2 0 1
A= 2 1 x .
0 0 1
Solution
Because
| || |
λ−2 0 −1
λI − A = −2 λ − 1 −x = (λ − 1) 2(λ − 2) = 0,
0 0 λ−1
( )( ) ( )
−1 0 −4 x1 0
−2 0 −x x2 = 0 .
0 0 0 x3 0
( ) ( )
−1 0 −1 0 R1 ( − 2 ) + R2
−1 0 −1 0
→
(A | 0) = −2 0 −x 0 → 0 0 −x + 2 0 .
0 0 0 0 0 0 0 0
If x = 2, then the system corresponding to the last augmented matrix has two free variables and thus,
dim(E λ ) = 2 (algebraic multiplicity). By Corollary 8.2.1 (i), A is diagonalizable.
1
In the above examples, we used eigenvalues or eigenvectors of a matrix to determine whether the matrix is
diagonalizable. Now, we use (8.2.2) and Theorem 8.2.2 to find the diagonalizable matrix D and the
invertible matrix P that diagonalizes A.
Example 8.2.3.
Let A =
( )
4
3
2
3
.
A = PDP − 1.
14/28
8.2 Diagonalizable matrices
Solution
1.
| | |
p(λ) = λI − A =
λ−4
−3 λ−3
−2
| = (λ − 4)(λ − 3) − 6
= λ 2 − 7λ + 12 − 6 = λ 2 − 7λ + 6 = (λ − 1)(λ − 6).
( 1−4
−3
−2
1−3 )( ) (
x
y
=
−3
−3
−2
−2 )( ) ( )
x
y
=
0
0
.
3x + 2y = 0,
where x is a basic variable and y is a free variable. Let y = 3t. Then x = − 2t. Hence,
() ( ) ( )
x − 2t −2 →
= =t = tυ1,
y 3t 3
{ }
→ →
where υ 1 = ( − 2, 1) T. Then E λ = tυ1 : t ∈ ℝ .
1
→ →
For λ 2 = 6, (λ 2I − A)X = 0 becomes
( 6−4
−3
−2
6−3 )( ) (
x
y
=
2
−3
−2
3 )( ) ( )
x
y
=
0
0
.
15/28
8.2 Diagonalizable matrices
x − y = 0,
() () ( )
x
y
=
t
t
=t
1
1
= tυ 2,
{ }
→ →
where υ 2 = (1, 1) T. Then E λ = tυ2 : t ∈ ℝ .
2
( )
→→ −2 1
2. Let D = diag(λ 1, λ 1) = diag(1, 6) and P = ( υ 1 υ 2 ) = .
3 1
Then by Theorem 8.2.2 , A = P − 1DP.
Remark 8.2.2.
In the above solution, we use Theorem 8.2.2 to obtain A = P − 1DP. Infact,we cancheck AP = PD.
Indeed,
AP =
( )(4
3
2
3
−2
3
1
1 ) (
=
−2
3
6
6 )
and PD =
( −2
3
1
1 )( ) (
1
0
0
6
=
−2
3
6
6 ) .
16/28
8.2 Diagonalizable matrices
Remark 8.2.3.
As we mentioned in Remark 8.2.1 , the choice of D and P is not unique. In Example 8.2.1 , we
can make the following choice:
and
( )
→→ 1 −2
P1 = (υ2 υ1) = .
1 3
| | ≠ 0.
By direct verification, we have AP 1 = P 1D 1 and P 1
Note that both D and P must follow the same order of the choices of eigenvalues. For example, we cannot
choose either D and P 1 or D 1 and P because you can check AP 1 ≠ P 1D and AP ≠ PD 1.
→ →
P = (α υ 1 β υ 2 ),
where α ≠ 0 and β ≠ 0.
17/28
8.2 Diagonalizable matrices
The following example deals with the case when geometric multiplicities of eigenvalues are greater than 1. The
matrix below is taken from Example 8.2.2 with x = 2.
Example 8.2.4.
Let
( )
2 0 1
A= 2 1 2 .
0 0 1
A = PDP − 1.
Solution
1.
| |
λ−2 0 1
p(λ) = | λI − A | = −2 λ−1 −2 = (λ − 1) 2(λ − 2).
0 0 λ−1
Solving p(λ) = (λ − 1) 2(λ − 2) = 0 implies λ 1 = 1 and λ 2 = 2. The algebraic multiplicities of λ 1 and λ 2 are
2 and 1, respectively.
For λ 1 = 1, the system λ 1I − A = 0 becomes
18/28
8.2 Diagonalizable matrices
( )( ) ( )
−1 0 −1 x1 0
−2 0 −2 x2 = 0 .
0 0 0 x3 0
( ) ( )
−1 0 −1 0 R1 ( − 2 ) + R2
−1 0 −1 0
→
(A | 0) = −2 0 −2 0 → 0 0 0 0 .
0 0 0 0 0 0 0 0
The system with the last augmented matrix is equivalent to the equation
x + 0y + z = 0,
where x is a basic variable and y and z are free variables. Let y = s and z = t. Then x = − t. Hence,
() ( ) () ( )
x −t 0 −1
→ →
y = s =s 1 +t 0 = tυ1 + sυ2,
z t 0 1
{ }
→ → → →
where υ 1 = (0, 1, 0) and υ 2 = ( − 1, 0, 1) T. Then E λ =
T
s υ 1 + t υ 2 : s, t ∈ ℝ .
1
19/28
8.2 Diagonalizable matrices
( )( ) ( )
0 0 −1 x1 0
−2 1 −2 x2 = 0 .
0 0 1 x3 0
( )( )
0 0 −1 0 R1 , 2
−2 1 −2 0
(A 0)| →
= −2 1 −2 0 → 0 0 −1 0
0 0 1 0 0 0 1 0
( )
R2 ( 1 ) + R3
−2 1 0 0
→ 0 0 −1 0 .
R2 ( − 2 ) + R1
0 0 0 0
The system with the last augmented matrix is equivalent to the equation
{ − 2x + y = 0
− z = 0,
→
where x and z are basic variables and y is a free variable. Let y = 2t. Then x = t. Let υ 3 = (1, 2, 0) T.
{ }
→
Then E λ = tυ3 : t ∈ ℝ .
2
20/28
8.2 Diagonalizable matrices
( )
0 −1 1
→→→
P = (υ1 υ2 υ3) = 1 0 2 .
0 1 0
Remark 8.2.4.
→ →
In Example 8.2.4 , there are two basis eigenvectors υ 1 and υ 2 corresponding to the eigenvalue
→ → →
λ 1 = 1. We choose the column vectors of P in the order υ 1 , υ 2 , υ 3 . By Theorem 8.2.2 , we can
→ → →
choose the following order υ 1 , υ 2 , υ 3 . If we let
( )
−1 0 1
→→→
P1 = (υ2 υ1 υ3) = 0 1 2 ,
1 0 0
−1
then it is easily verified that A = P 1DP 1 .
Theorem 8.2.6.
21/28
8.2 Diagonalizable matrices
Proof
Example 8.2.5.
Let
( )
1 −1 4
A= 3 2 −1 .
2 1 −1
A = PDP − 1.
Solution
1.
22/28
8.2 Diagonalizable matrices
| || |
λ−1 1 −4
p(λ) = λI − A = −3 λ−2 1
−2 −1 λ+1
λ 1 = − 2, λ 2 = 1, and λ 3 = 3.
→ →
For λ 1 = − 2, (λ 1I − A)X = 0 becomes
( )( ) ( )
−3 1 −4 x1 0
→
BX : = −3 −4 1 x2 = 0 .
−2 −1 −1 x3 0
23/28
8.2 Diagonalizable matrices
( ) ( )
−3 1 −4 0 R3 ( − 1 ) + R1
−1 2 −3 0
→
(B | 0) = −3 −4 1 0 → −3 −4 1 0
−2 −1 1 0 −2 −1 −1 0
( )
( ) ( )( )
1
R1 ( − 2 ) + R3
−1 2 −3 0 R3 −
5 −1 2 −3 0
→ 0 − 10 10 0 → 0 1 −1 0
R1 ( − 3 ) + R2
0 −5 5 0 1 0 1 −1 0
R2 −
10
( ) ( )
R2 ( − 1 ) + R3
−1 2 −3 0 R2 ( − 2 ) + R1
−1 0 −1 0
→ 0 1 −1 0 → 0 1 −1 0 .
0 0 0 0 0 0 0 0
{ x1 + x3
x2 − x3
=0
= 0,
where x 1 and x 2 are basic variables and x 3 is a free variable. Let x 3 = t. Then x 2 = t and x 1 = − t.
→
Let υ 1 = ( − 1, 1, 1) T. Then
{ }
→
Eλ = tυ1 : t ∈ ℝ .
1
24/28
8.2 Diagonalizable matrices
Hence, we obtain
{ x1+x3=0x2−4x3=0,
where x1 and x2 are basic variables and x3 is a free variable. Let x3=t. Then x2=4t and x1=−t. Let
υ2→=(−1, 4, 1)T. Then
Eλ2={tυ2→:t∈ℝ}.
This implies
{ x1−x3=0x2−2x3=0,
where x1 and x2 are basic variables and x3 is a free variable. Let x3=t. Then x2=2t and x1=t. Let
υ3→=(1, 2, 1)T. Then
Eλ3={tυ3→:t∈ℝ}.
25/28
8.2 Diagonalizable matrices
Exercises
1. For each of the following matrices, determine whether it is diagonalizable.
A=(−124003170058000−2)B=(−1430−110−43).
4. Let
26/28
8.2 Diagonalizable matrices
A=(1−1−41).
5. Let
A=(324202423).
6. Let
A=(00−2121103).
7. Let
A=(1−113−32001).
8. Let
27/28
8.2 Diagonalizable matrices
A=(01−10).
9. Two n×n matrices A and B are said to be similar (or A is similar to B) if there exists an invertible n×n
matrix P such that B=P−1AP. We write A~B. Let
A=(1−121),B=(1−211), and P=(01−10).
11. Assume that A, B, C are n×n matrices. Show that the following assertions hold
i. A~A.
ii. If A~B, then B~A.
iii. If A~B and B~C, then A~C.
12. Assume that A and B are n×n matrices and A~B. Prove that the following assertions hold.
a. Ak~Bk for every positive integer k.
b. Let ϕ(λ)=amλm+am−1λm−1+⋯+a1λ+a0. Then ϕ(A)~ϕ(B).
28/28
Chapter 9 Vector spaces
9.1 Subspaces of ℝ n
Definition 9.1.1.
→
i. If →
a = (x 1, x 2, ⋯, x n) ∈ W, and b = (y 1, y 2, ⋯, y n) ∈ W, then
→
a + b = (x 1 + y 1, x 2 + y 2, ⋯, x n + y n) ∈ W.
→
ii. If →
a = (x 1, x 2, ⋯, x n) ∈ W and k ∈ ℝ is a real number, then
k→
a = (kx 1, kx 2, ⋯, kx n) ∈ W.
Note that the above conditions (i) and (ii) are equivalent to the following condition: for
→
a = (x 1, x 2, ⋯, x n) ∈ W, b = (y 1, y 2, ⋯, y n) ∈ W, α ∈ ℝ and β ∈ ℝ,
→
(9.1.1)
→
α→
a + β b ∈ W.
… 1/12
Chapter 9 Vector spaces
By Definition 9.1.1 , we see that ℝ n is a subspace of itself, and the set containing only the origin
W : {(0, 0, ⋯, 0)︸n}
is a subspace of ℝ n.
→
In Definition 9.1.1 (ii), if k = 0 and →
a ∈ W, then k→ a = 0 ∈ W. . This shows that every subspace of ℝ n
a = 0→
→
contains the zero vector 0, Hence, we have the following remark.
Remark 9.1.1.
Any nonempty subsets of ℝ n that do not contain the origin of ℝ n are not subspaces of ℝ n.
The following result shows that the solution space of a homogeneous system (see (4.1.13) ) is a subspace
of ℝ n.
Theorem 9.1.1.
NA = {X ∈ ℝ : AX = 0 }
→ n → →
is a subspace of ℝ n.
Proof
→ → → → → → →
Note that X ∈ N A if and only if AX = 0. Let →
a ∈ N A, b ∈ N A, and k ∈ ℝ. Then A→
a = 0 and A b = 0. Hence,
by Theorem 2.2.1 ,
→ → → → → →
A(→
a + b) = A→
a + A b = 0 + 0 and A(k→
a) = k 0 = 0.
… 2/12
Chapter 9 Vector spaces
→
This implies →
a + b ∈ N A and k→
a ∈ N A. By Definition 9.1.1 , N A is a subspace of ℝ n.
→ →
The following result shows that the set of solutions of a nonhomogeneous system AX = b is not a subspace
of ℝ n because it doesn’t contain the zero vector in ℝ n.
Theorem 9.1.2.
→
Let A be an m × n matrix and b ∈ ℝ m and let
SA = { X ∈ ℝ : AX = b } .
→ n → →
→ → → →
If AX = b is consistent and b ≠ 0, then S A is not a subspace of ℝ n.
Proof
→ → → →
Because AX = b is consistent, AX = b has at least one solution and the set S A is a nonempty set.
→ → → → → →
Because b ≠ 0, 0 is not a solution of AX = b. Hence, 0 ∉ S A. By Theorem 9.1.2 , S A is not a
subspace of ℝ n.
What types of subsets in ℝ n are subspaces of ℝ n ? In the following, we find all the subspaces in ℝ 2 and ℝ 3.
The same approach can be used to deal with ℝ n with n ≥ 4. First, we consider subspaces in ℝ 2.
Example 9.1.1.
W= {(x, y) ∈ ℝ : x + 2y = 0}.
2
Solution
… 3/12
Chapter 9 Vector spaces
→
Let A = (1, 2) and X = (x, y) T. Then
W= {(x, y) ∈ ℝ : x + 2y = 0 } = {(x, y) ∈ ℝ : AX = 0 }.
2 2 → →
Note that x + 2y = 0 represents the equation of the line passing through the origin (0, 0). By Theorem
9.1.1 , one can show that any line passing through the origin (0, 0) is a subspace of ℝ 2. The equation of a
line passing through the origin (0, 0) is of the form
ax + by = 0,
Conclusion: All the subspaces in ℝ 2 are the origin {(0, 0)}, the entire space ℝ 2, and all the lines passing
through the origin (0, 0). Any other subsets of ℝ 2 are not subspaces of ℝ 2. We do not prove the result here,
but we give two examples that are not subspaces of ℝ 2.
Example 9.1.2.
W= {(x, y) ∈ ℝ : x + 2y = 1}.
2
Solution
Because (x, y) = (0, 0) is not a solution of x + 2y = 1, (0, 0) ∉ W. By Remark 9.1.1 , W is not a
subspace of ℝ 2.
… 4/12
Chapter 9 Vector spaces
Note that x + 2y = 1 represents the equation of the line that does not pass through the origin (0, 0). By
Theorem 9.1.2 , one can see that any line that does not pass through the origin (0, 0) is not a subspace of
ℝ 2. The equation of a line that does not pass through the origin (0, 0) is of the form
ax + by = c,
Example 9.1.3.
W= {(x, y) ∈ ℝ : y = x }.
2 2
Solution
Let (x 1, y 1) = (1, 1) and k = 2. Then (x 1, y 1) ∈ W and
2
because 2 = y 1 ≠ x 1 = 2 2 = 4. Hence, W does not satisfy the condition (ii) of Definition 9.1.1 and W is
not a subspace of ℝ 2.
Example 9.1.4.
W= {(x, y, z) ∈ ℝ : x + 2y + 3z = 0 }.
3
Solution
… 5/12
Chapter 9 Vector spaces
→
Let A = (1, 2, 3) and X = (x, y, z) T. Then
W= {(x, y, z) ∈ ℝ : x + 2y + 3z = 0 } = {(x, y, z) ∈ ℝ : AX = 0 }.
3 3 → →
Note that x + 2y + 3z = 0 represents the equation of the plane passing through the origin (0, 0, 0). By
Definition 9.1.1 or Theorem 9.1.1 , one can show that any plane passing through the origin (0, 0, 0) is
a subspace of ℝ 3.
The equation of a plane passing through the origin (0, 0, 0) is of the form
ax + by + cz = 0,
Example 9.1.5.
{
W = (x, y, z) ∈ ℝ 3 :
{ 3x − 2y + 3z = 0,
x − 3y + 5z = 0 }
.
Solution
Let A =
( 3
1
−2
−3
3
5 ) → →
, X = (x, y, z) T, and 0 = (0, 0, 0) T. Then
… 6/12
Chapter 9 Vector spaces
{
W = (x, y, z) ∈ ℝ 3 :
{ 3x − 2y + 3z = 0,
x − 3y + 5z = 0 } = {(x, y, z) ∈ ℝ : AX = 0 }.
3 → →
Example 9.1.6.
W= {(x, y, z) ∈ ℝ : x + 2y + 3z = 1 }.
3
Solution
Note that (x, y, z) ∈ W if and only if x + 2y + 3z = 1. Because 0 + 2(0) + 3(0) ≠ 1, so (0, 0, 0) ∉ W and W is
not a subspace of ℝ 3.
Note that x + 2y + 3z = 1 represents the equation of the plane that does not pass through the origin (0, 0). In
general, one can show that any plane that does not pass through the origin (0, 0, 0) is not a subspace of ℝ 3.
The equation of a plane that does not pass through the origin (0, 0) is of the form
ax + by + cz = d,
(9.1.2)
… 7/12
Chapter 9 Vector spaces
{
x = x 0 + ta
y = y 0 + tb −∞ < t < ∞
z = z 0 + tc,
→
→ →
see Theorem 6.2.1 . Let X = (x, y, z), P 0 = (x 0, y 0, z 0), and υ = (a, b, c). Then (9.1.2) can be written
as
→
→ →
X = P 0 + t υ, − ∞ < t < ∞.
→
If the line does not pass through the origin, then P 0 ≠ (0, 0, 0). If the line passes through the origin, then
→
P 0 = (0, 0, 0) and the equation of the line becomes
→ →
X = t υ, − ∞ < t < ∞.
The following example shows that any lines passing through the origin (0, 0, 0) are subspaces of ℝ 3.
Example 9.1.7.
→
Let υ ∈ ℝ 3 be a given nonzero vector. Show that
W = span υ =
→
{t υ : t ∈ ℝ }
→
is a subspace of ℝ 3.
Solution
… 8/12
Chapter 9 Vector spaces
→ →
Note that →
a ∈ W if and only if there exists t ∈ ℝ such that →
a = t υ. Let →
a ∈ W, b ∈ W, and k ∈ ℝ. Then
→ → →
there exist t 1, t 2 ∈ ℝ such that →
a = t 1 υ and b = t 2 υ. Hence,
→ → → → →
a + b = t 1 υ + t 2 υ = (t 1 + t 2) υ ∈ W
and
→ →
k→
a = k(t 1 υ) = (kt 1) υ ∈ W.
Similar to ℝ 2, all the subspaces of ℝ 3 contain the origin {(0, 0, 0)}, the entire space ℝ 3, all the planes
passing through the origin (0, 0, 0), and all the lines passing through the origin (0, 0, 0). Any other subsets of
ℝ 3 are not subspaces of ℝ 3.
Theorem 9.1.3.
{ }
→ → → → → →
Let a 1, a 2, ⋯, a n ∈ ℝ m. Show that span a 1, a 2, ⋯, a n is a subspace of ℝ m.
Proof
{ }
→ → →
Let W = span a 1, a 2, ⋯, a n . Note that →
u ∈ W if and only if there exist x 1, x 2, ⋯, x n ∈ ℝ such that
… 9/12
Chapter 9 Vector spaces
→ → →
u = x 1a 1 + x 2a 2 + ⋯ + x na n.
→
→
Let →
u ∈ W, υ ∈ W, and k ∈ ℝ. Then there exist x 1, x 2, ⋯, x n ∈ ℝ and y 1, y 2, ⋯, y n ∈ ℝ such that
→ → → → → →
u = x 1a 1 + x 2a 2 + ⋯ + x na n and υ = y 1a 1 + y 2a 2 + ⋯ + y na n .
→ →
Hence,
→ → → → → →
→ →
u + υ = x 1a 1 + x 2a 2 + ⋯ + x na n + y 1a 1 + y 2a 2 + ⋯ + y na n
→ → →
= (x 1 + y 1)a 1 + (x 2 + y 2)a 2 + ⋯ + (x n + y n)a n ∈ W
and
→ → →
k u = (kx 1)a 1 + (kx 2)a 2 + ⋯ + (kx n)a n ∈ W.
→
Exercises
10/12
Chapter 9 Vector spaces
ℝ+ =
2
{(x, y) : x ≥ 0 and y ≥ 0 .}
a ∈ ℝ 2+ and b ∈ ℝ 2+ , then →
a + b ∈ ℝ 2+ .
→ →
i. Show that if →
2
ii. Show that ℝ + is not a subspace of ℝ 2 by using Definition 9.1.1 .
4. Let
2 2
W = ℝ + ∪ ( − ℝ + ).
i. Show that if →
a ∈ W and k is a real number, then k→
a ∈ W.
ii. Show that W is not a subspace of ℝ 2 by using definition 9.1.1 .
{
W = (x, y, z, w) ∈ ℝ 4 :
{ x − 2y + 2z − w = 0,
− x + 3y + 4z + 2w = 0 } .
{ { }
x + 2y + 3z = 0,
W= (x, y, z) ∈ ℝ 3 : − x + y − 4z = 0, .
x − 2y + 5z = 0
11/12
Chapter 9 Vector spaces
11. Let W =
→
{(0, 0, ⋯, 0) } ∈ ℝ . Show that W is a subspace of ℝ .
n n
W= {t υ : t ∈ ℝ }
→
12/12
9.2 Vector Spaces
Definition 9.2.1.
We always use capital letters such as A, B, C, V, W, X, Y, Z to denote sets. Each object in a set is called an
element. We denote an element by small letters such as a, b, c, v, w, x, y, z. We use the symbol x ∈ A to
denote “x belongs to A” or ”x is in A” and the symbol x ∉ A to denote “x does not belong to A”or”x is not in A.”
When a set A is a collection of objects that satisfies some property P, we write
Example 9.2.1.
6. A = {(x, 2
y) : x +y
2
= 1}.
… 1/11
9.2 Vector Spaces
We use B ⊂ A to denote that A contains B or B is a subset of A, see Figure 9.1 . It is easy to see that
N ⊂ Z ⊂ Q ⊂ R.
A set is a collection of objects in which there may be no operations. If we introduce suitable operations in a set,
then such a set is more useful. For example, we introduce addition + and scalar multiplication ⋅ together with
dot product into Rn . We use the addition + and scalar multiplication ⋅ in Rn to study the linear combination of
vectors, spanning space, linear independence, bases, linear transformations, and eigenvalues.
Recall some sets in which we introduce suitable addition and scalar multiplication.
→ →
1. In Rn : for a = (a1 , ⋯ , an ) and b = (b1 , ⋯ , bn ), and k ∈ R, the addition + and scalar
multiplication ⋅ are defined by
→
→
1. a + b = (a1 + b1 , ⋯ , an + b1 ),
→
2. k ⋅ a = (ka1 , ⋯ , kan ).
2. For m × n matrices, the addition + and scalar multiplication ⋅ are defined as follows. Let
A = (aij ), B = (bij ), and k ∈ R,
1. A + B = (aij + bij ),
2. k ⋅ A = (kaij ).
… 2/11
9.2 Vector Spaces
V = {f : f : R → R is a function},
Let V be a nonempty set with addition + and scalar multiplication ⋅. Each element in V is called a vector. We
→
denote a vector by small letters in bold like u. When V = R
n
, then u = u .
Definition 9.2.2.
Let V be a nonempty set with addition + and scalar multiplication ⋅. . The set V is called a vector space if
the addition and scalar multiplication satisfy the following ten axioms:
… 3/11
9.2 Vector Spaces
inverse of u]
Remark 9.2.1.
1. V is a set.
2. In V, there are two operations: addition and scalar multiplication.
3. The two operations satisfy the ten axioms.
Remark 9.2.2.
a. 0u = 0;
b. k0 = 0;
… 4/11
9.2 Vector Spaces
c. ku = 0, then k = 0 or u = 0 .
Now, we provide several examples of vector spaces. The first one is the n-dimensional Euclidean space.
Example 9.2.2.
The n-dimensional Euclidean space Rn is a vector space with the standard addition + and scalar
multiplication ⋅ given in Definition 1.1.5 .
Solution
→ →
Let u = (u1 , u2 , ⋯ , un ) ∈ R
n
, υ = (υ1 , υ2 , ⋯ , υn ) ∈ R
n
and k ∈ R. Then
→ → n
u + υ = (u1 + υ1 , u2 + υ2 , ⋯ , un + υn ) ∈ R
and
→ n
k u = (ku1 , ku2 , ⋯ , kun ) ∈ R .
Hence, the axioms 1 and 6 hold. It is easy to verify that the other eight axioms given in Definition 9.2.2
hold. Hence, Rn is a vector space with the standard operations: + and ⋅.
Example 9.2.3.
Let
Show that V is a vector space with the standard matrix addition and matrix scalar multiplication given in
Definition 2.1.4 .
Solution
… 5/11
9.2 Vector Spaces
3. If w ∈ V , then u + (v + w) = (u + v) + w.
0 0
4. Let 0 = ( ). Then
0 0
−u11 −u12
5. −u = ( ) ∈ V.
−u21 −u22
ku11 ku12
6. ku = ( ) ∈ V.
ku21 ku22
8. (k + l)u = ku + lu.
… 6/11
9.2 Vector Spaces
9. k(lu) = (kl)u.
10. 1u = u.
By a method similar to Example 9.2.3 , one can prove the following result.
Example 9.2.4.
with the standard matrix addition and matrix scalar multiplication given in Definition 2.1.4 . Then V is a
vector space.
Example 9.2.5.
Solution
It is obvious that if f, g ∈ V, then f + g ∈ V , and if k ∈ R and f ∈ V, then kf ∈ V. Hence, the
axioms (1) and (6) hold. Let ˆ
0 denote the zero function defined on R, that is, 0(x) = 0 for each x ∈ R.
ˆ
Then 0̂ + f = f + 0̂ = f for each f ∈ V and the axiom (4) holds. It is easy to see that other axioms hold.
Example 9.2.6.
… 7/11
9.2 Vector Spaces
→ → → →
1. For u = (u1 , u2 ), υ = (υ1 , υ2 ), u + υ = (u1 + υ1 , u2 + υ2 ).
→ →
2. For k ∈ R and u = (u1 , u2 ), k ⋅ u = (ku1 , 0).
Solution
The addition given in (1) is the standard addition, but the scalar multiplication is not the standard scalar
multiplication. The axiom (10) does not hold for the scalar multiplication given in (2). For example, let
→
u = (2, 3), then by the scalar multiplication given in (2), we have
→ →
1 ⋅ u = (1(2), 0) = (2, 0) ≠ (2, 3) = u .
Hence, V is not a vector space under the two operations (1) and (2).
In a vector space V, there are two operations: addition and scalar multiplication, which satisfy 10 axioms.
Hence, similar to Section 1.4 , we can introduce linear combinations, spanning spaces, and similar to
Sections 7.1, 7.2 and 7.4, we can introduce linear independence, bases, coordinates. Also, we can introduce
linear transformations from one vector space to another, and eigenvalues and eigenvectors of linear
transformations. We can generalize the dot product of two vectors and the norm of a vector from Rn to vector
spaces and study some related problems such as those in Sections 1.7, 1.8, and 8.5. We do not discuss these
further here in this book. We only generalize the notion of subspaces in Rn in Section 10.1 to vector
spaces.
Definition 9.2.3.
… 8/11
9.2 Vector Spaces
To show that W is a subspace of V, according to Definition 9.2.3 , we would verify that the addition and
multiplication defined on V satisfy the ten axioms given in Definition 9.2.2 , where V is replaced by W.
However, in the following, we show that it suffices to verify that the axioms (1) and (6) hold.
Checking the axioms (1) to (10), we see that each of the six axioms (2), (3), (7)-(10) only depends on the
addition and multiplication while the other four axioms (1), (4), (5), and (6) depend on the operations and the
set involved. Therefore, if V is a vector space and we consider a nonempty subset W of V under the same
addition and multiplication as V, then the six axioms (2), (3), (7)-(10) hold for W because they only depend on
addition and multiplication. Therefore, to verify that W is a subspace of V, we only need to verify that the
addition and multiplication satisfy the four axioms (1), (4), (5), and (6). The following result shows that if the
axioms (1) and (6) hold, then the axioms (4) and (5) hold. Hence, we only need to verify that the axioms (1) and
(6) hold to show that W is a subspace of V.
Theorem 9.2.1.
Let V be a vector space and W be a nonempty subset of V. Then W is a subspace of V if and only if the
following conditions hold.
a. if u, υ ∈ W, then u + υ ∈ W .
b. if u ∈ W, k ∈ R, then ku ∈ W .
Proof
It is sufficient to show that under (a) and (b), the axioms (4), (5) to W hold. Indeed, let k = 0 and u ∈ W .
Because V is a vector space ku = 0u = 0. By (b), ku = 0 ∈ W . Therefore, 0 ∈ W . Because V is a
vector space, 0 + u = u + 0 = u for each u ∈ W . Hence, the axiom (4) holds. By (b),
−u = (−1)u ∈ W for each u ∈ W . Because V is a vector space, u + (−u) = (−u) + u = 0.
In Theorem 9.2.1 , if V = Rn , then Theorem 9.2.1 is the same as Definition 9.1.1 . We refer to
Section 9.1 for the discussion about subspaces in R . In the following, we give examples of subspaces that
n
… 9/11
9.2 Vector Spaces
are not in Rn .
Example 9.2.7.
Let
W = {A : A is an n × n symmetric matrix}.
Show that W is a vector subspace of Mn×n , where Mn×n is the vector space of all n × n matrices.
Solution
By (2.3.3) , A is symmetric if and only if AT = A. Let A, B ∈ W. Then A = AT and B = BT . By
Theorem 2.1.1 ,
a. (A + B) thus A + B ∈ V ;
T T T
= A +B = A + B,
b. (kA) thus kA ∈ V .
T T
= kA = kA,
Exercises
1. Let
2. Let
10/11
9.2 Vector Spaces
Show that C[a, b] is a vector space under the standard addition and scalar multiplication:
1. For f, g ∈ C[a, b], (f + g)(x) = f(x) + g(x) for x ∈ [a, b] .
2. For k ∈ R and f ∈ C[a, b], (k ⋅ f)(x) = kf(x) for x ∈ [a, b]
V = {A : A is a 3 × 3 matrix}.
Then V is not a vector space under the above addition and scalar multiplication.
5. Let
1 ′
C [a, b] = {f : f ∈ C[a, b]},
where f ′ (x) denotes the derivative of f at x. Show that C 1 [a, b] is a vector subspace of C[a, b].
11/11
Chapter 10 Complex numbers
(10.1.1)
2
aλ + bλ + c = 0,
2 2
4a λ + 4abλ + 4ac = 0.
(10.1.2)
2 2
(2aλ + b) = b − 4ac.
If Δ := b2 − 4ac ≥ 0, then
− −
2
−−−−− − −
2
−−−−−
2aλ + b = √ b − 4ac or 2aλ + b = −√ b − 4ac .
… 1/11
Chapter 10 Complex numbers
It follows that
(10.1.3)
2 2
−b+√b −4ac −b−√b −4ac
λ1 := and λ2 :=
2a 2a
If Δ < 0, then (10.1.1) has no real roots. Hence, when Δ < 0, we need to extend the set of real numbers
R to a larger set denoted by C such that (10.1.1) still has roots in the larger set. What is the set C ?
−−−−−−−−−−
If Δ < 0, then −(b2 − 4ac) > 0 and √−(b2 − 4ac) is a real number. We introduce a symbol
(10.1.4)
−
−−
i = √−1.
Then
− −−−−−−−− −−−−− −−−− −−−−−− −−−− −−−−−−
−
− 2 −
−− 2 2
√Δ = √ (−1)[−(b − 4ac)] = √−1√ −(b − 4ac) = i√ −(b − 4ac).
(10.1.5)
2 2
−b+i√−(b −4ac) −b−i√−(b −4ac)
λ1 = and λ2 = .
2a 2a
Hence, the quadratic equation (10.1.1) always has two roots: λ1 and λ2 given in (10.1.5) .
Example 10.1.1.
… 2/11
Chapter 10 Complex numbers
2
λ + 4λ + 13 = 0.
Solution
Because a = 1, b = 4, c = 13,
−
− − − −−−−− − − −−−−−−− −−−
− −
−− −−
2 2
√Δ = √ b − 4ac = √ 4 − 4 × 13 = √−36 = √−1√36 = 6i.
By (10.1.5) , λ1 and λ2 .
−4+6i −4−6i
= = −2 + 3i = = −2 − 3i
2 2
By (10.1.5) , we see that λ1 and λ2 can be written into the following form
x + yi,
where x and y are real numbers. If y ≠ 0, then x + yi is not a real number. Hence, we introduce
Definition 10.1.1.
z = x + yi
is called a complex number, where x and y are real numbers and i is given in (10.1.4) . x is called the real
part of z and is denoted by Re z = x and y is called the imaginary part of z and is denoted by Im z = y.
−
−−
i = √−1 is called the imaginary number and i = −1. We denote by C the set of all complex numbers.
2
Example 10.1.2.
Solution
… 3/11
Chapter 10 Complex numbers
The following definition shows that two complex numbers are equal if and only if their corresponding real parts
and imaginary parts are the same.
Definition 10.1.2.
Example 10.1.3.
a. 3x + 4i = 6 + 2yi
b. (x + y) + (x − y)i = 5 − i
Solution
a. By Definition 10.1.2 , 3x = 6 and 4 = 2y. This implies that x = y = 2 .
b. By Definition 10.1.2 , we have x + y = 5 and x − y = −1. Solving the system, we obtain
x = 2 and y = 3 .
Definition 10.1.3.
Let z1 = x1 + y1 i and z2 = x2 + y2 i .
… 4/11
Chapter 10 Complex numbers
z1 x1 x2 + y1 y2 y2 x2 − x1 y2
= ( )+( ) i.
2 2 2 2
z2 x +y x +y
2 2 2 2
Remark 10.1.1.
2
= x1 x2 + x1 y2 i + x2 y1 i + y1 y2 i = x1 x2 + x1 y2 i + x2 y1 i − y1 y2
1. (x + yi)(x − yi) = x2
2 2 2
− (yi) = x +y .
2. (x + yi)
2 2 2 2 2 2 2
= x + 2x(yi) + (yi) = x + 2xyi − y = (x − y ) + 2xyi.
3. (x − yi)
2 2 2 2 2 2 2
= x − 2x(yi) + (yi) = x − 2xyi − y = (x − y ) − 2xyi.
Remark 10.1.2.
By using the above formula (1), the division (iv) can be obtained by the following way.
2
z1 x1 +y1 i (x1 +y1 i)(x2 −y2 i) x1 x2 −x1 y2 i+x2 y1 i−y1 y2 i
= = =
z2 x2 +y i (x2 +y i)(x2 −y i)
2 2
2 2 2 (x2 ) −(y i)
2
Example 10.1.4.
Let z1 = 4 − 5i and z2 = −1 + 6i. Express the following complex numbers in the form x + yi .
… 5/11
Chapter 10 Complex numbers
1. z1 + z2
2. z1 − z2
3. 3z1
4. −z1
5. z1 z2
6. z
z 1
Solution
1. z1 + z2 = (4 − 5i) + (−1 + 6i) = (4 − 1) + (−5 + 6)i = 3 + i.
2. z1 − z2 = (4 − 5i) − (−1 + 6i) = (4 − (−1)) + (−5 − 6)i = 5 − 11i.
3. 3z1 = 3(4 − 5i) = 3 × 4 − 3 × 5i = 12 − 15i.
4. −z1 = −(4 − 5i) = −4 + 5i.
z 1 z 2 = (4 − 5i)(−1 + 6i) = 4(−1 + 6i) − (5i)(−1 + 6i)
5.
2
= −4 + 24i + 5i − 30i = −4 + 29i + 30 = 26 + 29i.
z1 4−5i (4−5i)(−1−6i) −4−24i+5i−30
= = =
z2 −1+6i (−1+6i)(−1−6i)
2 2
6. (−1) −(6i)
−34−19i 34 19
= = − − i.
1+36 37 37
Definition 10.1.4.
−− −−−−
∣ ∣ 2 2
z = √x + y
∣ ∣
Example 10.1.5.
… 6/11
Chapter 10 Complex numbers
Solution
− − −−−−−−−
∣ ∣ 2 2 −−−− − −−
∣z∣ = √ 3 + (−4) = √9 + 16 = √25 = 5.
∣ ∣
The following result gives some properties of the modulus of complex numbers.
Proposition 10.1.1.
Example 10.1.6.
(1 + i)(1 − 2i)
.
i(1 − i)(2 + 3i)
Solution
(1+i)(1−2i) |1+i||1−2i| √2√5 √65
∣ ∣
= = = .
∣ i(1−i)(2+3i) ∣ |i||1−i||2+3i| (1)√2√13 13
Definition 10.1.5.
Example 10.1.7.
… 7/11
Chapter 10 Complex numbers
z 1 = 3 + 2i; z 2 = −4 − 2i; z 3 = i; z 4 = 4.
Solution
¯¯
z¯1 = 3 − 2i, ¯¯
z¯2 = −4 + 2i, ¯¯
z¯3 = −i and ¯¯
z¯4 = 4.
Proposition 10.1.2.
i. z̄¯ = z.
ii. z̄ = z if and only if z is a real number.
iii. z̄ = −z if and only if z is a pure imaginary.
iv. ¯z¯ ¯¯¯¯¯¯¯¯
1 + z 2 = ¯¯z¯1 − ¯¯z¯2 .
v. ¯z¯ ¯¯¯¯¯¯¯¯
1 − z 2 = ¯¯z¯1 − ¯¯z¯2 .
vi. ¯z¯ ¯¯
1 ⋅¯¯
z¯
2¯ = ¯¯ z¯2 and ¯
z¯1 ⋅ ¯¯ ¯ = αz̄ for α ∈ R.
αz
¯¯
¯¯¯¯¯¯¯
vii. ( z
z1 ¯¯
¯
z1
) = .
2 z2
viii. 2
∣z ∣
∣ ∣ = zz̄ .
Proof
We only prove (ii)and (viii).
x + yi = x − yi.
… 8/11
Chapter 10 Complex numbers
2 2 2 2 2 2
zz̄ = (x + yi)(x − yi) = x − xyi + xyi − y i = x +y = ∣
∣z ∣
∣.
Example 10.1.8.
1. ¯
z̄¯¯
+¯¯¯¯¯¯ = z − 7i
7i
¯¯¯¯¯¯¯¯¯
¯
2. ( i−3z¯ ) =
i+3z¯ i−3z
i+3z
Solution
1. ¯
z̄¯¯
+¯¯¯¯¯¯ = z̄
7i ¯ + ¯¯
¯ = z − 7i.
7i
¯¯¯¯¯¯¯¯¯
¯ ¯¯
¯¯¯¯
¯¯ ¯ ¯¯¯
¯
2. ( i+3z
¯ i+3z ¯ ¯
i +3z −i+3z i−3z
) = = = = .
¯
i−3z ¯¯
¯¯¯¯
¯¯
i−3z ¯ ī −¯
¯¯
3z¯
¯ −i−3z i+3z
Example 10.1.9.
Let p(λ) = λ4 + 2λ
3
+λ
2
+ 5λ + 6. Show that if z is a root of p(λ), then z̄ is a root of p(λ),.
Solution
We prove
(10.1.6)
¯¯¯¯¯
¯
p(z) = p(z̄ ) for z ∈ C.
¯¯¯¯¯
¯ ¯¯¯
4¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
¯ ¯¯4
¯ ¯¯¯¯
¯ ¯¯2
¯
p(z) = z + 2z + z + 5z + 6 = z + 2z + z + ¯ ¯¯ + 6̄
3 2 3
5z
4 3 2
= (z̄ ) + 2(z̄ ) + (z̄ ) + 5z̄ + 6 = p(z̄ ).
¯¯¯¯¯
¯
p(z̄ ) = p(z) = 0̄ = 0
Similar to Example 10.1.9 , we can show that if z ∈ C is a root of a polynomial of degree n, then its
conjugate z̄ is also a root of this polynomial. This implies that the quadratic equation (10.1.1) has either of
two real roots, which could possibly be the same or two different complex roots.
Exercises
1. Solve each of the following equations.
a. z 2 + z + 1 = 0
b. z 2 − 2z + 5 = 0
c. z 2 − 8z + 25 = 0
3. Let z1 = 2 + i and z2 = 1 − i. Express each of the following complex numbers in the form x + yi,
calculate its modulus, and find its conjugate.
1. z1 + 2z2
2. 2z1 − 3z2
3. z1 z2
10/11
Chapter 10 Complex numbers
4.
z1
z2
1−i
)
b. 2−i
(1−i)(2+i)
c.
1+i
i(1+i)(2−i)
d.
1+i 2−i
−
1−i 1+i
1000
5. Find i2 , 3 4
i , i , i , i
5 6
and compute (−i − i4 +i
5 6
−i ) .
8
6. Let z = ( 1+i ) Calculate z 66
1−i 33
. + 2z − 2.
7. Find z ∈ C
a. iz = 4 + 3i
b. (1 + i)z̄ = (1 − i)z
x + 1 + i(y − 3)
= 1 + i.
5 + 3i
12
(3+4i)(1+i)
9. Let z = 5 2
. Find |z|.
i (2+4i)
11/11
10.2 Polar and exponential forms
10.1 (I)).
When complex numbers are viewed in an xy-coordinate system, the x-axis is called the real axis, the y-axis
is called the imaginary axis with unit i, and the plane is called the complex plane.
… 1/10
10.2 Polar and exponential forms
Definition 10.2.1.
Let z = x + yi and z ≠ 0. The angle θ from the positive x-axis to the vector
()
x
y
is called an argument of z
and is denoted by
θ = argz
− π < θ ≤ π,
θ = Arg z.
(10.2.1)
where ℤ = {0, ± 1, ± 2, ⋯} is the set of integers. Definition 10.2.1 does not give the definitions of an
argument and a principal argument for z = 0. Hence, if z = 0, then z does not have an argument nor a
principal argument.
Theorem 10.2.1.
Example 10.2.1.
For each of the following numbers, find its principal argument and argument.
a. 1
b. 2i
c. − 1
d. − 2i
Solution
By Theorem 10.2.1 , we have
The following result gives the formulas of the principal arguments of complex numbers (see Figure
10.2 (I), (II), (III), (IV)).
… 3/10
10.2 Polar and exponential forms
Theorem 10.2.2.
(10.2.2)
tanψ = ||
y
x
.
where k ∈ ℤ.
( )
Note that if ψ ∈ 0,
π
2
, then
(10.2.3)
tanψ =
||
y
x
if and only if ψ = arctan
|| y
x
.
Example 10.2.2.
a. z = 3 + √3i
b. z = − 2 − 2i.
Solution
tanψ = ||y
x
=
√3
3
.
π
Then ψ = 6 . Because x > 0 and y > 0, z is in the first quadrant and by Theorem 10.2.2 (1),
… 5/10
10.2 Polar and exponential forms
π π
Argz = ψ = 6
and argz = 6
+ 2kπ for k ∈ ℤ.
tanψ =
| |
−2
−2
= 1.
π
Then ψ = 4 . Because x < 0 and y < 0, z is in the third quadrant and by Theorem 10.2.2 (3),
π 3 3
Argz = − π + ψ = − π + 4
= − 4 π and argz = − 4 π + 2kπ for k ∈ ℤ.
(10.2.4)
Proposition 10.2.1.
Proof
… 6/10
10.2 Polar and exponential forms
1. Assume that e iθ = 1. Then cosθ + isinθ = 1. This implies that cosθ = 1 and sinθ = 0. Hence, k ∈ ℤ.
Conversely, if θ = 2kπ for k ∈ ℤ, then
Theorem 10.2.3.
Proof
Because x = | x | cosθ and y = | z | sinθ,
For each nonzero complex number z = x + yi, there exist a unique number r = | z | > 0 and arguments
θ = Argz + 2kπ for k ∈ ℤ. We call (r, θ) the polar coordinates of the complex number z = x + yi or the point (x,
y). Conversely, for a polar coordinate (r, θ), we define a complex number z by
(10.2.5)
z = (r cosθ) + (r sinθ)i.
… 7/10
10.2 Polar and exponential forms
||
z = √r 2cos 2θ + r 2sin 2θ = √r 2(cos 2θ + sin 2θ) = √r 2 = r,
where we use the identity: cos 2θ + sin 2θ = 1.
Definition 10.2.2.
(10.2.6)
z = r(cosθ + isinθ)
(10.2.7)
z = re iθ
By Theorem 10.2.3 , we see that the polar form and the exponential form of a complex number z can be
converted into one another easily. Moreover, by (10.2.5) , we see that a polar form of a complex number
can be easily transferred into the standard form. However, it is not trivial to transfer the standard form of a
complex number z = x + yi into a polar form of z.
… 8/10
10.2 Polar and exponential forms
The following are the steps for finding a polar form of a complex number z = x + yi.
In Steps 2 and 3, the principal argument Arg z can be replaced by the argument arg z of z. We always use
Arg z instead of arg z if there are no indications.
Example 10.2.3.
Solution
1. Step 1. r = ||
z = √12 + ( − √3)
2
= √4 = 2.
2. Step 2. Let ψ ∈ 0, ( ) π
2
such that tanψ =
| |
√−3
1
=
π
√3. Then ψ = 3 . Because x = 1 > 0 and
y = − √3 < 0, z is in the second quadrant and by Theorem 10.2.2 (2),
π 2π
Argz = π − ψ = π − = .
3 3
( () ( )
3. Step 3. z = 2 cos
2π
3
+ isin
2π
3
. Hence, the polar form of 1 − √3i is
(
1 − √3i = 2 cos
2π
3
+ isin
2π
3 )
… 9/10
10.2 Polar and exponential forms
2π
and its exponential form is 1 − √3i = 2e i 3 .
Exercises
1. For each of the following numbers, find its principal argument and argument.
a. − 4
b. − i
1 √3
c. 2
− 2
i
d. 1 + √3i
e. − 1 + √3i
f. − √3 − i
2. Express each of the following complex numbers in polar and exponent forms.
a. 1
b. i
c. − 2
d. − 3i
e. 1 + √3i
f. − 1 − i
10/10
10.3 Products in polar and exponential forms
i(θ 1 +θ 2 )
z 1 z 2 = r1 r2 [cos(θ 1 + θ 2 ) + i sin(θ 1 + θ 2 )] = r1 r2 e
and if z2 ≠ 0, then
z1 r1 r1 i(θ 1 −θ 2 )
= [cos(θ 1 − θ 2 ) + i sin(θ 1 − θ 2 )] = e .
z2 r2 r2
Proof
= r1 r2 [cos(θ 1 + θ 2 ) + i sin(θ 1 + θ 2 )]
i(θ 1 +θ 2 )
= r1 r2 e .
iθ 1 iθ 1 iθ 1 iθ2 i(θ 1 −θ 2 )
z1 r1 e r1 e r1 e e r1 e r1 i(θ 1 −θ 2 )
= = = = = e .
iθ 2 i0
z2 r2 e r2 e iθ 2 r2 e
iθ 2
e
iθ 2
r2 e r2
Example 10.3.1.
1/8
10.3 Products in polar and exponential forms
Solution
π π π π 5π 5π
z1 z2 = (4)(2) [cos ( + ) + i sin ( + )] = 8 (cos + i sin ).
4 6 4 6 12 12
5π
i
z 1 z 2 = 8e 12 .
π π π π π π π π
z1 z2 = 8 [(cos cos − sin sin ) + i (sin cos + cos sin )]
4 6 4 6 4 6 4 6
√2 √3 √2 1 √2 √3 √2 1
= 8 [( − ) + i( + )]
2 2 2 2 2 2 2 2
– – –
= 2√2[(√3 − 1) + i(√3 + 1)].
z1 π π π π π π
= 2 [cos ( − ) + i sin ( − )] = 2 [cos + i sin ].
z2 4 6 4 6 12 12
π
z1 i
= 2e 12 .
z2
z1 π π π π
= 2 [cos ( − ) + i sin ( − )]
z2 4 6 4 6
π π π π π π π π
= 2 [(cos cos + sin sin ) + i (sin cos − cos sin )]
4 6 4 6 4 6 4 6
√2 √3 √2 √2 √3 √2
1 1
= 2 [( + ) + i( − )]
2 2 2 2 2 2 2 2
√2 – –
= [(√3 + 1) + i(√3 − 1)].
2
By Theorem 10.3.1 , we obtain the following result on the arguments of products and quotients.
Theorem 10.3.2.
Let z1 , z2 be two nonzero complex numbers. Then the following assertions hold.
2/8
10.3 Products in polar and exponential forms
Proof
We only prove (1). Let θ1 = arg z1 = Arg z 1 + 2mπ for m ∈ Z and θ2 = arg z 2 = Arg z 2 + 2nπ
Example 10.3.2.
Solution
Because
π π 3π
Arg z 1 = Arg(1 + i) = and Arg z 2 = Arg(−1 − i) = −π + = − ,
4 4 4
π 3π π
arg(z 1 z 2 ) = ( − ) + 2kπ = − + 2kπ for k ∈ Z,
4 4 2
z1 π 3π
arg ( ) = − (− ) + 2kπ = π + 2kπ for k ∈ Z,
z2 4 4
z2 3π π
arg ( ) = − − + 2kπ = −π + 2kπ for k ∈ Z,
z1 4 4
3/8
10.3 Products in polar and exponential forms
and
1 π
arg = − + 2kπ for k ∈ Z.
z1 4
In Theorem 10.3.2 , the arguments corresponding to k = 0 may not be the principal arguments if they are
not in the interval (−π, Therefore, to find Arg(z1 z2 ) or Arg ( z we need to find a suitable integer
z1
π]. ),
2
k0 ∈ Z such that the corresponding argument belongs to (−π, π], which is the principal argument.
Corollary 10.3.1.
Let z1 , z2 be two nonzero complex numbers. Then the following assertions hold.
2. Arg ( z
z1
) = Arg z 1 − Arg z 2 + 2k0 π, where k0 ∈ Z such that
2
Example 10.3.3.
Solution
By Example 10.3.2 , we obtain
z1
arg ( ) = π + 2kπ for k ∈ Z.
z2
4/8
10.3 Products in polar and exponential forms
z1
Arg ( ) = π.
z2
Similarly, we have
z2
arg ( ) = −π + 2kπ for k ∈ Z,
z1
Example 10.3.4.
Solution
By Theorem 10.3.2 , we have
π 3π
arg(z 1 z 2 ) = π + + 2kπ = + 2kπ for k ∈ Z.
2 2
Because 3π
2
∉ (−π, π], so 3π
2
is not the principal argument of z1 z2 .. When k = −1, we have
3π 3π π
+ 2kπ = − 2π = − ∈ (−π, π].
2 2 2
Hence, Arg(z1 z2 ) = − π2 .
5/8
10.3 Products in polar and exponential forms
n n
z = r (cos nθ + i sin nθ).
n n i(nθ)
z = r e
and if z ≠ 0, then
−n −n i(−nθ)
z = r e .
Proof
We only prove (i) because (ii) follows from (i). By Theorem 10.3.1 , we have
2 3
z = z ⋅ z = (r ⋅ r)[cos(2θ + θ) + i sin(2θ + θ)] = r (cos(3θ) + i sin(3θ)).
and
3 2 2 3
z = z = z ⋅ z = (r ⋅ r)[cos(2θ + θ) + i sin(2θ + θ)] = r (cos(3θ) + i sin(3θ)).
n n n
z = [r(cos θ + i sin θ)] = r (cos nθ + i sin nθ).
Because z −1 =
1
z
,
1 1
−1 −n i(−nθ)
z = = = r e .
n n i(nθ)
z r e
Corollary 10.3.2.
6/8
10.3 Products in polar and exponential forms
Example 10.3.5.
Solution
1. Step 1. Find the polar form of z = −2 − 2i .
− −−−−−−−−−−
∣ ∣ 2 2 – –
r = z = √ (−2) + (−2) = √8 = 2√2.
∣ ∣
Let ψ ∈ (0, π
2
) be such that
−
−−
∣ √−2 ∣
tan ψ = ∣ ∣ = 1.
∣ −2 ∣
Then ψ = π
4
. Because x = −2 < 0 and y = −2 < 0, z is in the third quadrant and by
Theorem 10.2.2 (3),
π 3π
Arg z = −π + ψ = −π + = − .
4 4
Hence,
– 3π 3π
−2 − 2i = 2√2 [(cos (− ) + i sin (− )] .
4 4
2. Step 2. Use Theorem 10.3.3 to find the polar form. By Theorem 10.3.3 , we have
7/8
10.3 Products in polar and exponential forms
4 – 3π 3π
4
(−2 − 2i) = {2√2 [cos (− ) + i sin (− )]}
4 4
– 4 3π 3π
= (2√2) [cos 4 (− ) + i sin 4 (− )]
4 4
= 64 cos π = −64.
Exercises
1. Express the following numbers in x + yi form.
– 3
1. (1 + √3i)
2.
4
(−1 − i)
3.
−8
(1 + i) .
–
2. Let z1 and z2 Find the exponential form of z1 z2 .
1+i
= = √3 − i.
√2
√2
3. Let z = 2
(1 − i). Find z 100 +z
50
+ 1.
– 600 – 60
1 √3 1 √3
z = (− + i) + (− − i) .
2 2 2 2
8/8
10.4 Roots of complex numbers
(10.4.1)
n
w = z.
(10.4.2)
1
n
w = z n = √z .
z n is called the nth root of z. It is obvious that the nth root of z is a root of equation (10.4.1) .
Theorem 10.4.1.
(10.4.3)
(θ+2k)π
n n
√z = √re n for k = 0, 1, 2, ⋯ , n − 1,
or in polar form
1/8
10.4 Roots of complex numbers
(10.4.4)
n n θ+2kπ θ+2kπ
√z = √r (cos + i sin ) for k = 0, 1, 2, ⋯ , n − 1,
n n
Proof
Let w = ρeiψ . Because z = r(cos θ + i sin θ) = reiθ and wn = z, it follows from Theorem 10.3.3
(2) that
n
n i(nψ) iψ iθ
ρ e = (ρe ) = re .
n
ρ = r and nψ = θ + 2kπ for k ∈ Z.
θ+2kπ
ψ = for k ∈ Z.
n
Hence,
(10.4.5)
θ+2kπ
n
w = √re n for k ∈ Z.
Note that for each k ∈ Z, there exist j ∈ Z and l ∈ {0, 1, 2, ⋯ , n − 1} such that
k = jn + l.
This, together with (10.4.5) and Proposition 10.2.1 (1), implies that
2/8
10.4 Roots of complex numbers
θ+2kπ θ+2(jn+l)π θ+2l+2jn)π (θ+2l)π
n n n n i(2j)
w = √re n = √re n = √re n = √re n e
(θ+2l)π
n
= √re n .
(θ+2k)π
θ + 2kπ θ + 2kπ
n n
w = √re n = √r (cos + i sin ).
n n
n n θ+2kπ θ+2kπ
(√z )k = √r (cos + i sin ) for k = 0, 1, 2, ⋯ , n − 1.
n n
By Theorem 10.4.1 , we see that equation (10.4.1) has n distinct roots given by (10.4.4) . We write
(10.4.6)
n n n n
√z = {(√z )0 , (√z )1 , ⋯ , (√z )
n−1
}.
Corollary 10.4.1.
θ θ
(√z ) = √r (cos + i sin ) and (√z ) = −(√z ) .
0 2 2 1 0
θ θ θ θ
√z = √r (cos + i sin ) or √z = −√r (cos + i sin ).
2 2 2 2
3/8
10.4 Roots of complex numbers
It means that the square root of a complex number has a positive root (√z)0 and a negative root (√z)1 .
Hence, when we consider complex numbers, the square root √z represents two roots, which are
(10.4.7)
Example 10.4.1.
−
−−
Find √−1.
Solution
Because −1 = cos π + i sin π. It follows from Corollary 10.4.1 that
−
−− π π −
−−
(√−1) = cos + i sin = i and (√−1) = −i.
0 2 2 1
−
−−
Therefore, √−1 = {i, − i} .
−
−− −
−−
From Example 10.4.1 , we see that in (10.1.4) , i = √−1 can be interpreted as the positive root of √−1,
that is, i is the positive root of the following equation
2
w = −1.
Example 10.4.2.
–
Solve z 2 − 1 − √3i = 0 for z.
Solution
– −−−−−
–−
z
2
− 1 − √3i = 0 if and only if z = √1 + √3i . Because
4/8
10.4 Roots of complex numbers
– π π
1 + √3i = 2 (cos + i sin ),
3 3
– – –
− −−−−− π π √3 1 √6 √2
– – –
z0 := (√ 1 + √3i ) = √2 (cos + i sin ) = √2 ( + i) = + i
0 6 6 2 2 2 2
and
− −−−−−
– – π π – √3 1
z 1 : = (√ 1 + √3i ) = √2 [cos ( + π) + i sin ( + π)] = √2 (− − i)
1 6 6 2 2
√6 √2
= − − i.
2 2
–
Hence, z0 and z1 are the solutions of z 2 − 1 − √3i = 0 .
Example 10.4.3.
−
−−
Find all √−1.
3
Solution
Because −1 = cos π + i sin π. It follows from Theorem 10.4.1 that
3 −
−− π π 1 √3
(√−1) = cos + i sin = +i .
0 3 3 2 2
3 −
−− π+2π π+2π
(√−1) = cos + i sin = cos π + i sin π = −1.
1 3 3
3 −
−− π+4π π+4π π π
(√−1) = cos + i sin = cos (2π − ) + i sin (2π − )
2 3 3 3 3
π π 1 √3
= cos − i sin = −i .
3 3 2 2
−
−− √3 √3
Therefore, √−1 = { 12 1
.
3
+i , − 1, , −i }
2 2 2
5/8
10.4 Roots of complex numbers
Theorem 10.4.2.
Let a, b, c ∈ C be complex numbers and a ≠ 0. Then the solution of the following quadratic equation
(10.4.8)
2
az + bz + c = 0
is
(10.4.9)
− −
2
−−−−−
−b + √ b − 4ac
z = .
2a
Proof
The proof is the same as (10.1.2) and is omitted.
−−−−−−−
Note that if b2 − 4ac ≠ 0, then by (10.4.7) , we see that √b2 − 4ac contains two values. Hence,
(10.4.9) is consistent with (10.1.3) or (10.1.5) .
Example 10.4.4.
2
(1 − i)z − 2z + (1 + i) = 0.
Solution
By (10.4.9) , the solution of the equation is
(10.4.10)
6/8
10.4 Roots of complex numbers
− −−−− −−−− −−− −−− −
−− −
−−
2 + √ 4 − 4(1 − i)(1 + i) 2 + √−4 1 + √−1
z = = = .
2(1 − i) 2(1 − i) (1 − i)
−
−− −
−− −
−−
By Example 10.4.1 , √−1 = i or √−1 = −i. If √−1 = i, by (10.4.10) , we have
2
1+i (1 + i) 2i
z0 = = = = i.
1−i (1 − i)(1 + i) 2
−
−−
If √−1 = −i. by (10.4.10) , we have
1−i
z1 = = 1.
1−i
Exercises
1. Find all roots for each of the following complex numbers.
–
a. √1
4
−−−
b. √−1
6
−−− −
c. √1 − i
3
−−
d. √−i
3
1
–
e. (−1 + √3i)
4
7/8
Appendix A Solutions to the Problems
1/1
A.1 Euclidean spaces
Section 1.1
1. Write a three-column zero vector, and a three-column nonzero vector, and a five-column zero vector
and a five-column nonzero vector.
2. Suppose that the buyer for a manufacturing plant must order different quantities of oil, paper, steel,
and plastics. He will order 40 units of oil, 50 units of paper, 80 units of steel, and 20 units of plastics.
Write the quantities in a single vector.
3. Suppose that a student’s course marks for quiz 1, quiz 2, test 1, test 2, and the final exam are 70, 85,
80, 75, and 90, respectively. Write his marks as a column vector.
( ) ()
x − 2y 2
→ →
4. Let →
a= 2x − y and b = − 2 . Find all x, y, z ∈ ℝ that →
a = b.
2z 1
5. →
a=
()
|x|
y2
and b =
→
()
1
4
. Find all x, y ∈ ℝ such that →
a = b.
→
6. →
a=
( )
x−y
4
and b =
→
( )x+y
2
. Find all x, y ∈ ℝ such that →
→
a − b is a nonzero vector.
→ →
2
7. Let P(1, x ), Q(3, 4), P 1(4, 5), and Q 1(6, 1). Find all possible values of x ∈ ℝ such that PQ = P 1Q 1.
8. A company with 553 employees lists each employee’s salary as a component of a vector → a in ℝ 553. If
a 6% salary increase has been approved, find the vector involving →
a that gives all the new salaries.
1/4
A.1 Euclidean spaces
()
110
9. Let →
a= 88 denote the current prices of three items at a store. Suppose that the store announces a
40
sale so that the price of each item is reduced by 20%.
a. Find a three-vector that gives the price changes for the three items.
b. Find a three-vector that gives the new prices of the three items.
() () ()
−1 2 3
→
10. Let →
a= 2 , b= −2 , →
c = 0 , α ∈ ℝ. Compute
3 0 −1
→
i. 2→a − b + 5→
c;
→
ii. 4 a + αb − 2→
→
c
( ) ( ) ()
9 3x 0
11. Find x, y, and z such that 4y + 8 = 0 .
2z −6 0
12. Let → v = ( − 2, − 1, 1, − 4), and w
u = (1, − 1, 0, 2), → →
= (3, 2, − 1, 0).
a ∈ ℝ 4 such that 2→
a. Find a vector → u − 3→
v−→ →
a = w.
b. Find a vector →
a
1 →
[2 u − 3→
v+→ →
a] = 2w +→
u − 2→
a.
2
→
13. Determine whether →
a ǁ b,
→
1. →
a = (1, 2, 3) and b = ( − 2, − 4, − 6).
2/4
A.1 Euclidean spaces
→
2. →
a = ( − 1, 0, 1, 2) and b = ( − 3, 0, 3, 6).
→
3. →
a = (1, − 1, 1, 2) and b = ( − 2, 2, 2, 3).
14. Let x, y, a, b ∈ ℝ with x ≠ y. Let P(x, 2x), Q(y, 2y), P 1(a, 2a), and Q 1(b, 2b) be points in ℝ 2. Show
→ →
that PQ ǁ P 1Q 1.
→ →
2 2
15. Let P(x, 0), Q(3, x ), P 1(2x, 1), and Q 1(6, x ). Find all possible values of x ∈ ℝ such that PQ ǁ P 1Q 1.
() () ()
−1 2 3
→
16. Let →
a= 2 , b= −2 , →
c = 0 , and α ∈ ℝ.
3 0 −1
Compute
→
a. →
a ⋅ b;
b. a ⋅ c;
→ →
→
c. b ⋅ →
c;
→
d. →
a ⋅ (b + →
c).
18. Assume that the percentages for homework, test 1, test 2, and the final exam for a course are 10%,
25%, 25%, and 40%, respectively. The total marks for homework, test 1, test 2, and the final exam are
10, 50, 50, and 90, respectively. A student’s corresponding marks are 8, 46, 48, and 81, respectively.
What is the student’s final mark out of 100?
19. Assume that the percentages for homework, test 1, test 2, and the final exam for a course are 14%,
20%, 20%, and 46%, respectively. The total marks for homework, test 1, test 2, and the final exam are
14, 60, 60, and 80, respectively. A student’s corresponding marks are 12, 45, 51, and 60, respectively.
What is the student’s final mark out of 100?
3/4
A.1 Euclidean spaces
20. A manufacturer produces three different types of products. The demand for the products is denoted by
the vector →
a = (10, 20, 30). The price per unit for the products is given by the vector
→
b = ($200, $150, $100). If the demand is met, how much money will the manufacturer receive?
21. A company pays four groups of employees a salary. The numbers of the employees for the four
groups are expressed by a vector →a = (5, 20, 40). The payments for the groups are expressed by a
→
vector b = ($100, 000, $80, 000, $60, 000). Use the dot product to calculate the total amount of
money the company paid its employees.
22. There are three students who may buy Calculus or algebra books. Use the dot product to find the total
number of students who buy both calculus and algebra books.
23. There are n students who may buy a calculus or algebra books (n ≥ 2). Use the dot product to find the
total number of students who buy both calculus and algebra books.
24. Assume that a person A has contracted a contagious disease and has direct contacts with four
people: P1, P2, P3, and P4. We denote the contacts by a vector → a : = (a 11, a 12, a 13, a 14), where if
the person A has made contact with the person Pj, then a 1j = 1, and if the person A has made no
contact with the person Pj, then a 1j = 0. Now we suppose that the four people then have had a variety
→
of direct contacts with another individual B, which we denote by a vector b : = (b 11, b 21, b 31, b 41),
where if the person Pj has made contact with the person B, then b j1 = 1, and if the person Pj has
made no contact with the person B, then b j1 = 0.
i. Find the total number of indirect contacts between A and B.
ii. If →
a = (1, 0, 1, 1) and →
a = (1, 1, 1, 1), find the total number of indirect contacts between A and
B.
4/4
A.2 Matrices
A.2 Matrices
Section 2.1
1. Find the sizes of the following matrices:
1 0 0
⎛ ⎞
5 8 10 3 8 ⎜0 10 2 ⎟
A = (2) B = ( ) C = ( ) D = ⎜ ⎟
1 4 10 3 4 ⎜0 0 1 ⎟
⎝ ⎠
6 9 10
2. Use column vectors and row vectors to rewrite each of the following matrices.
9 8 7
⎛ ⎞
3 3 2 1 9 7 2
⎛ ⎞ ⎛ ⎞
⎜6 5 4 ⎟
A = ⎜1 8 1 ⎟ B = ⎜3 10 8 7⎟ C = ⎜ ⎟
⎜3 2 1 ⎟
⎝ ⎠ ⎝ ⎠
5 4 10 4 3 10 4
⎝ ⎠
0 −1 −2
C = (4) B = ( 3 11 2 ) C = ⎜ 8 ⎟ D = ⎜ −2 8 −7 ⎟
⎝ ⎠ ⎝ ⎠
12 5 3 −9
9 3 9 3
4. Let A = ( ) and B = ( ). Find all x ∈ R such that A = B.
2
6 x 6 4
2
12 5 25 12 5 x
5. Let C = ( ) and D = ( ). Find all x ∈ R such that C = D.
19 4 6 19 4 6
2
120 25 122 120 x 122
⎛ ⎞ ⎛ ⎞
1/3
A.2 Matrices
a b 1 1
7. Let A = ( ) and B = ( ). Find all a, b, x, y ∈ R such that A = B.
3x + 2y −x + y 3 6
8. Let
a+b 2b − a 6 0
C = ( ) and D = ( ).
x − 2y 5x + 3y 8 14
−2 3 4 7 −8 9
9. Let A = ( ) and B = ( ). Compute
6 −1 −8 0 −1 0
i. A + B;
ii. −A;
iii. 4A − 2B;
iv. 100A + B.
9 5 1 2 2 2 4 7 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
Compute
1. 3A − 2B + C,
2.
T
[3(A + B)] ,
T
3. (4A +
1
2
B − C) .
T
T
−7 −2 −5 −10
11. Find the matrix A if [(3AT ) − ( ) ] = ( ).
−6 9 33 12
2/3
A.2 Matrices
T T
6 3 −5 6
⎡ ⎛ ⎞⎤ ⎛ ⎞
1 23 −16 17
⎢ B + ⎜8 3 ⎟⎥ − 3⎜ 8 −9 ⎟ = ( ).
2 −16 26.5 2
⎣ ⎝ ⎠⎦ ⎝ ⎠
1 4 −4 2
3/3
A.3 Determinants
A.3 Determinants
Section 3.1
a b
1. Let A = ( where a, Show that
n
), b, c, d ∈ R.
c d
tio
2
∣
∣λI − A∣
∣ = λ − tr(A)λ + ∣
∣A∣
∣
bu
for each λ ∈ R.
tri
2 4 1 2 −1 1
2. Let A1 = ( ), A2 = ( ), A3 = ( ) For each i = 1, 2, 3, find all
3 −2 −1 4 −1 −3
is
λ ∈ R such that |λI − Ai | = 0.
D
3. Use (3.1.3) and (3.1.4) to calculate each of the following determinants.
∣ 1 2 3∣ ∣2 −2 4∣ ∣0 −1 3∣
r
fo
∣ ∣ ∣ ∣ ∣ ∣
|A| = −2 1 2 |B| = 4 3 1 |C| = 4 1 2
∣ ∣ ∣ ∣ ∣ ∣
∣ 3 −1 4∣ ∣0 1 2∣ ∣0 0 1∣
ot
4. Compute the determinant |A| by using its expansion of cofactors corresponding to each of its rows,
N
where
∣ 1 2 3∣
∣ ∣
|A| = −2 1 2 .
∣ ∣
∣ 3 −1 4∣
1/2
A.3 Determinants
∣2 3 −4 ∣ ∣ −2 0 0∣
∣ ∣ ∣ ∣
|A| = 0 6 0 |B| = 1 1 0
∣ ∣ ∣ ∣
∣0 0 5 ∣ ∣ 0 3 8∣
2/2
A.4 Systems of linear equations
Section 4.1
1. Determine which of the following equations are linear.
a. x + 2y + z = 6
b. 2x − 6y + z = 1
– 2
c. −√2x + 6
−
3 y = 4 − 3z
e. 2xy + 3yz + 5z = 8
2. For each of the following systems of linear equations, find its coefficient and augmented matrices
2x1 − x2 = 6
1. {
4x1 + x2 = 3
⎧ x1 − 3x3 − 4 = 0
⎪
3. ⎨ 2x2 − 5x3 − 8 = 0
⎩
⎪
3x1 + 2x2 − x3 = 4
3. For each of the following augmented matrices, find the corresponding linear system.
1 2 3∣ 1 2 1 ∣ 1
⎛ ⎞ ⎛ ⎞
∣ ∣
⎜ −1 1 0 −2 ⎟ ⎜0 −1 −2 ⎟
∣ ∣
⎝ ⎠ ⎝ ⎠
0 −1 1∣ 0 0 3 ∣ −1
1/3
A.4 Systems of linear equations
→ → →
4. For each of the following linear systems, write it into the form A X = b and express b as a linear
combination of the column vectors of A.
⎪ −x1 + x2 + 2x3 = 3
⎧
⎧
⎪ x1 − x2 + 4x3 = 0
x1 − 2x2 + x3 = 0
c. {
−x1 + 3x2 − 4x3 = 0
2
) , P 2 (8, 1), P 3 (2, 3), and P4 (0, 0), verify whether it
is a solution of the system
x − 2y = 7,
{
x + 2y = 5.
⎪ −x1 + x2 + 2x3 = 3
⎧
6
,
4
3
) is a solution of the system.
→
3. Determine if the vector b = (3, is a linear combination of the column vectors of the
T
2, − 1)
2/3
A.4 Systems of linear equations
→
4. Determine if the vector b = (3, 2, − 1) belongs to the spanning space of the column
T
x1 + 2x2 = 1
2. {
2x1 + x2 = 2
x1 + 2x2 = 1
3. {
x1 + 2x2 = 2
3/3
A.5 Linear transformations
Section 5.1
1. For each of the following matrices,
2 −2 4 −1 −2 2 1
⎛ ⎞ ⎛ ⎞
A1 = ⎜ −1 2 3 0 ⎟ A2 = ⎜ 1 1 −3 ⎟
⎝ ⎠ ⎝ ⎠
0 1 2 −1 1 0 −2
1 −2 0 −1 1 1
⎛ ⎞ ⎛ ⎞
⎜ −2 3 1 ⎟ ⎜ 2 0 −2 ⎟
A3 = ⎜ ⎟ A4 = ⎜ ⎟
⎜ 3 −2 3 ⎟ ⎜ −1 −1 1 ⎟
⎝ ⎠ ⎝ ⎠
1 2 −1 0 0 1
1 2 1 0
2. Let A = ( ). Find TA ( ) and TA ( ) .
−1 3 0 1
−1 2 1 → → → → → →
3. Let A = ( ). Find TA ( e1 ), TA ( e2 ), TA ( e3 ), where e1 , e2 , e3 are the standard vectors in
3 0 6
R
3
.
4. For each of the following linear transformations, express it in (5.1.1) .
a. T (x1 , x2 ) = (x1 − x2 , 2x1 + x2 , 5x1 − 3x2 ).
b. T (x1 , x2 , x3 , x4 ) = (6x1 − x2 − x3 + x4 , x2 + x3 − 2x4 , 3x3 + x4 , 5x4 ).
c. T (x1 , x2 , x3 ) = (4x1 − x2 + 2x3 , 3x1 − x3 , 2x1 + x2 + 5x3 ).
d. T (x1 , x2 , x3 ) = (8x1 − x1 + x3 , x2 − x3 , 2x1 + x2 , x2 + 3x3 ).
→ →
5. Let X = (2, −2, −2, 1), Y = (1, −2, 2, 3) and
1/2
A.5 Linear transformations
1 0 1 0
⎛ ⎞
A = ⎜ 1 −2 1 1⎟.
⎝ ⎠
−1 1 2 0
→ →
Compute TA ( X − 3 Y ) .
6. Let S, T : R3 → R3 be linear operators such that T (1, 2, 3) = (1, −4) and S(1, 2, 3) = (4, 9).
Compute (5T + 2S)(1, 2, 3) .
7. Let S, T 3
: R → R be linear transformations such that T (1, 2, 3) = (1, 1, 0, −1) and
4
2/2
A.6 Lines and planes in Math Equation
Section 5.4
→ → → → → → → →
1. Find and verify ( a and ( a .
n
a × b × b ) ⊥ a × b ) ⊥ b
→
tio
→
1. a = (1, 2, 3); b = (1, 0, 1),
→
→
2. a = (1, 0, − 3), b = (−1, 0, − 2),
bu
→
→
3. a = (0, 2, 1); b = (−5, 0, 1),
→ →
4.
tri
a = (1, 1, − 1); b = (−1, 1, 0),
is
→
→ →
2. Let a = (1, − 2, 3), b = (−1, − 4, 0), and c = (2, 3, 1). Find the scalar triple product of
rD
→ → →
a , b , c .
→ →
3. Find the area of the parallelogram determined by a and b .
fo
→ →
1. a = (1, 0, 1), b = (−1, 0, 1);
→ →
2. a = (1, 0, 0), b = (−1, 1, − 2).
ot
→ →
N
→
4. Find the volume of the parallelpiped determined by a , b, and c .
→ → →
1. a = (1, − 1, 0), b = (−1, 0, 2), c = (0, − 1, − 1).
→
→ →
2. a = (−1, − 1, 0), b = (−1, 1, − 2), c = (−1, 1, 1).
1/1
A.7 Bases and dimensions
Section 7.1
→ → → → → →
1. Let a1 and a3 Determine whether {a1 ,
T T T
= (2, 4, 6) , a2 = (0, 0, 0) , = (1, 2, 3) . a2 , a3 }
n
is linearly dependent.
→ → → →
2. Determine whether {a1 , a2 , a3 , a4 } is linearly dependent, where
1 1 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
r
a1 = ⎜ ⎟, a2 = ⎜ ⎟, a3 = ⎜ ⎟, a4 = ⎜ ⎟.
⎜1⎟ ⎜1⎟ ⎜0⎟ ⎜0⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 0 1
→ → → →
3. Determine whether S = {a1 , a2 , a3 , a4 } is linearly independent, where
1 7 −1 6
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
→ → → →
a1 = ⎜ 3 ⎟, a2 = ⎜ 9 ⎟ , a3 = ⎜ −2 ⎟ , a4 = ⎜ −2 ⎟ .
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
−4 8 1 5
→ → →
4. Determine that S = {a1 , a2 , a3 } is linearly independent, where
−1 3 1
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
→ → →
a1 = ⎜ 0 ⎟, a2 = ⎜ −1 ⎟ , a3 = ⎜ 2 ⎟ .
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
−1 −6 3
1/2
A.7 Bases and dimensions
2/2
A.8 Eigenvalues and Diagonalizability
Section 8.1
1. For each of the following matrices, find its eigenvalues and determine which eigenvalues are repeated
eigenvalues.
−1 0 0 0
⎛ ⎞
2 0 0
⎛ ⎞
1 2 ⎜ 0 2 0 0⎟
A = ( ) B = ⎜1 1 0 ⎟ C = ⎜ ⎟
0 1 ⎜ 0 0 −1 0⎟
⎝ ⎠
2 −3 −4 ⎝ ⎠
0 0 0 0
2. For each of the following matrices, find its eigenvalues and eigenspaces.
1 2 3
⎛ ⎞
5 7 3 −1 3 0
A = ( ) B = ( ) C = ( ) D = ⎜2 1 3⎟
3 1 1 5 0 3
⎝ ⎠
1 1 2
1/1
A.9 Vector spaces
2
R = {(x, y) : x ≥ 0 and y ≥ 0}.
+
→ → → →
i. Show that if 2
a ∈ R+ and b
2
∈ R+ , then 2
a + b ∈ R+ .
ii. Show that R2+ is not a subspace of R2 by using Definition 9.1.1 .
4. Let
2 2
W = R ∪ (−R ).
+ +
→ →
i. Show that if a ∈ W and k is a real number, then k a ∈ W .
r
ii. Show that W is not a subspace of R2 by using definition 9.1.1 .
5. Let W = {(x, y, z) ∈ R
3
: x − 2y − z = 0}. Show that W is a subspace of R3 by Definition
9.1.1 and Theorem 9.1.1 , respectively.
6. Show that W is a subspace of R4 , where
4
x − 2y + 2z − w = 0,
W = {(x, y, z, w) ∈ R : { }.
−x + 3y + 4z + 2w = 0
1/2
A.9 Vector spaces
⎧ ⎧ x + 2y + 3z = 0, ⎫
⎪ ⎪ ⎪
3
W = ⎨(x, y, z) ∈ R : ⎨ −x + y − 4z = 0, ⎬ .
⎩
⎪ ⎩
⎪ ⎭
⎪
x − 2y + 5z = 0
8. Let W = {(x, y, z) ∈ R
3
: x − 2y − z = 3}. Show that W is not a subspace of R3 by Definition
9.1.1 .
9. Let W = [0, 1]. Show that W is not a subspace of R.
10. Show that all the subspaces of R are {0} and R .
11. Let W = {(0, 0, ⋯ , 0)} ∈ Rn . Show that W is a subspace of Rn .
→
12. Let υ ∈ R
n
be a given nonzero vector. Show that
→
W = {t υ : t ∈ R}
2/2
A.10 Complex numbers
Section 10.1
1. Solve each of the following equations.
a. z 2 + z + 1 = 0
b. z 2 − 2z + 5 = 0
c. z 2 − 8z + 25 = 0
3. Let z1 = 2 + i and z2 = 1 − i. Express each of the following complex numbers in the form x + yi,
calculate its modulus, and find its conjugate.
1. z1 + 2z2
2. 2z1 − 3z2
3. z1 z2
4. z
z 1
1−i
)
b.
2−i
(1−i)(2+i)
c.
1+i
i(1+i)(2−i)
d. 1+i
1−i
−
2−i
1+i
1/2
A.10 Complex numbers
1000
5. Find i2 , 3 4
i , i , i , i
5 6
and compute (−i − i4 +i
5 6
−i ) .
8
6. Let z = ( 1+i ) Calculate z 66
1−i 33
. + 2z − 2.
7. Find z ∈ C
a. iz = 4 + 3i
b. (1 + i)z̄ = (1 − i)z
x + 1 + i(y − 3)
= 1 + i.
5 + 3i
12
(3+4i)(1+i)
9. Let z = 5 2
. Find |z|.
i (2+4i)
2/2
Index
Index
2×2 matrix, 14
3 × 3 determinant, 95
R -space, 1
n
n-column vector, 2
n-dimensional space, 221
n-dimensional subspace, 221
n × n determinant, 102
determinant of a 2 × 2 matrix, 14
adjoint, 115
algebraic multiplicity, 248
angle, 19
area of the parallelogram, 26
argument, 293
augmented matrix, 126
axioms, 281
back-substitution, 132
basic variable, 132
basis of a solution space, 221
basis of a spanning space, 221
basis vectors of the solution space, 144
1/7
Index
Cauchy-Schwarz inequality, 15
change of bases, 239
change of coordinates, 239
characteristic equation, 247
characteristic polynomial, 247
coefficient matrix, 126
cofactor, 97, 101
column matrix, 39
column spaces, 230
complex number, 287
conjugate, 291
consistency, 156, 166
coordinates, 237
Cramer’s rule, 150
eigenvalue, 246
eigenvector, 246
elementary matrices, 78
equal matrices, 40
equal vector, 4
equations of lines, 200
equations of planes, 200
2/7
Index
Euclidean space, 1
Euler’s formula, 296
expansion of cofactors, 98
expansion of minors, 97
exponential form, 293
identity matrix, 57
image, 172
imaginary axis, 293
imaginary number, 288
imaginary part, 288
inverse image, 172
inverse matrix method, 150
inverse of a square matrix, 85
kernel, 180
Lagrange’s identity, 27
3/7
Index
leading 1, 61
leading entry, 61
linear combination, 31, 166
linear operator, 172
linear system, 126
linear transformation, 172
linearly dependent, 217
linearly independent, 217
lower triangular matrix, 57
matrix, 38
matrix of cofactors, 115
midpoint formula, 18
minor, 97, 101
modulus, 290
multiplication, 172
4/7
Index
range, 178
rank, 74, 156
real axis, 293
real part, 288
reflection operator, 184
root of a complex number, 302
rotation operator, 184
row echelon matrix, 62
row equivalent, 76
row matrix, 39
row operations of matrices, 63
row spaces, 230
5/7
Index
sets, 279
similar matrices, 272, 448
size of a matrix, 38
sizes of two matrices, 51
solution space, 129, 274
spanning space, 35
spanning spaces, 31, 277
square matrix, 56
standard basis, 221
standard matrix, 172
standard vectors, 2
subspace, 273
subspaces of Rn, 273
subspaces of vector spaces, 285
symmetric matrix, 56
system of n variables, 125
zero matrix, 39
6/7
Index
zero row, 61
zero vector, 2
7/7