Professional Documents
Culture Documents
ANNE HENKE
Contents
Topics Covered 1
Some remarks 1
Solving mathematical problems 2
1. Fields 4
2. The Algebra of Matrices 6
3. Vector Spaces 12
4. First Properties of Vector Spaces 17
5. Subspaces of Vector Spaces 19
6. Linear Dependence, Linear Independence and Spanning 25
7. Bases of Vector Spaces 30
8. Steinitz Exchange Procedure 34
9. Dimension of Vector Spaces and an Application to Sums 38
10. Linear Transformations 43
11. The Rank-Nullity Theorem 48
12. The Matrix Representation of a Linear Transformation 52
13. Row Reduced Echelon Matrices 59
14. Systems of Linear Equations 64
15. Invertible Matrices and Systems of Linear Equations. 68
16. Elementary Matrices 72
17. Row Rank and Column Rank 75
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 1
Topics Covered
Algebra of matrices.
Vector spaces over the real numbers; subspaces. Linear dependence and linear
independence. The span of a (finite) set of vectors; spanning sets. Examples.
Finite dimensionality.
Definition of bases; reduction of a spanning set and extension of a linearly inde-
pendent set to a basis; proof that all bases have the same size. Dimension of a
vector space. Co-ordinates with respect to a basis.
Sums and intersections of subspaces; formula for the dimension of the sum.
Linear transformations from one (real) vector space to another. The image and
kernel of a linear transformation. The rank-nullity theorem. Applications.
The matrix representation of a linear transformation with respect to fixed bases;
change of basis and co-ordinate systems. Composition of transformations and
product of matrices.
Elementary row operations on matrices; echelon form and row-reduction. Matrix
representation of a system of linear equations. Invariance of the row space under
row operations; row rank.
Significance of image, kernel, rank and nullity for systems of linear equations.
Solution by Gaussian elimination. Bases of solution space of homogeneous equa-
tions. Applications to finding bases of vector spaces.
Invertible matrices; use of row operations to decide invertibility and to calculate
inverse.
Column space and column rank. Equality of row rank and column rank.
Some remarks
This set of notes is a collection of material which will be covered in this course, in
about the given order. It contains a few things which are not part of the syllabus,
like the section on fields, examples of vector spaces over a field different from R,
or some comments on infinite dimensional vector spaces. I will use this collection
of material to prepare the lectures. This means that I may spontaneously decide
to give different examples, to elaborate on something that I did not write down
in these notes. I would strongly advise you to take your own notes in the lectures
– already as it helps to concentrate on the lecture. Equally important is that you
read in linear algebra books; they are not only written with much more care than
these notes, they also contain more examples, more details and more background
material.
Please let me know of any necessary corrections of the mathematics, preferably
by email to: henke@maths.ox.ac.uk.
2 ANNE HENKE
solution. When you found a solution, it could prove helpful to see how other
people solve the problem, or to let other people criticise your own solution.
Writing. This is a critical moment. Here it shows whether the thought solution
can really be written down. Every correct solution can be written down in a
sensible way. If you have problems to write down your solution, then you have
not yet ordered your thoughts enough, you have not yet fully understood the
solution, the mathematical mechanism. Think again, you have not yet reached
the final goal! There are two bad extremes of writing styles: the first one is to just
write a calculation without an argument; the second one is to write a whole novel
without talking precisely about the problem. The correct way is somewhere in
the middle. Give precise arguments. Moreover, a solution to a problem consists
of a properly readable English text. We are not talking maths-language, we speak
and write English when we explain a solution. We write full sentences (not just
a formula without giving any context). Can you read your text loudly and it still
makes sense? Expect that you are not handing in the first written version of your
solution. And of course, be kind and respectful to your tutor: write clearly and
do not hand in pages where half of the text is crossed out or your cafe pot spilled
over. Your tutors also care about you making progress. Do you still understand
your own solution a couple of days later? If not start again. Getting a correct,
elegant and well written solution is often hard work! But you will feel good when
you achieved it.
Presenting. The communication of a solution is an important part of math-
ematical work. It is part of your education at university to give a clear and
understandable presentation of your work. You will likely need it whatever you
do after university. Exercise now, to learn it later will be harder.
The above is a free translation of parts of the student advice given by Prof M
Lehn (University of Mainz); for the original see http://www.mathematik.uni-
mainz.de/Members/lehn/le/uebungsblatt.
4 ANNE HENKE
1. Fields
This chapter is not part of the syllabus of this course. It is included in these notes
to indicate the more general setting in which linear algebra is defined. Fields will
properly be introduced and studied in some depth in the second year. The objects
that we study in this course – vector spaces, subspaces, linear maps (but similarly
groups, fields and many other mathematical objects which you meet later) – are
typically defined by a list of axioms that need to be satisfied.
Notation 1.1.
C = {a + bi | a, b ∈ R} = set of all complex numbers,
R = set of all real numbers,
Q = set of all rational numbers = m n
| m, n ∈ Z, n 6= 0 ,
Z = set of all integers,
N = set of all natural numbers.
Note: Z ⊆ Q ⊆ R ⊆ C.
Recall that the addition and multiplication of complex numbers is given by:
(a + bi) + (c + di) = (a + c) + (b + d)i,
(a + bi) · (c + di) = (ac − bd) + (ad + bc)i,
Note that this generalises the addition and multiplication of the subsets N, Z, Q
and R.
Definition 1.2. Let K be a subset of the complex numbers. Then K is called a
field if it satisfies the following conditions:
(−a)
(b) Since a ∈ Z =⇒ −a ∈ Z. Hence −x = b
∈ Q. Moreover
−1 b
x−1 = ab = a ∈ Q. Hence (K2) holds.
(c) 0,1 are elements in Q as 0 = 10 and 1 = 11 .
We will meet matrices in this course at various places. They will provide an ex-
ample for the most important object of linear algebra, the so-called vector spaces.
They also will be of fundamental importance when we study maps between vector
spaces. Finally they will be important when we study systems of linear equations.
In this section, we introduce matrices and operations defined for matrices. We are
interested in which algebraic relations matrices satisfy. Matrices can be defined
over any field K. In this course we always take K = R. We assume in this section
the usual rules of how to add and multiply elements in R. In the more general
setting of a field K these rules are precisely part of the definition of what a field
is.
Definition 2.1. Let m, n be natural numbers. An m × n matrix over R is an
array
a11 a12 · · · a1n
a21 a22 · · · a2n
A= ... .. ... .. ,
. .
am1 am2 · · · amn
where aij ∈ R. We shortly write A = (aij )1≤i≤m,1≤j≤n , or A = (aij ) if the shape
of the matrix is understood. A matrix of shape m × 1 is called a vector. We
define Mm×n (R) as the set of all m × n matrices with real entries. In particular
Mn (R) = Mn×n (R) is the set of real square matrices of size n.
Example 2.2. Matrix A given below is a 3 × 3 matrix. We have three rows and
three columns. Matrix B below is a general 3 × 3 matrix with entries bij ∈ R
where 1 ≤ i, j ≤ 3. We speak of bij as the (i, j)th entry of the matrix B. This
entry lies in row i and column j. Matrices need not be square; note that the
definition takes account of a general m × n matrix, a matrix that has m rows and
n columns. Matrix C below is an example of a 2 × 4 matrix.
2 1 1 b11 b12 b13
−8 1 1 −1
A = 4 1 0 , B = b21 b22 b23 , C = .
4 1 0 5
−2 2 1 b31 b32 b33
Example 2.3. Define 0n = (aij ) where aij = 0 for i = 1, . . . , m and j = 1, . . . , n.
This is called the zero matrix of Mm×n (R). Moreover, define matrix In = (aij )
where
1 if i = j,
aij =
0 if i 6= j,
for 1 ≤ i, j ≤ n. This is called the n × n identity matrix. Often the zero matrix
and the identity matrix are just denoted by 0 and I respectively:
0 0 ··· ··· 0 1 0 ··· 0
0 0 ··· ··· 0 0 1 ··· 0
0= .
.. . .. . . , I = .
. . .. .. . . . .. .
.
0 0 ··· ··· 0 0 0 ··· 1
Remark. Note that addition of matrices is only defined for matrices of the same
shape. If A, B are two m × n matrices then also A + B is an m × n matrix. We
say: We add two matrices by adding the entries coordinate-wise. Note that this
also defines addition of vectors (which are special matrices by definition).
Example 2.5. For example, for 3 × 3 matrices we have:
a11 a12 a13 b11 b12 b13 a11 + b11 a12 + b12 a13 + b13
a21 a22 a23 + b21 b22 b23 = a21 + b21 a22 + b22 a23 + b23 .
a31 a32 a33 b31 b32 b33 a31 + b31 a32 + b32 a33 + b33
We next define scalar multiplication of matrices. Note that this definition includes
also the definition of the scalar multiplication of vectors.
Definition 2.6. The product of a matrix A ∈ Mm×n (R) by a scalar λ ∈ R,
written as λA is the matrix C = (cij ) obtained by multiplying each entry of A by
λ: cij = λaij for 1 ≤ i ≤ m and 1 ≤ j ≤ n:
λa11 · · · λa1n
λA = ... ... .. .
.
λam1 · · · λamn
Example 2.7. If A = (aij ) is a square matrix with aij = 0 for all i 6= j, then A
is called diagonal matrix, and we write A = diag(a11 , . . . , ann ). A special type of
a diagonal matrix is the scalar matrix. Matrix B is a scalar matrix if B = kIn
for some k ∈ R.
a11 0 . . . 0 k 0 ... 0
0 a22 . . . 0 0 k ... 0
A= ... .. . . .. , B = kIn = .. .. . . .. .
. . . . . . .
0 0 . . . ann 0 0 ... k
Proposition 2.8. For all A, B, C ∈ Mm×n (R) and for all r, s ∈ R we have:
(1) A + B = B + A,
(2) A + (B + C) = (A + B) + C,
(3) A + 0 = A = 0 + A,
(4) s(rA) = (sr)A,
(5) (r + s)A = rA + sA,
(6) r(A + B) = rA + rB.
Note that 0 in statement (3) denotes the zero matrix of shape m × n. Note that
statement (2) says, when forming a sum of matrices, then brackets can safely be
omitted.
The proof of the above statements is straight forward. To demonstrate how they
should be written, we give an example by proving the first statement.
8 ANNE HENKE
There is one more operation for matrices which we need, namely the P
product of
matrices. We first recall the summation notation: We write shortly ni=1 ai for
the sum a1 + a2 + . . . + an of real numbers ai .
Definition 2.9. If A = (aij ) is an m × n matrix over the real numbers and
B = (bij ) is an n × p matrix over the real numbers, then the product
P AB is an
m × p matrix, defined as follows: AB = C = (cij ) where cij = nk=1 aik bkj for
1 ≤ i ≤ m and 1 ≤ j ≤ p.
Remark. Note that multiplication of two matrices A and B is only defined if the
number of elements in a row of A equals the number of elements in a column of
B. If it is defined then the (i, j)th entry of the matrix C = AB is obtained by
multiplying the ith row of matrix A with the jth column of the matrix B:
Xn
cij = aik bkj = ai1 b1j + ai2 b2j + . . . + ain bnj .
k=1
Example 2.10. The above definition includes the multiplication of a matrix with
a vector. For example, let A be a 3 × 3 matrix and x be a 3 × 1 vector, then Ax
is defined and is another 3 × 1 vector. For example,
2 1 1 2 3
4 1 0 · −1 = 7 .
−2 2 1 0 −6
Proposition 2.11. Suppose A, B, C are matrices.
Remark. Note that statement (4) says, when forming a product of matrices then
brackets can safely be omitted. Proofs of all except (4) are straight forward; they
are left as an exercise to the reader.
We conclude this section with defining some more language about matrices and
by giving some more examples.
Example 2.12. (1) For every matrix A = (aij ), there exists a matrix B such
that A + B = 0 = B + A, namely B = (−aij ). We call B the additive
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 9
inverse of A and write −A for it. Note that −A = (−1)A; this last
equation says that the additive inverse of A is given by multiplying the
matrix A with the scalar (−1).
(2) If A ∈ Mn (R) (a square matrix) and there exists B ∈ Mn (R) such that
AB = BA = In , then we call B the (multiplicative) inverse of A. We write
A−1 for the inverse matrix of A. For example, the matrices A, B, D below
are invertible with A−1 = A, B −1 = D and D−1 = B. Matrix C is not
invertible. So there are non-zero square matrices which are not invertible.
0 1 1 a 0 1 1 −a
A= ,B= ,C= ,D= .
1 0 0 1 0 0 0 1
Example 2.13. If A ∈ Mm×n (R) and A = (aij ) then AT = (bij ) where bij = aji .
We call AT the transposed matrix of A. In particular AT ∈ Mn×m (R). Note rows
of AT are columns of A, and columns of AT are rows of A. For example,
2 3
2 1 4
A= then AT = 1 1 .
3 1 2
4 2
We say A is a symmetric matrix if AT = A. We say A is skew symmetric if
AT = −A. We say A is orthogonal if AAT = AT A = I. Equivalently, A is
orthogonal if A is invertible and A−1 = AT .
Proposition 2.14. Let A, B ∈ Mm×n (R), C ∈ Mn×p (R). Then
(1) (AT )T = A,
(2) (A + B)T = AT + B T ,
(3) (λA)T = λAT ,
(4) (BC)T = C T B T .
Proof. We prove the first property and leave the others as straight forward
exercises to the reader. Let A = (aij ) be a matrix of shape m × n. Note that
A and (AT )T have the same shape. By the definition of a transposed matrix,
the (i, j)th entry of (AT )T equals the (j, i)th entry of AT , which in turn equals
the (i, j)th entry of A. So the entries of A and (AT )T coincide. Hence indeed
A = (AT )T .
Example 2.15. Let’s see on an example how to check that a matrix is symmet-
ric. Let A, B be symmetric matrices of the same size. Is the matrix AB again
symmetric? We claim that AB is symmetric if and only if AB = BA. To prove
this, we need to show two directions:
Proof. WeP prove (4), the rest is left as an exercise to the reader.PLet (xij ) = DC.
Then xij = m k=1 dik ckj . Similarly, let (yij ) = CD. Then yij =
n
t=1 cit dtj . Since
addition and multiplication in R is commutative, we have:
n
X
tr(DC) = xtt
t=1
Xn X m m X
X n
= dtk ckt = ckt dtk
t=1 k=1 k=1 t=1
Xm
= ykk = tr(CD).
k=1
(For those of you who found this problem too easy: Find all n × n matrices which
commute with any matrix A ∈ Mn (R) for fixed n ∈ N. Justify your answer.)
Exercise 3. For each a ∈ R, define the matrix A(a) by
1 a 12 a2
A(a) = 0 1 a .
0 0 1
Show that for all a, b ∈ R we have A(a + b) = A(a)A(b). Deduce that each matrix
A(a) is invertible.
Exercise 4. Let A and B be two square matrices of the same size with A sym-
metric and B skew symmetric. Determine which of the following matrices are
symmetric and which are skew symmetric, and justify your answer:
(a) AB + BA,
(b) AB − BA,
(c) A2 ,
(d) B2,
(e) B T (AT + A)B,
(f) B T (A − AT )B.
3. Vector Spaces
We next define the main objects of linear algebra, the vector spaces over a field
K. Although we will define vector spaces over any field K, you may always take
K = R. The concrete example of vectors will underpin the abstract definition.
Definition 3.1. A vector space V over K is a triple (V, +, ·) where
such that:
The elements of V are called vectors. If it is understood what + and · are, we will
shortly write V is a vector space instead of writing (V, +, ·) is a vector space. We
also will typically write λv instead of λ · v, and 0 instead of 0V . Note also that the
addition and scalar multiplication are examples of so-called binary operations.
Remark. Note that the statements in (b) and (c) mean that when checking that
some set V is a vector space, you have to check that for all u, v ∈ V and λ ∈ K,
the resulting elements u + v and λv are indeed elements belonging to V .
Example 3.2. The canonical example of a vector space is V = Rn where
x1
n .
.
R = x i ∈ R .
.
xn
Given two elements u, v ∈ V , then there are xi , yi ∈ R for 1 ≤ i ≤ n such that
x1 y1
u = ... v = ... .
xn yn
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 13
(V1) Since for real numbers we have xi +yi = yi +xi (for i = 1, . . . , n), it follows
that
x1 + y1 y1 + x1
u+v = .. = .. = u + v.
. .
xn + yn yn + xn
(V2) Since for real numbers we have (xi +yi )+zi = xi +(yi +zi ) (for i = 1, . . . , n),
it follows that
(x1 + y1 ) + z1 x1 + (y1 + z1 )
(u + v) + w = .. = .. = u + (v + w).
. .
(xn + yn ) + zn xn + (yn + zn )
(V3) We take
0R
0V = ... .
0R
Note that 0R denotes here zero of the real numbers. Clearly 0V ∈ Rn .
Over the real numbers we have xi + 0 = 0 + xi = xi (for i = 1, . . . , n),
which implies that
x1 + 0R x1
u + 0V = .. = ... = u.
.
xn + 0 R xn
Similarly, 0V + u = u.
14 ANNE HENKE
Hence we have verified that Rn with addition and scalar multiplication as above
is indeed a vector space.
Example 3.3. Fix a natural number n. Define Rn [x] to be the set of all polyno-
mials f (x) of degree less than or equal to n with coefficients in R. So
Rn [x] = {f | f (x) = a0 + a1 x + · · · + an xn with ai ∈ R for i = 0, . . . , n},
which clearly is a non-empty set. Addition is defined on Rn [x] as follows. Let
f (x) = a0 + a1 x + · · · + an xn ,
g(x) = b0 + b1 x + · · · + bn xn .
then (f + g)(x) := (a0 + b0 ) + · · · + (an + bn )xn . Clearly f + g ∈ Rn [x] since
ai + bi ∈ R for i = 0, . . . , n. Hence Rn [x] is closed with respect to addition.
Scalar multiplication is defined by (λf )(x) = (λa0 ) + · · · + (λan )xn . As λai ∈ R it
follows that Rn [x] is closed with respect to scalar multiplication. To check axioms
(V 1) − (V 8) is left as an exercise to the reader. Note that the set of polynomials
of (some fixed) degree n is not forming a vector space.
Example 3.4. Consider the m × n matrices with entries in R, that is, consider
V = Mm×n (R) = {A | A = (aij ), aij ∈ R, i = 1, . . . , m, j = 1, . . . , n}.
Then Mm×n (R) forms a vector space with component-wise addition (see Defini-
tion 2.4) and scalar multiplication (see Definition 2.6). The proof is left as an
exercise. Hint: the element 0V is given in Example 2.3. Note that Proposition 2.8
shows some of the vector space axioms for Mm×n (R).
16 ANNE HENKE
Example 3.5. There are many more examples of vector spaces, turning up in
different areas of mathematics. Vector spaces are defined over any field K. Similar
to above we have that
The precise definitions of the sets and the binary operations is left to the reader.
When doing this exercise it becomes apparent which properties a field needs to
have. We also have the following variations of the above examples:
And so on. In this course we only consider vector spaces over the real numbers.
Vector spaces over other fields than the real numbers are not part of the syllabus.
From now on we will work only with vector spaces over R. It should however be
noted that we could develop our theory equally for vector spaces over fields. The
interested reader may try this as a (not so difficult, and eventually boring) exer-
cise. Examples (3) and (4) are examples of so-called field extensions, something
studied in the second year in the course “Fields”.
Example 3.6. Let X be a non-empty set. Let V = {f : X → R}. We write
x 7→ f (x) for x ∈ X. Then V is a vector space over R with the following addition
and scalar multiplication:
(a) Addition is defined by: For f1 , f2 ∈ V define (f1 + f2 )(x) := f1 (x) + f2 (x).
Note that f1 (x) ∈ R and f2 (x) ∈ R, and hence f1 (x) + f2 (x) ∈ R. So
f1 + f2 ∈ V and hence V is closed with respect to addition.
(b) Scalar multiplication is defined by: For f ∈ V and λ ∈ R define (λf )(x) =
λ · f (x). Since λ ∈ R and f (x) ∈ R, this implies λ · f (x) ∈ R. Hence
λf ∈ V and V is closed with respect to scalar multiplication.
Using the axioms (V 1) − (V 8) of vector spaces, we now can derive some first
properties of vector spaces. Throughout this section, let V be a vector space over
R.
Proof. (i) Let 0V , 0′V ∈ V both have the property described in Axiom (V3),
that is:
v + 0V = 0V + v = v ∀v ∈ V, (a)
v + 0′V = 0′V + v = v ∀v ∈ V. (b)
Then put v = 0′V in (a) and we have 0′V + 0V = 0V + 0′V = 0′V . Put v = 0V in (b),
then 0V + 0′V = 0′V + 0V = 0V . Hence 0V = 0′V .
(ii) Suppose we have two elements −v and v ′ both satisfying Axiom (V4), that is:
Then
v + (−v) + v ′ = (v + (−v)) + v ′ by Axiom (V2),
= 0V + v ′ by (a),
′
=v byAxiom(V3).
Hence v ′ = −v.
Lemma 4.2. Let V be a vector space over R. Then for all u, v ∈ V and λ ∈ R
we have:
(i) 0R · v = 0V ;
(ii) λ · 0V = 0V ;
(iii) if λ · v = 0V then λ = 0R or v = 0V ;
(iv) (−1) · v = −v;
(v) (−1) · (−v) = v.
(vi) −0V = 0V .
18 ANNE HENKE
Proof. (i) We need to show that 0R · v satisfies Axiom (V3). If so, then
Lemma 4.1(i) implies 0R · v = 0V . Now
v + 0R · v = 1 · v + 0R · v by (V8),
= (1 + 0R ) · v by (V6),
= 1·v since 1 + 0R = 1 in R,
= v by (V8).
Therefore 0R · v = 0V .
(ii) Exercise.
(iii) Let λ · v = 0V and λ 6= 0. We prove that v = 0V . Now
v = 1·v by (V8),
= (λ−1 · λ) · v as λ−1 · λ = 1 in R and λ 6= 0,
= λ−1 · (λ · v) by (V7),
= λ−1 · 0V by assumption,
= 0V by (ii).
(iv)-(vi) Exercise.
Exercise 8. In each of the following cases, either give a careful proof that V is a
vector space over R, or give a reason why it is not:
(a) V is the set of all polynomials over R (in one variable, say x) which have
a non-zero constant term, with the usual addition of polynomials and the
usual scalar multiplication.
(b) V is the set of all functions f : X → R (for some fixed non-empty set X),
and if f, g ∈ V , α ∈ R, then the functions f + g, αf are defined by setting
(f + g)(x) = f (x) + g(x), (αf )(x) = αf (x).
(c) V is the set of all symmetric n × n matrices over R.
(d) V is the set of all skew-symmetric n × n-matrices over R.
(e) V is the set of all invertible n × n-matrices over R.
(f) V = R2 with the usual scalar multiplication and the new addition ⊕ :
V × V → V given by
x1 y1 x1 + y2
⊕ = .
x2 y2 x2 + y1
Exercise 9. Let V be a vector space over R. Use the vector space axioms to
show that for all v ∈ V and all λ ∈ R the following holds:
(a) λ · 0V = 0V , (b) (−1)v = −v.
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 19
Remarks. (a) Every vector space V has at least two subspaces: {0V } and V itself.
Any subspace W of V with W 6= {0V } and W 6= V is called a proper subspace.
(b) The zero element of a subspace W of V always coincides with the zero element
of V . To see this, use Lemma 4.1 and Definition 3.1 for V .
(c) The two binary operations (addition and scalar mulitiplication) needed to
define a vector space structure on W are precisely the two binary operations
given with the vector space V , restricted to the subset W . The definition now
says, that in order to check that a subset of a vector space is again a vector
space, we need to check that W is closed with respect to addition and scalar
multiplication, and that the eight axioms (V1)-(V8) of a vector space hold for W .
Similar as W inherits the binary operations from V , several of the axioms hold
automatically for the elements of a subset of a vector space.
Lemma 5.2 (First subspace test). A non-empty subset W of a vector space V is
a subspace of V if and only if it is closed under addition and scalar multiplication:
(i) If w1 , w2 ∈ W then w1 + w2 ∈ W .
(ii) If w ∈ W and λ ∈ R then λw ∈ W .
We call Lemma 5.2 the first subspace test. It obviously speeds up the checking
that we have to do in order to prove that a given non-empty subset of a vector
space is a subspace. Often you will see conditions (i), (ii) in Lemma 5.2 simplified
into one condition. This is called the second subspace test.
Lemma 5.3 (Second subspace test). A non-empty subset W of a vector space V is
a subspace if and only if for any λ1 , λ2 ∈ R and w1 , w2 ∈ W then λ1 w1 +λ2 w2 ∈ W .
Proof. The proof is by induction on k, using Lemma 5.3. Let k = 1 then the
claim follows from the definition of a vector space, see Definition 3.1. If k = 2
then the claim follows from Lemma 5.3. Assume the claim is true for some k ≥ 2.
Consider
x = α1 u1 + . . . + αk uk + αk+1 uk+1 .
Put ũ = α1 u1 + . . . + αk uk . By induction assumption, ũ ∈ U . By (V8) we have
ũ = 1 · ũ. Hence x = 1 · ũ + αk+1 uk+1 . By Lemma 5.3 we have x ∈ U .
Clearly U + W is a subset of V .
Example 5.8. Let V = R2 and let U be the x-axis and W be the y-axis:
x 0
U= x∈R , W = y∈R .
0 y
Then U and W are subspaces of V and
0
U ∩W = 6= ∅,
0
x 0
U ∪W = x∈R ∪ y∈R ,
0 y
x 0
U +W = + x, y ∈ R
0 y
x
= x, y ∈ R = R2 .
y
The sets U ∩ W and U + W are vector spaces, and hence subspaces of V . The
set U ∪ W is not a vector space. It is not closed under addition: (1, 0)T ∈ U and
(0, 1)T ∈ W , hence (1, 0)T , (0, 1)T ∈ U ∪ W . However
1 0
+ ∈/ U ∪ W.
0 1
Example 5.9. Let V = M2×2 (R), the set of 2 × 2 matrices with entries in R. Let
a b x 0
U= a, b ∈ R , W = x, y ∈ R .
0 0 y 0
Then U and W are subspaces of V , and
z b
z, b, y ∈ R ,
U +W =
y 0
a 0
U ∩W = a∈R ,
0 0
since a + x = z describes the whole of R as x and a do. The sets U ∩ W and
U + W are vector spaces, and hence subspaces of V . The set U ∪ W is not a
vector space: It is not closed under addition, similar as in the previous example.
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 23
The proofs of the following (important) propositions are left as an exercise to the
reader.
Proposition 5.10. Let U, W be subspaces of a vector space V .
(a) U ∩ W is a subspace of V ;
(b) U + W = {u + w | u ∈ U, w ∈ W } is a subspace of V ;
(c) U ∪ W is a subspace of V if and only if U ⊂ W or W ⊂ U .
Exercise 11. Let V = R[x], the vector space of all real polynomials in one
variable x. Determine whether or not U is a subspace of V when:
(a) U consists of all polynomials with degree ≥ k for fixed k, together with
the zero polynomial;
(b) U consists of all polynomials with only even powers of x;
(c) U consists of all polynomials with integral coefficients;
(d) U consists of all polynomials p(x) ∈ R[x] with p(1) = p(5).
Exercise 12. (a) Let α ∈ R. Prove that Uα = {(x1 , x2 , x3 ) ∈ R3 | x1 + x2 +
x3 = α} is a subspace of R3 if and only if α = 0.
(b) Is the set U = {(x1 , x2 , x3 , x4 ) ∈ R4 | x21 = 2x2 and x1 + x2 = x3 + x4 } a
subspace of R4 ? Justify your answer.
Exercise 13. (a) Let S be the subset {(x, 0) | x ∈ R and x > 0} of R2 .
Is S a subspace of the vector space R2 with respect to the usual scalar
multiplication and the usual addition of R2 ?
(b) Let S be the subset {(x, 0) | x ∈ R and x > 0} of R2 . Define the scalar
multiplication ∗ and addition ⊕ on S by:
α ∗ (u, 0) = (uα , 0), (u, 0) ⊕ (v, 0) = (uv, 0)
for all α, u, v ∈ R with u, v > 0. Show that S is a vector space with respect
to ∗ and ⊕. Is (S, ∗, ⊕) a subspace of R2 ?
Exercise 14. If A is a real n × n-matrix, prove that {x ∈ Mn×1 (R) | Ax = 0} is
a subspace of Rn .
Exercise 15. For each of the following statements about subspaces X, Y, Z of a
vector space V either give a proof of the statement, or find a counterexample. R2
and R3 will provide all the counterexamples required.
(b) (X ∩ Y ) + (X ∩ Z) = X ∩ (Y + Z);
(c) (X + Y ) ∩ (X + Z) = X + (Y ∩ Z);
(d) if Y ⊆ X, then Y + (X ∩ Z) = X ∩ (Y + Z).
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 25
Remark. At this point we can revisit the last remark given in Section 5. It is
now an easy exercise to show that the space generated by U ∪ W is precisely the
sum of U and W . It then follows from the last proposition, that it is the smallest
subspace containing both U and W .
Linear (in)dependence.
Definition 6.4. Let V be a vector space over R and let {v1 , . . . , vn } ⊂ V.
Hence
n
X
0= αi vi + (−1)vk .
i=1
i6=k
Pn
Let αk = −1. Then 0 = i=1 αi vi and v1 , . . . , vn are linearly dependent.
0 0 1
are linearly independent. The proof is left as an exercise to the reader.
(3) Let 1 ≤ i, j ≤ n and let V = Mn (R). Define Eij = (akl )1≤k≤n,1≤l≤n by
1 if k = i, l = j,
akl =
0 otherwise.
Then {Eij | 1 ≤ i ≤ n, 1 ≤ j ≤ n} is linearly independent.
Remark. We have worked with finite subsets S of a vector space, both for spanning
and linear independence. In fact, if we would want to include infinite sets S in
our study, we would need to be more careful with our definitions. If S is any
(possibly infinite) subset of a vector space V , we define the span of S to be the
set of all linear combinations of finite subsets of S. We say an infinite family of
vectors S is linearly independent, if each finite subset of S is linearly independent.
In this course we work only with so-called finite dimensional vector spaces. It will
therefore be enough to always assume that S is finite.
Exercise 16. (a) Which of the following sets of vectors in R3 are linearly
independent?
(i) {(1, 3, 0), (2, −3, 4), (3, 0, 4)},
(ii) {(1, 2, 3), (2, 3, 1), (3, 1, 2)}.
(b) Which of the following sets of vectors in V = {f : R → R} are linearly
independent?
(i) {f, g, h} with f (x) = 5x2 + x + 1, g(x) = 2x + 3 and h(x) = x2 − 1.
(ii) {p, q, r} with p(x) = cos2 (x), q(x) = cos(2x) and r(x) = 1.
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 29
(c) Determine all α ∈ R for which the set {(1, α, α), (α, 1, α) and (α, α, 1)} is
linear independent.
Exercise 17. Let V be an R-vector space, n ∈ N and v1 , . . . , vn ∈ V . Define
vectors wi for 1 ≤ i ≤ n by
Xi
wi = vj .
j=1
Throughout this section, let V be a vector space over R. In the previous two
sections, we introduced the span of a finite set of vectors, and we studied what it
means for a finite set of vectors to be linearly independent. These two concepts
come together in the basis of a vector space.
Definition 7.1. Let V be a vector space over R and let {v1 , . . . , vn } be elements
in V such that:
(1) V = Span{v1 , . . . , vn },
(2) {v1 , . . . , vn } are linearly independent.
(1) Rn has basis {ei | 1 ≤ i ≤ n}. See Example 6.2(1) and Example 6.9(1).
(2) Rn has basis {vi | 1 ≤ i ≤ n}. See Example 6.2(2) and Example 6.9(2).
(3) Mm×n (R) has basis {Eij | 1 ≤ i ≤ m, 1 ≤ j ≤ n}. See Example 6.2(4) and
Example 6.9(3).
(4) C as R-vector space has basis {1, i}.
(5) Claim: Rn [x] has basis {1, x, . . . , xn }.
Proof.
(a) Let f (x) ∈ Rn [x]. Then f (x) = a0 + a1 x + · · · + an xn for some ai ∈ R
and clearly f (x) ∈ Span{1, x, . . . , xn }.
(b) The vectors {1, x, . . . , xn } are linearly independent: Assume λi ∈ R
with λ0 + λ1 x + · · · + λn xn = 0. Then the polynomial
f (x) := λ0 + λ1 x + · · · + λn xn
is zero for any x ∈ R. The fundamental theorem of algebra states
that any polynomial of degree n has over C precisely n roots – roots
of f (x) are by definition those values x ∈ R with f (x) = 0. Over R
such a polynomial then has at most n roots. Since R has more than
n elements this implies that f (x) is the zero polynomial, that is
λ1 = λ2 = . . . = λn = 0.
Example 7.7.
We demonstrate the idea for the proof of the last corollary on an example. Let
V = R3 . Define
1 0 1 0 1
v1 = 0 , v2 = 1 , v3 =
1 , v4 =
1 , v5 =
0 .
0 0 0 1 1
Find a linear dependence involving some (or all) of the five vectors, for example
v3 = v1 + v2 . Hence Span{v1 , v2 , v3 , v4 , v5 } = Span{v1 , v2 , v4 , v5 }. We next look
for a linear dependence of the remaining four vectors, for example v1 − v2 + v4 =
v5 . Hence Span{v1 , v2 , v3 , v4 , v5 } = Span{v1 , v2 , v4 }, and the latter is a minimal
spanning set.
There are various other ways to obtain a minimal spanning set. Here is one such al-
ternative. Note that v1 = v3 −v2 . Hence Span{v1 , v2 , v3 , v4 , v5 } = Span{v2 , v3 , v4 , v5 }.
Next we use that v2 = 21 (v3 +v4 −v5 ). Hence Span{v1 , v2 , v3 , v4 , v5 } = Span{v3 , v4 , v5 },
and the latter is a minimal spanning set.
Exercise 19. Which of the following system of vectors of R3 are linear indepen-
dent, which form a generating system, which are a basis of R3 ?
(b) Extend the set {(8, 2, 5), (−3, −5, 9)} to a basis of R3 .
Exercise 22. Let V, W be vector spaces over R. Consider the cartesian product
V × W with componentwise addition and scalar multiplication:
(v1 , w1 ) + (v2 , w2 ) = (v1 + v2 , w1 + w2 ), λ · (v1 , w1 ) = (λv1 , λw1 )
for all v1 , v2 ∈ V, w1 , w2 ∈ W and λ ∈ R (which defines on V × W a vector space
structure). Let S be a basis of V and T be a basis of W . Give a basis for the
vector space V × W and justify your answer.
Exercise 23. (i) Let S and T be subsets of a vector space V . Which of the
following statements are true? Give reasons.
(a) X1 ∪ X2 generates U1 + U2 .
(b) X1 ∪ X2 is linear independent in U1 + U2 .
(c) X1 ∩ X2 generates U1 ∩ U2 .
(d) X1 ∩ X2 is linear independent in U1 ∩ U2 .
(e) X1 ⊆ X2 if and only if U1 ⊆ U2 .
(r) If U1 ∩ U2 = {0} then X1 and X2 are disjoint.
34 ANNE HENKE
Let V be a vector space over R. We would like to define the dimension of a vector
space V to be the cardinality of any basis of V . To do this, we need to know that
two bases of the same vector space have the same cardinality. To prove such a
result, we use the exchange procedure going back to E. Steinitz. In this course
we only deal with vector spaces that have a finite basis.
Lemma 8.1 (Steinitz Exchange Lemma). Let {v1 , . . . , vn } be a basis of a vector
space V . Let
(2) w = λ1 v1 + · · · + λn vn
with λi ∈ R. If there exists an index k with 1 ≤ k ≤ n and λk 6= 0, then the set
{v1 , . . . , vk−1 , w, vk+1 , . . . , vn } is a basis of V.
and hence
v = µ1 ( λ11 w − λλ12 v2 − · · · − λλn1 vn ) + µ2 v2 + · · · + µn vn ,
= µλ11 w + µ2 − µλ1 λ1 2 v2 + · · · + µn − µλ1 λ1 n vn .
Remark. It should be noted that this last proof also indicates that it is not so
difficult to prove that if a vector space V has a finite basis, then any basis of V is
finite. For this purpose we would need to generalise some of the earlier definitions
(see the Remark at the end of Section 6) and statements to infinite sets. We call
a vector space V with a finite basis finitely generated. In this course we only deal
with finitely generated vector spaces.
Theorem 8.4. Let V be a finitely generated vector space and let S = {w1 , . . . , wr } ⊆
V be linearly independent. Then V has a basis B with S ⊆ B.
Proof. By assumption, the vector space V has a finite basis, say {v1 , . . . , vn }.
Apply Proposition 8.2. Then after possibly a suitable rearrangement of the vec-
tors, the set B = {w1 , . . . , wr , vr+1 , . . . , vn } forms a basis of V , and S ⊆ B.
Remark. To prove Theorem 8.4 for infinite families of vectors (for a vector space
which is not finitely generated) is more complicated. The result of Theorem 8.4
still would be true in the infinite setting, however it would need a not completely
unproblematic tool from set theory, called Zorn’s lemma. Examples of vector
spaces with no finite basis are:
Dimension of vector spaces. Given any vector space V , we have not yet
shown, that there always exists a basis for V . Indeed to prove existence of a
basis is another consequence of the more general version of Theorem 8.4 (see the
remark following it): any family S of linearly independent vectors of a vector
space V (not necessarily finitely generated) can be extended to a basis B of V . In
particular, if we take S to be the empty set, then S is linearly independent and
by the generalised version of Theorem 8.4, S can be extended to a basis B of V .
Theorem 9.1. Every vector space has a basis.
In the case of a finitely generated vector space V , we have seen in Theorem 8.3 and
the remark following it, that any basis of V is finite and of the same cardinality.
Definition 9.2. If a vector space V has a finite basis, then we define the dimen-
sion of V as the number of elements in a basis of V. We denote the dimension of
V shortly by dim(V ) or dim V and say that V is finite dimensional. If V has no
finite basis, we call V infinite dimensional and write dim V = ∞.
Example 9.3. Compare the following with the Examples in 7.2.
(1) dim(Rn ) = n.
(2) dim(Mm×n (R)) = m · n.
(3) Let V = C be a vector space over C. Then dim(V ) = 1.
(4) Let V = C be a vector space over R. Then dim(V ) = 2.
(5) dim(Rn [X]) = n + 1.
(6) dim(R[X]) = ∞.
(7) Let V = R be a vector space over Q. Then dim(V ) = ∞.
(8) Define R∞ = {(ai ) | (ai ) = (a1 , a2 , . . . )}, the vector space of sequences of
real numbers. This is a vector space over R, and dim(R∞ ) = ∞.
Remark 9.4. Let V be a finite dimensional vector space and let W be a subspace
of V. As a consequence of Theorem 8.4 we have:
Remark 9.4(2) is not true for infinite dimensional vector spaces: the vector space
W of polynomials is a subspace of the vector space V of continuous functions and
dim(W ) = dim(V ) = ∞.
Sums and intersections of subspaces. The second part of this section deals
with an important dimension formula. Recall from Section 5:
Proposition 9.5. Let V be a vector space and let U, W be subspaces of V . Then
U ∩ W is a subspace of V .
Proof. We show this by using Lemma 5.3, the second subspace test. Since U, W
are subspaces, then 0 ∈ W and 0 ∈ U , so 0 ∈ U ∩ W , hence U ∩ W 6= ∅. If
v1 , v2 ∈ U ∩ W , then v1 , v2 ∈ U and v1 , v2 ∈ W . Let a1 , a2 ∈ R. Since U is a
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 39
Proof. We will use Lemma 5.3, the second subspace test. Since U, W are both
subspaces, then 0 ∈ U and 0 ∈ W , so 0 + 0 = 0 ∈ U + W . Hence U + W 6= ∅.
Let v1 , v2 ∈ U + W , then v1 = u1 + w1 for some u1 ∈ U and w1 ∈ W . Similarly,
v2 = u2 + w2 for some u2 ∈ U and w2 ∈ W . So
a1 v1 + a2 v2 = a1 (u1 + w1 ) + a2 (u2 + w2 )
= (a1 u1 + a2 u2 ) + (a1 w1 + a2 w2 ) =: v say,
Since u1 , u2 ∈ U and U is a subspace, we have u := a1 u1 + a2 u2 ∈ U . Similarly
w := a1 w1 + a2 w2 ∈ W . Hence v = u + w ∈ U + W , and so by Lemma 5.3 we see
that U + W is a subspace of V .
Example 9.8.
Let V = M3 (R), and define
U = {A | A = (aij ) with aij = 0 for i > j} ⊆ V,
W = {B | B = (bij ) with bij = 0 for i < j} ⊆ V.
So
a11 a12 a13 b11 0 0
A = 0 a22 a23 ∈ U, B = b21 b22 0 ∈ W.
0 0 a33 b31 b32 b33
Then
(3) U + W = {A + B | A ∈ U, B ∈ W } = V = M3 (R).
Direct sums. Let V be a finite dimensional vector space with subspaces U and
W . Sometimes the formula in Theorem 9.7 holds without the correction term
dim(U ∩ W ), see Example 5.8. This gets an own name.
Definition 9.9. A vector space V is called a direct sum of the subspaces U and
W, written as V = U ⊕ W, if the following holds:
(D1) V = U + W,
(D2) U ∩ W = {0}.
(1) V = U ⊕ W,
(2) For every v ∈ V there is a unique u ∈ U and w ∈ W with v = u + w.
Proof. (1) ⇒ (2): We only need to show uniqueness. Let v = u + w and also
v = u′ +w′ for u, u′ ∈ U , w, w′ ∈ W. Then u+w = u′ +w′ and so w−w′ = u′ −u. But
w −w′ ∈ W and u′ −u ∈ U as U, W are subspaces. Hence w −w′ = u′ −u ∈ U ∩W.
As U ∩ W = {0} this implies w − w′ = 0 = u′ − u. So w = w′ and u = u′ .
(2) ⇒ (1): By assumption (D1) holds. We show (D2). Assume v ∈ U ∩ W. Then
v ∈ U and v ∈ W . Since U, W are subspaces, 0 ∈ U and 0 ∈ W . Note that
v = 0 + v with 0 ∈ U, v ∈ W ; and v = v + 0 with v ∈ U, 0 ∈ W . Both these
expressions for v ∈ U + W need to be the same by assumption (2). Hence v = 0
and so U ∩ W = {0}.
(1) V = U ⊕ W ;
(2) V = U + W and dim V = dim U + dim W ;
(3) U ∩ W = {0} and dim V = dim U + dim W.
Proof. (1) ⇒ (2): Follows from Theorem 9.7 and Definition 9.9.
(2) ⇒ (3): By Theorem 9.7 it follows that dim(U ∩ W ) = 0, and so U ∩ W = {0}.
(3) ⇒ (1): Use Theorem 9.7 and the assumption in (3) to get:
dim(U + W ) = dim U + dim W − dim(U ∩ W )
= dim U + dim(W ) = dim(V ).
Note U + W ≤ V. By Remark 9.4(2) it follows that V = U + W.
42 ANNE HENKE
Exercise 26. A magical square is a table with nine digits with the following
properties: the sum of all numbers in each row, and in each column, and in each
diagonal is equal. This number is called the magical number. For example,
4 3 8
9 5 1
2 7 6
and the magical number is 15, the number in the center of the square is 5. Consider
the set of all magical squares with entries from the set of real numbers R.
(a) Compute the dimension of Mn (R). Show that Mn (R) has a basis with
the property that each matrix in the basis is either symmetric or skew-
symmetric.
(b) Compute the dimension of the subspace of Mn (R) consisting of all diagonal
matrices.
(c) Compute the dimension of the subspace of Mn (R) consisting of all matrices
of zero trace (that is, where the sum of the diagonal entries is zero).
Exercise 28. (a) Let U and V be two subspaces of R2n−1 and let dim(U ) =
dim(V ) = n. Prove that U ∩ V 6= {0}.
(b) Let X, Y, Z be subspaces of a vector space V . Is the following formula
correct:
dim(X + Y + Z) = (dimX + dimY + dimZ)
−(dimX ∩ Y + dimY ∩ Z + dimZ ∩ X) + dimX ∩ Y ∩ Z?
(c) Given are three two-dimensional subspaces U1 , U2 , U3 of a vector space V
such that the intersection of each two subspaces is one dimensional. Which
dimensions can occure as dim(U1 + U2 + U3 ).
Exercise 29. Let V be a vector space of dimension n over R.
This course deals with finite dimensional vector spaces only. From now on we
always assume the vector spaces under consideration to be finite dimensional,
without explicitely saying so. Throughout this section, let V, W be (finite dimen-
sional) vector spaces over R. Assume that T is a map from V to W , where we
consider V and W as sets. If T respects the structure of the underlying vector
spaces, then T is called linear transformation. More precisely:
Definition 10.1. Let V, W be vector spaces over R. Then a map T : V → W is
said to be a linear transformation (or a linear map) if and only if:
Proof. We have
T (0V ) = T (0R · 0V ) by Lemma 4.2,
= 0R · T (0V ) by (L2),
| {z }
∈W
= 0W by Lemma 4.2.
Remark. The set of all R-linear maps from a vector space U to a vector space
V is denoted by HomR (U, V ). The set HomR (U, V ) is in fact an R-vector space
where the first property of the last proposition gives the scalar multiplication
and addition of this vector space. If one considers the set HomR (U, U ) – also
denoted by EndR (U ) – then this is a so-called ring (a slightly more general object
than what a field is). The second property of the last proposition defines the
multiplication of this ring. Rings are studied in the second year algebra course.
Theorem 10.8. Let V, W be vector spaces over R. Let {v1 , . . . , vn } be a basis
of V and {w1 , . . . , wn } a set of vectors in W. Then there is precisely one linear
transformation T : V → W with T (vi ) = wi for 1 ≤ i ≤ n. Moreover
It should be noted that in Theorem 10.8 the assumption that {v1 , . . . , vn } is a basis
is very important. To see this the reader is advised to try out examples where
{v1 , . . . , vn } is either a linearly dependent set of vectors or it is not spanning V .
Example 10.9. (a) Choose vectors
1 2 1 0
v1 = , v2 = , w1 = , w2 = .
1 2 0 1
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 47
Note that {v1 , v2 } is linearly dependent and is not spanning V . It is easily seen
that for the vectors choosen above, there is no linear map T : R2 → R2 with
T (vi ) = wi for i = 1, 2.
(b) Choose vectors
1 2 1 2
v1 = , v2 = , w1 = , w2 = .
1 2 0 0
Define
x x
T( )= ,
y 0
x x
S( )= .
y x−y
Then both S and T are linear with T (vi ) = wi and S(vi ) = wi .
Exercise 31. Which of the following mappings T : R3 → R3 are linear transfor-
mations:
Proof.
Proof. “⇒”: Assume T is injective. Let v ∈ ker(T ). Then T (v) = 0, and hence
by Lemma 10.5: T (v) = T (0). Since T is injective, this implies v = 0. Hence
ker(T ) = {0}.
“⇐”: Let v1 , v2 ∈ V with T (v1 ) = T (v2 ). By Lemma 10.5:
T (v1 − v2 ) = T (v1 ) − T (v2 ) = 0.
So v1 − v2 ∈ ker(T ) = {0}. Hence v1 = v2 , which proves that T is injective.
Proof.
Corollary 11.7. Between finite dimensional vector spaces V and W , there exists
a bijective linear map (called isomorphism) T : V → W if and only if dim V =
dim W.
Exercise 34. Describe the kernel and image of each of the following linear trans-
formations, and in each case give the rank and nullity of the transformation:
(i) Show that there is precisely one linear map f : R3 → R4 with f (ai ) = bi
for i = 1, 2, 3, 4.
(ii) Describe the kernel and the image of f and give the rank and the nullity
of f .
Exercise 38. Let V be an n-dimensional vector space and let S and T be linear
transformations on V .
Note that by Proposition 7.3, the coefficients aij are uniquely determined. Hence
the matrix A is well-defined.
Definition 12.1. Matrix A is called the matrix of T (or the matrix representing
T , or corresponding to T ) with respect to bases B1 and B2 . Write A = MBB21 (T ).
Example 12.2. Let T : Rn [x] → Rn [x] be differentiation. Let B1 = B2 =
{1, x, . . . , xn }. Then
T (1) = 0 = 0 · 1 + 0 · x + · · · + 0 · xn ,
T (x) = 1 = 1 · 1 + 0 · x + · · · + 0 · xn ,
T (x2 ) = 2x = 0 · 1 + 2 · x + · · · + 0 · xn ,
..
.
T (x ) = nxn−1 = 0 · 1 + 0 · x + · · · + nxn−1 + 0 · xn .
n
Recall the definition of a coordinate vector, given in the remark after Proposi-
tion 7.3.
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 53
Proof. Follows from Proposition 10.7 and Definition 12.1 The details are left as
an exercise to the reader.
B1 = {u1 , . . . , un } be a basis of U,
B2 = {v1 , . . . , vm } be a basis of V,
B3 = {w1 , . . . , wk } be a basis of W.
54 ANNE HENKE
Then
(ST )(ui ) = S(T ui )
= S(a1i v1 + . . . + ami vm ) by Equation (8),
= a1i S(v1 ) + . . . + ami S(vm ) since S is linear,
= a1i (b11 w1 + . . . + bk1 wk ) by Equation (9),
+a2i (b12 w1 + . . . + bk2 wk )
..
.
+ami (b1m w1 + . . . + bkm wk )
= (a1i b11 + a2i b12 + . . . + ami b1m )w1 (by reordering)
+(a1i b21 + a2i b22 + . . . + ami b2m w2
..
.
+(a1i bj1 + a2i bj2 + . . . + ami bjm )wj
..
.
+(a1i bk1 + a2i bk2 + . . . + ami bkm )wk
Pk Pm
= j=1 ( l=1 bjl ali ) wj .
Pm
Hence cji = l=1 bjl ali . Hence C = B · A.
In the rest of this section, we consider some special cases and applications of the
results obtained so far in this section.
Definition 12.7. Let id : V → V be the identity map, and let B1 and B2 be
bases of V. We call MBB21 (id) the base change matrix associated with the change
of basis from basis B1 to basis B2 .
Proposition 12.8. Let V be a vector space. Let x be the coordinate vector of v
with respect to basis B1 . Let y be the coordinate vector of v with respect to basis
B2 . Then y = MBB21 (id)x.
Theorem 12.9. Let T : V → W be linear. Let BV1 and BV2 be bases of V, BW1
B B B B
and BW2 be bases of W. Then MBWV2 (T ) = MBWW1 (id) ◦ MBWV1 (T ) ◦ MBVV1 (id)−1 .
2 2 1 2
Proof. Consider the composition of maps T = idW ◦ T ◦ idV with respect to the
following bases:
id T id
V −−→ V −−→ W −−→ W.
basis BV2 basis BV1 basis BW1 basis BW2
Then by Theorem 12.5 and Corollary 12.6 the claim follows:
B B
MBWV2 (T ) = MBWV2 (idW ◦ T ◦ idV )
2 2
B B B
= MBWW1 (idW ) ◦ MBWV1 (T ) ◦ MBVV2 (idV ) by 12.5,
2 1 1
B B B
= MBWW1 (idW ) ◦ MBWV1 (T ) ◦ MBVV1 (idV )−1 by 12.6.
2 1 2
(ii) We take new bases for R2 and R3 , say B3 = {w1 , w2 } and B4 = {z1 , z2 , z3 }
with
z1 = v1 + v2 ,
w1 = u1 − 2u2 ,
z2 = v2 + v3 ,
w2 = u1 + u2 ,
z3 = v1 + v3 .
What is MBB43 (T )?
(iii) Note
v1 = 21 (z1 − z2 + z3 ),
(12) v2 = 12 (z2 − z3 + z1 ),
v3 = 12 (z3 + z2 − z1 ).
Then
T w1 = T u1 − 2T u2
= (v1 − 3v3 ) − 2(2v1 + v2 − v3 ) by Eq. (11),
= −3v1 − 2v2 − v3
= − 23 (z1 − z2 + z3 ) − (z2 − z3 + z1 ) − 21 (z3 + z2 − z1 ) by Eq. (12),
= −2z1 + 0z2 − 1z3
Similarly,
T w2 = T u1 + T u2
= (v1 − 3v3 ) + (2v1 + v2 − v3 )
= 3v1 + v2 − 4v3
3
= (z − z2 + z3 ) + 21 (z2 − z3 + z1 ) − 2(z3 + z2 − z1 )
2 1
= 4z1 − 3z2 − z3
−2 4
Hence B = MBB43 (T ) = 0 −3 .
−1 −1
(iv) What is the relationship between A and B? Let Q = MBB42 (id) and let
P = MBB31 (id). Then by Equation (12) we have:
1 1 −1
1
Q = −1 1 1 .
2 1 −1 1
Moreover, from (ii) we see that
1 1
P −1
= MBB13 (id) = .
−2 1
It is now easily seen that indeed B = QAP −1 .
Exercise 39. Let E2 and E3 denote the canonical bases for R2 and R3 respec-
tively, that is E2 = {(1, 0), (0, 1)} and E3 = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}. Let
f : R2 → R3 and g : R3 → R2 be given by
f (x, y) = (x + 2y, x − y, 2x + y),
g(x, y, z) = (x − 2y + 3z, 2y − 3z).
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 57
(a) Determine the matrices MEE32 (f ), MEE23 (g), MEE22 (g ◦ f ) and MEE33 (f ◦ g)
representing the linear maps f, g, g ◦ f and f ◦ g with respect to bases E2
and E3 .
(b) Show that g ◦ f is bijective and determine MEE22 ((g ◦ f )−1 ).
Exercise 40. Consider the vector spaces R3 and R2 with the bases B and C
respectively, where
B = {(1, 1, 0), (1, −1, 1), (1, 1, 1)} and C = {(1, 1), (1, −1)}.
(a) Determine the matrices MBE3 (id), MEB3 (id) representing the identity map
on R3 , and determine the matrices MCE2 (id), MEC2 (id) representing the
identity map on R2 .
(b) For the linear maps f and g in the previous question, determine the ma-
trices MBC (f ), MCB (g), MCC (g ◦ f ) and MBB (f ◦ g) representing the linear
maps f, g, g ◦ f and f ◦ g with respect to bases B and C.
Exercise 41. Let n ∈ N. Consider the vector space Rn [x] of polynomials of
degree at most n. Let Bn = {1, x, . . . , xn }. Define Dn : Rn [x] → Rn−1 [x] by
f 7→ f ′ , where f ′ denotes the first derivative of f .
The last part of this lecture course deals with how to solve system of linear
equations. We will study in particular when a system of linear equations has
precisely one solution. In this section we introduce the row reduced echelon form
of a matrix.
Definition 13.1. An m × n matrix M is in row reduced echelon form if
(1) The zero rows of M (if any) all come below the non-zero rows.
(2) In each non-zero row the leading entry (= the left most non-zero entry) is
one.
(3) If row i and row i + 1 are non-zero, then the leading entry of row i + 1 is
strictly to the right of the leading entry of row i.
(4) If a column contains a leading entry of a non-zero row, then all its other
entries are zero.
Example 13.2. Matrices in row reduced echelon form are
1
0 1 0 0 0 3 1
0 , , 0 0 0 , 0 1 , 0 1 4 0 −2 0 .
0
0 0 0 0 0 0
0 0 0 0 1 0 1
0
Or more generally, the following matrix is in row reduced echelon form:
0 0 1 0 0 0
0 1
··· ∗ ··· ∗ ∗ ··· ∗ ∗ ··· ∗ ∗ ··· ∗ ··· ∗
B .. .. .. .. .. .. .. C
B
B . . 0 ··· ··· 0 1 ∗ ··· ∗ 0 . . . . . C
C
B .. .. C
. 0 0 1 0 .
B C
B ··· ∗ ··· ∗ ∗ C
B
B .. C
C.
B
B . 0 ··· ··· 1 ∗ ∗ C
C
B
B 0 ··· ··· 0 ··· ··· 0 ··· ··· 0 C
C
B .. .. C
@ . . A
0 ··· ··· ··· ··· ··· ··· ··· ··· ··· 0
Remark 13.3. If A ∈ Mm×n (R) is in row reduced echelon form then for each k
with 1 ≤ k ≤ n, the matrix made from the first k columns is also in row reduced
echelon form.
Theorem 13.5. Every m×n-matrix may be brought to a row reduced echelon form
by applying elementary row operations. The row reduced echelon form obtained is
unique.
Proof. For existence, see Algorithm 13.9 below. We omit the proof for the
uniqueness of the row reduced echelon form obtained.
(1)
0 2 −1
M1 = e1 = R2 −→ 21 R2
2 4 8
0 2 −1
M2 = e2 = R1 ←→ R2
1 2 4
1 2 4
M3 = e3 = R2 −→ 21 R2
0 2 −1
1 2 4
M4 = e4 = R1 −→ R1 − 2R2
0 1 − 21
1 0 5
M5 =
0 1 − 12
(2)
0 1 0
M1 = 0 0 1 e1 = R3 −→ 12 R3
2 2 0
0 1 0
M2 = 0 0 1 e2 = R1 ←→ R3
1 1 0
1 1 0
M3 = 0 0 1 e3 = R2 ←→ R3
0 1 0
1 1 0
M4 = 0 1 0 e4 = R1 −→ R1 − R2
0 0 1
1 0 0
M5 = 0 1 0
0 0 1
Remark 13.8. Given matrix M = (mij ). We will apply eros to M using the
following language:
(1) If mij 6= 0 then we can normalise by Ri −→ (mij )−1 Ri , so that the (i, j)th
entry becomes 1.
(2) We can move an entry mij up and down in its column by Ri ←→ Rv .
(3) If mij = 1 then we can purge all other entries in column j (i.e. make them
zero) by applying Type III operations: Rs −→ Rs − msj Ri , for s 6= i. The
element mij = 1 used to “clean out” the rest of the column is called the
pivot of the purging operation.
Algorithm 13.9 (for reducing a matrix to row reduced echelon form by eros).
or
1 ∗ ··· ∗
. ..
0 ..
.
M2 = .. .. ..
. . .
0 ∗ ··· ∗
Stage k: We start with matrix Mk of the form
Mk = (Ak | Bk )
where Ak is an m × (k − 1)-matrix in row reduced echelon form and Bk is
an m × (n − k + 1)-matrix.
Case 1: If Ak = 0 (has no non-zero rows), then apply the Stage 1 process to
the first column of Bk . We obtain Matrix Mk+1 .
Case 2: If Ak has m non-zero rows then stop altogether. In this case Mk is
already in row reduced echelon form.
Case 3: Write
Ek Fk
Mk = ,
0 Gk
where Ek consists of the non-zero rows of Ak , Fk is the continuation
of these non-zero rows of Ek . As Case 1 did not occur, this implies
Ek has at least one row. As Case 2 did not occur, this implies Gk has
at least one row. Note Ek is in row reduced echelon form. Inspect
the first column of Gk in Mk . If it is zero, then stop and Stage k is
complete. Otherwise select the first non-zero element, normalise and
move it to the top left hand corner of Gk . Use it as a pivot to purge
the kth column of Mk . Stop. Stage k is complete. We obtain matrix
Mk+1 .
Remarks. (a) This algorithm proves the first part of Theorem 13.5. We have not
shown uniqueness of the row reduced echelon form in Theorem 13.5.
(b) Note that the algorithm has been applied in Example 13.7. However many
more inbetween steps have been given in these examples, and hence the labelling
of the matrices Mi does not agree with the labelling of matrices used in Algo-
rithm 13.9.
Example 13.10. We perform Algorithm 13.9 on matrix M1 given below. There
are three steps in the algorithm and the matrices Mi from the algorithm are given
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 63
by:
1 0 1 −2 2
2 1 1 −2 3
M1 =
3 1 3 −6 4
1 0 2 −4 1
1 0 1 −2 2
0 1 −1 2 −1
M2 =
0 1 0 0 −2
0 0 1 −2 −1
1 0 1 −2 2
0 1 −1 2 −1
M3 =
0 0 1 −2 −1
0 0 1 −2 −1
1 0 0 0 3
0 1 0 0 −2
M4 =
0 0 1 −2 −1
0 0 0 0 0
It follows that the row rank of the matrices Mi for 1 ≤ i ≤ 4 is three.
Exercise 45. Find the row reduced echelon form of the matrices A and B where
2 −2 2 1 2 2 −1 6 4
A = −3 6 0 −1 , B= 4 4 1 10 13 .
1 −7 10 2 8 8 −1 26 23
64 ANNE HENKE
(b)
x + 2y − 3z = −1
−3x + y − 2z = −7 ,
5x + 3y − 4z = 2
(c)
x + 2y − 3z = 1
2x + 5y − 8z = 4 .
3x + 8y − 13z = 7
Exercise 47. Use the Gaussian elimination algorithm to do the following:
(i) Span{(3, −2, −5, 4), (−5, 2, 8, −5)} + Span{(−2, 4, 7, −3), (2, −3, −5, 8)},
(ii) Span{(1, −2, 5, −3), (4, −4, 6, −3)} + Span{(3, 4, 0, 1), (−3, 8, −2, 1)}.
68 ANNE HENKE
Notation 15.1. Recall Definitions 11.1 and 11.4 where we defined the image,
kernel, rank and nullity of a linear transformation. Let A be an m × n matrix.
Consider the map fA : Rn → Rm given by fA (x) = Ax. Note that fA is a linear
map. We define:
imA := imfA image of A, ker A := ker fA null space of A,
rkA := dim(imA) rank of A, n(A) := dim(ker A) nullity of A.
Proposition 15.2. Let A ∈ Mn (R), x ∈ Mn×1 (R). The following are equivalent:
Proof. (1) ⇒ (2): Bring the augmented matrix (A | 0) into row reduced echelon
form, say (E | 0). So Ax = 0 if and only if Ex = 0. As Ax = 0 has a unique
solution, this implies that Ex = 0 has a unique solution. Hence there are no
parameters in the general solution of Ex = 0. Hence each of the columns of E
contains a leading entry. This implies E = In .
(2) ⇒ (3): Bring (A | b) into row reduced echelon form. Since A has row reduced
echelon form In , it follows that (A | b) has row reduced echelon form (In | c). So
Ax = b if and only if In x = c. But In x = c has unique solution x = c. Hence
Ax = b has a unique solution.
(3) ⇒ (1): By assumption, Ax = b has a unique solution for every b ∈ Mn×1 (R).
Take b = 0, then Ax = 0 has a unique solution.
Remark: Note that the conditions in the last proposition are also equivalent to
nullity of A is zero; to row rank of A is n; to rank of A is n (use the rank-nullity
formula).
Next, recall that a square matrix A ∈ Mn (R) is invertible if and only if there
exists B ∈ Mn (R) such that AB = BA = In . Also recall that if A is invertible
then B is uniquely determined. Write B = A−1 .
Proposition 15.3. Suppose A ∈ Mn (R) is satisfying one of the conditions in
Proposition 15.2. Then there exists B ∈ Mn (R) such that AB = In . (We say that
B is a right inverse of A.)
(1) We show that B has a right inverse C: Assume we have a vector x with
Bx = 0. Then: 0 = A · 0 = ABx = In x = x. Hence Bx = 0 has
unique solution x = 0. By Proposition 15.3, there exists C ∈ Mn (R) with
BC = In .
(2) Note that C = In C = (AB)C = A(BC) = AIn = A. Hence AB = In and
BA = In and so, by definition, A is invertible with A−1 = B.
Proof. This follows from Proposition 15.3 and Proposition 15.4 and from show-
ing that (4) implies (1). Let B be the left inverse of A, so BA = In . Assume that
x is a solution of Ax = 0. Then 0 = B · 0 = BAx = In x = x. Hence Ax = 0 has
x = 0 as a unique solution.
Algorithm 15.6. (for calculating the inverse of an n×n matrix A, or for declaring
A to have no inverse)
Input: Given matrix A ∈ Mn (R).
Step1: Form the augmented n × 2n matrix M = (A | In ).
Step2: Bring M into row reduced echelon form (E | F ) with E, F ∈ Mn (R).
Output:
[Denote by B the row reduced echelon form of a given matrix A. Then the rank
of A is defined to be the number of non-zero rows of B.]
Exercise 51. The n × n Van der Monde matrix is the matrix A defined by
1 x1 x21 · · · x1n−1
1 x2 x22 · · · x2n−1
A= ... .. .. . . ..
. . . .
1 xn x2n · · · xnn−1
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 71
an−1
be a solution to the simultaneous equations Ax = 0. Show that x1 , . . . , xn are all
roots of the polynomial a0 + a1 x + a2 x2 + . . . + an−1 xn−1 .]
72 ANNE HENKE
Proof. We sketch a proof for the type II elementary row operation. We leave
it as an exercise to the reader to formally write down the matrix multiplications.
Similarly, to prove the claim for the other types of elementary row operations is
left as an exercise to the reader. Let A = (aij ) and let e be scalar multiplication
of row i by λ ∈ R (Type II ero). Then e(Im ) = diag(1, . . . , 1, λ, 1, . . . , 1) with λ
in row and column i. Hence
1 0 ... 0 a11 a12 . . . a1n
. ... ..
0 ..
.
. ..
1 .. .
. . . .
e(Im ) · A = .. .. λ .. ..
· ai1 ai2 . . . ain
. .
.. ..
1
. .
.
. 0 .. .
..
Proof.
Proposition 16.5. Matrices A and B are row equivalent if and only if there
exists an invertible matrix P with B = P A.
Proof. “⇒”: Since A and B are row equivalent, there are elementary row
operations ei (for 1 ≤ i ≤ t) such that
(13) B = e1 e2 · · · et (A).
Let Ei = ei (I) for 1 ≤ i ≤ t. Let P = E1 E2 · · · Et . By Lemma 16.4 we have P is
invertible, and by Equation (13) and Lemma 16.3 we have B = P A.
“⇐”: Suppose A and B are matrices with B = P A where P is invertible. Since
P is invertible, it follows by Corollary 15.5 and Proposition 15.2, that the row
reduced echelon form of P is the identity matrix I. This means, P can be brought
to row reduced echelon form I by applying elementary row operations. So P =
Es Es−1 · · · E1 for some elementary matrices Ei with 1 ≤ i ≤ s. Then B = P A =
Es Es−1 · · · E1 A, and so B can be obtained from A by applying elementary row
operations corresponding to E1 , E2 , . . . , Es .
Remark 16.6. Why is Algorithm 15.6 (for inverting a matrix or for declaring
it to be non-invertible) working? If A ∈ Mn (R) is invertible, then by Corollary
15.5 and Proposition 15.2 matrix A is row equivalent to In . Hence there exist
elementary matrices Ei for 1 ≤ i ≤ k such that
In = Ek Ek−1 · · · E1 A.
As A is invertible, we can multiply this equation by A−1 from the right to get:
A−1 = Ek Ek−1 · · · E1 In .
The meaning of this last equation is the following: the elementary row operations
used to get the row reduced echelon form of A, if applied to In , give precisely the
inverse of A.
74 ANNE HENKE
Exercise 52. (a) Write the following matrix C as a product of elementary ma-
trices:
1 1 1
C = 1 2 2 .
1 2 3
(b) Given are matrices A and B with
1 1 0 1 0 0
A = 1 0 2 , B = 0 1 0 .
2 1 2 0 0 0
Find matrices P and Q such that P AQ = B.
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 75
(2) Two matrices A and B are column equivalent, if we can pass from A to
B by a sequence of elementary column operations.
Definition 17.2. Let A = (aij ) be an m × n matrix with entries in R.
(1) We define the row space of A as the vector space spanned by the rows
ai = (ai1 , ai2 , . . . , ain ) of A, for 1 ≤ i ≤ m. We consider the row space as
a subspace of Rn .
(2) We define the column space of A as the vector space spanned by the
columns ai = (a1i , a2i , . . . , ami ) of A, for 1 ≤ i ≤ m. We consider the
column space as a subspace of Rm .
Remark. Note that the column space of a matrix B is in general not equal to the
column space of a matrix A, if matrix B is obtained from matrix A by elementary
row operations. And the row space of a matrix B is in general not equal to the
76 ANNE HENKE
Recall that by Definition 13.6, the row rank of a matrix A equals the number of
non-zero rows in the row reduced echelon form of A.
Definition 17.7. We define the column rank of a matrix A to be equal to the
number of non-zero columns in the column reduced echelon form of A.
Corollary 17.8. Let A be an m×n matrix over R. Then the row rank of A equals
the dimension of the row space of A. Moreover, the column rank of A equals the
dimension of the column space of A, which in turn equals the rank of A.
Proof. (1) Let B be the corresponding row reduced echelon form of the given
matrix A. Note that the non-zero row vectors in B are linearly independent.
Hence the dimension of the row space of B equals the number of non-zero rows
of B. By Proposition 17.4, it follows that the row rank of A equals the dimension
of the row space of A.
(2) The proof that the column rank is equal to the dimension of the column space
is similar to the argument given in (1).
(3) By definition, the rank of a matrix A equals the dimension of the image of
A, see 15.1. Let {e1 , . . . , en } be the canonical basis of Rn . Then the image of A
is equal to Span{Ae1 , . . . , Aen }. Note that Aei equals precisely the ith column of
A. Hence the image of A is equal to the column space of A. This implies that
the rank of A is equal to the column rank of A.
Row and column rank are equal. We define elementary matrices for elemen-
tary column operations, similar to the elementary matrices defined for elementary
row operations (see Section 16). Note that when applying an elementary column
operation to a matrix A, it means multiplying A with the corresponding ele-
mentary matrix from the right (not from the left.) We hence have similar to
Proposition 16.5:
Proposition 17.9. Matrices A and B are column equivalent if and only if there
exists an invertible matrix Q with B = A · Q.
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 77
Theorem 17.10. Let A ∈ Mm×n (R) with row rank r. Then there exists an in-
vertible matrix P ∈ Mm (R) and an invertible matrix Q ∈ Mn (R) such that
Ir 0
P AQ = .
0 0
Step 2: Take the r columns containing the leading entries of the non-zero rows
and use them to make up the first r columns. This can be done by elementary
column operations. We obtain the matrix
Ir ∗
.
0 0
Step 3: Use the leading entry of each non-zero row to purge the first r rows. This
means applying elementary column operations. We obtain the matrix
Ir 0
.
0 0
To calculate P and Q, keep track of the elementary row operations and elementary
column operations applied (see the proof of Proposition 16.5).
Theorem 17.11. Let A ∈ Mm×n (R). Then the row rank of A equals the column
rank of A (equals the rank of A).
Proof. Let r be the row rank of A. Then by Theorem 17.10, there exist invertible
matrices P, Q with
Ir 0
P AQ = .
0 0
Step 1: Bring A to row reduced echelon form E and apply the same elementary
row operations to I2 :
1 0 1 2 1
R2 → R2 + 2R1
0 1 −2 −4 1
1 0 1 2 1
R2 → 13 R2
2 1 0 0 3
1 0 1 2 1
2 1 R1 → R1 − R2
3 3
0 0 1
1
3
− 31 1 2 0
2 1 .
3 3
0 0 1
Hence for
1
3
− 31
P = 2 1
3 3
we have P A = E.
Step 2/3: Bring E to column reduced echelon form and apply the same elementary
column operations to I3 :
1 0 0
1 2 0 0 1 0 C2 ↔ C3
0 0 1
0 0 1
1 0 0
1 0 2 0 0 1 C3 → C3 − 2C1
0 1 0
0 1 0
1 0 −2
1 0 0 0 0 1 .
0 1 0
0 1 0
HONOURS MODERATION LINEAR ALGEBRA I, 2008/09 79
1 0 −2
1 0 0
Then taking Q = 0 0 1 , we have P AQ = .
0 1 0
0 1 0
Exercise 53. Determine the row rank and the column rank of the following
matrix:
1 2 3 ... n
2 3 4 ... n + 1
X= ... .. .. ... .. .
. . .
n n + 1 n + 2 . . . 2n − 1
Exercise 54. Matrix U comes from matrix A by subtracting row one from row
three:
1 3 2 1 3 2
A= 0 1 1 U = 0 1 1
1 3 2 0 0 0