Professional Documents
Culture Documents
Preface
These lecture notes contain more–less verbatim what appears on the board during the lectures.
The colour–coding signifies the following: what is written in red is being defined. The named theorems
and statements are denoted by blue (these could be referred to by these names). Finally, green is used
for references to lecture notes of other modules (mostly MATH1048 Linear Algebra I).
Acknowledgement
For the most part, these notes were written by Dr Bernhard Koeck, and typed into LATEX by the
undergraduate student Thomas Blundell-Hunter in summer 2011.
Contents
1 Groups 2
3 Bases 12
4 Linear Transformations 19
5 Determinants 28
6 Diagonalisability 31
2 1 GROUPS
1 Groups
Recall that the cartesian product G × H of two sets G and H is defined as the set
G × H := {(g, h) : g ∈ G, h ∈ H}
=⇒ a = c.
=⇒ b ∗ a = b ∗ c = e (because c is a right inverse of b);
i.e. b is also a left inverse of a.
Suppose b0 is another right inverse of a.
=⇒ b = b ∗ e = b ∗ (a ∗ b0 ) = (b ∗ a) ∗ b0 = e ∗ b0 = b0 . (These equalities hold by definition of e, by
definition of b0 , by associativity, by the fact just proved and by definition of e, respectively.)
(b) For every a ∈ G both the map G → G, x 7→ a ∗ x, (left translation with a) and G → G, y 7→ y ∗ a
(right translation with a) are bijective. In other words: For all a, b ∈ G both the equation a ∗ x = b
and y ∗ a = b have a unique solution in G.
(c) Both (a−1 )−1 and a are solutions of the equation a−1 ∗ x = e (see 1.3(b)). Now apply (b).
(d) We have (a ∗ b) ∗ (b−1 ∗ a−1 ) = a ∗ (b ∗ (b−1 ∗ a−1 )) = a ∗ ((b ∗ b−1 ) ∗ a−1 ) = a ∗ (e ∗ a−1 ) = a ∗ a−1 =
e. =⇒ b−1 ∗ a−1 is a right inverse of a ∗ b.
Example 1.5. (a) The group table of a finite group G = {a1 , a2 , . . . , an } is given by:
∗ ... aj ...
.. .. ..
. . .
ai ... ai ∗ aj ...
.. ..
. .
The trivial group is the group with exactly one element, say e. This must have the group table:
∗ e
e e
Any group with 2 elements, say e and a, is given by the group table:
∗ e a
e e a
a a e
4 1 GROUPS
Any group with 3 elements, say e, a and b, must have the group table:
∗ e a b
e e a b
a a b e
b b e a
(Notice: a ∗ b = a would imply b = e by 1.4(a).) There are two “essentially different” groups of
order 4.
Note that Proposition 1.4(b) implies that the group tables must satisfy “sudoku rules”, i.e. every
group element must appear in each row and each column exactly once. However, not every table
obeying this sudoku rule defines a group, for instance the table
e a b c d
a e c d b
b c d e a
c d a b e
d b e a c
(c) Let S be a set (such as S = {1, . . . , n} for some n) and let Sym(S) denote the set of all bijective
maps π : S → S, also called permutations of S. For any σ, π ∈ Sym(S) we define σ◦π by (σ◦π)(s) =
σ(π(s)) for s ∈ S (composition of permutations). Then Sym(S) together with composition as the
binary operation is a group, the permutation group of S. The identity element of Sym(S) is the
identity function and the inverse of π ∈ Sym(S) is the inverse function π −1 (seen in Calculus I). If
1 GROUPS 5
1 2
... n
S = {1, . . . , n} for some n ∈ N we write Sn for Sym(S) and for π ∈ Sn .
π(1) π(2) . . . π(n)
−1
1 2 3 4 1 2 3 4
For example we have : = in S4 .
3 1 2 4 2 3 1 4
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
◦ = in S5 .
3 4 1 2 5 2 3 5 1 4 4 1 5 3 2
Definition 1.6. Let n ≥ 1 and s ≤ n. Let a1 , . . . , as ∈ {1, . . . , n} be pairwise distinct. The permutation
π ∈ Sn with
π(a1 ) = a2 , π(a2 ) = a3 , . . . , π(as−1 ) = as , π(as ) = a1
and
π(a) = a for a ∈ {1, . . . , n} not belonging to {a1 , . . . , as }
is denoted by ha1 , . . . , as i. Any permutation of this form is called a cycle. If s = 2 it is called a
transposition. The number s is called the length (or order) of the cycle.
1 2 3 4 5 6 1 2 3 4
E.g. h3, 1, 5, 2i = in S6 ; h3, 2i = in S4 .
5 3 1 4 2 6 1 3 2 4
Proposition 1.7. Every permutation σ ∈ Sn is a composition of cycles.
1 2 3 4 5 6 7 8 9 10 11
Example 1.8. (a) Let σ = ∈ S11 .
2 5 6 9 1 4 10 11 3 7 8
=⇒ σ = h1, 2, 5i ◦ h3, 6, 4, 9i ◦ h7, 10i ◦ h8, 11i.
1 2 3 4
(b) Let σ = ∈ S4 ; =⇒ σ = h1, 4i.
4 2 3 1
General Recipe (which with a bit of effort can be turned into a proof of 1.7).
Let a := 1 and choose m ∈ N minimal such that
σ m (a) ∈ {a, σ(a), σ 2 (a), . . . , σ m−1 (a)}. (We then in fact have σ m (a) = a.)
Let σ1 be the cycle ha, σ(a), . . . , σ m−1 (a)i.
Now let b be the smallest number in {1, . . . , n}\{a, σ(a), . . . , σ m−1 (a)} and, as above, choose l ∈ N
minimal such that σ l (b) ∈ {b, σ(b), . . . , σ l−1 (b)} and put σ2 := hb, σ(b), . . . , σ l−1 (b)i.
Continuing this way we find σ = σ1 ◦ σ2 ◦ . . . is a composition of cycles.
Definition 1.9. Let n ≥ 1 and σ ∈ Sn . We write σ = σ1 ◦ σ2 ◦ . . . as a composition of cycles of length
s1 , s2 , . . . . Then
sgn(σ) := (−1)(s1 −1)+(s2 −1)+... ∈ {±1}
is called the sign (or signum) of σ.
E.g. let σ be as in 1.8(a), then sgn(σ) = (−1)2+3+1+1 = (−1)7 = −1.
If σ is a transposition then sgn(σ) = −1.
Theorem 1.10. (a) The definition of sgn(σ) does not depend on the chosen cycle decomposition of σ.
(b) For all σ, π ∈ Sn we have sgn(σ ◦ π) = sgn(σ)sgn(π).
Proof. See Group Theory in Year 2.
6 2 FIELDS AND VECTOR SPACES
Definition 2.1. A field is a set F together with binary operations on F , called addition and multipli-
cation, such that F together with addition and F × := F \{0} together with multiplication are abelian
groups (we use the notations a + b, 0, −a and a · b, 1, a−1 respectively) and such that the following axiom
holds:
Distributivity: For all a, b, c ∈ F we have a(b + c) = ab + ac in F .
Example 2.2. (a) The sets Q, R and C with the usual addition and multiplication are fields (see also
(1.2(b))).
(b) The set Z with the usual addition and multiplication is not a field because for instance there is no
multiplicative inverse of 2 in Z.
(c) The set F2 := {0, 1} together with the following operations is a field.
+ 0 1 · 0 1
0 0 1 0 0 0
1 1 0 1 0 1
Proof. (F2 , +) and (F2 \{0}, ·) are abelian groups (see (1.5(a))).
Distributivity: We need to check a(b + c) = ab + ac for all a, b, c ∈ F2 .
1st Case: a = 0 =⇒ LHS = 0, RHS = 0 + 0 = 0
2nd Case: a = 1 =⇒ LHS = b + c = RHS
(d) (without proof) Let p be a prime. The set Fp := {0, 1, . . . , p − 1} together with the addition defined
in Example (1.5(b)) and the following multiplication is a field:
(a) 0a = 0 in F .
Definition 2.4. Let F be a field. A vector space over F is an abelian group V (where binary operation
is written additively, i.e. (z, y) 7→ x + y) together with map F × V → V (called scalar multiplication and
written as (a, x) 7→ ax) such that the following axioms are satisfied:
The elements of V are often called vectors. We write 0F and 0V for the identity element of F and V ,
respectively, and often just 0 for both 0F and 0V . Furthermore we use the notation u − v for u + (−v)
when u, v are both in V .
Example 2.5. (a) For every n ∈ N the set Rn together with the usual addition and scalar multi-
plication (as seen in Linear Algebra I) is a vector space over R. Similarly, for any field F , the
set
F n := {(a1 , . . . , an ) : a1 , . . . , an ∈ F }
together with componentwise addition and the obvious scalar multiplication is a vector space over
F ; e.g. F22 = {(0, 0), (0, 1), (1, 0), (1, 1)} is a vector space over F2 , F = F 1 is a vector space over
F, {0} = F 0 is a vector space over F .
(b) Let V be the additive group of R or C. We view the usual multiplication R × V → V, (a, x) 7→ ax,
as scalar multiplication of R on V . Then V is a vector space over Q (easy to check!). Similarly C
is a vector space over R.
(c) Let V denote the abelian group R (with the usual addition). For a ∈ R and x ∈ V we put
a ⊗ x := a2 x ∈ V ; this defines a scalar multiplication
R × V → V, (a, x) 7→ a ⊗ x,
of the field R on V . Which of the vector-space axioms (see 2.4) hold for V with this scalar
multiplication?
Solution:
Proposition 2.6. Let V be a vector space over a field F and let a, b ∈ F and x, y ∈ V . Then we have:
(a) (a − b)x = ax − bx
(b) a(x − y) = ax − ay
(c) ax = 0V ⇐⇒ a = 0F or x = 0V
(d) (−1)x = −x
The next example is the “mother” of almost all vector spaces. It vastly generalises the fourth of the
following five ways of representing vectors and vector addition in R3 .
z y
AA
3
2
(I)
1
b r
x
h
a h H
H
h
1a 2 hh 3 hra + b
aa hhhh
aa
−1 ar
a a
1 2 3 1 2 3 1 2 3
(III) a = b= a+b=
2.5 0 −1 1 1 0.5 3.5 1 −0.5
A
3
2
(V)
1
H
1 2 3
−1
F S := {f : S → F }
denote the set of all maps from S to F . We define an addition on F S and a scalar multiplication of F
on F S as follows. Let f, g ∈ F S and a ∈ F ; then:
Special Cases:
(a) Let S = {1, . . . , n}. Identifying any map f : {1, . . . , n} → F with the corresponding tuple
(f (1), . . . , f (n)) we see that F S is nothing but the set F n of all n-tuples (a1 , . . . , an ) considered in
Example 2.5(a).
(b) Let S = {1, . . . , n} × {1, . . . , m}. Identifying any map f : {1, . . . , n} × {1, . . . , m} → F with the
corresponding matrix:
f ((1, 1)) . . . f ((1, n))
.. ..
. .
f ((m, 1)) ... f ((m, n))
we see that F S is nothing but the set Mn×m (F ) of (n × m)-matrices.
a11 . . . a1m
.. ..
. .
an1 . . . anm
(c) Let S = N. Identifying any map f : N → F with the sequence (f (1), f (2), f (3), . . .) we see that F N
is nothing but the set of all infinite sequences (a1 , a2 , a3 , . . .) in F .
(d) Let F = R and let S be an interval I in R. Then F S = RI is the set of all functions f : I → R.
Such functions are usually visualised by their graph in I × R ⊆ R × R.
Identity element: Let 0 denote the constant function S → F, s 7→ 0F . For any f ∈ F S we have
(f + 0)(s) = f (s) + 0(s) = f (s) + 0F = f (s) for all s ∈ S, hence f + 0 = f . Similarly we have 0 + f = f .
=⇒ 0 is an identity element.
Similarly one easily checks the three remaining axioms of Definition 2.4.
Definition 2.8. Let V be a vector space over a field F . A subset W of V is called a subspace of V if
the following conditions hold:
10 2 FIELDS AND VECTOR SPACES
(a) 0V ∈ W .
(c) W is closed under scalar multiplication, i.e. for all a ∈ F and x ∈ W we have ax ∈ W .
Note that condition (b) states that the restriction of the addition in V to W gives a binary operation
W × W → W on W (addition in W ). Similarly, condition (c) states that the scalar multiplication of F
on V yields a map F × W → W which we view as a scalar multiplication of F on W .
Proposition 2.9. Let V be a scalar space over a field F and let W be a subspace of V . Then W together
with the above mentioned addition and scalar multiplication is a vector space over F .
Proof. The following axioms hold for W because they already hold for V :
• associativity of addition;
• commutativity of addition;
There exists an additive identity element in W by condition 2.8(a). It remains to show that additive
inverse exists: Let x ∈ W ; then −x = (−1)x (by 2.6(d)) is in W by condition 2.8(c) and −x satisfies the
condition of an additive inverse of x because it does so in V .
Example 2.10. (a) Examples of subspaces of Rn as seen in Linear Algebra I, such as the nullspace
of any real (m × n)-matrix.
(b) The set of convergent sequences is a subspace of the vector space RN of all sequences (a1 , a2 , a3 , . . .)
in R. A subspace of this subspace (and hence of RN ) is the set of all sequences in R that converge
to 0. (See Calculus I for proofs).
(c) Let A ∈ Ml×m (R). Then W := {B ∈ Mm×n (R) : AB = 0} is a subspace of Mm×n (R).
Proof. i We have A · 0 = 0 =⇒ 0 ∈ W .
ii Let B1 , B2 ∈ W
=⇒ A(B1 + B2 ) = AB1 + AB2 = 0 + 0 = 0
=⇒ B1 + B2 ∈ W .
iii Let a ∈ R and B ∈ W
=⇒ A(aB) = a(AB) = a0 = 0
=⇒ aB ∈ W .
(d) Let I be a non-empty interval in R. The following subsets of the vector space RI consisting of all
functions from I to R are subspaces:
(e) The subset Zn of the vector space Rn over R is obviously closed under addition but not closed under
scalar multiplication because, for instance, (1, 0, . . . , 0) ∈ Zn and 21 ∈ R, but 12 (1, 0, . . . , 0) ∈
/ Zn .
(f) The subsets W1 := {(a, 0) : a ∈ R} and W2 := {(0, b) : b ∈ R} are obviously subspaces of R2 . The
subset W := W1 ∪ W2 of the vector space R2 is obviously closed under scalar multiplication but
not under addition because, for instance, (1, 0) and (0, 1) are in W1 but
(1, 0) + (0, 1) = (1, 1) ∈
/ W.
Proposition 2.11. Let W1 , W2 be subspaces of a vector space V over a field F . Then the intersection
W1 ∩ W2 and the sum
W1 + W2 := {x1 + x2 ∈ V : x1 ∈ W1 , x2 ∈ W2 }
are subspaces of V as well.
Proof. For W1 ∩ W2 :
• We have 0V ∈ W1 and 0V ∈ W2 (because W1 and W2 are subspaces)
=⇒ 0V ∈ W1 ∩ W2 (by the definition of intersection).
• Let x, y ∈ W1 ∩ W2
=⇒ x, y ∈ W1 and x, y ∈ W2 (by the definition of intersection)
=⇒ x + y ∈ W1 and x + y ∈ W2 (because W1 and W2 are subspaces)
=⇒ x + y ∈ W1 ∩ W2 (by the definition of intersection).
• Let a ∈ F and x ∈ W1 ∩ W2
=⇒ x ∈ W1 and x ∈ W2 (by the definition of intersection)
=⇒ ax ∈ W1 and ax ∈ W2 (because W1 and W2 are subspaces)
=⇒ ax ∈ W1 ∩ W2 (by the definition of intersection).
For W1 + W2 :
• We have 0V = 0V + 0V ∈ W1 + W2 .
• Let x, y ∈ W1 + W2
=⇒ ∃x1 , y1 ∈ W1 and ∃x2 , y2 ∈ W2 such that x = x1 + x2 , y = y1 + y2 .
=⇒ x + y = (x1 + x2 ) + (y1 + y2 ) = (x1 + y1 ) + (x2 + y2 ) ∈ W1 + W2
(because W1 and W2 are subspaces).
• Let a ∈ F and x ∈ W1 + W2
=⇒ ∃x1 ∈ W1 , x2 ∈ W2 s.t. x = x1 + x2
=⇒ ax = a(x1 + x2 ) = ax1 + ax2 ∈ W1 + W2
(because W1 and W2 are subspaces).
3 Bases
Definition 3.1. Let V be a vector space over a field F . Let x1 , . . . , xn ∈ V .
(a) An element x ∈ V is called a linear combination of x1 , . . . , x2 if there are a1 , . . . , an ∈ F such that
x = a1 x1 + . . . + an xn .
(b) The subset of V consisting of all linear combinations of x1 , . . . , xn is called the Span of x1 , . . . , xn
and denoted by Span(x1 , . . . , xn ); i.e.
(c) We say that x1 , . . . , xn span (or generate) V or that x1 , . . . xn form a spanning set of V if V =
Span(x1 , . . . , xn ), i.e. every x ∈ V is a linear contribution of x1 , . . . , xn .
(See also Sections 6.4 and 6.5 of L.A.I.)
Example 3.2. (a) Let V := Mn×m (F ) be the vector space of (n × m)-matrices with entries in F . For
i ∈ {1, . . . , n} and j ∈ {1, . . . , m} let Eij denote the (n × m)-matrix with zeroes everywhere except
at (ij) where it has the entry 1. Then the matrices Eij ; i = 1, . . . , n; j = 1, . . . , m form a spanning
set of V .
1
i
(b) Do the vectors , ∈ C2 span the vector spaces C2 over C?
i 2
w
Solution: Let ∈ C2 be an arbitrary vector.
z
1
i w
We want to find a1 , a2 ∈ C such that a1 + a2 =
i 2 z
1 i w 1 i
R27→R2−iR1 w
Hence ∼∼∼∼∼∼∼∼∼>
i 2 z 0 3 z − iw
As inLinear Algebra I we conclude that this system is solvable. (Thm 3.15 of L.A.I.)
1
i
=⇒ , Span C2 .
i 2
• Let x, y ∈ Span(x1 , . . . , xn );
=⇒ ∃a1 , . . . , an ∈ F, ∃b1 , . . . bn ∈ F such that
x = a1 x1 + . . . + an xn and y = b1 x1 + . . . + bn xn ;
=⇒ x + y = (a1 x1 + . . . + an xn ) + (b1 x1 + . . . + bn xn )
= (a1 x1 + b1 x1 ) + . . . + (an xn + bn xn )
(using commutativity and associativity)
= (a1 + b1 )x1 + . . . + (an + bn )xn (using 1st distributivity law)
∈ Span(x1 , . . . , xn ) (by definition)
Span(x1 , . . . , xn ) contains x1 , . . . , xn :
because xi = 0F x1 + . . . 0F xi−1 + 1 · xi + 0F xi+1 + . . . + 0F xn
Span(x1 , . . . , xn ) is smallest:
Let W be a subspace of V s.t. x1 , . . . , xn ∈ W .
Let x ∈ Span(x1 , . . . , xn ); write x = a1 x1 + . . . + an xn with a1 , . . . an ∈ F .
We have a1 x1 , . . . , an xn ∈ W (by condition 2.8(c))
=⇒ x = a1 x1 + . . . + an xn ∈ W (by condition 2.8(b)).
Definition 3.4. Let V be a vector space over a field F . Let x1 , . . . , xn ∈ V . We say that x1 , . . . , xn
are linearly independent (over F ) if the following condition holds: whenever a1 , . . . , an ∈ F and a1 x1 +
. . . + an xn = 0 then a1 = . . . = an = 0. Otherwise we say that x1 , . . . , xn are linearly dependent.
A linear combination a1 x1 + · · · + an xn is called trivial if a1 = · · · = an = 0, otherwise it is called
non-trivial. (See also Section 6.5 of L.A.I.)
Note: x1 , . . . , xn are linearly dependent ⇐⇒ ∃a1 , . . . , an ∈ F , not all zero, such that a1 x1 + · · · +
an xn = 0. In other words, ⇐⇒ there exists a non-trivial linear combination of x1 , . . . , xn which equals
0.
Example 3.5. (a) Examples as seen in Linear Algebra I. (Section 6.5 of L.A.I.)
1 1 0
(b) The three vectors x1 = 0 , x2 = 1 , x3 = −1 ∈ F 3 are not linearly independent because
1 0 1
x1 − x2 − x3 = 0.
1 0
c1
(c) Determine all vectors c2 ∈ C3 such that x1 := i , x2 := 1 ,
c3 1 0
c1
x3 := c2 ∈ C3 are linearly dependent.
c3
Solution: We apply Gaussian elimination to
14 3 BASES
1 0 c1 1 0
R27→R2−iR1
c1
i 1 c2 ∼∼∼∼∼∼∼∼∼> 0 1 c2 − ic1
R37→R3−R1
1 0 c3 0 0 c3 − c1
=⇒ The equation a1 x1 + a2 x2 + a3 x3 = 0 has a non-trivial solution (a1 , a2 , a3 ) if and only if
c3 − c1 = 0
=⇒ x1 , x2 , x3 are linearly dependent if and only if c1 = c3 .
(e) Let I ⊆ R be a non-empty open interval. For any i ∈ N0 , let ti denote the polynomial function
I → R, s 7→ si . Then 1, t, t2 , . . . , tn are linearly independent in RI .
Proposition 3.6. Any non-zero polynomial function f : R → R of degree n has at most n zeroes.
Proof. Induction on n:
n ≤ 1:
The polynomial a0 + a1 t has no zeroes if a1 = 0 and a0 6= 0 and precisely one zero (namely −a0 a−1
1 ) if
a1 6= 0.
n−1 n, n ≥ 2: Let f ∈ Pn .
Suppose s0 ∈ R is a zero of f . (If f doesn’t have a zero, we are done.)
=⇒ ∃g ∈ Pn−1 such that f (s) = g(s)(s − s0 ) for all s ∈ R
(for instance by long division)
=⇒ The zeroes of f are s0 and those of g.
(Let s1 ∈ R, s1 6= s0 s.t. f (s1 ) = 0 =⇒ g(s1 )(s1 − s0 ) = 0 =⇒ g(s1 ) = 0)
=⇒ f has as most n zeroes
(because g has at most n − 1 zeroes by the inductive hypothesis).
(b) Every subset of any set of linearly independent vectors is linearly independent again. (This is
equivalent to: If a subset of a set of vectors is linearly dependent, then the set itself is linearly
dependent.)
(c) Let x1 , . . . , xn ∈ V and suppose that xi = 0V for some i ∈ {1, . . . , n} or that xi = xj for some
i 6= j. Then x1 , . . . , xn are linearly dependent.
(d) If x1 , . . . , xn ∈ V are linearly dependent then at least one vector xi among x1 , . . . , xn is a linear
combination of the other ones.
Definition 3.8. Let V be a vector space over a field F . Let x1 , . . . , xn ∈ V . We say that x1 , . . . , xn
form a basis of V if x1 , . . . , xn both span V and are linearly independent.
Example 3.9. (a) The vectors e1 := (1, 0, . . . , 0); . . . ; en := (0, . . . , 0, 1) form a basis F n , called the
canonical or standard basis of F n (as in Linear Algebra I (Ex 6.41(a))).
(b) The polynomials 1, t, . . . , tn form a basis of Pn . (see 2.10(d)iv, 3.2(c) and 3.5(e)).
1 −1 3 2
2 −1 6 7
A :=
3 −2 9
∈ M4×4 (R).
9
−2 0 −6 −10
We perform row
Solution: operations:
1 −1 3 2 1 −1 3 2
2 −1 6 7 →R2−2R1 0 1 0 3
A= ∼R27∼∼∼∼∼∼∼∼>
3 −2 9 9 0 1 0 3
R37→R3−3R1
−2 0 −6 −10 R47→R4+2R1 0 −2 0 −6
1 0 3 5
R17→R1+R2 0 1 0 3
∼∼∼∼∼∼∼∼∼> =: A
R37→R3−R2 0 0 0 0
e
R47→R4+2R2
0 0 0 0
=⇒ N (A) =
∈ R : Aa = 0} = {a ∈ R : Aa = 0}
{a 4 4 e
a1
a
= ∈ R : a1 = −3a3 − 5a4 ; a2 = −3a4
2 4
a3
a
4
−3a3 − 5a4
−3a4
= : a 3 , a4 ∈ R
a3
a
4
−3 −5
0
−3
= a3 1
+ a4 : a3 , a4 ∈ R
0
0 1
−3 −5
0 −3
= Span 1 , 0 ;
0 1
16 3 BASES
−3 −5
0 −3
=⇒ , form a basis of N (A) (they are linearly independent because they are not
1 0
0 1
multiples of each other).
(e) Let A ∈ Mn×n (R). Then:
A invertible ⇐⇒ The columns of A form a basis of Rn
(for a proof see Theorem 5.6).
Proposition 3.10. Let V be a vector space over a field F . Let x1 , . . . , xn ∈ V . The following statements
are equivalent:
(a) x1 , . . . , xn form a basis of V .
(b) x1 , . . . , xn form a minimal spanning set of V (i.e. x1 , . . . , xn span V and after removing any vector
from x1 , . . . , xn the remaining ones don’t span V any more). (Compare Def 6.23 of L.A.I.)
(c) x1 , . . . , xn form a maximal linearly independent subset of V (i.e. x1 , . . . , xn are linearly independent
and for x ∈ V the n + 1 vectors x1 , . . . , xn , x are linearly dependent).
(d) Every vector x ∈ V can be written in the form
x = a1 x1 + . . . + an xn
Definition 3.13. Let V be a vector space over a field F . If x1 , . . . , xn ∈ V form a basis of V we say
that V is of finite dimension and call n the dimension of V we write dimF (V ) or just dim(V ) for n.
Note that n does not depend on the chosen basis x1 , . . . , xn by 3.12.
Corollary 3.16. Let V be a vector space over a field F . Let x1 , . . . xn ∈ V . Suppose two of the following
three statements hold. Then x1 , . . . , xn form a basis:
(b) x1 , . . . , xn span V .
(c) n = dimF (V ).
1 1
−2
Example 3.17. (a) The vectors x1 := 2 , x2 := 1 , x3 := 0 form a basis of the vector
3 0 1
space Q3 over Q, of the vector space R3 over R and of the vector space C3 over C.
∼∼∼∼∼∼∼∼∼∼> 0 5 −2
R37→R3−6/5R2
0 0 2/5
=⇒ c = (c1 , c2 , c3 ) = (0, 0, 0) is the only solution of Ac = 0.
=⇒ x1 , x2 , x3 are linearly independent over C and then also over R and Q.
=⇒ x1 , x2 , x3 form a basis of C3 , R3 and Q3
(by 3.16 because 3 = dimC (C3 ) = dimR (R3 ) = dimQ (Q3 )).
4 Linear Transformations
Let F be a field (e.g. F = R, Q, C or F2 ).
am1 . . . amn
with entries aij in F . We use the notation Mm×n (F ) for the set of all (m × n) - matrices over F (see
also 2.7(b)). We define addition and multiplication of matrices (and other notions) in the same way as
in the caseF = R (as seen
in Linear Algebra
I). e.g.
1 1+i 1−i 1 − i + 3 + 3i 4 + 2i
= = (matrices over C)
2 1 − i 3 2 − 2i + 3 − 3i 5 − 5i
0 1 1 1 0 1
= (as matrices over F2 )
1 1 0 1 1 0
0 1 1 1 0 1
but = (as matrices over R)
1 1 0 1 1 2
Definition 4.2. Let V, W be vector spaces over a field F . A map L : V → W is called a linear
transformation if the following two conditions hold:
LA :F n → F m
a11 x1 + . . . + a1n xn
x1
x = ... 7→ Ax = ..
.
xn am1 x1 + . . . + amn xn
is a linear transformation.
a 0
E.g. if A = ∈ M2×2 (R) for some a ∈ R then LA : R2 → R2 is given by x 7→ ax (stretch
0 a
by a factor
of a);
cos(φ) sin(φ)
if A = for some 0 ≤ φ < 2π then LA : R2 → R2 is the clockwise rotation by
− sin(φ) cos(φ)
the angle φ.
Proof. i Let x, y ∈ F n ;
=⇒ LA (x + y) = A(x + y) = Ax + Ay = LA (x) + LA (y)
ii Let a ∈ F and x ∈ F n
=⇒ LA (ax) = A(ax) = a(Ax) = a(LA (x))
(The middle equality of both chains of equalities has been proved in Linear Algebra I if F = R,
See Thm 2.13(i) and (ii), the same proof works for any field F . The whole statement is actually
Lemma 5.3 in L.A.I.)
20 4 LINEAR TRANSFORMATIONS
(b) Let V be a vector space over a field F . Then the following maps are linear transformations:
id : V → V, x 7→ x (identity);
0 : V → V, x 7→ 0V (zero map);
the map V → V, given by x 7→ ax, for any given a ∈ F fixed (stretch).
(c) Let L : V → W and M : W → Z be linear transformation between vector spaces over a field F .
Then their composition M ◦ L : V → Z is again a linear transformation.
i Let x, y ∈ V ;
=⇒ (M ◦ L)(x + y) = M (L(x + y)) = M (L(x) + L(y))
= M (L(x)) + M (L(y)) = (M ◦ L)(x) + (M ◦ L)(y).
ii Let a ∈ F and x ∈ V ;
=⇒ (M ◦ L)(ax) = M (L(ax)) = M (a(L(x)))
= a(M (L(x))) = a(M ◦ L)(x).
(d) Let V be the subspace of RR consisting of all differentiable functions. Then differentiation D :
V → RR , f 7→ f 0 , is a linear transformation.
Proof. i Let f, g ∈ V ;
=⇒ D(f + g) = (f + g)0 = f 0 + g 0 = D(f ) + D(g)
ii Let a ∈ R and f ∈ V ;
=⇒ D(af ) = (af )0 = af 0 = a(D(f ))
(The middle equality in both chains of equalities has been proved in Calculus.)
x
(e) The map L : R → R, 1
2
7→ x1 x2 , is not a linear transformation.
x2
1
Proof. Let a = 2 and x = ∈ R2 . Then:
! 1
2
L(ax) = L = 4 but aL(x) = 2 · 1 = 2.
2
Proposition 4.4 (Matrix representation I). Let F be a field. Let L : F n → F m be a linear transforma-
tion. Then there exists a unique matrix A ∈ Mm×n (F ) such that L = LA (as defined in 4.3(a)). In this
case we say that A represents L (with respect to the standard bases of F n and F m ).
1 0
0 ..
Proof. Let e1 := . , . . . , en := . denote the standard basis of F n .
.. 0
0 1
Uniqueness: Suppose A ∈ Mm×n (F ) satisfies L = LA .
=⇒ The j th column of A is Aej = LA (ej ) = L(ej ) (for j = 1, . . . , n)
=⇒ A is the (m × n)-matrix with the column vector L(e1 ), . . . , L(en ).
Existence: Let A be defined this way. We want to show L = LA .
4 LINEAR TRANSFORMATIONS 21
c1
Let c = ... ∈ F n =⇒ c = c1 e1 + . . . + cn en ;
cn
=⇒ L(c) = L(c1 e1 ) + . . . + L(cn en ) = c1 L(e1 ) + . . . + cn L(en )
and LA (c) = ... = c1 LA (e1 ) + . . . + cn LA (en )
(because L and LA are linear transformations)
=⇒ L(c) = LA (c) (because L(ej ) = LA (ej ) for all j = 1, . . . , n).
Definition 4.5. Let L : V → W be a linear transformation between vector spaces V, W over a field F .
Then
ker(L) := {x ∈ V : L(x) = 0W }
is called the kernel of L and
im(L) := {y ∈ W : ∃x ∈ V : y = L(x)}
ker(LA ) = N(A)
when N(A) = {c ∈ F n : Ac = 0} denotes the nullspace of A (see also Section 6.2 of L.A.I.) and
im(LA ) = Col(A)
Lemma 4.7. Let V and W be vector spaces over a field F and let L : V → W be a linear transformation.
Then:
Example 4.8. Let A ∈ M4×4 (R) be as in 3.9(d). Find a basis of image, im(LA ), of LA : R4 → R4 , c 7→
Ac.
22 4 LINEAR TRANSFORMATIONS
1 3 2 1 0 0 0
−1
2 6 −1 7 C27→C2+C1 2 1 0 3
A =
3
∼∼∼∼∼∼∼∼∼>
9 9 C37→C3−3C1 3 1 0 3
−2
−2 0
−6 −10 C47→C4−2C1
−2 −2 0 −6
1 0 0 0
C47→C4+3C2 2 1 0 0
∼∼∼∼∼∼∼∼∼> =: A
3 1 0 0
e
−2 −2 0 0
1 0
2 1
3 , 1 span im(LA )
=⇒ The two vectors
−2 −2
(because im(LA ) = Col(A) = Col(Ã) by 4.6 and 3.3(b)).
=⇒ They form a basis on im(LA ) (because they are also L.I., as they are not multiples of each other).
Proposition 4.9. Let V and W be vector spaces over a field F and let L : V → W be a linear
transformation. Let x1 , . . . , xn ∈ V . Then:
(a) If x1 , . . . , xn ∈ V span V then L(x1 ), . . . , L(xn ) span im(L).
(b) If L(x1 ), . . . , L(xn ) are linearly independent then x1 , . . . , xn are linearly independent.
Proof. (a) Let y ∈ im(L);
=⇒ ∃x ∈ V such that y = L(x)
and ∃a1 , . . . , an ∈ F such that x = a1 x1 + . . . + an xn (since V = Span(x1 , . . . , xn ))
=⇒ y = L(x) = L(a1 x1 + . . . + an xn ) =
a1 L(x1 ) + . . . + an L(xn ) ∈ Span(L(x1 ), . . . , L(xn ));
=⇒ im(L) ⊆ Span(L(x1 ), . . . , L(xn ));
=⇒ im(L) = Span(L(x1 ), . . . , L(xn )) (i.e. L(x1 ), . . . , L(xn ) span im(L))
(because Span(L(x1 ), . . . , L(xn )) is the smallest subspace containing L(x1 ), . . . , L(xn ) (by 3.3)).
Proposition 4.10 (Kernel Criterion). Let V and W be vector spaces over a field F , and let L : V → W
be a linear transformation. Then:
L injective ⇐⇒ ker(L) = {0V }.
Proof. “=⇒”: )
Let x ∈ ker(L) =⇒ L(x) = 0W
=⇒ x = 0V .
We also have L(0V ) = 0W
“⇐=”: Let x, y ∈ V such that L(x) = L(y);
=⇒ L(x − y) = L(x) − L(y) = 0W ;
=⇒ x − y = 0V (since ker(L) = {0V });
=⇒ x = y.
Definition 4.11. Let V, W be vector spaces over a field F . A bijective linear transformation L : V → W
is called an isomorphism. The vector spaces V and W are called isomorphic if there exists an isomorphism
L : V → W ; we then write V ∼= W.
Example 4.12. (a) For any vector space V over a field F , the identity id : V → V is an isomorphism.
4 LINEAR TRANSFORMATIONS 23
an
a4
is clearly an isomorphism.
Proposition 4.13. Let V be a vector space over a field F with basis x1 , . . . xn .
Then the map
a1
L : F n → V, ... 7→ a1 x1 + . . . + an xn
an
is an isomorphism. (We will later use the notation Ix1 ,...,xn for the map L.)
a1 b1
.. ..
Proof. (a) Let a = . and b = . ∈ F n ;
an b
n
a1 + b1
=⇒ L(a + b) = L ...
an + bn
= (a1 + b1 )x1 + . . . + (an + bn )xn (by definition of L)
= (a1 x1 + . . . + an xn ) + (b1 x1 + . . . + bn xn )
(by distributivity, commutativity and associativity)
= L(a) + L(b) (by definition of L).
b1
..
(b) Let a ∈ F and b = . ∈ F n ;
bn
ab1
=⇒ L(ab) = L ...
abn
= (ab1 )x1 + . . . + (abn )xn (by definition of L)
= a(b1 x1 + . . . + bn xn ) (using the axioms of a vector space)
= a(L(b)) (by definition of L).
24 4 LINEAR TRANSFORMATIONS
0
a1
(c) ker(L) = ... ∈ F n : a1 x1 + . . . + an xn = 0V = ...
0
an
(because x1 , . . . , xn are linearly independent)
=⇒ L is injective (by 4.10).
Theorem 4.14. Let V and W be vector spaces over a field F of dimension n and m, respectively. Then
V and W are isomorphic if and only if n = m.
Theorem 4.15 (Dimension Theorem). Let V be a vector space over a field F of finite dimension and
let L : V → W be a linear transformation from V to another vector space W over F . Then:
Example 4.16.
We want to find dimR (ker(LA )) and dimR (im(LA )) – we do this by finding bases for both ker(LA ) and
im(LA ).
We find a basis of the nullspace N (A):
1 2 3 −1 1 0 −1 3
−2 −2
Gaussian elimination
A = −3 6 −1 1 −7 −−−−−−−−−−−−−→ 0 0 0 0 0 =: Ã.
2 −4 5 8 −4 0 0 1 2 −2
4 LINEAR TRANSFORMATIONS 25
Then
N (A) = {x ∈ R5 : Ax = 0} = {x ∈ R5 : Ãx = 0} =
x1
x2
= x 3
∈ R 5
: x 1 = 2x 2 + x 4 − 3x 5 , x 3 = −2x 4 + 2x 5 =
x
4
x5
2x2 + x4 − 3x5
x2
= + 2x : =
−2x 4 5
x 2 , x4 , x5 ∈ R
x
4
x5
2 1
−3
1
0 0
= x2 0 + x4 −2 + x5 2 : x2 , x4 , x5 ∈ R =
0 1 0
0 0 1
2 1
−3
1 0 0
0 , −2 , 2 .
= Span
0 1 0
0 0 1
2 1 0
−3
1 0 0 0
0 + a2 −2 + a3 2 = 0 ,
a1
0 1 0 0
0 0 1 0
1 −2 2 3 −1 1 0 0 0 0
column operations
A = −3 6 −1 1 −7 −−−−−−−−−−−→ −3 0 5 0 0 .
2 −4 5 8 −4 2 0 1 0 0
1 0
So the vectors −3 , 5 span im(LA ). Since they are obviously not multiples of each other, they are
2 1
linearly independent, hence form a basis of im(LA ). Consequently, dimR (im(LA )) = 2.
as we wanted to prove.
L(xr+1 ), . . . , L(xn ) form a basis of im(L):
• L(xr+1 ), . . . , L(xn ) span im(L): Let y ∈ imL.
=⇒ ∃x ∈ V such that y = L(x) (by definition of im(L))
and ∃a1 , . . . , an ∈ F s.t. x = a1 x1 + · · · + an xn (since x1 , . . . , xn span V ).
=⇒ y = L(x) = L(a1 x1 + · · · + an xn )
26 4 LINEAR TRANSFORMATIONS
cn
c1 d1
.. ..
and (Iy1 ,...,ym ◦ LA ) . = Iy1 ,...,ym . = d1 x1 + · · · + dm ym .
cn dm
Hence: L(c1 x1 + . . .+cn x
n )= d1 y1 + . . . + dm ym
c1 c1
.. ..
⇐⇒ (L ◦ Iy1 ,...,ym ) . = (Iy1 ,...,ym ◦ LA ) .
cn cn
Therefore: A represents L ⇐⇒ L ◦ Ix1 ,...,xn = Iy1 ,...,ym ◦ LA
4 LINEAR TRANSFORMATIONS 27
D(1) = 0 = 0 + 0t + 0t2
D(t) = 1 = 1 + 0t + 0t2
D(t2 ) = 2t = 0 + 2t + 0t2
D(t3 ) = 3t2 = 0 + 0t + 3t2
0 10 0
=⇒ A = 0 00 2
0 03 0
1
−1
Example 4.19. Let B := ∈ M2×2 (R). Find the matrix A ∈ M2×2 (R) representing the linear
2 4
1 1
transformation LB : R2 → R2 , x 7→ Bx, with respect to the basis , of R2 (used for both
−2 −1
source and target space).
Solution:
1 1 1 3 1 1
−1
LB = = =3 +0
2 2 4 −2 −6 −2 −1
1 1 1 2 1 1
−1
LB = = =0 +2
−1 2 4 −1 −2 −2 −1
3 0
=⇒ A = .
0 2
28 5 DETERMINANTS
5 Determinants
In Linear Algebra I the determinant of a square matrix has been (inductively) defined via expanding
along a row (or a column). Here we begin with the following closed formula; it is obtained by iterated
expansion (without proof).
Definition 5.1 (Leibniz). Let F be a field. Let n ≥ 1 and A = (aij )i,j=1,...,n ∈ Mn×n (F ). Then
X n
Y
det(A) := sgn(σ) ai,σ(i) ∈ F
σ∈Sn i=1
det(A) = sgn(id)a11 a22 + sgn(h1, 2i)a12 a21 = a11 a22 − a12 a21 .
1 + 2i 3 + 4i
E.g. let A = ∈ M2×2 (C);
1 − 2i 2 − i
=⇒ det(A) = (1 + 2i)(2 − i) − (3 + 4i)(1 − 2i)
= (2 + 2 + 4i − i) − (3 + 8 + 4i − 6i) = −7 + 5i.
a11 a12 a13
(c) Let n = 3 and A = a21 a22 a23 ∈ M3×3 (F );
a31 a32 a33
=⇒ S3 = {id, h1, 2, 3i, h1, 3, 2i, h1, 3i, h2, 3i, h1, 2i} and det(A) =
sgn(id) a11 a22 a33 + sgn(h1, 2, 3i) a12 a23 a31 + sgn(h1, 3, 2i) a13 a21 a32
| {z } | {z } | {z }
=+1 =+1 =+1
+ sgn(h1, 3i) a13 a22 a31 + sgn(h2, 3i) a11 a23 a32 + sgn(h1, 2i) a12 a21 a33 .
| {z } | {z } | {z }
=−1 =−1 =−1
Proof. Let σ ∈ Sn , σ 6= Q
id =⇒ ∃i0 ∈ {1, . . . , n} such that i0 > σ(i0 );
n
=⇒ ai0 ,σ(i0 ) = 0; =⇒ Qi=1 ai,σ(i) = 0.
n
=⇒ det(A) = sgn(id) i=1 ai,id(i) = a11 a22 · · · ann .
5 DETERMINANTS 29
Theorem 5.3 (Weierstrass’ axiomatic description of the determinant map). (See Defn 4.1 of L.A.I.)
Let F be a field. Let n ≥ 1. The map
(b) Multiplying any column of a matrix A ∈ Mn×n (F ) with a scalar λ ∈ F changes det(A) by the
λ:
factor
λa1s a1s
det C .. ..
D = λ · det C
. . D
λans ans
Theorem 5.6. Let F be a field. Let A ∈ Mn×n (F ). Then the following are equivalent:
(a) A is invertible.
(b) The columns of A span F n .
30 5 DETERMINANTS
Proof. “(a) =⇒ (b)”. We’ll prove the following more precise statement:
(?) ∃B ∈ Mn×n (F ) s.t. AB = In ⇐⇒ The columns of A span F n .
Proof of (?): Leta1 . . . , an denote
the columns of A
1 0
0 ..
and let e1 = . , . . . , en = . , i.e. In = (e1 , . . . , en ). Then:
.. 0
0 1
LHS ⇐⇒ ∃b1 , . . . , bn ∈ F n s.t. Abi = ei for all i = 1, . . . , n
(use B = (b1 , . . . , bn ));
⇐⇒ ∃b1 , . . . , bn ∈ F n s.t. a1 bi1 + . . . + an bin = ei for all i = 1, . . . , n;
⇐⇒ e1 , . . . , en ∈ Span(a1 , . . . , an ) ⇐⇒ RHS.
“(b) ⇐⇒ (d)”. We apply column operations to the matrix A until we arrive at a lower triangular
matrix C. Then:
The columns of A span F n ;
⇐⇒ the columns of C span F n (by 3.3(b));
⇐⇒ all the diagonal elements of C are non-zero (because C is triangular)
⇐⇒ det(C) 6= 0 (by 5.2(d))
⇐⇒ det(A) 6= 0 (because det(C) = λ det(A) for some non-zero λ ∈ F by 5.5).
Theorem 5.7. (See also Thm 4.21 of L.A.I.) Let F be a field. Let A, B ∈ Mn×n (F ). Then:
For each m
∈ N compute det(A ) ∈ C
m
Example 5.8.
1 + 4i 1
where A = ∈ M2×2 (C).
5+i 1−i
6 Diagonalisability
Definition 6.1. Let V be a vector space over a field F and let L : V → V be a linear transformation
from V to itself.
(b) An element λ ∈ F is called an eigenvalue of L if Eλ (L) is not the zero space. In this case any
vector x in Eλ (L) different from the zero vector is called an eigenvector of L with eigenvalue λ.
(c) Let A ∈ Mn×n (F ). The eigenspaces, eigenvalues and eigenvectors of A are, by definition, those of
LA : F n → F n , x 7→ Ax.
Lemma 6.2. Let F , V and L be as in Def 6.1. Then Eλ (L) is a subspace of V for every λ ∈ F .
λ is an eigenvalue of A ⇐⇒ det(λIn − A) = 0.
(pA (λ) := det(λIn − A) is called the characteristic polynomial of A.) (See also Proposition 7.5 in L.A.I.)
Proof. λ is an eigenvalue of A
⇐⇒ ∃x ∈ F n , x 6= 0, such that Ax = λx
⇐⇒ ∃x ∈ F n , x 6= 0, such that (λIn − A)x = 0
⇐⇒ N (λIn − A) 6= {0}
⇐⇒ det(λIn − A) = 0 (by 5.6(c) ⇐⇒ (d)).
5i 3
A := ∈ M2x2 (C)
2 −2i
Solution.
λ − 5i
−3
pA (λ) = det(λI2 − A) = det
−2 λ + 2i
= (λ − 5i)(λ + 2i) − 6√ = λ2 − (3i)λ + 4
The two roots of this polynomial are λ1,2 = 3i± −9−16
2 = 3i±5i
2 = 4i or − i;
=⇒ eigenvalues of A are 4i and −i.
(3i)x2 3i
= : x2 ∈ C = Span
x2 1
3i
=⇒ basis of E4i (A) is .
1
Basis of E−i(A): Apply Gaussian elimination to
−6i −3 R1↔R2 −2 i R27→R2−(3i)R1 −2 i
−iI2 − A = ∼∼∼∼∼> ∼∼∼∼∼∼∼∼∼∼∼>
−2 i −6i −3 0 0
x1
=⇒ E−i (A) = ∈ C2 : x1 = 2i x2 =
x2
i i
2 x2 : x2 ∈ C = Span 2
x 1
2
i
=⇒ basis of E−i (A) is .
2
Example 6.5. Let V be the real vector space of infinitely often differentiable functions from R to R
and let D : V → V, f 7→ f 0 , denote differentiation (cf. 4.3(d)). Then for every λ ∈ R the eigenspace of
D with eigenvalue λ is of dimension 1 with basis given by the function expλ : R → R, t 7→ eλt .
Definition 6.6. (a) Let F , V and L : V → V be as in Def 6.1. We say that L is diagonalisable if there
exists a basis x1 , . . . , xn of V such that the matrix D representing L with respect to this basis is a
diagonal matrix.
(b) Let F be a field. We say that a square matrix A ∈ Mn×n (F ) is diagonalisable if the linear
transformation LA : F n → F n , x 7→ Ax, is diagonalisable.
Proposition 6.7. Let F , V and L : V → V be as in Def 6.1. Then L is diagonalisable if and only if V
has a basis x1 , . . . , xn consisting of eigenvectors of L.
L
V −−→ V
=⇒ The diagram I ↑ ↑I commutes (see proof of 4.17)
n LD n
F −−−−→ F
=⇒ L(xi ) = L(I(ei )) = I(LD (ei )) = I(λi ei ) = λi xi
=⇒ x1 , . . . , xn are eigenvectors of L with eigenvalues λ1 , . . . , λn , respectively.
if part: Let x1 , . . . , xn be a basis of V consisting of eigenvectors of L and let λi ∈ F denote the eigenvalue
corresponding to xi .
6 DIAGONALISABILITY 33
0
λ1
Define the diagonal matrix D by D = .. .
.
0 λn
=⇒ D represents L with respect to x1 , . . . , xn because
L(c1 x1 + . . .+ cn xn
) = c1 L(x1 ) + . . . + cn L(xn ) = λ1 c1 x1 + . . . + λn cn xn
λ1 c1 c1
and ... = D ... ∀c1 , . . . , cn ∈ F.
λn cn cn
Proposition 6.8. Let F be a field. Let A ∈ Mn×n (F ). Then A is diagonalisable if and only if there
exists an invertible matrix M ∈ GLn (F ) such that M −1 AM is a diagonal matrix. (In this case we say
that M diagonalises A.)
0 −1 1
is diagonalisable and find an invertible matrix M ∈ GL3 (R) that diagonalises it.
Solution.
1
λ −1
pA (λ) = det(λI3 − A) = det 3 λ + 2 −3
2 2 λ−3
= λ(λ + 2)(λ − 3) + 1(−3)2 + (−1)3 · 2 − (−1)(λ + 2)2 − λ(−3)(2) − 1 · 3(λ − 3)
= λ(λ2 − λ − 6) − 6 − 6 + 2λ + 4 + 6λ − 3λ + 9
= λ3 − λ2 − λ + 1 = λ2 (λ − 1) − (λ − 1) = (λ2 − 1)(λ − 1) = (λ − 1)2 (λ + 1).
=⇒ Eigenvalues of A are 1 and −1.
Basis of E1 (A): We apply Gaussian elimination to
34 6 DIAGONALISABILITY
1 1 −1 1 1 −1
R27→R2−3R1
I3 − A = 3 3 −3 ∼∼∼∼∼∼∼∼∼> 0 0 0
R37→R3−2R1
2 2 −2 0 0 0
−b2 + b3
b1
=⇒ E1 (A) = b2 ∈ R3 : b1 + b2 − b3 = 0 = b2 : b2 , b3 ∈ R
b3 b3
1 1
−1 −1
= b2 1 + b3 0 : b2 , b3 ∈ R = Span 1 , 0 ;
0 1 0 1
1
−1
=⇒ Basis of E1 (A) is x1 := 1 , x2 := 0.
0 1
Basis of E −1 (A): We apply Gaussian elimination
to −I3 − A
−1 1 −1 1 −1 0 12
R27→R2+3R1
−1 −1 R7→R1− 41 R2
= 3 1 −3 ∼∼∼∼∼∼∼∼∼> 0 4 −6 ∼∼∼∼∼∼∼∼∼> 0 4 −6
R37→R3+2R1 R37→R3−R2
2 2 −4 0 4 −6 0 0 0
1b
=⇒ E−1 (A) = b2 ∈ R3 : −b1 + 12 b3 = 0 and 4b2 − 6b3 = 0
b3
1 1
2 b3 2
= 32 b3 : b3 ∈ R = Span 32 ;
b3 1
1
2
=⇒ Basis of E−1 (A) is x3 := 32 .
1
−1 1 12
(a) The algebraic multiplicity aλ (A) of λ is its multiplicity as a root of the characteristic polynomial
of A.
(b) The geometric multiplicity gλ (A) of λ is the dimension of the eigenspace Eλ (A).
1 1
(b) Let A = ∈ M2×2 (F )
0 1
1
=⇒ pA (λ) = (λ − 1) and a basis of E1 (A) is
2
0
=⇒ a1 (A) = 2 but g1 (A) = 1.
Theorem 6.12. Let F be a field. Let A ∈ Mn×n (F ). Then A is diagonalisable if and only if the
characteristic polynomial of A splits into linear factors and the algebraic multiplicity equals the geometric
multiplicity for each eigenvalue of A.
Proof. Omitted
0 1
Example 6.13. Determine whether the matrix A = is diagonalisable when viewed as an
−1 0
element of M2×2 (R), of M2×2 (C) and of M2×2 (F2 ). If A is diagonalisable then determine an invertible
matrix M that diagonalises A.
6 DIAGONALISABILITY 35
λ −1
Solution. pA (λ) = det = λ2 + 1.
1 λ
for R: λ2 + 1 does not split into linear factors
=⇒ as an element of M2×2 (R) the matrix A is not diagonalisable (by 6.12)
(A is a rotation by 90◦ ).
Theorem 6.14 (Cayley-Hamilton). Let F be a field, let A ∈ Mn×n (F ) and let pA be the characteristic
polynomial of A. Then pA (A) is the zero matrix.
0 1
Example 6.15. Let A = ∈ M2 (F )
−1 0
=⇒ pA (λ) = λ + 1 (see 6.9)
2
−1 0 1 0 0 0
=⇒ pA (λ) = A + I2 =
2
+ = .
0 −1 0 1 0 0
0
a1
Proof of 6.14. First Case: Let A = .. be a diagonal matrix.
.
0 an
=⇒ pA (λ) = (λ − a1 ) · · · (λ − an )
=⇒
pA (A) = (A − a1 In )· · · (A − an In )
1 −an
0 0 a1 −a2 0
a 0
a2 −a1 0 ..
= .. .. ... .
. . an−1 −an
0 an −a1 0 an −a2 0 0
36 6 DIAGONALISABILITY
= 0 (because the product of any two diagonal matrices with diagonal entries b1 , . . . , bn and c1 , . . . , cn ,
respectively, is the diagonal matrix with diagonal entries b1 c1 , . . . , bn cn ).
Second Case: Let A be a diagonalisable matrix.
=⇒ ∃M ∈ GLn (F ) such that M −1 AM = D where D is a diagonal matrix (by 6.8).
Let pA (λ) = λn + an−1 λn−1 + . . . + a1 λ + a0 be the characteristic polynomial of A
=⇒ pA (A) = An + an−1 An−1 + . . . + a1 A + a0 In
= (M DM −1 )n + an−1 (M DM −1 )n−1 + . . . + a1 (M DM −1 ) + a0 In
(because A = M DM −1 )
= M Dn M −1 + an−1 M Dn−1 M −1 + . . . + a1 M DM −1 + a0 M M −1
(because (M DM −1 )k = (M DM −1 )(M DM −1 ) . . . (M DM −1 )
= M D(M −1 M )D(M −1 M ) . . . (M −1 M )DM −1 = M Dk M −1 )
= M (Dn + an−1 Dn−1 + . . . + a1 D + a0 In )M −1
(using distributivity of matrix multiplication)
= M pA (D)M −1
= M pD (D)M −1 (see below)
= M 0M −1 = 0 (by the first case).