Elements of
Abstract and Linear Algebra
E. H. Connell
ii
E.H. Connell
Department of Mathematics
University of Miami
P.O. Box 249085
Coral Gables, Florida 33124 USA
ec@math.miami.edu
Mathematical Subject Classiﬁcations (1991): 1201, 1301, 1501, 1601, 2001
c (1999 E.H. Connell
March 20, 2004
iii
Introduction
In 1965 I ﬁrst taught an undergraduate course in abstract algebra. It was fun to
teach because the material was interesting and the class was outstanding. Five of
those students later earned a Ph.D. in mathematics. Since then I have taught the
course about a dozen times from various texts. Over the years I developed a set of
lecture notes and in 1985 I had them typed so they could be used as a text. They
now appear (in modiﬁed form) as the ﬁrst ﬁve chapters of this book. Here were some
of my motives at the time.
1) To have something as short and inexpensive as possible. In my experience,
students like short books.
2) To avoid all innovation. To organize the material in the most simpleminded
straightforward manner.
3) To order the material linearly. To the extent possible, each section should use
the previous sections and be used in the following sections.
4) To omit as many topics as possible. This is a foundational course, not a topics
course. If a topic is not used later, it should not be included. There are three
good reasons for this. First, linear algebra has top priority. It is better to go
forward and do more linear algebra than to stop and do more group and ring
theory. Second, it is more important that students learn to organize and write
proofs themselves than to cover more subject matter. Algebra is a perfect place
to get started because there are many “easy” theorems to prove. There are
many routine theorems stated here without proofs, and they may be considered
as exercises for the students. Third, the material should be so fundamental
that it be appropriate for students in the physical sciences and in computer
science. Zillions of students take calculus and cookbook linear algebra, but few
take abstract algebra courses. Something is wrong here, and one thing wrong
is that the courses try to do too much group and ring theory and not enough
matrix theory and linear algebra.
5) To oﬀer an alternative for computer science majors to the standard discrete
mathematics courses. Most of the material in the ﬁrst four chapters of this text
is covered in various discrete mathematics courses. Computer science majors
might beneﬁt by seeing this material organized from a purely mathematical
viewpoint.
iv
Over the years I used the ﬁve chapters that were typed as a base for my algebra
courses, supplementing them as I saw ﬁt. In 1996 I wrote a sixth chapter, giving
enough material for a full ﬁrst year graduate course. This chapter was written in the
same “style” as the previous chapters, i.e., everything was right down to the nub. It
hung together pretty well except for the last two sections on determinants and dual
spaces. These were independent topics stuck on at the end. In the academic year
199798 I revised all six chapters and had them typed in LaTeX. This is the personal
background of how this book came about.
It is diﬃcult to do anything in life without help from friends, and many of my
friends have contributed to this text. My sincere gratitude goes especially to Marilyn
Gonzalez, Lourdes Robles, Marta Alpar, John Zweibel, Dmitry Gokhman, Brian
Coomes, Huseyin Kocak, and Shulim Kaliman. To these and all who contributed,
this book is fondly dedicated.
This book is a survey of abstract algebra with emphasis on linear algebra. It is
intended for students in mathematics, computer science, and the physical sciences.
The ﬁrst three or four chapters can stand alone as a one semester course in abstract
algebra. However they are structured to provide the background for the chapter on
linear algebra. Chapter 2 is the most diﬃcult part of the book because groups are
written in additive and multiplicative notation, and the concept of coset is confusing
at ﬁrst. After Chapter 2 the book gets easier as you go along. Indeed, after the
ﬁrst four chapters, the linear algebra follows easily. Finishing the chapter on linear
algebra gives a basic one year undergraduate course in abstract algebra. Chapter 6
continues the material to complete a ﬁrst year graduate course. Classes with little
background can do the ﬁrst three chapters in the ﬁrst semester, and chapters 4 and 5
in the second semester. More advanced classes can do four chapters the ﬁrst semester
and chapters 5 and 6 the second semester. As bare as the ﬁrst four chapters are, you
still have to truck right along to ﬁnish them in one semester.
The presentation is compact and tightly organized, but still somewhat informal.
The proofs of many of the elementary theorems are omitted. These proofs are to
be provided by the professor in class or assigned as homework exercises. There is a
nontrivial theorem stated without proof in Chapter 4, namely the determinant of the
product is the product of the determinants. For the proper ﬂow of the course, this
theorem should be assumed there without proof. The proof is contained in Chapter 6.
The Jordan form should not be considered part of Chapter 5. It is stated there only
as a reference for undergraduate courses. Finally, Chapter 6 is not written primarily
for reference, but as an additional chapter for more advanced courses.
v
This text is written with the conviction that it is more eﬀective to teach abstract
and linear algebra as one coherent discipline rather than as two separate ones. Teach
ing abstract algebra and linear algebra as distinct courses results in a loss of synergy
and a loss of momentum. Also with this text the professor does not extract the course
from the text, but rather builds the course upon it. I am convinced it is easier to
build a course from a base than to extract it from a big book. Because after you
extract it, you still have to build it. The bare bones nature of this book adds to its
ﬂexibility, because you can build whatever course you want around it. Basic algebra
is a subject of incredible elegance and utility, but it requires a lot of organization.
This book is my attempt at that organization. Every eﬀort has been extended to
make the subject move rapidly and to make the ﬂow from one topic to the next as
seamless as possible. The student has limited time during the semester for serious
study, and this time should be allocated with care. The professor picks which topics
to assign for serious study and which ones to “wave arms at”. The goal is to stay
focused and go forward, because mathematics is learned in hindsight. I would have
made the book shorter, but I did not have any more time.
When using this text, the student already has the outline of the next lecture, and
each assignment should include the study of the next few pages. Study forward, not
just back. A few minutes of preparation does wonders to leverage classroom learning,
and this book is intended to be used in that manner. The purpose of class is to
learn, not to do transcription work. When students come to class cold and spend
the period taking notes, they participate little and learn little. This leads to a dead
class and also to the bad psychology of “OK, I am here, so teach me the subject.”
Mathematics is not taught, it is learned, and many students never learn how to learn.
Professors should give more direction in that regard.
Unfortunately mathematics is a diﬃcult and heavy subject. The style and
approach of this book is to make it a little lighter. This book works best when
viewed lightly and read as a story. I hope the students and professors who try it,
enjoy it.
E. H. Connell
Department of Mathematics
University of Miami
Coral Gables, FL 33124
ec@math.miami.edu
vi
Outline
Chapter 1 Background and Fundamentals of Mathematics
Sets, Cartesian products 1
Relations, partial orderings, Hausdorﬀ maximality principle, 3
equivalence relations
Functions, bijections, strips, solutions of equations, 5
right and left inverses, projections
Notation for the logic of mathematics 13
Integers, subgroups, unique factorization 14
Chapter 2 Groups
Groups, scalar multiplication for additive groups 19
Subgroups, order, cosets 21
Normal subgroups, quotient groups, the integers mod n 25
Homomorphisms 27
Permutations, the symmetric groups 31
Product of groups 34
Chapter 3 Rings
Rings 37
Units, domains, ﬁelds 38
The integers mod n 40
Ideals and quotient rings 41
Homomorphisms 42
Polynomial rings 45
Product of rings 49
The Chinese remainder theorem 50
Characteristic 50
Boolean rings 51
Chapter 4 Matrices and Matrix Rings
Addition and multiplication of matrices, invertible matrices 53
Transpose 56
Triangular, diagonal, and scalar matrices 56
Elementary operations and elementary matrices 57
Systems of equations 59
vii
Determinants, the classical adjoint 60
Similarity, trace, and characteristic polynomial 64
Chapter 5 Linear Algebra
Modules, submodules 68
Homomorphisms 69
Homomorphisms on R
n
71
Cosets and quotient modules 74
Products and coproducts 75
Summands 77
Independence, generating sets, and free basis 78
Characterization of free modules 79
Uniqueness of dimension 82
Change of basis 83
Vector spaces, square matrices over ﬁelds, rank of a matrix 85
Geometric interpretation of determinant 90
Linear functions approximate diﬀerentiable functions locally 91
The transpose principle 92
Nilpotent homomorphisms 93
Eigenvalues, characteristic roots 95
Jordan canonical form 96
Inner product spaces, GramSchmidt orthonormalization 98
Orthogonal matrices, the orthogonal group 102
Diagonalization of symmetric matrices 103
Chapter 6 Appendix
The Chinese remainder theorem 108
Prime and maximal ideals and UFD
s
109
Splitting short exact sequences 114
Euclidean domains 116
Jordan blocks 122
Jordan canonical form 123
Determinants 128
Dual spaces 130
viii
1 2 3 4
5 6 7 8
9 11 10
Abstract algebra is not only a major subject of science, but it is also
magic and fun. Abstract algebra is not all work and no play, and it is
certainly not a dull boy. See, for example, the neat card trick on page
18. This trick is based, not on sleight of hand, but rather on a theorem
in abstract algebra. Anyone can do it, but to understand it you need
some group theory. And before beginning the course, you might ﬁrst try
your skills on the famous (some would say infamous) tile puzzle. In this
puzzle, a frame has 12 spaces, the ﬁrst 11 with numbered tiles and the
last vacant. The last two tiles are out of order. Is it possible to slide the
tiles around to get them all in order, and end again with the last space
vacant? After giving up on this, you can study permutation groups and
learn the answer!
Chapter 1
Background and Fundamentals of
Mathematics
This chapter is fundamental, not just for algebra, but for all ﬁelds related to mathe
matics. The basic concepts are products of sets, partial orderings, equivalence rela
tions, functions, and the integers. An equivalence relation on a set A is shown to be
simply a partition of A into disjoint subsets. There is an emphasis on the concept
of function, and the properties of surjective, injective, and bijective. The notion of a
solution of an equation is central in mathematics, and most properties of functions
can be stated in terms of solutions of equations. In elementary courses the section
on the Hausdorﬀ Maximality Principle should be ignored. The ﬁnal section gives a
proof of the unique factorization theorem for the integers.
Notation Mathematics has its own universally accepted shorthand. The symbol
∃ means “there exists” and ∃! means “there exists a unique”. The symbol ∀ means
“for each” and ⇒ means “implies”. Some sets (or collections) are so basic they have
their own proprietary symbols. Five of these are listed below.
N = Z
+
= the set of positive integers = ¦1, 2, 3, ...¦
Z = the ring of integers = ¦..., −2, −1, 0, 1, 2, ...¦
Q = the ﬁeld of rational numbers = ¦a/b : a, b ∈ Z, b = 0¦
R = the ﬁeld of real numbers
C = the ﬁeld of complex numbers = ¦a +bi : a, b ∈ R¦ (i
2
= −1)
Sets Suppose A, B, C,... are sets. We use the standard notation for intersection
and union.
A ∩ B = ¦x : x ∈ A and x ∈ B¦ = the set of all x which are elements
1
2 Background Chapter 1
of A and B.
A ∪ B = ¦x : x ∈ A or x ∈ B¦ = the set of all x which are elements of
A or B.
Any set called an index set is assumed to be nonvoid. Suppose T is an index set and
for each t ∈ T, A
t
is a set.
¸
t∈T
A
t
= ¦x : ∃ t ∈ T with x ∈ A
t
¦
¸
t∈T
A
t
= ¦x : if t ∈ T, x ∈ A
t
¦ = ¦x : ∀t ∈ T, x ∈ A
t
¦
Let ∅ be the null set. If A∩ B = ∅, then A and B are said to be disjoint.
Deﬁnition Suppose each of A and B is a set. The statement that A is a subset
of B (A ⊂ B) means that if a is an element of A, then a is an element of B. That
is, a ∈ A ⇒a ∈ B. If A ⊂ B we may say A is contained in B, or B contains A.
Exercise Suppose each of A and B is a set. The statement that A is not a subset
of B means .
Theorem (De Morgan’s laws) Suppose S is a set. If C ⊂ S (i.e., if C is a subset
of S), let C
, the complement of C in S, be deﬁned by C
= S−C = ¦x ∈ S : x ∈ C¦.
Then for any A, B ⊂ S,
(A ∩ B)
= A
∪ B
and
(A ∪ B)
= A
∩ B
Cartesian Products If X and Y are sets, X Y = ¦(x, y) : x ∈ X and y ∈ Y ¦.
In other words, the Cartesian product of X and Y is deﬁned to be the set of all
ordered pairs whose ﬁrst term is in X and whose second term is in Y .
Example RR = R
2
= the plane.
Chapter 1 Background 3
Deﬁnition If each of X
1
, ..., X
n
is a set, X
1
X
n
= ¦(x
1
, ..., x
n
) : x
i
∈ X
i
for 1 ≤ i ≤ n¦ = the set of all ordered ntuples whose ith term is in X
i
.
Example R R = R
n
= real nspace.
Question Is (RR
2
) = (R
2
R) = R
3
?
Relations
If A is a nonvoid set, a nonvoid subset R ⊂ A A is called a relation on A. If
(a, b) ∈ R we say that a is related to b, and we write this fact by the expression a ∼ b.
Here are several properties which a relation may possess.
1) If a ∈ A, then a ∼ a. (reﬂexive)
2) If a ∼ b, then b ∼ a. (symmetric)
2
) If a ∼ b and b ∼ a, then a = b. (antisymmetric)
3) If a ∼ b and b ∼ c, then a ∼ c. (transitive)
Deﬁnition A relation which satisﬁes 1), 2
), and 3) is called a partial ordering.
In this case we write a ∼ b as a ≤ b. Then
1) If a ∈ A, then a ≤ a.
2
) If a ≤ b and b ≤ a, then a = b.
3) If a ≤ b and b ≤ c, then a ≤ c.
Deﬁnition A linear ordering is a partial ordering with the additional property
that, if a, b ∈ A, then a ≤ b or b ≤ a.
Example A = R with the ordinary ordering, is a linear ordering.
Example A = all subsets of R
2
, with a ≤ b deﬁned by a ⊂ b, is a partial ordering.
Hausdorﬀ Maximality Principle (HMP) Suppose S is a nonvoid subset of A
and ∼ is a relation on A. This deﬁnes a relation on S. If the relation satisﬁes any
of the properties 1), 2), 2
), or 3) on A, the relation also satisﬁes these properties
when restricted to S. In particular, a partial ordering on A deﬁnes a partial ordering
4 Background Chapter 1
on S. However the ordering may be linear on S but not linear on A. The HMP is
that any linearly ordered subset of a partially ordered set is contained in a maximal
linearly ordered subset.
Exercise Deﬁne a relation on A = R
2
by (a, b) ∼ (c, d) provided a ≤ c and
b ≤ d. Show this is a partial ordering which is linear on S = ¦(a, a) : a < 0¦. Find
at least two maximal linearly ordered subsets of R
2
which contain S.
One of the most useful applications of the HMP is to obtain maximal monotonic
collections of subsets.
Deﬁnition A collection of sets is said to be monotonic if, given any two sets of
the collection, one is contained in the other.
Corollary to HMP Suppose X is a nonvoid set and A is some nonvoid
collection of subsets of X, and S is a subcollection of A which is monotonic. Then ∃
a maximal monotonic subcollection of A which contains S.
Proof Deﬁne a partial ordering on A by V ≤ W iﬀ V ⊂ W, and apply HMP.
The HMP is used twice in this book. First, to show that inﬁnitely generated
vector spaces have free bases, and second, in the Appendix, to show that rings have
maximal ideals (see pages 87 and 109). In each of these applications, the maximal
monotonic subcollection will have a maximal element. In elementary courses, these
results may be assumed, and thus the HMP may be ignored.
Equivalence Relations A relation satisfying properties 1), 2), and 3) is called
an equivalence relation.
Exercise Deﬁne a relation on A = Z by n ∼ m iﬀ n − m is a multiple of 3.
Show this is an equivalence relation.
Deﬁnition If ∼ is an equivalence relation on A and a ∈ A, we deﬁne the equiva
lence class containing a by cl(a) = ¦x ∈ A : a ∼ x¦.
Chapter 1 Background 5
Theorem
1) If b ∈ cl(a) then cl(b) = cl(a). Thus we may speak of a subset of A
being an equivalence class with no mention of any element contained
in it.
2) If each of U, V ⊂ A is an equivalence class and U ∩ V = ∅, then
U = V .
3) Each element of A is an element of one and only one equivalence class.
Deﬁnition A partition of A is a collection of disjoint nonvoid subsets whose union
is A. In other words, a collection of nonvoid subsets of A is a partition of A provided
any a ∈ A is an element of one and only one subset of the collection. Note that if A
has an equivalence relation, the equivalence classes form a partition of A.
Theorem Suppose A is a nonvoid set with a partition. Deﬁne a relation on A by
a ∼ b iﬀ a and b belong to the same subset of the partition. Then ∼ is an equivalence
relation, and the equivalence classes are just the subsets of the partition.
Summary There are two ways of viewing an equivalence relation — one is as a
relation on A satisfying 1), 2), and 3), and the other is as a partition of A into
disjoint subsets.
Exercise Deﬁne an equivalence relation on Z by n ∼ m iﬀ n −m is a multiple
of 3. What are the equivalence classes?
Exercise Is there a relation on R satisfying 1), 2), 2
) and 3) ? That is, is there
an equivalence relation on R which is also a partial ordering?
Exercise Let H ⊂ R
2
be the line H = ¦(a, 2a) : a ∈ R¦. Consider the collection
of all translates of H, i.e., all lines in the plane with slope 2. Find the equivalence
relation on R
2
deﬁned by this partition of R
2
.
Functions
Just as there are two ways of viewing an equivalence relation, there are two ways
of deﬁning a function. One is the “intuitive” deﬁnition, and the other is the “graph”
or “ordered pairs” deﬁnition. In either case, domain and range are inherent parts of
the deﬁnition. We use the “intuitive” deﬁnition because everyone thinks that way.
6 Background Chapter 1
Deﬁnition If X and Y are (nonvoid) sets, a function or mapping or map with
domain X and range Y , is an ordered triple (X, Y, f) where f assigns to each x ∈ X
a well deﬁned element f(x) ∈ Y . The statement that (X, Y, f) is a function is written
as f : X →Y or X
f
→Y .
Deﬁnition The graph of a function (X, Y, f) is the subset Γ ⊂ X Y deﬁned
by Γ = ¦(x, f(x)) : x ∈ X¦. The connection between the “intuitive” and “graph”
viewpoints is given in the next theorem.
Theorem If f : X → Y , then the graph Γ ⊂ X Y has the property that each
x ∈ X is the ﬁrst term of one and only one ordered pair in Γ. Conversely, if Γ is a
subset of X Y with the property that each x ∈ X is the ﬁrst term of one and only
ordered pair in Γ, then ∃! f : X → Y whose graph is Γ. The function is deﬁned by
“f(x) is the second term of the ordered pair in Γ whose ﬁrst term is x.”
Example Identity functions Here X = Y and f : X → X is deﬁned by
f(x) = x for all x ∈ X. The identity on X is denoted by I
X
or just I : X →X.
Example Constant functions Suppose y
0
∈ Y . Deﬁne f : X → Y by f(x) =
y
0
for all x ∈ X.
Restriction Given f : X →Y and a nonvoid subset S of X, deﬁne f [ S : S →Y
by (f [ S)(s) = f(s) for all s ∈ S.
Inclusion If S is a nonvoid subset of X, deﬁne the inclusion i : S → X by
i(s) = s for all s ∈ S. Note that inclusion is a restriction of the identity.
Composition Given W
f
→X
g
→Y deﬁne g ◦ f : W →Y by
(g ◦ f)(x) = g(f(x)).
Theorem (The associative law of composition) If V
f
→ W
g
→ X
h
→ Y , then
h ◦ (g ◦ f) = (h ◦ g) ◦ f. This may be written as h ◦ g ◦ f.
Chapter 1 Background 7
Deﬁnitions Suppose f : X →Y .
1) If T ⊂ Y , the inverse image of T is a subset of X, f
−1
(T) = ¦x ∈ X :
f(x) ∈ T¦.
2) If S ⊂ X, the image of S is a subset of Y , f(S) = ¦f(s) : s ∈ S¦ =
¦y ∈ Y : ∃s ∈ S with f(s) = y¦.
3) The image of f is the image of X , i.e., image (f) = f(X) =
¦f(x) : x ∈ X¦ = ¦y ∈ Y : ∃x ∈ X with f(x) = y¦.
4) f : X →Y is surjective or onto provided image (f) = Y i.e., the image
is the range, i.e., if y ∈ Y , f
−1
(y) is a nonvoid subset of X.
5) f : X →Y is injective or 11 provided (x
1
= x
2
) ⇒f(x
1
) = f(x
2
), i.e.,
if x
1
and x
2
are distinct elements of X, then f(x
1
) and f(x
2
) are
distinct elements of Y .
6) f : X →Y is bijective or is a 11 correspondence provided f is surjective
and injective. In this case, there is function f
−1
: Y →X with f
−1
◦ f =
I
X
: X →X and f ◦ f
−1
= I
Y
: Y →Y . Note that f
−1
: Y →X is
also bijective and (f
−1
)
−1
= f.
Examples
1) f : R →R deﬁned by f(x) = sin(x) is neither surjective nor injective.
2) f : R →[−1, 1] deﬁned by f(x) = sin(x) is surjective but not injective.
3) f : [0, π/2] →R deﬁned by f(x) = sin(x) is injective but not surjective.
4) f : [0, π/2] →[0, 1] deﬁned by f(x) = sin(x) is bijective. (f
−1
(x) is
written as arcsin(x) or sin
−1
(x).)
5) f : R →(0, ∞) deﬁned by f(x) = e
x
is bijective. (f
−1
(x) is written as
ln(x).)
Note There is no such thing as “the function sin(x).” A function is not deﬁned
unless the domain and range are speciﬁed.
8 Background Chapter 1
Exercise Show there are natural bijections from (R R
2
) to (R
2
R) and
from (R
2
R) to R R R. These three sets are disjoint, but the bijections
between them are so natural that we sometimes identify them.
Exercise Suppose X is a set with 6 elements and Y is a ﬁnite set with n elements.
1) There exists an injective f : X →Y iﬀ n .
2) There exists a surjective f : X →Y iﬀ n .
3) There exists a bijective f : X →Y iﬀ n .
Pigeonhole Principle Suppose X is a ﬁnite set with m elements, Y is a ﬁnite
set with n elements, and f : X →Y is a function.
1) If m = n, then f is injective iﬀ f is surjective iﬀ f is bijective.
2) If m > n, then f is not injective.
3) If m < n, then f is not surjective.
If you are placing 6 pigeons in 6 holes, and you run out of pigeons before you ﬁll
the holes, then you have placed 2 pigeons in one hole. In other words, in part 1) for
m = n = 6, if f is not surjective then f is not injective. Of course, the pigeonhole
principle does not hold for inﬁnite sets, as can be seen by the following exercise.
Exercise Show there is a function f : Z
+
→ Z
+
which is injective but not
surjective. Also show there is one which is surjective but not injective.
Exercise Suppose f : [−2, 2] → R is deﬁned by f(x) = x
2
. Find f
−1
(f([1, 2])).
Also ﬁnd f(f
−1
([3, 5])).
Exercise Suppose f : X → Y is a function, S ⊂ X and T ⊂ Y . Find the
relationship between S and f
−1
(f(S)). Show that if f is injective, S = f
−1
(f(S)).
Also ﬁnd the relationship between T and f(f
−1
(T)). Show that if f is surjective,
T = f(f
−1
(T)).
Strips We now deﬁne the vertical and horizontal strips of X Y .
If x
0
∈ X, ¦(x
0
, y) : y ∈ Y ¦ = (x
0
Y ) is called a vertical strip.
If y
0
∈ Y, ¦(x, y
0
) : x ∈ X¦ = (X y
0
) is called a horizontal strip.
Chapter 1 Background 9
Theorem Suppose S ⊂ X Y . The subset S is the graph of a function with
domain X and range Y iﬀ each vertical strip intersects S in exactly one point.
This is just a restatement of the property of a graph of a function. The purpose
of the next theorem is to restate properties of functions in terms of horizontal strips.
Theorem Suppose f : X →Y has graph Γ. Then
1) Each horizontal strip intersects Γ in at least one point iﬀ f is .
2) Each horizontal strip intersects Γ in at most one point iﬀ f is .
3) Each horizontal strip intersects Γ in exactly one point iﬀ f is .
Solutions of Equations Now we restate these properties in terms of solutions of
equations. Suppose f : X → Y and y
0
∈ Y . Consider the equation f(x) = y
0
. Here
y
0
is given and x is considered to be a “variable”. A solution to this equation is any
x
0
∈ X with f(x
0
) = y
0
. Note that the set of all solutions to f(x) = y
0
is f
−1
(y
0
).
Also f(x) = y
0
has a solution iﬀ y
0
∈ image(f) iﬀ f
−1
(y
0
) is nonvoid.
Theorem Suppose f : X →Y .
1) The equation f(x) = y
0
has at least one solution for each y
0
∈ Y iﬀ
f is .
2) The equation f(x) = y
0
has at most one solution for each y
0
∈ Y iﬀ
f is .
3) The equation f(x) = y
0
has a unique solution for each y
0
∈ Y iﬀ
f is .
Right and Left Inverses One way to understand functions is to study right and
left inverses, which are deﬁned after the next theorem.
Theorem Suppose X
f
→Y
g
→W are functions.
1) If g ◦ f is injective, then f is injective.
10 Background Chapter 1
2) If g ◦ f is surjective, then g is surjective.
3) If g ◦ f is bijective, then f is injective and g is surjective.
Example X = W = ¦p¦, Y = ¦p, q¦, f(p) = p, and g(p) = g(q) = p. Here
g ◦ f is the identity, but f is not surjective and g is not injective.
Deﬁnition Suppose f : X → Y is a function. A left inverse of f is a function
g : Y → X such that g ◦ f = I
X
: X → X. A right inverse of f is a function
h : Y →X such that f ◦ h = I
Y
: Y →Y .
Theorem Suppose f : X →Y is a function.
1) f has a right inverse iﬀ f is surjective. Any such right inverse must be
injective.
2) f has a left inverse iﬀ f is injective. Any such left inverse must be
surjective.
Corollary Suppose each of X and Y is a nonvoid set. Then ∃ an injective
f : X → Y iﬀ ∃ a surjective g : Y → X. Also a function from X to Y is bijective
iﬀ it has a left inverse and a right inverse iﬀ it has a left and right inverse.
Note The Axiom of Choice is not discussed in this book. However, if you worked
1) of the theorem above, you unknowingly used one version of it. For completeness,
we state this part of 1) again.
The Axiom of Choice If f : X → Y is surjective, then f has a right inverse
h. That is, for each y ∈ Y , it is possible to choose an x ∈ f
−1
(y) and thus to deﬁne
h(y) = x.
Note It is a classical theorem in set theory that the Axiom of Choice and the
Hausdorﬀ Maximality Principle are equivalent. However in this text we do not go
that deeply into set theory. For our purposes it is assumed that the Axiom of Choice
and the HMP are true.
Exercise Suppose f : X → Y is a function. Deﬁne a relation on X by a ∼ b if
f(a) = f(b). Show this is an equivalence relation. If y belongs to the image of f,
then f
−1
(y) is an equivalence class and every equivalence class is of this form. In the
next chapter where f is a group homomorphism, these equivalence classes will be
called cosets.
Chapter 1 Background 11
Projections If X
1
and X
2
are nonvoid sets, we deﬁne the projection maps
π
1
: X
1
X
2
→X
1
and π
2
: X
1
X
2
→X
2
by π
i
(x
1
, x
2
) = x
i
.
Theorem If Y, X
1
, and X
2
are nonvoid sets, there is a 11 correspondence
between ¦functions f: Y →X
1
X
2
¦ and ¦ordered pairs of functions (f
1
, f
2
) where
f
1
: Y →X
1
and f
2
: Y →X
2
¦.
Proof Given f, deﬁne f
1
= π
1
◦ f and f
2
= π
2
◦ f. Given f
1
and f
2
deﬁne
f : Y → X
1
X
2
by f(y) = (f
1
(y), f
2
(y)). Thus a function from Y to X
1
X
2
is
merely a pair of functions from Y to X
1
and Y to X
2
. This concept is displayed in
the diagram below. It is summarized by the equation f = (f
1
, f
2
).
X
1
X
2
X
1
X
2
Y
·
,
»
f
1
f
2
f
π
1
π
2
One nice thing about this concept is that it works ﬁne for inﬁnite Cartesian
products.
Deﬁnition Suppose T is an index set and for each t ∈ T, X
t
is a nonvoid set.
Then the product
¸
t∈T
X
t
=
¸
X
t
is the collection of all sequences ¦x
t
¦
t∈T
= ¦x
t
¦
where x
t
∈ X
t
. Formally these sequences are functions α from T to
¸
X
t
with each
α(t) in X
t
and written as α(t) = x
t
. If T = ¦1, 2, . . . , n¦ then ¦x
t
¦ is the ordered
ntuple (x
1
, x
2
, . . . , x
n
). If T = Z
+
then ¦x
t
¦ is the sequence (x
1
, x
2
, . . .). For any T
and any s in T, the projection map π
s
:
¸
X
t
→X
s
is deﬁned by π
s
(¦x
t
¦) = x
s
.
Theorem If Y is any nonvoid set, there is a 11 correspondence between
¦functions f : Y →
¸
X
t
¦ and ¦sequences of functions ¦f
t
¦
t∈T
where f
t
: Y → X
t
¦.
Given f, the sequence ¦f
t
¦ is deﬁned by f
t
= π
t
◦ f. Given ¦f
t
¦, f is deﬁned by
f(y) = ¦f
t
(y)¦.
12 Background Chapter 1
A Calculus Exercise Let A be the collection of all functions f : [0, 1] → R
which have an inﬁnite number of derivatives. Let A
0
⊂ A be the subcollection of
those functions f with f(0) = 0. Deﬁne D : A
0
→A by D(f) = df/dx. Use the mean
value theorem to show that D is injective. Use the fundamental theorem of calculus
to show that D is surjective.
Exercise This exercise is not used elsewhere in this text and may be omitted. It
is included here for students who wish to do a little more set theory. Suppose T is a
nonvoid set.
1) If Y is a nonvoid set, deﬁne Y
T
to be the collection of all functions with domain
T and range Y . Show that if T and Y are ﬁnite sets with m and n elements, then
Y
T
has n
m
elements. In particular, when T = ¦1, 2, 3¦, Y
T
= Y Y Y has
n
3
elements. Show that if n ≥ 3, the subset of Y
{1,2,3}
of all injective functions has
n(n − 1)(n − 2) elements. These injective functions are called permutations on Y
taken 3 at a time. If T = N, then Y
T
is the inﬁnite product Y Y . That is,
Y
N
is the set of all inﬁnite sequences (y
1
, y
2
, . . .) where each y
i
∈ Y . For any Y and
T, let Y
t
be a copy of Y for each t ∈ T. Then Y
T
=
¸
t∈T
Y
t
.
2) Suppose each of Y
1
and Y
2
is a nonvoid set. Show there is a natural bijection
from (Y
1
Y
2
)
T
to Y
T
1
Y
T
2
. (This is the fundamental property of Cartesian products
presented in the two previous theorems.)
3) Deﬁne {(T), the power set of T, to be the collection of all subsets of T (including
the null set). Show that if T is a ﬁnite set with m elements, {(T) has 2
m
elements.
4) If S is any subset of T, deﬁne its characteristic function χ
S
: T → ¦0, 1¦ by
letting χ
S
(t) be 1 when t ∈ S, and be 0 when t ∈[ S. Deﬁne α : {(T) → ¦0, 1¦
T
by
α(S) = χ
S
. Deﬁne β : ¦0, 1¦
T
→ {(T) by β(f) = f
−1
(1). Show that if S ⊂ T then
β ◦ α(S) = S, and if f : T → ¦0, 1¦ then α ◦ β(f) = f. Thus α is a bijection and
β = α
−1
.
{(T) ←→¦0, 1¦
T
5) Suppose γ : T →¦0, 1¦
T
is a function and show that it cannot be surjective. If
t ∈ T, denote γ(t) by γ(t) = f
t
: T → ¦0, 1¦. Deﬁne f : T → ¦0, 1¦ by f(t) = 0 if
f
t
(t) = 1, and f(t) = 1 if f
t
(t) = 0. Show that f is not in the image of γ and thus
γ cannot be surjective. This shows that if T is an inﬁnite set, then the set ¦0, 1¦
T
represents a “higher order of inﬁnity than T”.
6) An inﬁnite set Y is said to be countable if there is a bijection from the positive
Chapter 1 Background 13
integers N to Y. Show Q is countable but the following three collections are not.
i) {(N), the collection of all subsets of N.
ii) ¦0, 1¦
N
, the collection of all functions f : N →¦0, 1¦.
iii) The collection of all sequences (y
1
, y
2
, . . .) where each y
i
is 0 or 1.
We know that ii) and iii) are equal and there is a natural bijection between i)
and ii). We also know there is no surjective map from N to ¦0, 1¦
N
, i.e., ¦0, 1¦
N
is
uncountable. Finally, show there is a bijection from ¦0, 1¦
N
to the real numbers R.
(This is not so easy. To start with, you have to decide what the real numbers are.)
Notation for the Logic of Mathematics
Each of the words “Lemma”, “Theorem”, and “Corollary” means “true state
ment”. Suppose A and B are statements. A theorem may be stated in any of the
following ways:
Theorem Hypothesis Statement A.
Conclusion Statement B.
Theorem Suppose A is true. Then B is true.
Theorem If A is true, then B is true.
Theorem A ⇒B (A implies B ).
There are two ways to prove the theorem — to suppose A is true and show B is
true, or to suppose B is false and show A is false. The expressions “A ⇔ B”, “A is
equivalent to B”, and “A is true iﬀ B is true ” have the same meaning (namely, that
A ⇒B and B ⇒A).
The important thing to remember is that thoughts and expressions ﬂow through
the language. Mathematical symbols are shorthand for phrases and sentences in the
English language. For example, “x ∈ B ” means “x is an element of the set B.” If A
is the statement “x ∈ Z
+
” and B is the statement “x
2
∈ Z
+
”, then “A ⇒ B”means
“If x is a positive integer, then x
2
is a positive integer”.
Mathematical Induction is based upon the fact that if S ⊂ Z
+
is a nonvoid
subset, then S contains a smallest element.
14 Background Chapter 1
Theorem Suppose P(n) is a statement for each n = 1, 2, ... . Suppose P(1) is
true and for each n ≥ 1, P(n) ⇒P(n + 1). Then for each n ≥ 1, P(n) is true.
Proof If the theorem is false, then ∃ a smallest positive integer m such that
P(m) is false. Since P(m−1) is true, this is impossible.
Exercise Use induction to show that, for each n ≥ 1, 1+2+ +n = n(n+1)/2.
The Integers
In this section, lower case letters a, b, c, ... will represent integers, i.e., elements
of Z. Here we will establish the following three basic properties of the integers.
1) If G is a subgroup of Z, then ∃ n ≥ 0 such that G = nZ.
2) If a and b are integers, not both zero, and G is the collection of all linear
combinations of a and b, then G is a subgroup of Z, and its
positive generator is the greatest common divisor of a and b.
3) If n ≥ 2, then n factors uniquely as the product of primes.
All of this will follow from long division, which we now state formally.
Euclidean Algorithm Given a, b with b = 0, ∃! m and r with 0 ≤ r <[b[ and
a = bm + r. In other words, b divides a “m times with a remainder of r”. For
example, if a = −17 and b = 5, then m = −4 and r = 3, −17 = 5(−4) + 3.
Deﬁnition If r = 0, we say that b divides a or a is a multiple of b. This fact is
written as b [ a. Note that b [ a ⇔ the rational number a/b is an integer ⇔ ∃! m
such that a = bm ⇔ a ∈ bZ.
Note Anything (except 0) divides 0. 0 does not divide anything.
± 1 divides anything . If n = 0, the set of integers which n divides
is nZ = ¦nm : m ∈ Z¦ = ¦..., −2n, −n, 0, n, 2n, ...¦. Also n divides
a and b with the same remainder iﬀ n divides (a −b).
Deﬁnition A nonvoid subset G ⊂ Z is a subgroup provided (g ∈ G ⇒ −g ∈ G)
and (g
1
, g
2
∈ G ⇒(g
1
+g
2
) ∈ G). We say that G is closed under negation and closed
under addition.
Chapter 1 Background 15
Theorem If n ∈ Z then nZ is a subgroup. Thus if n = 0, the set of integers
which n divides is a subgroup of Z.
The next theorem states that every subgroup of Z is of this form.
Theorem Suppose G ⊂ Z is a subgroup. Then
1) 0 ∈ G.
2) If g
1
and g
2
∈ G, then (m
1
g
1
+m
2
g
2
) ∈ G for all integers m
1
, m
2
.
3) ∃! nonnegative integer n such that G = nZ. In fact, if G = ¦0¦
and n is the smallest positive integer in G, then G = nZ.
Proof Since G is nonvoid, ∃ g ∈ G. Now (−g) ∈ G and thus 0 = g + (−g)
belongs to G, and so 1) is true. Part 2) is straightforward, so consider 3). If G = 0,
it must contain a positive element. Let n be the smallest positive integer in G. If
g ∈ G, g = nm +r where 0 ≤ r < n. Since r ∈ G, it must be 0, and g ∈ nZ.
Now suppose a, b ∈ Z and at least one of a and b is nonzero.
Theorem Let G be the set of all linear combinations of a and b, i.e., G =
¦ma +nb : m, n ∈ Z¦. Then
1) G contains a and b.
2) G is a subgroup. In fact, it is the smallest subgroup containing a and b.
It is called the subgroup generated by a and b.
3) Denote by (a, b) the smallest positive integer in G. By the previous
theorem, G = (a, b)Z, and thus (a, b) [ a and (a, b) [ b. Also note that
∃ m, n such that ma +nb = (a, b). The integer (a, b) is called
the greatest common divisor of a and b.
4) If n is an integer which divides a and b, then n also divides (a, b).
Proof of 4) Suppose n [ a and n [ b i.e., suppose a, b ∈ nZ. Since G is the
smallest subgroup containing a and b, nZ ⊃ (a, b)Z, and thus n [ (a, b).
Corollary The following are equivalent.
1) a and b have no common divisors, i.e., (n [ a and n [ b) ⇒n = ±1.
16 Background Chapter 1
2) (a, b) = 1, i.e., the subgroup generated by a and b is all of Z.
3) ∃ m, n ∈Z with ma +nb = 1.
Deﬁnition If any one of these three conditions is satisﬁed, we say that a and b
are relatively prime.
This next theorem is the basis for unique factorization.
Theorem If a and b are relatively prime with a not zero, then a[bc ⇒a[c.
Proof Suppose a and b are relatively prime, c ∈ Z and a [ bc. Then there exist
m, n with ma + nb = 1, and thus mac + nbc = c. Now a [ mac and a [ nbc. Thus
a [ (mac +nbc) and so a [ c.
Deﬁnition A prime is an integer p > 1 which does not factor, i.e., if p = ab then
a = ±1 or a = ±p. The ﬁrst few primes are 2, 3, 5, 7, 11, 13, 17,... .
Theorem Suppose p is a prime.
1) If a is an integer which is not a multiple of p, then (p, a) = 1. In other
words, if a is any integer, (p, a) = p or (p, a) = 1.
2) If p [ ab then p [ a or p [ b.
3) If p [ a
1
a
2
a
n
then p divides some a
i
. Thus if each a
i
is a prime,
then p is equal to some a
i
.
Proof Part 1) follows immediately from the deﬁnition of prime. Now suppose
p [ ab. If p does not divide a, then by 1), (p, a) = 1 and by the previous theorem, p
must divide b. Thus 2) is true. Part 3) follows from 2) and induction on n.
The Unique Factorization Theorem Suppose a is an integer which is not 0,1,
or 1. Then a may be factored into the product of primes and, except for order, this
factorization is unique. That is, ∃ a unique collection of distinct primes p
1
, p
2
, ..., p
k
and positive integers s
1
, s
2
, ..., s
k
such that a = ±p
s
1
1
p
s
2
2
p
s
k
k
.
Proof Factorization into primes is obvious, and uniqueness follows from 3) in the
theorem above. The power of this theorem is uniqueness, not existence.
Chapter 1 Background 17
Now that we have unique factorization and part 3) above, the picture becomes
transparent. Here are some of the basic properties of the integers in this light.
Theorem (Summary)
1) Suppose [ a[> 1 has prime factorization a = ±p
s
1
1
p
s
k
k
. Then the only
divisors of a are of the form ±p
t
1
1
p
t
k
k
where 0 ≤ t
i
≤ s
i
for i = 1, ..., k.
2) If [ a [> 1 and [ b [> 1, then (a, b) = 1 iﬀ there is no common prime in
their factorizations. Thus if there is no common prime in their
factorizations, ∃ m, n with ma +nb = 1, and also (a
2
, b
2
) = 1.
3) Suppose [ a[> 1 and [ b[> 1. Let ¦p
1
, . . . , p
k
¦ be the union of the distinct
primes of their factorizations. Thus a = ±p
s
1
1
p
s
k
k
where 0 ≤ s
i
and
b = ±p
t
1
1
p
t
k
k
where 0 ≤ t
i
. Let u
i
be the minimum of s
i
and t
i
. Then
(a, b) = p
u
1
1
p
u
k
k
. For example (2
3
5 11, 2
2
5
4
7) = 2
2
5.
3
) Let v
i
be the maximum of s
i
and t
i
. Then c = p
v
1
1
p
v
k
k
is the least
(positive) common multiple of a and b. Note that c is a multiple of
a and b, and if n is a multiple of a and b, then n is a multiple of c.
Finally, if a and b are positive, their least common multiple is
c = ab/(a, b), and if in addition a and b are relatively prime,
then their least common multiple is just their product.
4) There is an inﬁnite number of primes. (Proof: Suppose there were only
a ﬁnite number of primes p
1
, p
2
, ..., p
k
. Then no prime would divide
(p
1
p
2
p
k
+ 1).)
5) Suppose c is an integer greater than 1. Then
√
c is rational iﬀ
√
c is an
integer. In particular,
√
2 and
√
3 are irrational. (Proof: If
√
c is
rational, ∃ positive integers a and b with
√
c = a/b and (a, b) = 1.
If b > 1, then it is divisible by some prime, and since cb
2
= a
2
, this
prime will also appear in the prime factorization of a. This is a
contradiction and thus b = 1 and
√
c is an integer.) (See the ﬁfth
exercise below.)
Exercise Find (180,28), i.e., ﬁnd the greatest common divisor of 180 and 28,
i.e., ﬁnd the positive generator of the subgroup generated by ¦180,28¦. Find integers
m and n such that 180m + 28n = (180, 28). Find the least common multiple of 180
and 28, and show that it is equal to (180 28)/(180, 28).
18 Background Chapter 1
Exercise We have deﬁned the greatest common divisor (gcd) and the least com
mon multiple (lcm) of a pair of integers. Now suppose n ≥ 2 and S = ¦a
1
, a
2
, .., a
n
¦
is a ﬁnite collection of integers with [a
i
[ > 1 for 1 ≤ i ≤ n. Deﬁne the gcd and the
lcm of the elements of S and develop their properties. Express the gcd and the lcm
in terms of the prime factorizations of the a
i
. When is the lcm of S equal to the
product a
1
a
2
a
n
? Show that the set of all linear combinations of the elements of
S is a subgroup of Z, and its positive generator is the gcd of the elements of S.
Exercise Show that the gcd of S = ¦90, 70, 42¦ is 2, and ﬁnd integers n
1
, n
2
, n
3
such that 90n
1
+ 70n
2
+ 42n
3
= 2. Also ﬁnd the lcm of the elements of S.
Exercise Show that if each of G
1
, G
2
, ..., G
m
is a subgroup of Z, then
G
1
∩ G
2
∩ ∩ G
m
is also a subgroup of Z. Now let G = (90Z) ∩ (70Z) ∩ (42Z)
and ﬁnd the positive integer n with G = nZ.
Exercise Show that if the nth root of an integer is a rational number, then it
itself is an integer. That is, suppose c and n are integers greater than 1. There is a
unique positive real number x with x
n
= c. Show that if x is rational, then it is an
integer. Thus if p is a prime, its nth root is an irrational number.
Exercise Show that a positive integer is divisible by 3 iﬀ the sum of its digits is
divisible by 3. More generally, let a = a
n
a
n−1
. . . a
0
= a
n
10
n
+ a
n−1
10
n−1
+ + a
0
where 0 ≤ a
i
≤ 9. Now let b = a
n
+a
n−1
+ +a
0
, and show that 3 divides a and b
with the same remainder. Although this is a straightforward exercise in long division,
it will be more transparent later on. In the language of the next chapter, it says that
[a] = [b] in Z
3
.
Card Trick Ask friends to pick out seven cards from a deck and then to select one
to look at without showing it to you. Take the six cards face down in your left hand
and the selected card in your right hand, and announce you will place the selected
card in with the other six, but they are not to know where. Put your hands behind
your back and place the selected card on top, and bring the seven cards in front in
your left hand. Ask your friends to give you a number between one and seven (not
allowing one). Suppose they say three. You move the top card to the bottom, then
the second card to the bottom, and then you turn over the third card, leaving it face
up on top. Then repeat the process, moving the top two cards to the bottom and
turning the third card face up on top. Continue until there is only one card face
down, and this will be the selected card. Magic? Stay tuned for Chapter 2, where it
is shown that any nonzero element of Z
7
has order 7.
Chapter 2
Groups
Groups are the central objects of algebra. In later chapters we will deﬁne rings and
modules and see that they are special cases of groups. Also ring homomorphisms and
module homomorphisms are special cases of group homomorphisms. Even though
the deﬁnition of group is simple, it leads to a rich and amazing theory. Everything
presented here is standard, except that the product of groups is given in the additive
notation. This is the notation used in later chapters for the products of rings and
modules. This chapter and the next two chapters are restricted to the most basic
topics. The approach is to do quickly the fundamentals of groups, rings, and matrices,
and to push forward to the chapter on linear algebra. This chapter is, by far and
above, the most diﬃcult chapter in the book, because group operations may be written
as addition or multiplication, and also the concept of coset is confusing at ﬁrst.
Deﬁnition Suppose G is a nonvoid set and φ : G G → G is a function. φ is
called a binary operation, and we will write φ(a, b) = a b or φ(a, b) = a+b. Consider
the following properties.
1) If a, b, c ∈ G then a (b c) = (a b) c. If a, b, c ∈ G then a + (b +c) = (a +b) +c.
2) ∃ e = e
G
∈ G such that if a ∈ G ∃ 0
¯
=0
¯
G
∈ G such that if a ∈ G
e a = a e = a. 0
¯
+a = a+0
¯
= a.
3) If a ∈ G, ∃b ∈ G with a b = b a = e If a ∈ G, ∃b ∈ G with a +b = b +a = 0
¯
(b is written as b = a
−1
). (b is written as b = −a).
4) If a, b ∈ G, then a b = b a. If a, b ∈ G, then a +b = b +a.
Deﬁnition If properties 1), 2), and 3) hold, (G, φ) is said to be a group. If we
write φ(a, b) = a b, we say it is a multiplicative group. If we write φ(a, b) = a + b,
19
20 Groups Chapter 2
we say it is an additive group. If in addition, property 4) holds, we say the group is
abelian or commutative.
Theorem Let (G, φ) be a multiplicative group.
(i) Suppose a, c, ¯ c ∈ G. Then a c = a ¯ c ⇒ c = ¯ c.
Also c a = ¯ c a ⇒ c = ¯ c.
In other words, if f : G →G is deﬁned by f(c) = a c, then f is injective.
Also f is bijective with f
−1
given by f
−1
(c) = a
−1
c.
(ii) e is unique, i.e., if ¯ e ∈ G satisﬁes 2), then e = ¯ e. In fact,
if a, b ∈ G then (a b = a) ⇒(b = e) and (a b = b) ⇒ (a = e).
Recall that b is an identity in G provided it is a right and left
identity for any a in G. However, group structure is so rigid that if
∃ a ∈ G such that b is a right identity for a, then b = e.
Of course, this is just a special case of the cancellation law in (i).
(iii) Every right inverse is an inverse, i.e., if a b = e then b = a
−1
.
Also if b a = e then b = a
−1
. Thus inverses are unique.
(iv) If a ∈ G, then (a
−1
)
−1
= a.
(v) The multiplication a
1
a
2
a
3
= a
1
(a
2
a
3
) = (a
1
a
2
) a
3
is welldeﬁned.
In general, a
1
a
2
a
n
is well deﬁned.
(vi) If a, b ∈ G, (a b)
−1
= b
−1
a
−1
. Also (a
1
a
2
a
n
)
−1
=
a
−1
n
a
−1
n−1
a
−1
1
.
(vii) Suppose a ∈ G. Let a
0
= e and if n > 0, a
n
= a a (n times)
and a
−n
= a
−1
a
−1
(n times). If n
1
, n
2
, ..., n
t
∈ Z then
a
n
1
a
n
2
a
nt
= a
n
1
+···+nt
. Also (a
n
)
m
= a
nm
.
Finally, if G is abelian and a, b ∈ G, then (a b)
n
= a
n
b
n
.
Exercise. Write out the above theorem where G is an additive group. Note that
part (vii) states that G has a scalar multiplication over Z. This means that if a is in
G and n is an integer, there is deﬁned an element an in G. This is so basic, that we
state it explicitly.
Theorem. Suppose G is an additive group. If a ∈ G, let a0 =0
¯
and if n > 0,
let an = (a + +a) where the sum is n times, and a(−n) = (−a) + (−a) +(−a),
Chapter 2 Groups 21
which we write as (−a − a − a). Then the following properties hold in general,
except the ﬁrst requires that G be abelian.
(a +b)n = an +bn
a(n +m) = an +am
a(nm) = (an)m
a1 = a
Note that the plus sign is used ambiguously — sometimes for addition in G
and sometimes for addition in Z. In the language used in Chapter 5, this theorem
states that any additive abelian group is a Zmodule. (See page 71.)
Exercise Suppose G is a nonvoid set with a binary operation φ(a, b) = a b which
satisﬁes 1), 2) and [ 3
) If a ∈ G, ∃b ∈ G with a b = e]. Show (G, φ) is a group,
i.e., show b a = e. In other words, the group axioms are stronger than necessary.
If every element has a right inverse, then every element has a two sided inverse.
Exercise Suppose G is the set of all functions from Z to Z with multiplication
deﬁned by composition, i.e., f g = f ◦ g. Note that G satisﬁes 1) and 2) but not 3),
and thus G is not a group. Show that f has a right inverse in G iﬀ f is surjective,
and f has a left inverse in G iﬀ f is injective (see page 10). Also show that the set
of all bijections from Z to Z is a group under composition.
Examples G = R, G = Q, or G = Z with φ(a, b) = a +b is an additive
abelian group.
Examples G = R−0 or G = Q−0 with φ(a, b) = ab is a multiplicative
abelian group.
G = Z −0 with φ(a, b) = ab is not a group.
G = R
+
= ¦r ∈ R : r > 0¦ with φ(a, b) = ab is a multiplicative
abelian group.
Subgroups
Theorem Suppose G is a multiplicative group and H ⊂ G is a nonvoid subset
satisfying
1) if a, b ∈ H then a b ∈ H
and 2) if a ∈ H then a
−1
∈ H.
22 Groups Chapter 2
Then e ∈ H and H is a group under multiplication. H is called a subgroup of G.
Proof Since H is nonvoid, ∃a ∈ H. By 2), a
−1
∈ H and so by 1), e ∈ H. The
associative law is immediate and so H is a group.
Example G is a subgroup of G and e is a subgroup of G. These are called the
improper subgroups of G.
Example If G = Z under addition, and n ∈ Z, then H = nZ is a subgroup of
Z. By a theorem in the section on the integers in Chapter 1, every subgroup of Z
is of this form (see page 15). This is a key property of the integers.
Exercises Suppose G is a multiplicative group.
1) Let H be the center of G, i.e., H = ¦h ∈ G : g h = h g for all g ∈ G¦. Show
H is a subgroup of G.
2) Suppose H
1
and H
2
are subgroups of G. Show H
1
∩ H
2
is a subgroup of G.
3) Suppose H
1
and H
2
are subgroups of G, with neither H
1
nor H
2
contained in
the other. Show H
1
∪ H
2
is not a subgroup of G.
4) Suppose T is an index set and for each t ∈ T, H
t
is a subgroup of G.
Show
¸
t∈T
H
t
is a subgroup of G.
5) Furthermore, if ¦H
t
¦ is a monotonic collection, then
¸
t∈T
H
t
is a subgroup of G.
6) Suppose G= ¦all functions f : [0, 1] →R¦. Deﬁne an addition on G by
(f +g)(t) = f(t) +g(t) for all t ∈ [0, 1]. This makes G into an abelian group.
Let K be the subset of G composed of all diﬀerentiable functions. Let H
be the subset of G composed of all continuous functions. What theorems
in calculus show that H and K are subgroups of G? What theorem shows
that K is a subset (and thus subgroup) of H?
Order Suppose G is a multiplicative group. If G has an inﬁnite number of
Chapter 2 Groups 23
elements, we say that o(G), the order of G, is inﬁnite. If G has n elements, then
o(G) = n. Suppose a ∈ G and H = ¦a
i
: i ∈ Z¦. H is an abelian subgroup of G
called the subgroup generated by a. We deﬁne the order of the element a to be the
order of H, i.e., the order of the subgroup generated by a. Let f : Z → H be the
surjective function deﬁned by f(m) = a
m
. Note that f(k + l) = f(k) f(l) where
the addition is in Z and the multiplication is in the group H. We come now to the
ﬁrst real theorem in group theory. It says that the element a has ﬁnite order iﬀ f
is not injective, and in this case, the order of a is the smallest positive integer n
with a
n
= e.
Theorem Suppose a is an element of a multiplicative group G, and
H = ¦a
i
: i ∈ Z¦. If ∃ distinct integers i and j with a
i
= a
j
, then a has some ﬁnite
order n. In this case H has n distinct elements, H = ¦a
0
, a
1
, . . . , a
n−1
¦, and a
m
= e
iﬀ n[m. In particular, the order of a is the smallest positive integer n with a
n
= e,
and f
−1
(e) = nZ.
Proof Suppose j < i and a
i
= a
j
. Then a
i−j
= e and thus ∃ a smallest positive
integer n with a
n
= e. This implies that the elements of ¦a
0
, a
1
, ..., a
n−1
¦ are distinct,
and we must show they are all of H. If m ∈ Z, the Euclidean algorithm states that
∃ integers q and r with 0 ≤ r < n and m = nq + r. Thus a
m
= a
nq
a
r
= a
r
, and
so H = ¦a
0
, a
1
, ..., a
n−1
¦, and a
m
= e iﬀ n[m. Later in this chapter we will see that
f is a homomorphism from an additive group to a multiplicative group and that,
in additive notation, H is isomorphic to Z or Z
n
.
Exercise Write out this theorem for G an additive group. To begin, suppose a is
an element of an additive group G, and H = ¦ai : i ∈ Z¦.
Exercise Show that if G is a ﬁnite group of even order, then G has an odd number
of elements of order 2. Note that e is the only element of order 1.
Deﬁnition A group G is cyclic if ∃ an element of G which generates G.
Theorem If G is cyclic and H is a subgroup of G, then H is cyclic.
Proof Suppose G = ¦a
i
: i ∈ Z¦ is a cyclic group and H is a subgroup
of G. If H = e, then H is cyclic, so suppose H = e. Now there is a small
est positive integer m with a
m
∈ H. If t is an integer with a
t
∈ H, then by
the Euclidean algorithm, m divides t, and thus a
m
generates H. Note that in
the case G has ﬁnite order n, i.e., G = ¦a
0
, a
1
, . . . , a
n−1
¦, then a
n
= e ∈ H,
and thus the positive integer m divides n. In either case, we have a clear picture
of the subgroups of G. Also note that this theorem was proved on page 15 for the
additive group Z.
24 Groups Chapter 2
Cosets Suppose H is a subgroup of a group G. It will be shown below that H
partitions G into right cosets. It also partitions G into left cosets, and in general
these partitions are distinct.
Theorem If H is a subgroup of a multiplicative group G, then a ∼ b deﬁned by
a ∼ b iﬀ a b
−1
∈ H is an equivalence relation. If a ∈ G, cl(a) = ¦b ∈ G : a ∼ b¦ =
¦h a : h ∈ H¦ = Ha. Note that a b
−1
∈ H iﬀ b a
−1
∈ H.
If H is a subgroup of an additive group G, then a ∼ b deﬁned by a ∼ b iﬀ
(a − b) ∈ H is an equivalence relation. If a ∈ G, cl(a) = ¦b ∈ G : a ∼ b¦ = ¦h + a :
h ∈ H¦ = H +a. Note that (a −b) ∈ H iﬀ (b −a) ∈ H.
Deﬁnition These equivalence classes are called right cosets. If the relation is
deﬁned by a ∼ b iﬀ b
−1
a ∈ H, then the equivalence classes are cl(a) = aH and
they are called left cosets. H is a left and right coset. If G is abelian, there is no
distinction between right and left cosets. Note that b
−1
a ∈ H iﬀ a
−1
b ∈ H.
In the theorem above, H is used to deﬁne an equivalence relation on G, and thus
a partition of G. We now do the same thing a diﬀerent way. We deﬁne the right
cosets directly and show they form a partition of G. You might ﬁnd this easier.
Theorem Suppose H is a subgroup of a multiplicative group G. If a ∈ G, deﬁne
the right coset containing a to be Ha = ¦h a : h ∈ H¦. Then the following hold.
1) Ha = H iﬀ a ∈ H.
2) If b ∈ Ha, then Hb = Ha, i.e., if h ∈ H, then H(h a) = (Hh)a = Ha.
3) If Hc ∩ Ha = ∅, then Hc = Ha.
4) The right cosets form a partition of G, i.e., each a in G belongs to one and
only one right coset.
5) Elements a and b belong to the same right coset iﬀ a b
−1
∈ H iﬀ b a
−1
∈ H.
Proof There is no better way to develop facility with cosets than to prove this
theorem. Also write this theorem for G an additive group.
Theorem Suppose H is a subgroup of a multiplicative group G.
Chapter 2 Groups 25
1) Any two right cosets have the same number of elements. That is, if a, b ∈ G,
f : Ha →Hb deﬁned by f(h a) = h b is a bijection. Also any two left cosets
have the same number of elements. Since H is a right and left coset, any
two cosets have the same number of elements.
2) G has the same number of right cosets as left cosets. The function F deﬁned
by F(Ha) = a
−1
H is a bijection from the collection of right cosets to the left
cosets. The number of right (or left) cosets is called the index of H in G.
3) If G is ﬁnite, o(H) (index of H) = o(G) and so o(H) [ o(G). In other words,
o(G)/o(H) = the number of right cosets = the number of left cosets.
4) If G is ﬁnite, and a ∈ G, then o(a) [ o(G). (Proof: The order of a is the order
of the subgroup generated by a, and by 3) this divides the order of G.)
5) If G has prime order, then G is cyclic, and any element (except e) is a generator.
(Proof: Suppose o(G) = p and a ∈ G, a = e. Then o(a) [ p and thus o(a) = p.)
6) If o(G) = n and a ∈ G, then a
n
= e. (Proof: a
o(a)
= e and n = o(a) (o(G)/o(a)) .)
Exercises
i) Suppose G is a cyclic group of order 4, G = ¦e, a, a
2
, a
3
¦ with a
4
= e. Find the
order of each element of G. Find all the subgroups of G.
ii) Suppose G is the additive group Z and H = 3Z. Find the cosets of H.
iii) Think of a circle as the interval [0, 1] with end points identiﬁed. Suppose G = R
under addition and H = Z. Show that the collection of all the cosets of H
can be thought of as a circle.
iv) Let G = R
2
under addition, and H be the subgroup deﬁned by
H = ¦(a, 2a) : a ∈ R¦. Find the cosets of H. (See the last exercise on p 5.)
Normal Subgroups
We would like to make a group out of the collection of cosets of a subgroup H. In
26 Groups Chapter 2
general, there is no natural way to do that. However, it is easy to do in case H is a
normal subgroup, which is described below.
Theorem If H is a subgroup of a group G, then the following are equivalent.
1) If a ∈ G, then aHa
−1
= H
2) If a ∈ G, then aHa
−1
⊂ H
3) If a ∈ G, then aH = Ha
4) Every right coset is a left coset, i.e., if a ∈ G, ∃ b ∈ G with Ha = bH.
Proof 1) ⇒ 2) is obvious. Suppose 2) is true and show 3). We have (aHa
−1
)a ⊂
Ha so aH ⊂ Ha. Also a(a
−1
Ha) ⊂ aH so Ha ⊂ aH. Thus aH = Ha.
3) ⇒ 4) is obvious. Suppose 4) is true and show 3). Ha = bH contains a, so
bH = aH because a coset is an equivalence class. Thus aH = Ha.
Finally, suppose 3) is true and show 1). Multiply aH = Ha on the right by a
−1
.
Deﬁnition If H satisﬁes any of the four conditions above, then H is said to be a
normal subgroup of G. (This concept goes back to Evariste Galois in 1831.)
Note For any group G, G and e are normal subgroups. If G is an abelian group,
then every subgroup of G is normal.
Exercise Show that if H is a subgroup of G with index 2, then H is normal.
Exercise Show the intersection of a collection of normal subgroups of G is a
normal subgroup of G. Show the union of a monotonic collection of normal subgroups
of G is a normal subgroup of G.
Exercise Let A ⊂ R
2
be the square with vertices (−1, 1), (1, 1), (1, −1), and
(−1, −1), and G be the collection of all “isometries” of A onto itself. These are
bijections of A onto itself which preserve distance and angles, i.e., which preserve dot
product. Show that with multiplication deﬁned as composition, G is a multiplicative
group. Show that G has four rotations, two reﬂections about the axes, and two
reﬂections about the diagonals, for a total of eight elements. Show the collection of
rotations is a cyclic subgroup of order four which is a normal subgroup of G. Show
that the reﬂection about the xaxis together with the identity form a cyclic subgroup
of order two which is not a normal subgroup of G. Find the four right cosets of this
subgroup. Finally, ﬁnd the four left cosets of this subgroup.
Chapter 2 Groups 27
Quotient Groups Suppose N is a normal subgroup of G, and C and D are
cosets. We wish to deﬁne a coset E which is the product of C and D. If c ∈ C and
d ∈ D, deﬁne E to be the coset containing c d, i.e., E = N(c d). The coset E does
not depend upon the choice of c and d. This is made precise in the next theorem,
which is quite easy.
Theorem Suppose G is a multiplicative group, N is a normal subgroup, and
G/N is the collection of all cosets. Then (Na) (Nb) = N(a b) is a well de
ﬁned multiplication (binary operation) on G/N, and with this multiplication, G/N
is a group. Its identity is N and (Na)
−1
= (Na
−1
). Furthermore, if G is ﬁnite,
o(G/N) = o(G)/o(N).
Proof Multiplication of elements in G/N is multiplication of subsets in G.
(Na) (Nb) = N(aN)b = N(Na)b = N(a b). Once multiplication is well deﬁned,
the group axioms are immediate.
Exercise Write out the above theorem for G an additive group. In the additive
abelian group R/Z, determine those elements of ﬁnite order.
Example Suppose G = Z under +, n > 1, and N = nZ. Z
n
, the group of
integers mod n is deﬁned by Z
n
= Z/nZ. If a is an integer, the coset a + nZ is
denoted by [a]. Note that [a] + [b] = [a + b], −[a] = [−a], and [a] = [a + nl] for any
integer l. Any additive abelian group has a scalar multiplication over Z, and in this
case it is just [a]m = [am]. Note that [a] = [r] where r is the remainder of a divided
by n, and thus the distinct elements of Z
n
are [0], [1], ..., [n − 1]. Also Z
n
is cyclic
because each of [1] and [−1] = [n −1] is a generator. We already know that if p is a
prime, any nonzero element of Z
p
is a generator, because Z
p
has p elements.
Theorem If n > 1 and a is any integer, then [a] is a generator of Z
n
iﬀ (a, n) = 1.
Proof The element [a] is a generator iﬀ the subgroup generated by [a] contains
[1] iﬀ ∃ an integer k such that [a]k = [1] iﬀ ∃ integers k and l such that ak +nl = 1.
Exercise Show that a positive integer is divisible by 3 iﬀ the sum of its digits is
divisible by 3. Note that [10] = [1] in Z
3
. (See the ﬁfth exercise on page 18.)
Homomorphisms
Homomorphisms are functions between groups that commute with the group op
erations. It follows that they honor identities and inverses. In this section we list
28 Groups Chapter 2
the basic properties. Properties 11), 12), and 13) show the connections between coset
groups and homomorphisms, and should be considered as the cornerstones of abstract
algebra. As always, the student should rewrite the material in additive notation.
Deﬁnition If G and
¯
G are multiplicative groups, a function f : G →
¯
G is a
homomorphism if, for all a, b ∈ G, f(a b) = f(a) f(b). On the left side, the group
operation is in G, while on the right side it is in
¯
G. The kernel of f is deﬁned by
ker(f) = f
−1
(¯ e) = ¦a ∈ G : f(a) = ¯ e¦. In other words, the kernel is the set of
solutions to the equation f(x) = ¯ e. (If
¯
G is an additive group, ker(f) = f
−1
(0
¯
).)
Examples The constant map f : G →
¯
G deﬁned by f(a) = ¯ e is a homomorphism.
If H is a subgroup of G, the inclusion i : H → G is a homomorphism. The function
f : Z → Z deﬁned by f(t) = 2t is a homomorphism of additive groups, while the
function deﬁned by f(t) = t +2 is not a homomorphism. The function h : Z →R−0
deﬁned by h(t) = 2
t
is a homomorphism from an additive group to a multiplicative
group.
We now catalog the basic properties of homomorphisms. These will be helpful
later on in the study of ring homomorphisms and module homomorphisms.
Theorem Suppose G and
¯
G are groups and f : G →
¯
G is a homomorphism.
1) f(e) = ¯ e.
2) f(a
−1
) = f(a)
−1
. The ﬁrst inverse is in G, and the second is in
¯
G.
3) f is injective ⇔ ker(f) = e.
4) If H is a subgroup of G, f(H) is a subgroup of
¯
G. In particular, image(f) is
a subgroup of
¯
G.
5) If
¯
H is a subgroup of
¯
G, f
−1
(
¯
H) is a subgroup of G. Furthermore, if
¯
H is
normal in
¯
G, then f
−1
(
¯
H) is normal in G.
6) The kernel of f is a normal subgroup of G.
7) If ¯ g ∈
¯
G, f
−1
(¯ g) is void or is a coset of ker(f), i.e., if f(g) = ¯ g then
f
−1
(¯ g) = Ng where N= ker(f). In other words, if the equation f(x) = ¯ g has a
Chapter 2 Groups 29
solution, then the set of all solutions is a coset of N= ker(f). This is a key fact
which is used routinely in topics such as systems of equations and linear
diﬀerential equations.
8) The composition of homomorphisms is a homomorphism, i.e., if h :
¯
G →
=
G
is
a homomorphism, then h ◦ f : G →
=
G
is a homomorphism.
9) If f : G →
¯
G is a bijection, then the function f
−1
:
¯
G →G is a homomorphism.
In this case, f is called an isomorphism, and we write G ≈
¯
G. In the case
G =
¯
G, f is also called an automorphism.
10) Isomorphisms preserve all algebraic properties. For example, if f is an
isomorphism and H ⊂ G is a subset, then H is a subgroup of G
iﬀ f(H) is a subgroup of
¯
G, H is normal in G iﬀ f(H) is normal in
¯
G, G is
cyclic iﬀ
¯
G is cyclic, etc. Of course, this is somewhat of a copout, because an
algebraic property is one that, by deﬁnition, is preserved under isomorphisms.
11) Suppose H is a normal subgroup of G. Then π : G →G/H deﬁned by
π(a) = Ha is a surjective homomorphism with kernel H. Furthermore, if
f : G →
¯
G is a surjective homomorphism with kernel H, then G/H ≈
¯
G
(see below).
12) Suppose H is a normal subgroup of G. If H ⊂ ker(f), then
¯
f : G/H →
¯
G
deﬁned by
¯
f(Ha) = f(a) is a welldeﬁned homomorphism making
the following diagram commute.
G
¯
G
G/H
f
·
π
¯
f
Thus deﬁning a homomorphism on a quotient group is the same as deﬁning a
homomorphism on the numerator which sends the denominator to ¯ e. The
image of
¯
f is the image of f and the kernel of
¯
f is ker(f)/H. Thus if H = ker(f),
¯
f is injective, and thus G/H ≈ image(f).
13) Given any group homomorphism f, domain(f)/ker(f) ≈ image(f). This is
the fundamental connection between quotient groups and homomorphisms.
30 Groups Chapter 2
14) Suppose K is a group. Then K is an inﬁnite cycle group iﬀ K is isomorphic to
the integers under addition, i.e., K ≈ Z. K is a cyclic group of order n iﬀ
K ≈ Z
n
.
Proof of 14) Suppose
¯
G = K is generated by some element a. Then f : Z →K
deﬁned by f(m) = a
m
is a homomorphism from an additive group to a multiplicative
group. If o(a) is inﬁnite, f is an isomorphism. If o(a) = n, ker(f) = nZ and
¯
f : Z
n
→K is an isomorphism.
Exercise If a is an element of a group G, there is always a homomorphism from Z
to G which sends 1 to a. When is there a homomorphism from Z
n
to G which sends [1]
to a? What are the homomorphisms from Z
2
to Z
6
? What are the homomorphisms
from Z
4
to Z
8
?
Exercise Suppose G is a group and g is an element of G, g = e.
1) Under what conditions on g is there a homomorphism f : Z
7
→G with
f([1]) = g ?
2) Under what conditions on g is there a homomorphism f : Z
15
→G with
f([1]) = g ?
3) Under what conditions on G is there an injective homomorphism f : Z
15
→G ?
4) Under what conditions on G is there a surjective homomorphism f : Z
15
→G ?
Exercise We know every ﬁnite group of prime order is cyclic and thus abelian.
Show that every group of order four is abelian.
Exercise Let G = ¦h : [0, 1] → R : h has an inﬁnite number of derivatives¦.
Then G is a group under addition. Deﬁne f : G → G by f(h) =
dh
dt
= h
. Show f
is a homomorphism and ﬁnd its kernel and image. Let g : [0, 1] → R be deﬁned by
g(t) = t
3
−3t + 4. Find f
−1
(g) and show it is a coset of ker(f).
Exercise Let G be as above and g ∈ G. Deﬁne f : G →G by f(h) = h
+ 5h
+
6t
2
h. Then f is a group homomorphism and the diﬀerential equation h
+5h
+6t
2
h =
g has a solution iﬀ g lies in the image of f. Now suppose this equation has a solution
and S ⊂ G is the set of all solutions. For which subgroup H of G is S an Hcoset?
Chapter 2 Groups 31
Exercise Suppose G is a multiplicative group and a ∈ G. Deﬁne f : G → G to
be conjugation by a, i.e., f(g) = a
−1
g a. Show that f is a homomorphism. Also
show f is an automorphism and ﬁnd its inverse.
Permutations
Suppose X is a (nonvoid) set. A bijection f : X → X is called a permutation
on X, and the collection of all these permutations is denoted by S = S(X). In this
setting, variables are written on the left, i.e., f = (x)f. Therefore the composition
f ◦g means “f followed by g”. S(X) forms a multiplicative group under composition.
Exercise Show that if there is a bijection between X and Y , there is an iso
morphism between S(X) and S(Y ). Thus if each of X and Y has n elements,
S(X) ≈ S(Y ), and these groups are called the symmetric groups on n elements.
They are all denoted by the one symbol S
n
.
Exercise Show that o(S
n
) = n!. Let X = ¦1, 2, ..., n¦, S
n
= S(X), and H =
¦f ∈ S
n
: (n)f = n¦. Show H is a subgroup of S
n
which is isomorphic to S
n−1
. Let
g be any permutation on X with (n)g = 1. Find g
−1
Hg.
The next theorem shows that the symmetric groups are incredibly rich and com
plex.
Theorem (Cayley’s Theorem) Suppose G is a multiplicative group with n
elements and S
n
is the group of all permutations on the set G. Then G is isomorphic
to a subgroup of S
n
.
Proof Let h : G →S
n
be the function which sends a to the bijection h
a
: G →G
deﬁned by (g)h
a
= g a. The proof follows from the following observations.
1) For each given a, h
a
is a bijection from G to G.
2) h is a homomorphism, i.e., h
a·b
= h
a
◦ h
b
.
3) h is injective and thus G is isomorphic to image(h) ⊂ S
n
.
The Symmetric Groups Now let n ≥ 2 and let S
n
be the group of all permu
tations on ¦1, 2, ..., n¦. The following deﬁnition shows that each element of S
n
may
32 Groups Chapter 2
be represented by a matrix.
Deﬁnition Suppose 1 < k ≤ n, ¦a
1
, a
2
, ..., a
k
¦ is a collection of distinct inte
gers with 1 ≤ a
i
≤ n, and ¦b
1
, b
2
, ..., b
k
¦ is the same collection in some diﬀerent order.
Then the matrix
a
1
a
2
... a
k
b
1
b
2
... b
k
represents f ∈ S
n
deﬁned by (a
i
)f = b
i
for 1 ≤ i ≤ k,
and (a)f = a for all other a. The composition of two permutations is computed by
applying the matrix on the left ﬁrst and the matrix on the right second.
There is a special type of permutation called a cycle. For these we have a special
notation.
Deﬁnition
a
1
a
2
...a
k−1
a
k
a
2
a
3
...a
k
a
1
is called a kcycle, and is denoted by (a
1
, a
2
, ..., a
k
).
A 2cycle is called a transposition. The cycles (a
1
, ..., a
k
) and (c
1
, ..., c
) are disjoint
provided a
i
= c
j
for all 1 ≤ i ≤ k and 1 ≤ j ≤ .
Listed here are eight basic properties of permutations. They are all easy except
4), which takes a little work. Properties 9) and 10) are listed solely for reference.
Theorem
1) Disjoint cycles commute. (This is obvious.)
2) Every nonidentity permutation can be written uniquely (except for order) as
the product of disjoint cycles. (This is easy.)
3) Every permutation can be written (nonuniquely) as the product of transposi
tions. (Proof: I = (1, 2)(1, 2) and (a
1
, ..., a
k
) = (a
1
, a
2
)(a
1
, a
3
) (a
1
, a
k
). )
4) The parity of the number of these transpositions is unique. This means that if
f is the product of p transpositions and also of q transpositions, then p is
even iﬀ q is even. In this case, f is said to be an even permutation. In the other
case, f is an odd permutation.
5) A kcycle is even (odd) iﬀ k is odd (even). For example (1, 2, 3) = (1, 2)(1, 3) is
an even permutation.
6) Suppose f, g ∈ S
n
. If one of f and g is even and the other is odd, then g ◦ f is
Chapter 2 Groups 33
odd. If f and g are both even or both odd, then g ◦ f is even. (Obvious.)
7) The map h : S
n
→Z
2
deﬁned by h(even)= [0] and h(odd)= [1] is a
homomorphism from a multiplicative group to an additive group. Its kernel (the
subgroup of even permutations) is denoted by A
n
and is called the alternating
group. Thus A
n
is a normal subgroup of index 2, and S
n
/A
n
≈ Z
2
.
8) If a, b, c and d are distinct integers in ¦1, 2, . . . , n¦, then (a, b)(b, c) = (a, c, b)
and (a, b)(c, d) = (a, c, d)(a, c, b). Since I = (1, 2, 3)
3
, it follows that for
n ≥ 3, every even permutation is the product of 3cycles.
The following parts are not included in this course. They are presented here merely
for reference.
9) For any n = 4, A
n
is simple, i.e., has no proper normal subgroups.
10) S
n
can be generated by two elements. In fact, ¦(1, 2), (1, 2, ..., n)¦ generates S
n
.
(Of course there are subgroups of S
n
which cannot be generated by two
elements).
Proof of 4) It suﬃces to prove if the product of t transpositions is the identity I
on ¦1, 2, . . . , n¦, then t is even. Suppose this is false and I is written as t transposi
tions, where t is the smallest odd integer this is possible. Since t is odd, it is at least 3.
Suppose for convenience the ﬁrst transposition is (a, n). We will rewrite I as a prod
uct of transpositions σ
1
σ
2
σ
t
where (n)σ
i
= (n) for 1 ≤ i < t and (n)σ
t
= n, which
will be a contradiction. This can be done by inductively “pushing n to the right”
using the equations below. If a, b, and c are distinct integers in ¦1, 2, . . . , n − 1¦,
then (a, n)(a, n) = I, (a, n)(b, n) = (a, b)(a, n), (a, n)(a, c) = (a, c)(c, n), and
(a, n)(b, c) = (b, c)(a, n). Note that (a, n)(a, n) cannot occur here because it would
result in a shorter odd product. (Now you may solve the tile puzzle on page viii.)
Exercise
1) Write
1 2 3 4 5 6 7
6 5 4 3 1 7 2
as the product of disjoint cycles.
Write (1,5,6,7)(2,3,4)(3,7,1) as the product of disjoint cycles.
Write (3,7,1)(1,5,6,7)(2,3,4) as the product of disjoint cycles.
Which of these permutations are odd and which are even?
34 Groups Chapter 2
2) Suppose (a
1
, . . . , a
k
) and (c
1
, . . . , c
) are disjoint cycles. What is the order of
their product?
3) Suppose σ ∈ S
n
. Show that σ
−1
(1, 2, 3)σ = ((1)σ, (2)σ, (3)σ). This shows
that conjugation by σ is just a type of relabeling. Also let τ = (4, 5, 6) and
ﬁnd τ
−1
(1, 2, 3, 4, 5)τ.
4) Show that H = ¦σ ∈ S
6
: (6)σ = 6¦ is a subgroup of S
6
and ﬁnd its right
cosets and its left cosets.
5) Let A ⊂ R
2
be the square with vertices (−1, 1), (1, 1), (1, −1), and (−1, −1),
and G be the collection of all isometries of A onto itself. We know from a
previous exercise that G is a group with eight elements. It follows from Cayley’s
theorem that G is isomorphic to a subgroup of S
8
. Show that G is isomorphic
to a subgroup of S
4
.
6) If G is a multiplicative group, deﬁne a new multiplication on the set G by
a ◦ b = b a. In other words, the new multiplication is the old multiplication
in the opposite order. This deﬁnes a new group denoted by G
op
, the opposite
group. Show that it has the same identity and the same inverses as G, and
that f : G →G
op
deﬁned by f(a) = a
−1
is a group isomorphism. Now consider
the special case G = S
n
. The convention used in this section is that an element
of S
n
is a permutation on ¦1, 2, . . . , n¦ with the variable written on the left.
Show that an element of S
op
n
is a permutation on ¦1, 2, . . . , n¦ with the variable
written on the right. (Of course, either S
n
or S
op
n
may be called the symmetric
group, depending on personal preference or context.)
Product of Groups
The product of groups is usually presented for multiplicative groups. It is pre
sented here for additive groups because this is the form that occurs in later chapters.
As an exercise, this section should be rewritten using multiplicative notation. The
two theorems below are transparent and easy, but quite useful. For simplicity we
ﬁrst consider the product of two groups, although the case of inﬁnite products is only
slightly more diﬃcult. For background, read ﬁrst the two theorems on page 11.
Theorem Suppose G
1
and G
2
are additive groups. Deﬁne an addition on G
1
G
2
by (a
1
, a
2
) +(b
1
, b
2
) = (a
1
+b
1
, a
2
+b
2
). This operation makes G
1
G
2
into a group.
Its “zero” is (0
¯
1
, 0
¯
2
) and −(a
1
, a
2
) = (−a
1
, −a
2
). The projections π
1
: G
1
G
2
→G
1
Chapter 2 Groups 35
and π
2
: G
1
G
2
→G
2
are group homomorphisms. Suppose G is an additive group.
We know there is a bijection from ¦functions f : G →G
1
G
2
¦ to ¦ordered pairs of
functions (f
1
, f
2
) where f
1
: G → G
1
and f
2
: G → G
2
¦. Under this bijection, f is a
group homomorphism iﬀ each of f
1
and f
2
is a group homomorphism.
Proof It is transparent that the product of groups is a group, so let’s prove
the last part. Suppose G, G
1
, and G
2
are groups and f = (f
1
, f
2
) is a function
from G to G
1
G
2
. Now f(a + b) = (f
1
(a + b), f
2
(a + b)) and f(a) + f(b) =
(f
1
(a), f
2
(a)) +(f
1
(b), f
2
(b)) = (f
1
(a) +f
1
(b), f
2
(a) +f
2
(b)). An examination of these
two equations shows that f is a group homomorphism iﬀ each of f
1
and f
2
is a group
homomorphism.
Exercise Suppose G
1
and G
2
are groups. Show that G
1
G
2
and G
2
G
1
are
isomorphic.
Exercise If o(a
1
) = m and o(a
2
) = n, ﬁnd the order of (a
1
, a
2
) in G
1
G
2
.
Exercise Show that if G is any group of order 4, G is isomorphic to Z
4
or Z
2
Z
2
.
Show Z
4
is not isomorphic to Z
2
Z
2
. Show Z
12
is isomorphic to Z
4
Z
3
. Finally,
show that Z
mn
is isomorphic to Z
m
Z
n
iﬀ (m, n) = 1.
Exercise Suppose G
1
and G
2
are groups and i
1
: G
1
→ G
1
G
2
is deﬁned by
i
1
(g
1
) = (g
1
, 0
¯
2
). Show i
1
is an injective group homomorphism and its image is a
normal subgroup of G
1
G
2
. Usually G
1
is identiﬁed with its image under i
1
, so G
1
may be considered to be a normal subgroup of G
1
G
2
. Let π
2
: G
1
G
2
→ G
2
be the projection map deﬁned in the Background chapter. Show π
2
is a surjective
homomorphism with kernel G
1
. Therefore (G
1
G
2
)/G
1
≈ G
2
as you would expect.
Exercise Let R be the reals under addition. Show that the addition in the
product RR is just the usual addition in analytic geometry.
Exercise Suppose n > 2. Is S
n
isomorphic to A
n
G where G is a multiplicative
group of order 2 ?
One nice thing about the product of groups is that it works ﬁne for any ﬁnite
number, or even any inﬁnite number. The next theorem is stated in full generality.
36 Groups Chapter 2
Theorem Suppose T is an index set, and for any t ∈ T, G
t
is an additive
group. Deﬁne an addition on
¸
t∈T
G
t
=
¸
G
t
by ¦a
t
¦ + ¦b
t
¦ = ¦a
t
+ b
t
¦. This op
eration makes the product into a group. Its “zero” is ¦0
¯
t
¦ and −¦a
t
¦ = ¦−a
t
¦.
Each projection π
s
:
¸
G
t
→ G
s
is a group homomorphism. Suppose G is an ad
ditive group. Under the natural bijection from ¦functions f : G →
¸
G
t
¦ to
¦sequences of functions ¦f
t
¦
t∈T
where f
t
: G → G
t
¦, f is a group homomorphism
iﬀ each f
t
is a group homomorphism. Finally, the scalar multiplication on
¸
G
t
by integers is given coordinatewise, i.e., ¦a
t
¦n = ¦a
t
n¦.
Proof The addition on
¸
G
t
is coordinatewise.
Exercise Suppose s is an element of T and π
s
:
¸
G
t
→G
s
is the projection map
deﬁned in the Background chapter. Show π
s
is a surjective homomorphism and ﬁnd
its kernel.
Exercise Suppose s is an element of T and i
s
: G
s
→
¸
G
t
is deﬁned by i
s
(a) =
¦a
t
¦ where a
t
= 0
¯
if t = s and a
s
= a. Show i
s
is an injective homomorphism
and its image is a normal subgroup of
¸
G
t
. Thus each G
s
may be considered to be
a normal subgroup of
¸
G
t
.
Exercise Let f : Z → Z
30
Z
100
be the homomorphism deﬁned by f(m) =
([4m], [3m]). Find the kernel of f. Find the order of ([4], [3]) in Z
30
Z
100
.
Exercise Let f : Z → Z
90
Z
70
Z
42
be the group homomorphism deﬁned by
f(m) = ([m], [m], [m]). Find the kernel of f and show that f is not surjective. Let
g : Z → Z
45
Z
35
Z
21
be deﬁned by g(m) = ([m], [m], [m]). Find the kernel of
g and determine if g is surjective. Note that the gcd of ¦45, 35, 21¦ is 1. Now let
h : Z → Z
8
Z
9
Z
35
be deﬁned by h(m) = ([m], [m], [m]). Find the kernel of h
and show that h is surjective. Finally suppose each of b, c, and d is greater than 1
and f : Z → Z
b
Z
c
Z
d
is deﬁned by f(m) = ([m], [m], [m]). Find necessary and
suﬃcient conditions for f to be surjective (see the ﬁrst exercise on page 18).
Exercise Suppose T is a nonvoid set, G is an additive group, and G
T
is the
collection of all functions f : T →G with addition deﬁned by (f +g)(t) = f(t) +g(t).
Show G
T
is a group. For each t ∈ T, let G
t
= G. Note that G
T
is just another way
of writing
¸
t∈T
G
t
. Also note that if T = [0, 1] and G = R, the addition deﬁned on
G
T
is just the usual addition of functions used in calculus. (For the ring and module
versions, see exercises on pages 44 and 69.)
Chapter 3
Rings
Rings are additive abelian groups with a second operation called multiplication. The
connection between the two operations is provided by the distributive law. Assuming
the results of Chapter 2, this chapter ﬂows smoothly. This is because ideals are also
normal subgroups and ring homomorphisms are also group homomorphisms. We do
not show that the polynomial ring F[x] is a unique factorization domain, although
with the material at hand, it would be easy to do. Also there is no mention of prime
or maximal ideals, because these concepts are unnecessary for our development of
linear algebra. These concepts are developed in the Appendix. A section on Boolean
rings is included because of their importance in logic and computer science.
Suppose R is an additive abelian group, R = 0
¯
, and R has a second binary
operation (i.e., map from R R to R) which is denoted by multiplication. Consider
the following properties.
1) If a, b, c ∈ R, (a b) c = a (b c). (The associative property
of multiplication.)
2) If a, b, c ∈ R, a (b +c) = (a b) + (a c) and (b +c) a = (b a) + (c a).
(The distributive law, which connects addition and
multiplication.)
3) R has a multiplicative identity, i.e., there is an element
1
¯
= 1
¯
R
∈ R such that if a ∈ R, a 1
¯
= 1
¯
a = a.
4) If a, b ∈ R, a b = b a. (The commutative property for
multiplication.)
Deﬁnition If 1), 2), and 3) are satisﬁed, R is said to be a ring. If in addition 4)
is satisﬁed, R is said to be a commutative ring.
Examples The basic commutative rings in mathematics are the integers Z, the
37
38 Rings Chapter 3
rational numbers Q, the real numbers R, and the complex numbers C. It will be shown
later that Z
n
, the integers mod n, has a natural multiplication under which it is a
commutative ring. Also if R is any commutative ring, we will deﬁne R[x
1
, x
2
, . . . , x
n
],
a polynomical ring in n variables. Now suppose R is any ring, n ≥ 1, and R
n
is the
collection of all nn matrices over R. In the next chapter, operations of addition and
multiplication of matrices will be deﬁned. Under these operations, R
n
is a ring. This
is a basic example of a noncommutative ring. If n > 1, R
n
is never commutative,
even if R is commutative.
The next two theorems show that ring multiplication behaves as you would wish
it to. They should be worked as exercises.
Theorem Suppose R is a ring and a, b ∈ R.
1) a 0
¯
= 0
¯
a = 0
¯
. Since R = 0
¯
, it follows that 1
¯
= 0
¯
.
2) (−a) b = a (−b) = −(a b).
Recall that, since R is an additive abelian group, it has a scalar multiplication
over Z (page 20). This scalar multiplication can be written on the right or left, i.e.,
na = an, and the next theorem shows it relates nicely to the ring multiplication.
Theorem Suppose a, b ∈ R and n, m ∈ Z.
1) (na) (mb) = (nm)(a b). (This follows from the distributive
law and the previous theorem.)
2) Let n
¯
= n1
¯
. For example, 2
¯
= 1
¯
+ 1
¯
. Then na = n
¯
a, that is, scalar
multiplication by n is the same as ring multiplication by n
¯
.
Of course, n
¯
may be 0
¯
even though n = 0.
Units
Deﬁnition An element a of a ring R is a unit provided ∃ an element a
−1
∈ R
with a a
−1
= a
−1
a = 1
¯
.
Theorem 0
¯
can never be a unit. 1
¯
is always a unit. If a is a unit, a
−1
is also a
unit with (a
−1
)
−1
= a. The product of units is a unit with (a b)
−1
= b
−1
a
−1
. More
Chapter 3 Rings 39
generally, if a
1
, a
2
, ..., a
n
are units, then their product is a unit with (a
1
a
2
a
n
)
−1
=
a
−1
n
a
−1
n−1
a
−1
1
. The set of all units of R forms a multiplicative group denoted by
R
∗
. Finally if a is a unit, (−a) is a unit and (−a)
−1
= −(a
−1
).
In order for a to be a unit, it must have a twosided inverse. It suﬃces to require
a left inverse and a right inverse, as shown in the next theorem.
Theorem Suppose a ∈ R and ∃ elements b and c with b a = a c = 1
¯
. Then
b = c and so a is a unit with a
−1
= b = c.
Proof b = b 1
¯
= b (a c) = (b a) c = 1
¯
c = c.
Corollary Inverses are unique.
Domains and Fields In order to deﬁne these two types of rings, we ﬁrst consider
the concept of zero divisor.
Deﬁnition Suppose R is a commutative ring. An element a ∈ R is called a zero
divisor provided it is nonzero and ∃ a nonzero element b with a b = 0
¯
. Note that
if a is a unit, it cannot be a zero divisor.
Theorem Suppose R is a commutative ring and a ∈ (R−0
¯
) is not a zero divisor.
Then (a b = a c) ⇒b = c. In other words, multiplication by a is an injective map
from R to R. It is surjective iﬀ a is a unit.
Deﬁnition A domain (or integral domain) is a commutative ring such that, if
a = 0
¯
, a is not a zero divisor. A ﬁeld is a commutative ring such that, if a = 0
¯
, a is
a unit. In other words, R is a ﬁeld if it is commutative and its nonzero elements
form a group under multiplication.
Theorem A ﬁeld is a domain. A ﬁnite domain is a ﬁeld.
Proof A ﬁeld is a domain because a unit cannot be a zero divisor. Suppose R is
a ﬁnite domain and a = 0
¯
. Then f : R → R deﬁned by f(b) = a b is injective and,
by the pigeonhole principle, f is surjective. Thus a is a unit and so R is a ﬁeld.
40 Rings Chapter 3
Exercise Let C be the additive abelian group R
2
. Deﬁne multiplication by
(a, b) (c, d) = (ac − bd, ad + bc). Show C is a commutative ring which is a ﬁeld.
Note that 1
¯
= (1, 0) and if i = (0, 1), then i
2
= −1
¯
.
Examples Z is a domain. Q, R, and C are ﬁelds.
The Integers Mod n
The concept of integers mod n is fundamental in mathematics. It leads to a neat
little theory, as seen by the theorems below. However, the basic theory cannot be
completed until the product of rings is deﬁned. (See the Chinese Remainder Theorem
on page 50.) We know from page 27 that Z
n
is an additive abelian group.
Theorem Suppose n > 1. Deﬁne a multiplication on Z
n
by [a] [b] = [ab]. This
is a well deﬁned binary operation which makes Z
n
into a commutative ring.
Proof Since [a +kn] [b +l n] = [ab +n(al +bk +kl n)] = [ab], the multiplication
is welldeﬁned. The ring axioms are easily veriﬁed.
Theorem Suppose n > 1 and a ∈ Z. Then the following are equivalent.
1) [a] is a generator of the additive group Z
n
.
2) (a, n) = 1.
3) [a] is a unit of the ring Z
n
.
Proof We already know from page 27 that 1) and 2) are equivalent. Recall that
if b is an integer, [a]b = [a] [b] = [ab]. Thus 1) and 3) are equivalent, because each
says ∃ an integer b with [a]b = [1].
Corollary If n > 1, the following are equivalent.
1) Z
n
is a domain.
2) Z
n
is a ﬁeld.
3) n is a prime.
Proof We already know 1) and 2) are equivalent, because Z
n
is ﬁnite. Suppose
3) is true. Then by the previous theorem, each of [1], [2],...,[n − 1] is a unit, and
thus 2) is true. Now suppose 3) is false. Then n = ab where 1 < a < n, 1 < b < n,
Chapter 3 Rings 41
[a][b] = [0], and thus [a] is a zero divisor and 1) is false.
Exercise List the units and their inverses for Z
7
and Z
12
. Show that (Z
7
)
∗
is
a cyclic group but (Z
12
)
∗
is not. Show that in Z
12
the equation x
2
= 1
¯
has four
solutions. Finally show that if R is a domain, x
2
= 1
¯
can have at most two solutions
in R (see the ﬁrst theorem on page 46).
Subrings Suppose S is a subset of a ring R. The statement that S is a subring
of R means that S is a subgroup of the group R, 1
¯
∈ S , and (a, b ∈ S ⇒a b ∈ S).
Then clearly S is a ring and has the same multiplicative identity as R. Note that Z
is a subring of Q, Q is a subring of R, and R is a subring of C. Subrings do not play
a role analogous to subgroups. That role is played by ideals, and an ideal is never a
subring (unless it is the entire ring). Note that if S is a subring of R and s ∈ S, then
s may be a unit in R but not in S. Note also that Z and Z
n
have no proper subrings,
and thus occupy a special place in ring theory, as well as in group theory.
Ideals and Quotient Rings
Ideals in ring theory play a role analagous to normal subgroups in group theory.
Deﬁnition A subset I of a ring R is a
left
right
2−sided
ideal provided it is a subgroup
of the additive group R and if a ∈ R and b ∈ I, then
a b ∈ I
b a ∈ I
a b and b a ∈ I
. The
word “ideal ” means “2sided ideal”. Of course, if R is commutative, every right or
left ideal is an ideal.
Theorem Suppose R is a ring.
1) R and 0
¯
are ideals of R. These are called the improper ideals.
2) If ¦I
t
¦
t∈T
is a collection of right (left, 2sided) ideals of R, then
¸
t∈T
I
t
is a
right (left, 2sided) ideal of R. (See page 22.)
42 Rings Chapter 3
3) Furthermore, if the collection is monotonic, then
¸
t∈T
I
t
is a right (left, 2sided)
ideal of R.
4) If a ∈ R, I = aR is a right ideal. Thus if R is commutative, aR is an ideal,
called a principal ideal. Thus every subgroup of Z is a principal ideal,
because it is of the form nZ.
5) If R is a commutative ring and I ⊂ R is an ideal, then the following are
equivalent.
i) I = R.
ii) I contains some unit u.
iii) I contains 1
¯
.
Exercise Suppose R is a commutative ring. Show that R is a ﬁeld iﬀ R contains
no proper ideals.
The following theorem is just an observation, but it is in some sense the beginning
of ring theory.
Theorem Suppose R is a ring and I ⊂ R is an ideal, I = R. Since I is a normal
subgroup of the additive group R, R/I is an additive abelian group. Multiplication
of cosets deﬁned by (a +I) (b +I) = (ab +I) is welldeﬁned and makes R/I a ring.
Proof (a + I) (b + I) = a b + aI + Ib + II ⊂ a b + I. Thus multiplication
is well deﬁned, and the ring axioms are easily veriﬁed. The multiplicative identity is
(1
¯
+I).
Observation If R = Z, n > 1, and I = nZ, the ring structure on Z
n
= Z/nZ
is the same as the one previously deﬁned.
Homomorphisms
Deﬁnition Suppose R and
¯
R are rings. A function f : R →
¯
R is a ring homo
morphism provided
1) f is a group homomorphism
2) f(1
¯
R
) = 1
¯
¯
R
and
3) if a, b ∈ R then f(a b) = f(a) f(b). (On the left, multiplication
Chapter 3 Rings 43
is in R, while on the right multiplication is in
¯
R.)
The kernel of f is the kernel of f considered as a group homomorphism, namely
ker(f) = f
−1
(0
¯
).
Here is a list of the basic properties of ring homomorphisms. Much of this
work has already been done by the theorem in group theory on page 28.
Theorem Suppose each of R and
¯
R is a ring.
1) The identity map I
R
: R →R is a ring homomorphism.
2) The zero map from R to
¯
R is not a ring homomorphism
(because it does not send 1
¯
R
to 1
¯
¯
R
).
3) The composition of ring homomorphisms is a ring homomorphism.
4) If f : R →
¯
R is a bijection which is a ring homomorphism,
then f
−1
:
¯
R →R is a ring homomorphism. Such an f is called
a ring isomorphism. In the case R =
¯
R, f is also called a
ring automorphism.
5) The image of a ring homomorphism is a subring of the range.
6) The kernel of a ring homomorphism is an ideal of the domain.
In fact, if f : R →
¯
R is a homomorphism and I ⊂
¯
R is an ideal,
then f
−1
(I) is an ideal of R.
7) Suppose I is an ideal of R, I = R, and π : R →R/I is the
natural projection, π(a) = (a +I). Then π is a surjective ring
homomorphism with kernel I. Furthermore, if f : R →
¯
R is a surjective
ring homomorphism with kernel I, then R/I ≈
¯
R (see below).
8) From now on the word “homomorphism” means “ring homomorphism”.
Suppose f : R →
¯
R is a homomorphism and I is an ideal of R, I = R.
If I ⊂ ker(f), then
¯
f : R/I →
¯
R deﬁned by
¯
f(a +I) = f(a)
44 Rings Chapter 3
is a welldeﬁned homomorphism making the following diagram commute.
R
¯
R
R/I
f
·
π
¯
f
Thus deﬁning a homomorphism on a quotient ring is the same as
deﬁning a homomorphism on the numerator which sends the
denominator to zero. The image of
¯
f is the image of f, and
the kernel of
¯
f is ker(f)/I. Thus if I = ker(f),
¯
f is
injective, and so R/I ≈ image (f).
Proof We know all this on the group level, and it is only necessary
to check that
¯
f is a ring homomorphism, which is obvious.
9) Given any ring homomorphism f, domain(f)/ker(f) ≈ image(f).
Exercise Find a ring R with a proper ideal I and an element b such that b is not
a unit in R but (b +I) is a unit in R/I.
Exercise Show that if u is a unit in a ring R, then conjugation by u is an
automorphism on R. That is, show that f : R →R deﬁned by f(a) = u
−1
a u is
a ring homomorphism which is an isomorphism.
Exercise Suppose T is a nonvoid set, R is a ring, and R
T
is the collection of
all functions f : T → R. Deﬁne addition and multiplication on R
T
pointwise. This
means if f and g are functions from T to R, then (f + g)(t) = f(t) + g(t) and
(f g)(t) = f(t)g(t). Show that under these operations R
T
is a ring. Suppose S is a
nonvoid set and α : S →T is a function. If f : T →R is a function, deﬁne a function
α
∗
(f) : S →R by α
∗
(f) = f ◦ α. Show α
∗
: R
T
→R
S
is a ring homomorphism.
Exercise Now consider the case T = [0, 1] and R = R. Let A ⊂ R
[0,1]
be the
collection of all C
∞
functions, i.e., A =¦f : [0, 1] →R : f has an inﬁnite number of
derivatives¦. Show A is a ring. Notice that much of the work has been done in the
previous exercise. It is only necessary to show that A is a subring of the ring R
[0,1]
.
Chapter 3 Rings 45
Polynomial Rings
In calculus, we consider real functions f which are polynomials, f(x) = a
0
+a
1
x +
+a
n
x
n
. The sum and product of polynomials are again polynomials, and it is easy
to see that the collection of polynomial functions forms a commutative ring. We can
do the same thing formally in a purely algebraic setting.
Deﬁnition Suppose R is a commutative ring and x is a “variable” or “symbol”.
The polynomial ring R[x] is the collection of all polynomials f = a
0
+a
1
x + +a
n
x
n
where a
i
∈ R. Under the obvious addition and multiplication, R[x] is a commutative
ring. The degree of a nonzero polynomial f is the largest integer n such that a
n
= 0
¯
,
and is denoted by n = deg(f). If the top term a
n
= 1
¯
, then f is said to be monic.
To be more formal, think of a polynomial a
0
+ a
1
x + as an inﬁnite sequence
(a
0
, a
1
, ...) such that each a
i
∈ R and only a ﬁnite number are nonzero. Then
(a
0
, a
1
, ...) + (b
0
, b
1
, ...) = (a
0
+b
0
, a
1
+b
1
, ...) and
(a
0
, a
1
, ...) (b
0
, b
1
, ...) = (a
0
b
0
, a
0
b
1
+a
1
b
0
, a
0
b
2
+a
1
b
1
+a
2
b
0
, ...).
Note that on the right, the ring multiplication a b is written simply as ab, as is
often done for convenience.
Theorem If R is a domain, R[x] is also a domain.
Proof Suppose f and g are nonzero polynomials. Then deg(f)+deg(g) = deg(fg)
and thus fg is not 0
¯
. Another way to prove this theorem is to look at the bottom
terms instead of the top terms. Let a
i
x
i
and b
j
x
j
be the ﬁrst nonzero terms of f
and g. Then a
i
b
j
x
i+j
is the ﬁrst nonzero term of fg.
Theorem (The Division Algorithm) Suppose R is a commutative ring, f ∈
R[x] has degree ≥ 1 and its top coeﬃcient is a unit in R. (If R is a ﬁeld, the
top coeﬃcient of f will always be a unit.) Then for any g ∈ R[x], ∃! h, r ∈ R[x]
such that g = fh +r with r = 0
¯
or deg(r) < deg(f).
Proof This theorem states the existence and uniqueness of polynomials h and
r. We outline the proof of existence and leave uniqueness as an exercise. Suppose
f = a
0
+ a
1
x + +a
m
x
m
where m ≥ 1 and a
m
is a unit in R. For any g with
deg(g) < m, set h = 0
¯
and r = g. For the general case, the idea is to divide f into g
until the remainder has degree less than m. The proof is by induction on the degree
of g. Suppose n ≥ m and the result holds for any polynomial of degree less than
46 Rings Chapter 3
n. Suppose g is a polynomial of degree n. Now ∃ a monomial bx
t
with t = n − m
and deg(g − fbx
t
) < n. By induction, ∃ h
1
and r with fh
1
+ r = (g − fbx
t
) and
deg(r) < m. The result follows from the equation f(h
1
+bx
t
) +r = g.
Note If r = 0
¯
we say that f divides g. Note that f = x − c divides g iﬀ c is
a root of g, i.e., g(c) = 0
¯
. More generally, x −c divides g with remainder g(c).
Theorem Suppose R is a domain, n > 0, and g(x) = a
0
+ a
1
x + + a
n
x
n
is a
polynomial of degree n with at least one root in R. Then g has at most n roots. Let
c
1
, c
2
, .., c
k
be the distinct roots of g in the ring R. Then ∃ a unique sequence of
positive integers n
1
, n
2
, .., n
k
and a unique polynomial h with no root in R so that
g(x) = (x − c
1
)
n
1
(x − c
k
)
n
k
h(x). (If h has degree 0, i.e., if h = a
n
, then we say
“all the roots of g belong to R”. If g = a
n
x
n
, we say “all the roots of g are 0
¯
”.)
Proof Uniqueness is easy so let’s prove existence. The theorem is clearly true
for n = 1. Suppose n > 1 and the theorem is true for any polynomial of degree less
than n. Now suppose g is a polynomial of degree n and c
1
is a root of g. Then ∃
a polynomial h
1
with g(x) = (x − c
1
)h
1
. Since h
1
has degree less than n, the result
follows by induction.
Note If g is any nonconstant polynomial in C[x], all the roots of g belong to C,
i.e., C is an algebraically closed ﬁeld. This is called The Fundamental Theorem of
Algebra, and it is assumed without proof for this textbook.
Exercise Suppose g is a nonconstant polynomial in R[x]. Show that if g has
odd degree then it has a real root. Also show that if g(x) = x
2
+ bx + c, then it has
a real root iﬀ b
2
≥ 4c, and in that case both roots belong to R.
Deﬁnition A domain T is a principal ideal domain (PID) if, given any ideal I,
∃ t ∈ T such that I = tT. Note that Z is a PID and any ﬁeld is PID.
Theorem Suppose F is a ﬁeld, I is a proper ideal of F[x], and n is the smallest
positive integer such that I contains a polynomial of degree n. Then I contains a
unique polynomial of the form f = a
0
+ a
1
x + +a
n−1
x
n−1
+ x
n
and it has the
property that I = fF[x]. Thus F[x] is a PID. Furthermore, each coset of I can be
written uniquely in the form (c
0
+c
1
x + +c
n−1
x
n−1
+I).
Proof. This is a good exercise in the use of the division algorithm. Note this is
similar to showing that a subgroup of Z is generated by one element (see page 15).
Chapter 3 Rings 47
Theorem. Suppose R is a subring of a commutative ring C and c ∈ C. Then
∃! homomorphism h : R[x] → C with h(x) = c and h(r) = r for all r ∈ R. It is
deﬁned by h(a
0
+ a
1
x + +a
n
x
n
) = a
0
+ a
1
c + +a
n
c
n
, i.e., h sends f(x) to f(c).
The image of h is the smallest subring of C containing R and c.
This map h is called an evaluation map. The theorem says that adding two
polynomials in R[x] and evaluating is the same as evaluating and then adding in C.
Also multiplying two polynomials in R[x] and evaluating is the same as evaluating
and then multiplying in C. In street language the theorem says you are free to send
x wherever you wish and extend to a ring homomorphism on R[x].
Exercise Let C = ¦a + bi : a, b ∈ R¦. Since R is a subring of C, there exists a
homomorphism h : R[x] → C which sends x to i, and this h is surjective. Show
ker(h) = (x
2
+ 1)R[x] and thus R[x]/(x
2
+ 1) ≈ C. This is a good way to look
at the complex numbers, i.e., to obtain C, adjoin x to R and set x
2
= −1.
Exercise Z
2
[x]/(x
2
+ x + 1) has 4 elements. Write out the multiplication table
for this ring and show that it is a ﬁeld.
Exercise Show that, if R is a domain, the units of R[x] are just the units of R.
Thus if F is a ﬁeld, the units of F[x] are the nonzero constants. Show that [1] +[2]x
is a unit in Z
4
[x].
In this chapter we do not prove F[x] is a unique factorization domain, nor do
we even deﬁne unique factorization domain. The next deﬁnition and theorem are
included merely for reference, and should not be studied at this stage.
Deﬁnition Suppose F is a ﬁeld and f ∈ F[x] has degree ≥ 1. The statement
that g is an associate of f means ∃ a unit u ∈ F[x] such that g = uf. The statement
that f is irreducible means that if h is a nonconstant polynomial which divides f,
then h is an associate of f.
We do not develop the theory of F[x] here. However, the development is easy
because it corresponds to the development of Z in Chapter 1. The Division Algo
rithm corresponds to the Euclidean Algorithm. Irreducible polynomials correspond
to prime integers. The degree function corresponds to the absolute value function.
One diﬀerence is that the units of F[x] are nonzero constants, while the units of Z
48 Rings Chapter 3
are just ±1. Thus the associates of f are all cf with c = 0
¯
while the associates of an
integer n are just ±n. Here is the basic theorem. (This theory is developed in full in
the Appendix under the topic of Euclidean domains.)
Theorem Suppose F is a ﬁeld and f ∈ F[x] has degree ≥ 1. Then f factors as the
product of irreducibles, and this factorization is unique up to order and associates.
Also the following are equivalent.
1) F[x]/(f) is a domain.
2) F[x]/(f) is a ﬁeld.
3) f is irreducible.
Deﬁnition Now suppose x and y are “variables”. If a ∈ R and n, m ≥ 0, then
ax
n
y
m
= ay
m
x
n
is called a monomial. Deﬁne an element of R[x, y] to be any ﬁnite
sum of monomials.
Theorem R[x, y] is a commutative ring and (R[x])[y] ≈ R[x, y] ≈ (R[y])[x]. In
other words, any polynomial in x and y with coeﬃcients in R may be written as a
polynomial in y with coeﬃcients in R[x], or as a polynomial in x with coeﬃcients in
R[y].
Side Comment It is true that if F is a ﬁeld, each f ∈ F[x, y] factors as the
product of irreducibles. However F[x, y] is not a PID. For example, the ideal
I = xF[x, y] +yF[x, y] = ¦f ∈ F[x, y] : f(0
¯
, 0
¯
) = 0
¯
¦ is not principal.
If R is a commutative ring and n ≥ 2, the concept of a polynomial ring in
n variables works ﬁne without a hitch. If a ∈ R and v
1
, v
2
, ..., v
n
are nonnegative
integers, then ax
v
1
1
x
v
2
2
x
vn
n
is called a monomial. Order does not matter here.
Deﬁne an element of R[x
1
, x
2
, ..., x
n
] to be any ﬁnite sum of monomials. This
gives a commutative ring and there is canonical isomorphism R[x
1
, x
2
, ..., x
n
] ≈
(R[x
1
, x
2
, ..., x
n−1
])[x
n
]. Using this and induction on n, it is easy to prove the fol
lowing theorem.
Theorem If R is a domain, R[x
1
, x
2
, ..., x
n
] is a domain and its units are just the
units of R.
Chapter 3 Rings 49
Exercise Suppose R is a commutative ring and f : R[x, y] → R[x] is the eval
uation map which sends y to 0
¯
. This means f(p(x, y)) = p(x, 0
¯
). Show f is a ring
homomorphism whose kernel is the ideal (y) = yR[x, y]. Use the fact that “the do
main mod the kernel is isomorphic to the image” to show R[x, y]/(y) is isomorphic
to R[x]. That is, if you adjoin y to R[x] and then factor it out, you get R[x] back.
Product of Rings
The product of rings works ﬁne, just as does the product of groups.
Theorem Suppose T is an index set and for each t ∈ T, R
t
is a ring. On the
additive abelian group
¸
t∈T
R
t
=
¸
R
t
, deﬁne multiplication by ¦r
t
¦ ¦s
t
¦ = ¦r
t
s
t
¦.
Then
¸
R
t
is a ring and each projection π
s
:
¸
R
t
→ R
s
is a ring homomorphism.
Suppose R is a ring. Under the natural bijection from ¦functions f : R →
¸
R
t
¦
to ¦sequences of functions ¦f
t
¦
t∈T
where f
t
: R → R
t
¦, f is a ring homomorphism
iﬀ each f
t
is a ring homomorphism.
Proof We already know f is a group homomorphism iﬀ each f
t
is a group homo
morphism (see page 36). Note that ¦1
¯
t
¦ is the multiplicative identity of
¸
R
t
, and
f(1
¯
R
) = ¦1
¯
t
¦ iﬀ f
t
(1
¯
R
) = 1
¯
t
for each t ∈ T. Finally, since multiplication is deﬁned
coordinatewise, f is a ring homomorphism iﬀ each f
t
is a ring homomorphism.
Exercise Suppose R and S are rings. Note that R 0 is not a subring of R S
because it does not contain (1
¯
R
, 1
¯
S
). Show R0
¯
is an ideal and (RS/R0
¯
) ≈ S.
Suppose I ⊂ R and J ⊂ S are ideals. Show I J is an ideal of RS and every
ideal of R S is of this form.
Exercise Suppose R and S are commutative rings. Show T = R S is not a
domain. Let e = (1, 0) ∈ RS and show e
2
= e, (1 −e)
2
= (1 −e), R0 = eT,
and 0 S = (1 −e)T.
Exercise If T is any ring, an element e of T is called an idempotent provided
e
2
= e. The elements 0 and 1 are idempotents called the trivial idempotents. Suppose
T is a commutative ring and e ∈ T is an idempotent with 0 = e = 1. Let R = eT
and S = (1 − e)T. Show each of the ideals R and S is a ring with identity, and
f : T →RS deﬁned by f(t) = (et, (1−e)t) is a ring isomorphism. This shows that
a commutative ring T splits as the product of two rings iﬀ it contains a nontrivial
idempotent.
50 Rings Chapter 3
The Chinese Remainder Theorem
The natural map from Z to Z
m
Z
n
is a group homomorphism and also a ring
homomorphism. If m and n are relatively prime, this map is surjective with kernel
mnZ, and thus Z
mn
and Z
m
Z
n
are isomorphic as groups and as rings. The next
theorem is a classical generalization of this. (See exercise three on page 35.)
Theorem Suppose n
1
, ..., n
t
are integers, each n
i
> 1, and (n
i
, n
j
) = 1 for all
i = j. Let f
i
: Z → Z
n
i
be deﬁned by f
i
(a) = [a]. (Note that the bracket symbol is
used ambiguously.) Then the ring homomorphism f = (f
1
, .., f
t
) : Z →Z
n
1
Z
nt
is surjective. Furthermore, the kernel of f is nZ, where n = n
1
n
2
n
t
. Thus Z
n
and Z
n
1
Z
nt
are isomorphic as rings, and thus also as groups.
Proof We wish to show that the order of f(1) is n, and thus f(1) is a group
generator, and thus f is surjective. The element f(1)m = ([1], .., [1])m = ([m], .., [m])
is zero iﬀ m is a multiple of each of n
1
, .., n
t
. Since their least common multiple is n,
the order of f(1) is n. (See the fourth exercise on page 36 for the case t = 3.)
Exercise Show that if a is an integer and p is a prime, then [a] = [a
p
] in Z
p
(Fermat’s Little Theorem). Use this and the Chinese Remainder Theorem to show
that if b is a positive integer, it has the same last digit as b
5
.
Characteristic
The following theorem is just an observation, but it shows that in ring theory, the
ring of integers is a “cornerstone”.
Theorem If R is a ring, there is one and only one ring homomorphism f : Z →R.
It is given by f(m) = m1
¯
= m
¯
. Thus the subgroup of R generated by 1
¯
is a subring
of R isomorphic to Z or isomorphic to Z
n
for some positive integer n.
Deﬁnition Suppose R is a ring and f : Z → R is the natural ring homomor
phism f(m) = m1
¯
= m
¯
. The nonnegative integer n with ker(f) = nZ is called the
characteristic of R. Thus f is injective iﬀ R has characteristic 0 iﬀ 1
¯
has inﬁnite
order. If f is not injective, the characteristic of R is the order of 1
¯
.
It is an interesting fact that, if R is a domain, all the nonzero elements of R
have the same order. (See page 23 for the deﬁnition of order.)
Chapter 3 Rings 51
Theorem Suppose R is a domain. If R has characteristic 0, then each nonzero
a ∈ R has inﬁnite order. If R has ﬁnite characteristic n, then n is a prime and each
nonzero a ∈ R has order n.
Proof Suppose R has characteristic 0, a is a nonzero element of R, and m is a
positive integer. Then ma = m
¯
a cannot be 0
¯
because m
¯
, a = 0
¯
and R is a domain.
Thus o(a) = ∞. Now suppose R has characteristic n. Then R contains Z
n
as a
subring, and thus Z
n
is a domain and n is a prime. If a is a nonzero element of R,
na = n
¯
a = 0
¯
a = 0
¯
and thus o(a)[n and thus o(a) = n.
Exercise Show that if F is a ﬁeld of characteristic 0, F contains Q as a subring.
That is, show that the injective homomorphism f : Z → F extends to an injective
homomorphism
¯
f : Q →F.
Boolean Rings
This section is not used elsewhere in this book. However it ﬁts easily here, and is
included for reference.
Deﬁnition A ring R is a Boolean ring if for each a ∈ R, a
2
= a, i.e., each
element of R is an idempotent.
Theorem Suppose R is a Boolean ring.
1) R has characteristic 2. If a ∈ R, 2a = a +a = 0
¯
, and so a = −a.
Proof (a +a) = (a +a)
2
= a
2
+ 2a
2
+a
2
= 4a. Thus 2a = 0
¯
.
2) R is commutative.
Proof (a +b) = (a +b)
2
= a
2
+ (a b) + (b a) +b
2
= a + (a b) −(b a) +b. Thus a b = b a.
3) If R is a domain, R ≈ Z
2
.
Proof Suppose a = 0
¯
. Then a (1
¯
−a) = 0
¯
and so a = 1
¯
.
4) The image of a Boolean ring is a Boolean ring. That is, if I is an ideal
of R with I = R, then every element of R/I is idempotent and thus
R/I is a Boolean ring. It follows from 3) that R/I is a domain iﬀ R/I
is a ﬁeld iﬀ R/I ≈ Z
2
. (In the language of Chapter 6, I is a prime
ideal iﬀ I is a maximal ideal iﬀ R/I ≈ Z
2
).
52 Rings Chapter 3
Suppose X is a nonvoid set. If a is a subset of X, let a
= (X−a) be a complement
of a in X. Now suppose R is a nonvoid collection of subsets of X. Consider the
following properties which the collection R may possess.
1) a ∈ R ⇒ a
∈ R.
2) a, b ∈ R ⇒ (a ∩ b) ∈ R.
3) a, b ∈ R ⇒ (a ∪ b) ∈ R.
4) ∅ ∈ R and X ∈ R.
Theorem If 1) and 2) are satisﬁed, then 3) and 4) are satisﬁed. In this case, R
is called a Boolean algebra of sets.
Proof Suppose 1) and 2) are true, and a, b ∈ R. Then a ∪b = (a
∩b
)
belongs to
R and so 3) is true. Since R is nonvoid, it contains some element a. Then ∅ = a ∩a
and X = a ∪ a
belong to R, and so 4) is true.
Theorem Suppose R is a Boolean algebra of sets. Deﬁne an addition on R by
a +b = (a ∪ b) −(a ∩ b). Under this addition, R is an abelian group with 0
¯
= ∅ and
a = −a. Deﬁne a multiplication on R by a b = a ∩ b. Under this multiplication R
becomes a Boolean ring with 1
¯
= X.
Exercise Let X = ¦1, 2, ..., n¦ and let R be the Boolean ring of all subsets of
X. Note that o(R) = 2
n
. Deﬁne f
i
: R → Z
2
by f
i
(a) = [1] iﬀ i ∈ a. Show each
f
i
is a homomorphism and thus f = (f
1
, ..., f
n
) : R → Z
2
Z
2
Z
2
is a ring
homomorphism. Show f is an isomorphism. (See exercises 1) and 4) on page 12.)
Exercise Use the last exercise on page 49 to show that any ﬁnite Boolean ring is
isomorphic to Z
2
Z
2
Z
2
, and thus also to the Boolean ring of subsets above.
Note Suppose R is a Boolean ring. It is a classical theorem that ∃ a Boolean
algebra of sets whose Boolean ring is isomorphic to R. So let’s just suppose R is
a Boolean algebra of sets which is a Boolean ring with addition and multiplication
deﬁned as above. Now deﬁne a ∨ b = a ∪ b and a ∧ b = a ∩ b. These operations cup
and cap are associative, commutative, have identity elements, and each distributes
over the other. With these two operations (along with complement), R is called a
Boolean algebra. R is not a group under cup or cap. Anyway, it is a classical fact
that, if you have a Boolean ring (algebra), you have a Boolean algebra (ring). The
advantage of the algebra is that it is symmetric in cup and cap. The advantage of
the ring viewpoint is that you can draw from the rich theory of commutative rings.
Chapter 4
Matrices and Matrix Rings
We ﬁrst consider matrices in full generality, i.e., over an arbitrary ring R. However,
after the ﬁrst few pages, it will be assumed that R is commutative. The topics,
such as invertible matrices, transpose, elementary matrices, systems of equations,
and determinant, are all classical. The highlight of the chapter is the theorem that a
square matrix is a unit in the matrix ring iﬀ its determinant is a unit in the ring.
This chapter concludes with the theorem that similar matrices have the same deter
minant, trace, and characteristic polynomial. This will be used in the next chapter
to show that an endomorphism on a ﬁnitely generated vector space has a welldeﬁned
determinant, trace, and characteristic polynomial.
Deﬁnition Suppose R is a ring and m and n are positive integers. Let R
m,n
be
the collection of all mn matrices
A = (a
i,j
) =
¸
¸
¸
a
1,1
. . . a
1,n
.
.
.
.
.
.
a
m,1
. . . a
m,n
¸
where each entry a
i,j
∈ R.
A matrix may be viewed as m ndimensional row vectors or as n mdimensional
column vectors. A matrix is said to be square if it has the same number of rows
as columns. Square matrices are so important that they have a special notation,
R
n
= R
n,n
. R
n
is deﬁned to be the additive abelian group R R R.
To emphasize that R
n
does not have a ring structure, we use the “sum” notation,
R
n
= R⊕R⊕ ⊕R. Our convention is to write elements of R
n
as column vectors,
i.e., to identify R
n
with R
n,1
. If the elements of R
n
are written as row vectors, R
n
is
identiﬁed with R
1,n
.
53
54 Matrices Chapter 4
Addition of matrices To “add” two matrices, they must have the same number
of rows and the same number of columns, i.e., addition is a binary operation R
m,n
R
m,n
→R
m,n
. The addition is deﬁned by (a
i,j
) +(b
i,j
) = (a
i,j
+b
i,j
), i.e., the i, j term
of the sum is the sum of the i, j terms. The following theorem is just an observation.
Theorem R
m,n
is an additive abelian group. Its “zero” is the matrix 0 = 0
m,n
all of whose terms are zero. Also −(a
i,j
) = (−a
i,j
). Furthermore, as additive groups,
R
m,n
≈ R
mn
.
Scalar multiplication An element of R is called a scalar. A matrix may be
“multiplied” on the right or left by a scalar. Right scalar multiplication is deﬁned
by (a
i,j
)c = (a
i,j
c). It is a function R
m,n
R → R
m,n
. Note in particular that
scalar multiplication is deﬁned on R
n
. Of course, if R is commutative, there is no
distinction between right and left scalar multiplication.
Theorem Suppose A, B ∈ R
m,n
and c, d ∈ R. Then
(A+B)c = Ac +Bc
A(c +d) = Ac +Ad
A(cd) = (Ac)d
and A1 = A
This theorem is entirely transparent. In the language of the next chapter, it merely
states that R
m,n
is a right module over the ring R.
Multiplication of Matrices The matrix product AB is deﬁned iﬀ the number
of columns of A is equal to the number of rows of B. The matrix AB will have the
same number of rows as A and the same number of columns as B, i.e., multiplication
is a function R
m,n
R
n,p
→R
m,p
. The product (a
i,j
)(b
i,j
) is deﬁned to be the matrix
whose (s, t) term is a
s,1
b
1,t
+ + a
s,n
b
n,t
, i.e., the dot product of row s of A
with column t of B.
Exercise Consider real matrices A =
a b
c d
, U =
2 0
0 1
, V =
0 1
1 0
,
and W =
1 2
0 1
. Find the matrices AU, UA, AV, VA, AW, and WA.
Chapter 4 Matrices 55
Deﬁnition The identity matrix I
n
∈ R
n
is the square matrix whose diagonal terms
are 1 and whose oﬀdiagonal terms are 0.
Theorem Suppose A ∈ R
m,n
.
1) 0
p,m
A = 0
p,n
A0
n,p
= 0
m,p
2) I
m
A = A = AI
n
Theorem (The distributive laws) (A+B)C = AC +BC and
C(A+B) = CA +CB whenever the
operations are deﬁned.
Theorem (The associative law for matrix multiplication) Suppose A ∈ R
m,n
,
B ∈ R
n,p
, and C ∈ R
p,q
. Then (AB)C = A(BC). Note that ABC ∈ R
m,q
.
Proof We must show that the (s, t) terms are equal. The proof involves writing
it out and changing the order of summation. Let (x
i,j
) = AB and (y
i,j
) = BC.
Then the (s, t) term of (AB)C is
¸
i
x
s,i
c
i,t
=
¸
i
¸
j
a
s,j
b
j,i
c
i,t
=
¸
i,j
a
s,j
b
j,i
c
i,t
=
¸
j
a
s,j
¸
i
b
j,i
c
i,t
=
¸
j
a
s,j
y
j,t
which is the (s, t) term of A(BC).
Theorem For each ring R and integer n ≥ 1, R
n
is a ring.
Proof This elegant little theorem is immediate from the theorems above. The
units of R
n
are called invertible or nonsingular matrices. They form a group under
multiplication called the general linear group and denoted by GL
n
(R) = (R
n
)
∗
.
Exercise Recall that if A is a ring and a ∈ A, then aA is right ideal of A. Let
A = R
2
and a = (a
i,j
) where a
1,1
= 1 and the other entries are 0. Find aR
2
and R
2
a.
Show that the only ideal of R
2
containing a is R
2
itself.
Multiplication by blocks Suppose A, E ∈ R
n
, B, F ∈ R
n,m
, C, G ∈ R
m,n
, and
D, H ∈ R
m
. Then multiplication in R
n+m
is given by
A B
C D
E F
G H
=
AE +BG AF +BH
CE +DG CF +DH
.
56 Matrices Chapter 4
Transpose
Notation For the remainder of this chapter on matrices, suppose R is a commu
tative ring. Of course, for n > 1, R
n
is noncommutative.
Transpose is a function from R
m,n
to R
n,m
. If A ∈ R
m,n
, A
t
∈ R
n,m
is the matrix
whose (i, j) term is the (j, i) term of A. So row i (column i) of A becomes column
i (row i) of A
t
. If A is an ndimensional row vector, then A
t
is an ndimensional
column vector. If A is a square matrix, A
t
is also square.
Theorem 1) (A
t
)
t
= A
2) (A+B)
t
= A
t
+B
t
3) If c ∈ R, (Ac)
t
= A
t
c
4) (AB)
t
= B
t
A
t
5) If A ∈ R
n
, then A is invertible iﬀ A
t
is invertible.
In this case (A
−1
)
t
= (A
t
)
−1
.
Proof of 5) Suppose A is invertible. Then I = I
t
= (AA
−1
)
t
= (A
−1
)
t
A
t
.
Exercise Characterize those invertible matrices A ∈ R
2
which have A
−1
= A
t
.
Show that they form a subgroup of GL
2
(R).
Triangular Matrices
If A ∈ R
n
, then A is upper (lower) triangular provided a
i,j
= 0 for all i > j (all
j > i). A is strictly upper (lower) triangular provided a
i,j
= 0 for all i ≥ j (all j ≥ i).
A is diagonal if it is upper and lower triangular, i.e., a
i,j
= 0 for all i = j. Note
that if A is upper (lower) triangular, then A
t
is lower (upper) triangular.
Theorem If A ∈ R
n
is strictly upper (or lower) triangular, then A
n
= 0.
Proof The way to understand this is just multiply it out for n = 2 and n = 3.
The geometry of this theorem will become transparent later in Chapter 5 when the
matrix A deﬁnes an Rmodule endomorphism on R
n
(see page 93).
Deﬁnition If T is any ring, an element t ∈ T is said to be nilpotent provided ∃n
such that t
n
= 0. In this case, (1 − t) is a unit with inverse 1 + t + t
2
+ + t
n−1
.
Thus if T = R
n
and B is a nilpotent matrix, I −B is invertible.
Chapter 4 Matrices 57
Exercise Let R = Z. Find the inverse of
¸
¸
1 2 −3
0 1 4
0 0 1
¸
.
Exercise Suppose A =
¸
¸
¸
¸
¸
¸
¸
a
1
a
2
0
0
a
n
¸
is a diagonal matrix, B ∈ R
m,n
,
and C ∈ R
n,p
. Show that BA is obtained from B by multiplying column i of B
by a
i
. Show AC is obtained from C by multiplying row i of C by a
i
. Show A is a
unit in R
n
iﬀ each a
i
is a unit in R.
Scalar matrices A scalar matrix is a diagonal matrix for which all the diagonal
terms are equal, i.e., a matrix of the form cI
n
. The map R → R
n
which sends c to
cI
n
is an injective ring homomorphism, and thus we may consider R to be a subring
of R
n
. Multiplying by a scalar is the same as multiplying by a scalar matrix, and
thus scalar matrices commute with everything, i.e., if B ∈ R
n
, (cI
n
)B = cB = Bc =
B(cI
n
). Recall we are assuming R is a commutative ring.
Exercise Suppose A ∈ R
n
and for each B ∈ R
n
, AB = BA. Show A is a scalar
matrix. For n > 1, this shows how noncommutative R
n
is.
Elementary Operations and Elementary Matrices
Suppose R is a commutative ring and A is a matrix over R. There are 3 types of
elementary row and column operations on the matrix A. A need not be square.
Type 1 Multiply row i by some Multiply column i by some
unit a ∈ R. unit a ∈ R.
Type 2 Interchange row i and row j. Interchange column i and column j.
Type 3 Add a times row j Add a times column i
to row i where i = j and a to column j where i = j and a
is any element of R. is any element of R.
58 Matrices Chapter 4
Elementary Matrices Elementary matrices are square and invertible. There
are three types. They are obtained by performing row or column operations on the
identity matrix.
Type 1 B =
¸
¸
¸
¸
¸
¸
¸
¸
¸
1
1 0
a
1
0 1
1
¸
where a is a unit in R.
Type 2 B =
¸
¸
¸
¸
¸
¸
¸
¸
¸
1
0 1
1
1
1 0
1
¸
Type 3 B =
¸
¸
¸
¸
¸
¸
¸
¸
¸
1
1 a
i,j
1
1
0 1
1
¸
where i = j and a
i,j
is
any element of R.
In type 1, all the oﬀdiagonal elements are zero. In type 2, there are two nonzero
oﬀdiagonal elements. In type 3, there is at most one nonzero oﬀdiagonal element,
and it may be above or below the diagonal.
Exercise Show that if B is an elementary matrix of type 1,2, or 3, then B is
invertible and B
−1
is an elementary matrix of the same type.
The following theorem is handy when working with matrices.
Theorem Suppose A is a matrix. It need not be square. To perform an elemen
tary row (column) operation on A, perform the operation on an identity matrix to
obtain an elementary matrix B, and multiply on the left (right). That is, BA = row
operation on A and AB = column operation on A. (See the exercise on page 54.)
Chapter 4 Matrices 59
Exercise Suppose F is a ﬁeld and A ∈ F
m,n
.
1) Show ∃ invertible matrices B ∈ F
m
and C ∈ F
n
such that BAC = (d
i,j
)
where d
1,1
= = d
t,t
= 1 and all other entries are 0. The integer t is
called the rank of A. (See page 89 of Chapter 5.)
2) Suppose A ∈ F
n
is invertible. Show A is the product of elementary
matrices.
3) A matrix T is said to be in row echelon form if, for each 1 ≤ i < m, the
ﬁrst nonzero term of row (i + 1) is to the right of the ﬁrst nonzero
term of row i. Show ∃ an invertible matrix B ∈ F
m
such that BA is in
row echelon form.
4) Let A =
3 11
0 4
and D =
3 11
1 4
. Write A and D as products
of elementary matrices over Q. Is it possible to write them as products
of elementary matrices over Z?
For 1), perform row and column operations on A to reach the desired form. This
shows the matrices B and C may be selected as products of elementary matrices.
Part 2) also follows from this procedure. For part 3), use only row operations. Notice
that if T is in rowechelon form, the number of nonzero rows is the rank of T.
Systems of Equations
Suppose A = (a
i,j
) ∈ R
m,n
and C =
¸
¸
¸
¸
c
1
c
m
¸
∈ R
m
= R
m,1
. The system
a
1,1
x
1
+ + a
1,n
x
n
= c
1
.
.
.
.
.
.
.
.
.
a
m,1
x
1
+ +a
m,n
x
n
= c
m
of m equations in n unknowns, can be written as one
matrix equation in one unknown, namely as (a
i,j
)
¸
¸
¸
¸
x
1
x
n
¸
=
¸
¸
¸
¸
c
1
c
m
¸
or AX = C.
60 Matrices Chapter 4
Deﬁne f : R
n
→R
m
by f(D) = AD. Then f is a group homomorphism and also
f(Dc) = f(D)c for any c ∈ R. In the language of the next chapter, this says that
f is an Rmodule homomorphism. The next theorem summarizes what we already
know about solutions of linear equations in this setting.
Theorem
1) AX = 0 is called the homogeneous equation. Its solution set is ker(f).
2) AX = C has a solution iﬀ C ∈ image(f). If D ∈ R
n
is one
solution, the solution set f
−1
(C) is the coset D + ker(f) in R
n
.
(See part 7 of the theorem on homomorphisms in Chapter 2, page 28.)
3) Suppose B ∈ R
m
is invertible. Then AX = C and (BA)X = BC have
the same set of solutions. Thus we may perform any row operation
on both sides of the equation and not change the solution set.
4) If m = n and A ∈ R
m
is invertible, then AX = C has the unique
solution X = A
−1
C.
The geometry of systems of equations over a ﬁeld will not become really trans
parent until the development of linear algebra in Chapter 5.
Determinants
The concept of determinant is one of the most amazing in all of mathematics.
The proper development of this concept requires a study of multilinear forms, which
is given in Chapter 6. In this section we simply present the basic properties.
For each n ≥ 1 and each commutative ring R, determinant is a function from R
n
to R. For n = 1, [ (a) [ = a. For n = 2,
a b
c d
= ad −bc.
Deﬁnition Let A = (a
i,j
) ∈ R
n
. If σ is a permutation on ¦1, 2, ..., n¦, let sign(σ) =
1 if σ is an even permutation, and sign(σ) = −1 if σ is an odd permutation. The
determinant is deﬁned by [ A [=
¸
all σ
sign(σ) a
1,σ(1)
a
2,σ(2)
a
n,σ(n)
. Check that for
n = 2, this agrees with the deﬁnition above. (Note that here we are writing the
permutation functions as σ(i) and not as (i)σ.)
Chapter 4 Matrices 61
For each σ, a
1,σ(1)
a
2,σ(2)
a
n,σ(n)
contains exactly one factor from each row and
one factor from each column. Since R is commutative, we may rearrange the factors
so that the ﬁrst comes from the ﬁrst column, the second from the second column, etc.
This means that there is a permutation τ on ¦1, 2, . . . , n¦ such that a
1,σ(1)
a
n,σ(n)
=
a
τ(1),1
a
τ(n),n
. We wish to show that τ = σ
−1
and thus sign(σ) = sign(τ). To
reduce the abstraction, suppose σ(2) = 5. Then the ﬁrst expression will contain
the factor a
2,5
. In the second expression, it will appear as a
τ(5),5
, and so τ(5) = 2.
Anyway, τ is the inverse of σ and thus there are two ways to deﬁne determinant. It
follows that the determinant of a matrix is equal to the determinant of its transpose.
Theorem [A[ =
¸
all σ
sign(σ)a
1,σ(1)
a
2,σ(2)
a
n,σ(n)
=
¸
all τ
sign(τ)a
τ(1),1
a
τ(2),2
a
τ(n),n
.
Corollary [A[ = [A
t
[.
You may view an n n matrix A as a sequence of n column vectors or as a
sequence of n row vectors. Here we will use column vectors. This means we write the
matrix A as A = (A
1
, A
2
, . . . , A
n
) where each A
i
∈ R
n,1
= R
n
.
Theorem If two columns of A are equal, then [A[ = 0
¯
.
Proof For simplicity, assume the ﬁrst two columns are equal, i.e., A
1
= A
2
.
Now [A[ =
¸
all τ
sign(τ)a
τ(1),1
a
τ(2),2
a
τ(n),n
and this summation has n! terms and
n! is an even number. Let γ be the transposition which interchanges one and two.
Then for any τ, a
τ(1),1
a
τ(2),2
a
τ(n),n
= a
τγ(1),1
a
τγ(2),2
a
τγ(n),n
. This pairs up
the n! terms of the summation, and since sign(τ)=−sign(τγ), these pairs cancel
in the summation. Therefore [A[ = 0
¯
.
Theorem Suppose 1 ≤ r ≤ n, C
r
∈ R
n,1
, and a, c ∈ R. Then [(A
1
, . . . , A
r−1
,
aA
r
+cC
r
, A
r+1
, . . . , A
n
)[ = a[(A
1
, . . . , A
n
)[ + c[(A
1
, . . . , A
r−1
, C
r
, A
r+1
, . . . , A
n
)[
Proof This is immediate from the deﬁnition of determinant and the distributive
law of multiplication in the ring R.
Summary Determinant is a function d : R
n
→ R. In the language used in the
Appendix, the two previous theorems say that d is an alternating multilinear form.
The next two theorems show that alternating implies skewsymmetric (see page 129).
62 Matrices Chapter 4
Theorem Interchanging two columns of A multiplies the determinant by minus
one.
Proof For simplicity, show that [(A
2
, A
1
, A
3
, . . . , A
n
)[ = −[A[. We know 0
¯
=
[(A
1
+ A
2
, A
1
+ A
2
, A
3
, . . . , A
n
)[ = [(A
1
, A
1
, A
3
, . . . , A
n
)[ + [(A
1
, A
2
, A
3
, . . . , A
n
)[ +
[(A
2
, A
1
, A
3
, . . . , A
n
)[ + [(A
2
, A
2
, A
3
, . . . , A
n
)[. Since the ﬁrst and last of these four
terms are zero, the result follows.
Theorem If τ is a permutation of (1, 2, . . . , n), then
[A[ = sign(τ)[(A
τ(1)
, A
τ(2)
, . . . , A
τ(n)
)[.
Proof The permutation τ is the ﬁnite product of transpositions.
Exercise Rewrite the four preceding theorems using rows instead of columns.
The following theorem is just a summary of some of the work done so far.
Theorem Multiplying any row or column of matrix by a scalar c ∈ R, multiplies
the determinant by c. Interchanging two rows or two columns multiplies the determi
nant by −1. Adding c times one row to another row, or adding c times one column
to another column, does not change the determinant. If a matrix has two rows equal
or two columns equal, its determinant is zero. More generally, if one row is c times
another row, or one column is c times another column, then the determinant is zero.
There are 2n ways to compute [ A[; expansion by any row or expansion by any
column. Let M
i,j
be the determinant of the (n − 1) (n − 1) matrix obtained by
removing row i and column j from A. Let C
i,j
= (−1)
i+j
M
i,j
. M
i,j
and C
i,j
are
called the (i, j) minor and cofactor of A. The following theorem is useful but the
proof is a little tedious and should not be done as an exercise.
Theorem For any 1 ≤ i ≤ n, [ A[= a
i,1
C
i,1
+ a
i,2
C
i,2
+ + a
i,n
C
i,n
. For any
1 ≤ j ≤ n, [ A[= a
1,j
C
1,j
+a
2,j
C
2,j
+ +a
n,j
C
n,j
. Thus if any row or any column is
zero, the determinant is zero.
Exercise Let A =
¸
¸
a
1
a
2
a
3
b
1
b
2
b
3
c
1
c
2
c
3
¸
. The determinant of A is the sum of six terms.
Chapter 4 Matrices 63
Write out the determinant of A expanding by the ﬁrst column and also expanding by
the second row.
Theorem If A is an upper or lower triangular matrix, [ A[ is the product of the
diagonal elements. If A is an elementary matrix of type 2, [ A [= −1. If A is an
elementary matrix of type 3, [ A[= 1.
Proof We will prove the ﬁrst statement for upper triangular matrices. If A ∈ R
2
is an upper triangular matrix, then its determinant is the product of the diagonal
elements. Suppose n > 2 and the theorem is true for matrices in R
n−1
. Suppose
A ∈ R
n
is upper triangular. The result follows by expanding by the ﬁrst column.
An elementary matrix of type 3 is a special type of upper or lower triangular
matrix, so its determinant is 1. An elementary matrix of type 2 is obtained from the
identity matrix by interchanging two rows or columns, and thus has determinant −1.
Theorem (Determinant by blocks) Suppose A ∈ R
n
, B ∈ R
n,m
, and D ∈ R
m
.
Then the determinant of
A B
O D
is [ A[[ D[.
Proof Expand by the ﬁrst column and use induction on n.
The following remarkable theorem takes some work to prove. We assume it here
without proof. (For the proof, see page 130 of the Appendix.)
Theorem The determinant of the product is the product of the determinants,
i.e., if A, B ∈ R
n
, [ AB[ = [ A[[ B[. Thus [ AB[ = [ BA[ and if C is invertible,
[ C
−1
AC [ = [ACC
−1
[ = [ A[.
Corollary If A is a unit in R
n
, then [ A[ is a unit in R and [ A
−1
[ = [ A[
−1
.
Proof 1 = [ I [ = [ AA
−1
[ = [ A[[ A
−1
[ .
One of the major goals of this chapter is to prove the converse of the preceding
corollary.
Classical adjoint Suppose R is a commutative ring and A ∈ R
n
. The classical
adjoint of A is (C
i,j
)
t
, i.e., the matrix whose (j, i) term is the (i, j) cofactor. Before
64 Matrices Chapter 4
we consider the general case, let’s examine 2 2 matrices.
If A =
a b
c d
then (C
i,j
) =
d −c
−b a
and so (C
i,j
)
t
=
d −b
−c a
. Then
A(C
i,j
)
t
= (C
i,j
)
t
A =
[ A[ 0
0 [ A[
= [ A[ I. Thus if [ A[ is a unit in R, A is
invertible and A
−1
= [ A[
−1
(C
i,j
)
t
. In particular, if [ A[ = 1, A
−1
=
d −b
−c a
.
Here is the general case.
Theorem If R is commutative and A ∈ R
n
, then A(C
i,j
)
t
= (C
i,j
)
t
A = [ A[ I.
Proof We must show that the diagonal elements of the product A(C
i,j
)
t
are all
[ A[ and the other elements are 0. The (s, s) term is the dot product of row s of A
with row s of (C
i,j
) and is thus [ A[ (computed by expansion by row s). For s = t,
the (s, t) term is the dot product of row s of A with row t of (C
i,j
). Since this is the
determinant of a matrix with row s = row t, the (s, t) term is 0. The proof that
(C
i,j
)
t
A = [A[I is similar and is left as an exercise.
We are now ready for one of the most beautiful and useful theorems in all of
mathematics.
Theorem Suppose R is a commutative ring and A ∈ R
n
. Then A is a unit in
R
n
iﬀ [ A[ is a unit in R. (Thus if R is a ﬁeld, A is invertible iﬀ [ A[ = 0.) If A is
invertible, then A
−1
= [ A[
−1
(C
i,j
)
t
. Thus if [ A[ = 1, A
−1
= (C
i,j
)
t
, the classical
adjoint of A.
Proof This follows immediately from the preceding theorem.
Exercise Show that any right inverse of A is also a left inverse. That is, suppose
A, B ∈ R
n
and AB = I. Show A is invertible with A
−1
= B, and thus BA = I.
Similarity
Suppose A, B ∈ R
n
. B is said to be similar to A if ∃ an invertible C ∈ R
n
such
that B = C
−1
AC, i.e., B is similar to A iﬀ B is a conjugate of A.
Theorem B is similar to B.
Chapter 4 Matrices 65
B is similar to A iﬀ A is similar to B.
If D is similar to B and B is similar to A, then D is similar to A.
“Similarity” is an equivalence relation on R
n
.
Proof This is a good exercise using the deﬁnition.
Theorem Suppose A and B are similar. Then [ A[ = [ B[ and thus A is invertible
iﬀ B is invertible.
Proof Suppose B = C
−1
AC. Then [ B[ = [ C
−1
AC [ = [ACC
−1
[ = [ A[.
Trace Suppose A = (a
i,j
) ∈ R
n
. Then the trace is deﬁned by trace(A) = a
1,1
+
a
2,2
+ +a
n,n
. That is, the trace of A is the sum of its diagonal terms.
One of the most useful properties of trace is trace(AB) = trace(BA) whenever AB
and BA are deﬁned. For example, suppose A = (a
1
, a
2
, ..., a
n
) and B = (b
1
, b
2
, ..., b
n
)
t
.
Then AB is the scalar a
1
b
1
+ + a
n
b
n
while BA is the n n matrix (b
i
a
j
). Note
that trace(AB) = trace(BA). Here is the theorem in full generality.
Theorem Suppose A ∈ R
m,n
and B ∈ R
n,m
. Then AB and BA are square
matrices with trace(AB) = trace(BA).
Proof This proof involves a change in the order of summation. By deﬁnition,
trace(AB) =
¸
1≤i≤m
a
i,1
b
1,i
+ +a
i,n
b
n,i
=
¸
1≤i≤m
1≤j≤n
a
i,j
b
j,i
=
¸
1≤j≤n
b
j,1
a
1,j
+ +b
j,m
a
m,j
=
trace(BA).
Theorem If A, B ∈ R
n
, trace(A+B) = trace(A) + trace(B) and
trace(AB) = trace(BA).
Proof The ﬁrst part of the theorem is immediate, and the second part is a special
case of the previous theorem.
Theorem If A and B are similar, then trace(A) = trace(B).
Proof trace(B) = trace(C
−1
AC) = trace(ACC
−1
) = trace(A).
66 Matrices Chapter 4
Summary Determinant and trace are functions from R
n
to R. Determinant is a
multiplicative homomorphism and trace is an additive homomorphism. Furthermore
[ AB[ = [ BA[ and trace(AB) = trace(BA). If A and B are similar, [ A[ = [ B[ and
trace(A) = trace(B).
Exercise Suppose A ∈ R
n
and a ∈ R. Find [aA[ and trace(aA).
Characteristic polynomials If A ∈ R
n
, the characteristic polynomial CP
A
(x) ∈
R[x] is deﬁned by CP
A
(x) = [ (xI − A) [. Any λ ∈ R which is a root of CP
A
(x) is
called a characteristic root of A.
Theorem CP
A
(x) = a
0
+ a
1
x + + a
n−1
x
n−1
+ x
n
where trace(A) = −a
n−1
and [ A[ = (−1)
n
a
0
.
Proof This follows from a direct computation of the determinant.
Theorem If A and B are similar, then they have the same characteristic polyno
mials.
Proof Suppose B = C
−1
AC. CP
B
(x) = [ (xI −C
−1
AC) [ = [ C
−1
(xI −A)C[ =
[ (xI −A)[ = CP
A
(x).
Exercise Suppose R is a commutative ring, A =
a b
c d
is a matrix in R
2
, and
CP
A
(x) = a
0
+ a
1
x + x
2
. Find a
0
and a
1
and show that a
0
I + a
1
A + A
2
= 0, i.e.,
show A satisﬁes its characteristic polynomial. In other words, CP
A
(A) = 0.
Exercise Suppose F is a ﬁeld and A ∈ F
2
. Show the following are equivalent.
1) A
2
= 0.
2) [ A [= trace(A) = 0.
3) CP
A
(x) = x
2
.
4) ∃ an elementary matrix C such that C
−1
AC is strictly upper triangular.
Note This exercise is a special case of a more general theorem. A square matrix
over a ﬁeld is nilpotent iﬀ all its characteristic roots are 0
¯
iﬀ it is similar to a strictly
upper triangular matrix. This remarkable result cannot be proved by matrix theory
alone, but depends on linear algebra (see pages 93, 94, and 98).
Chapter 5
Linear Algebra
The exalted position held by linear algebra is based upon the subject’s ubiquitous
utility and ease of application. The basic theory is developed here in full generality,
i.e., modules are deﬁned over an arbitrary ring R and not just over a ﬁeld. The
elementary facts about cosets, quotients, and homomorphisms follow the same pat
tern as in the chapters on groups and rings. We give a simple proof that if R is a
commutative ring and f : R
n
→ R
n
is a surjective Rmodule homomorphism, then
f is an isomorphism. This shows that ﬁnitely generated free Rmodules have a well
deﬁned dimension, and simpliﬁes some of the development of linear algebra. It is in
this chapter that the concepts about functions, solutions of equations, matrices, and
generating sets come together in one uniﬁed theory.
After the general theory, we restrict our attention to vector spaces, i.e., modules
over a ﬁeld. The key theorem is that any vector space V has a free basis, and thus
if V is ﬁnitely generated, it has a well deﬁned dimension, and incredible as it may
seem, this single integer determines V up to isomorphism. Also any endomorphism
f : V →V may be represented by a matrix, and any change of basis corresponds to
conjugation of that matrix. One of the goals in linear algebra is to select a basis so
that the matrix representing f has a simple form. For example, if f is not injective,
then f may be represented by a matrix whose ﬁrst column is zero. As another
example, if f is nilpotent, then f may be represented by a strictly upper triangular
matrix. The theorem on Jordan canonical form is not proved in this chapter, and
should not be considered part of this chapter. It is stated here in full generality only
for reference and completeness. The proof is given in the Appendix. This chapter
concludes with the study of real inner product spaces, and with the beautiful theory
relating orthogonal matrices and symmetric matrices.
67
68 Linear Algebra Chapter 5
Deﬁnition Suppose R is a ring and M is an additive abelian group. The state
ment that M is a right Rmodule means there is a scalar multiplication
M R → M satisfying (a
1
+a
2
)r = a
1
r +a
2
r
(m, r) → mr a(r
1
+r
2
) = ar
1
+ar
2
a(r
1
r
2
) = (ar
1
)r
2
a1
¯
= a
for all a, a
1
, a
2
∈ M and r, r
1
, r
2
∈ R.
The statement that M is a left Rmodule means there is a scalar multiplication
R M → M satisfying r(a
1
+a
2
) = ra
1
+ra
2
(r, m) → rm (r
1
+r
2
)a = r
1
a +r
2
a
(r
1
r
2
)a = r
1
(r
2
a)
1
¯
a = a
Note that the plus sign is used ambiguously, as addition in M and as addition in R.
Notation The fact that M is a right (left) Rmodule will be denoted by M = M
R
(M =
R
M). If R is commutative and M = M
R
then left scalar multiplication deﬁned
by ra = ar makes M into a left Rmodule. Thus for commutative rings, we may write
the scalars on either side. In this text we stick to right Rmodules.
Convention Unless otherwise stated, it is assumed that R is a ring and the word
“Rmodule” (or sometimes just “module”) means “right Rmodule”.
Theorem Suppose M is an Rmodule.
1) If r ∈ R, then f : M →M deﬁned by f(a) = ar is a homomorphism of
additive groups. In particular (0
¯
M
)r = 0
¯
M
.
2) If a ∈ M, a0
¯
R
= 0
¯
M
.
3) If a ∈ M and r ∈ R, then (−a)r = −(ar) = a(−r).
Proof This is a good exercise in using the axioms for an Rmodule.
Chapter 5 Linear Algebra 69
Submodules If M is an Rmodule, the statement that a subset N ⊂ M is a
submodule means it is a subgroup which is closed under scalar multiplication, i.e., if
a ∈ N and r ∈ R, then ar ∈ N. In this case N will be an Rmodule because the
axioms will automatically be satisﬁed. Note that 0
¯
and M are submodules, called the
improper submodules of M.
Theorem Suppose M is an Rmodule, T is an index set, and for each t ∈ T,
N
t
is a submodule of M.
1)
¸
t∈T
N
t
is a submodule of M.
2) If ¦N
t
¦ is a monotonic collection,
¸
t∈T
N
t
is a submodule.
3) +
t∈T
N
t
= ¦all ﬁnite sums a
1
+ +a
m
: each a
i
belongs
to some N
t
¦ is a submodule. If T = ¦1, 2, .., n¦,
then this submodule may be written as
N
1
+N
2
+ +N
n
= ¦a
1
+a
2
+ +a
n
: each a
i
∈ N
i
¦.
Proof We know from page 22 that versions of 1) and 2) hold for subgroups, and
in particular for subgroups of additive abelian groups. To ﬁnish the proofs it is only
necessary to check scalar multiplication, which is immediate. Also the proof of 3) is
immediate. Note that if N
1
and N
2
are submodules of M, N
1
+ N
2
is the smallest
submodule of M containing N
1
∪ N
2
.
Exercise Suppose T is a nonvoid set, N is an Rmodule, and N
T
is the collection
of all functions f : T →N with addition deﬁned by (f +g)(t) = f(t)+g(t), and scalar
multiplication deﬁned by (fr)(t) = f(t)r. Show N
T
is an Rmodule. (We know from
the last exercise in Chapter 2 that N
T
is a group, and so it is only necessary to check
scalar multiplication.) This simple fact is quite useful in linear algebra. For example,
in 5) of the theorem below, it is stated that Hom
R
(M, N) forms an abelian group.
So it is only necessary to show that Hom
R
(M, N) is a subgroup of N
M
. Also in 8) it
is only necessary to show that Hom
R
(M, N) is a submodule of N
M
.
Homomorphisms
Suppose M and N are Rmodules. A function f : M → N is a homomorphism
(i.e., an Rmodule homomorphism) provided it is a group homomorphism and if
a ∈ M and r ∈ R, f(ar) = f(a)r. On the left, scalar multiplication is in M and on
the right it is in N. The basic facts about homomorphisms are listed below. Much
70 Linear Algebra Chapter 5
of this work has already been done in the chapter on groups (see page 28).
Theorem
1) The zero map M →N is a homomorphism.
2) The identity map I : M →M is a homomorphism.
3) The composition of homomorphisms is a homomorphism.
4) The sum of homomorphisms is a homomorphism. If f, g : M →N are
homomorphisms, deﬁne (f +g) : M →N by (f +g)(a) = f(a) +g(a).
Then f +g is a homomorphism. Also (−f) deﬁned by (−f)(a) = −f(a)
is a homomorphism. If h : N →P is a homomorphism,
h ◦ (f +g) = (h ◦ f) + (h ◦ g). If k : P →M is a homomorphism,
(f +g ) ◦ k = (f ◦ k) + (g ◦ k).
5) Hom
R
(M, N) = Hom(M
R
, N
R
), the set of all homomorphisms from M
to N, forms an abelian group under addition. Hom
R
(M, M), with
multiplication deﬁned to be composition, is a ring.
6) If a bijection f : M →N is a homomorphism, then f
−1
: N →M is also
a homomorphism. In this case f and f
−1
are called isomorphisms. A
homomorphism f : M →M is called an endomorphism. An isomorphism
f : M →M is called an automorphism. The units of the endomorphism
ring Hom
R
(M, M) are the automorphisms. Thus the automorphisms on
M form a group under composition. We will see later that if M = R
n
,
Hom
R
(R
n
, R
n
) is just the matrix ring R
n
and the automorphisms
are merely the invertible matrices.
7) If R is commutative and r ∈ R, then g : M →M deﬁned by g(a) = ar
is a homomorphism. Furthermore, if f : M →N is a homomorphism,
fr deﬁned by (fr)(a) = f(ar) = f(a)r is a homomorphism.
8) If R is commutative, Hom
R
(M, N) is an Rmodule.
9) Suppose f : M →N is a homomorphism, G ⊂ M is a submodule,
and H ⊂ N is a submodule. Then f(G) is a submodule of N
and f
−1
(H) is a submodule of M. In particular, image(f) is a
submodule of N and ker(f) = f
−1
(0
¯
) is a submodule of M.
Proof This is just a series of observations.
Chapter 5 Linear Algebra 71
Abelian groups are Zmodules On page 21, it is shown that any additive
group M admits a scalar multiplication by integers, and if M is abelian, the properties
are satisﬁed to make M a Zmodule. Note that this is the only way M can be a Z
module, because a1 = a, a2 = a + a, etc. Furthermore, if f : M → N is a group
homomorphism of abelian groups, then f is also a Zmodule homomorphism.
Summary Additive abelian groups are “the same things” as Zmodules. While
group theory in general is quite separate from linear algebra, the study of additive
abelian groups is a special case of the study of Rmodules.
Exercise Rmodules are also Zmodules and Rmodule homomorphisms are also
Zmodule homomorphisms. If M and N are Qmodules and f : M → N is a
Zmodule homomorphism, must it also be a Qmodule homomorphism?
Homomorphisms on R
n
R
n
as an Rmodule On page 54 it was shown that the additive abelian
group R
m,n
admits a scalar multiplication by elements in R. The properties listed
there were exactly those needed to make R
m,n
an Rmodule. Of particular importance
is the case R
n
= R ⊕ ⊕R = R
n,1
(see page 53). We begin with the case n = 1.
R as a right Rmodule Let M = R and deﬁne scalar multiplication on the right
by ar = a r. That is, scalar multiplication is just ring multiplication. This makes
R a right Rmodule denoted by R
R
(or just R). This is the same as the deﬁnition
before for R
n
when n = 1.
Theorem Suppose R is a ring and N is a subset of R. Then N is a submodule
of R
R
(
R
R) iﬀ N is a right (left) ideal of R.
Proof The deﬁnitions are the same except expressed in diﬀerent language.
Theorem Suppose M = M
R
and f, g : R →M are homomorphisms with f(1
¯
) =
g(1
¯
). Then f = g. Furthermore, if m ∈ M, ∃! homomorphism h : R → M with
h(1
¯
) = m. In other words, Hom
R
(R, M) ≈ M.
Proof Suppose f(1
¯
) = g(1
¯
). Then f(r) = f(1
¯
r) = f(1
¯
)r = g(1
¯
)r = g(1
¯
r) =
g(r). Given m ∈ M, h : R → M deﬁned by h(r) = mr is a homomorphism. Thus
72 Linear Algebra Chapter 5
evaluation at 1
¯
gives a bijection from Hom
R
(R, M) to M, and this bijection is clearly
a group isomorphism. If R is commutative, it is an isomorphism of Rmodules.
In the case M = R, the above theorem states that multiplication on left by some
m ∈ R deﬁnes a right Rmodule homomorphism from R to R, and every module
homomorphism is of this form. The element m should be thought of as a 1 1
matrix. We now consider the case where the domain is R
n
.
Homomorphisms on R
n
Deﬁne e
i
∈ R
n
by e
i
=
¸
¸
¸
¸
¸
¸
¸
0
¯
1
¯
i
0
¯
¸
. Note that any
¸
¸
¸
¸
¸
¸
¸
r
1
r
n
¸
can be written uniquely as e
1
r
1
+ +e
n
r
n
. The sequence ¦e
1
, .., e
n
¦ is called the
canonical free basis or standard basis for R
n
.
Theorem Suppose M = M
R
and f, g : R
n
→ M are homomorphisms with
f(e
i
) = g(e
i
) for 1 ≤ i ≤ n. Then f = g. Furthermore, if m
1
, m
2
, ..., m
n
∈ M, ∃!
homomorphism h : R
n
→ M with h(e
i
) = m
i
for 1 ≤ i ≤ m. The homomorphism
h is deﬁned by h(e
1
r
1
+ +e
n
r
n
) = m
1
r
1
+ +m
n
r
n
.
Proof The proof is straightforward. Note this theorem gives a bijection from
Hom
R
(R
n
, M) to M
n
= M M M and this bijection is a group isomorphism.
We will see later that the product M
n
is an Rmodule with scalar multiplication
deﬁned by (m
1
, m
2
, .., m
n
)r = (m
1
r, m
2
r, .., m
n
r). If R is commutative so that
Hom
R
(R
n
, M) is an Rmodule, this theorem gives an Rmodule isomorphism from
Hom
R
(R
n
, M) to M
n
.
This theorem reveals some of the great simplicity of linear algebra. It does not
matter how complicated the ring R is, or which Rmodule M is selected. Any
Rmodule homomorphism from R
n
to M is determined by its values on the basis,
and any function from that basis to M extends uniquely to a homomorphism from
R
n
to M.
Exercise Suppose R is a ﬁeld and f : R
R
→ M is a nonzero homomorphism.
Show f is injective.
Chapter 5 Linear Algebra 73
Now let’s examine the special case M = R
m
and show Hom
R
(R
n
, R
m
) ≈ R
m,n
.
Theorem Suppose A = (a
i,j
) ∈ R
m,n
. Then f : R
n
→R
m
deﬁned by f(B) = AB
is a homomorphism with f(e
i
) = column i of A. Conversely, if v
1
, . . . , v
n
∈ R
m
, deﬁne
A ∈ R
m,n
to be the matrix with column i = v
i
. Then f deﬁned by f(B) = AB is
the unique homomorphism from R
n
to R
m
with f(e
i
) = v
i
.
Even though this follows easily from the previous theorem and properties of ma
trices, it is one of the great classical facts of linear algebra. Matrices over R give
Rmodule homomorphisms! Furthermore, addition of matrices corresponds to addi
tion of homomorphisms, and multiplication of matrices corresponds to composition
of homomorphisms. These properties are made explicit in the next two theorems.
Theorem If f, g : R
n
→ R
m
are given by matrices A, C ∈ R
m,n
, then f + g is
given by the matrix A+C. Thus Hom
R
(R
n
, R
m
) and R
m,n
are isomorphic as additive
groups. If R is commutative, they are isomorphic as Rmodules.
Theorem If f : R
n
→ R
m
is the homomorphism given by A ∈ R
m,n
and g :
R
m
→R
p
is the homomorphism given by C ∈ R
p,m
, then g ◦ f : R
n
→R
p
is given by
CA ∈ R
p,n
. That is, composition of homomorphisms corresponds to multiplication
of matrices.
Proof This is just the associative law of matrix multiplication, C(AB) = (CA)B.
The previous theorem reveals where matrix multiplication comes from. It is the
matrix which represents the composition of the functions. In the case where the
domain and range are the same, we have the following elegant corollary.
Corollary Hom
R
(R
n
, R
n
) and R
n
are isomorphic as rings. The automorphisms
correspond to the invertible matrices.
This corollary shows one way noncommutative rings arise, namely as endomor
phism rings. Even if R is commutative, R
n
is never commutative unless n = 1.
We now return to the general theory of modules (over some given ring R).
74 Linear Algebra Chapter 5
Cosets and Quotient Modules
After seeing quotient groups and quotient rings, quotient modules go through
without a hitch. As before, R is a ring and module means Rmodule.
Theorem Suppose M is a module and N ⊂ M is a submodule. Since N is a
normal subgroup of M, the additive abelian quotient group M/N is deﬁned. Scalar
multiplication deﬁned by (a + N)r = (ar + N) is welldeﬁned and gives M/N the
structure of an Rmodule. The natural projection π : M → M/N is a surjective
homomorphism with kernel N. Furthermore, if f : M →
¯
M is a surjective homomor
phism with ker(f) = N, then M/N ≈
¯
M (see below).
Proof On the group level, this is all known from Chapter 2 (see pages 27 and 29).
It is only necessary to check the scalar multiplication, which is obvious.
The relationship between quotients and homomorphisms for modules is the same
as for groups and rings, as shown by the next theorem.
Theorem Suppose f : M →
¯
M is a homomorphism and N is a submodule of M.
If N ⊂ ker(f), then
¯
f : (M/N) →
¯
M deﬁned by
¯
f(a + N) = f(a) is a welldeﬁned
homomorphism making the following diagram commute.
M
¯
M
M/N
f
·
π
¯
f
Thus deﬁning a homomorphism on a quotient module is the same as deﬁning a homo
morphism on the numerator that sends the denominator to 0
¯
. The image of
¯
f is the
image of f, and the kernel of
¯
f is ker(f)/N. Thus if N = ker(f),
¯
f is injective, and
thus (M/N) ≈image(f). Therefore for any homomorphism f, (domain(f)/ker(f)) ≈
image(f).
Proof On the group level this is all known from Chapter 2 (see page 29). It is
only necessary to check that
¯
f is a module homomorphism, and this is immediate.
Chapter 5 Linear Algebra 75
Theorem Suppose M is an Rmodule and K and L are submodules of M.
i) The natural homomorphism K →(K + L)/L is surjective with kernel
K ∩ L. Thus (K/K ∩ L)
≈
→(K +L)/L is an isomorphism.
ii) Suppose K ⊂ L. The natural homomorphism M/K →M/L is surjective
with kernel L/K. Thus (M/K)/(L/K)
≈
→M/L is an isomorphism.
Examples These two examples are for the case R = Z, i.e., for abelian groups.
1) M = Z K = 3Z L = 5Z K ∩ L = 15Z K +L = Z
K/K ∩ L = 3Z/15Z ≈ Z/5Z = (K +L)/L
2) M = Z K = 6Z L = 3Z (K ⊂ L)
(M/K)/(L/K) = (Z/6Z)/(3Z/6Z) ≈ Z/3Z = M/L
Products and Coproducts
Inﬁnite products work ﬁne for modules, just as they do for groups and rings.
This is stated below in full generality, although the student should think of the ﬁnite
case. In the ﬁnite case something important holds for modules that does not hold
for nonabelian groups or rings – namely, the ﬁnite product is also a coproduct. This
makes the structure of module homomorphisms much more simple. For the ﬁnite
case we may use either the product or sum notation, i.e., M
1
M
2
M
n
=
M
1
⊕M
2
⊕ ⊕M
n
.
Theorem Suppose T is an index set and for each t ∈ T, M
t
is an Rmodule. On
the additive abelian group
¸
t∈T
M
t
=
¸
M
t
deﬁne scalar multiplication by ¦m
t
¦r =
¦m
t
r¦. Then
¸
M
t
is an Rmodule and, for each s ∈ T, the natural projection
π
s
:
¸
M
t
→M
s
is a homomorphism. Suppose M is a module. Under the natural 11
correspondence from ¦functions f : M →
¸
M
t
¦ to ¦sequence of functions ¦f
t
¦
t∈T
where f
t
: M →M
t
¦, f is a homomorphism iﬀ each f
t
is a homomorphism.
Proof We already know from Chapter 2 that f is a group homomorphism iﬀ each
f
t
is a group homomorphism. Since scalar multiplication is deﬁned coordinatewise,
f is a module homomorphism iﬀ each f
t
is a module homomorphism.
76 Linear Algebra Chapter 5
Deﬁnition If T is ﬁnite, the coproduct and product are the same module. If T
is inﬁnite, the coproduct or sum
¸
t∈T
M
t
=
t∈T
M
t
= ⊕M
t
is the submodule of
¸
M
t
consisting of all sequences ¦m
t
¦ with only a ﬁnite number of nonzero terms. For
each s ∈ T, the inclusion homomorphisms i
s
: M
s
→⊕M
t
is deﬁned by i
s
(a) = ¦a
t
¦
where a
t
= 0
¯
if t = s and a
s
= a. Thus each M
s
may be considered to be a submodule
of ⊕M
t
.
Theorem Suppose M is an Rmodule. There is a 11 correspondence from
¦homomorphisms g : ⊕M
t
→ M¦ and ¦sequences of homomorphisms ¦g
t
¦
t∈T
where
g
t
: M
t
→ M¦ . Given g, g
t
is deﬁned by g
t
= g ◦ i
t
. Given ¦g
t
¦, g is deﬁned by
g(¦m
t
¦) =
¸
t
g
t
(m
t
). Since there are only a ﬁnite number of nonzero terms, this
sum is well deﬁned.
For T = ¦1, 2¦ the product and sum properties are displayed in the following
commutative diagrams.
M
1
⊕M
2
M
1
M
2
M
1
M
2
M
1
⊕M
2
M M
π
1 i
1
π
2 i
2
f
g
f
1
f
2
g
1
g
2
, ,
·
`
»
´
Theorem For ﬁnite T, the 11 correspondences in the above theorems actually
produce group isomorphisms. If R is commutative, they give isomorphisms of R
modules.
Hom
R
(M, M
1
⊕ ⊕M
n
) ≈ Hom
R
(M, M
1
) ⊕ ⊕Hom
R
(M, M
n
) and
Hom
R
(M
1
⊕ ⊕M
n
, M) ≈ Hom
R
(M
1
, M) ⊕ ⊕Hom
R
(M
n
, M)
Proof Let’s look at this theorem for products with n = 2. All it says is that if f =
(f
1
, f
2
) and h = (h
1
, h
2
), then f +h = (f
1
+h
1
, f
2
+h
2
). If R is commutative, so that
the objects are Rmodules and not merely additive groups, then the isomorphisms
are module isomorphisms. This says merely that fr = (f
1
, f
2
)r = (f
1
r, f
2
r).
Chapter 5 Linear Algebra 77
Exercise Suppose M and N are Rmodules. Show that M ⊕N is isomorphic to
N ⊕ M. Now suppose A ⊂ M, B ⊂ N are submodules and show (M ⊕ N)/(A ⊕B)
is isomorphic to (M/A) ⊕ (N/B). In particular, if a ∈ R and b ∈ R, then
(R ⊕ R)/(aR ⊕ bR) is isomorphic to (R/aR) ⊕ (R/bR). For example, the abelian
group (Z ⊕ Z)/(2Z ⊕ 3Z) is isomorphic to Z
2
⊕ Z
3
. These isomorphisms are trans
parent and are used routinely in algebra without comment (see Th 4, page 118).
Exercise Suppose R is a commutative ring, M is an Rmodule, and n ≥ 1. Deﬁne
a function α : Hom
R
(R
n
, M) →M
n
which is a Rmodule isomorphism.
Summands
One basic question in algebra is “When does a module split as the sum of two
modules?”. Before deﬁning summand, here are two theorems for background.
Theorem Consider M
1
= M
1
⊕0
¯
as a submodule of M
1
⊕M
2
. Then the projection
map π
2
: M
1
⊕ M
2
→ M
2
is a surjective homomorphism with kernel M
1
. Thus
(M
1
⊕M
2
)/M
1
is isomorphic to M
2
. (See page 35 for the group version.)
This is exactly what you would expect, and the next theorem is almost as intuitive.
Theorem Suppose K and L are submodules of M and f : K ⊕ L → M is the
natural homomorphism, f(k, l) = k + l. Then the image of f is K + L and the
kernel of f is ¦(a, −a) : a ∈ K ∩ L¦. Thus f is an isomorphism iﬀ K + L = M and
K ∩ L = 0
¯
. In this case we write K ⊕ L = M. This abuse of notation allows us to
avoid talking about “internal” and “external” direct sums.
Deﬁnition Suppose K is a submodule of M. The statement that K is a summand
of M means ∃ a submodule L of M with K ⊕ L = M. According to the previous
theorem, this is the same as there exists a submodule L with K + L = M and
K ∩ L = 0
¯
. If such an L exists, it need not be unique, but it will be unique up to
isomorphism, because L ≈ M/K. Of course, M and 0
¯
are always summands of M.
Exercise Suppose M is a module and K = ¦(m, m) : m ∈ M¦ ⊂ M ⊕M. Show
K is a submodule of M ⊕M which is a summand.
Exercise R is a module over Q, and Q ⊂ R is a submodule. Is Q a summand of
R? With the material at hand, this is not an easy question. Later on, it will be easy.
78 Linear Algebra Chapter 5
Exercise Answer the following questions about abelian groups, i.e., Zmodules.
(See the third exercise on page 35.)
1) Is 2Z a summand of Z?
2) Is 2Z
4
a summand of Z
4
?
3) Is 3Z
12
a summand of Z
12
?
4) Suppose m, n > 1. When is nZ
mn
a summand of Z
mn
?
Exercise If T is a ring, deﬁne the center of T to be the subring ¦t : ts =
st for all s ∈ T¦. Let R be a commutative ring and T = R
n
. There is a exercise
on page 57 to show that the center of T is the subring of scalar matrices. Show R
n
is a left Tmodule and ﬁnd Hom
T
(R
n
, R
n
).
Independence, Generating Sets, and Free Basis
This section is a generalization and abstraction of the brief section Homomor
phisms on R
n
. These concepts work ﬁne for an inﬁnite index set T because linear
combination means ﬁnite linear combination. However, to avoid dizziness, the student
should ﬁrst consider the case where T is ﬁnite.
Deﬁnition Suppose M is an Rmodule, T is an index set, and for each t ∈ T,
s
t
∈ M. Let S be the sequence ¦s
t
¦
t∈T
= ¦s
t
¦. The statement that S is dependent
means ∃ a ﬁnite number of distinct elements t
1
, ..., t
n
in T, and elements r
1
, .., r
n
in
R, not all zero, such that the linear combination s
t
1
r
1
+ +s
tn
r
n
= 0
¯
. Otherwise,
S is independent. Note that if some s
t
= 0
¯
, then S is dependent. Also if ∃ distinct
elements t
1
and t
2
in T with s
t
1
= s
t
2
, then S is dependent.
Let SR be the set of all linear combinations s
t
1
r
1
+ +s
tn
r
n
. SR is a submodule
of M called the submodule generated by S. If S is independent and generates M,
then S is said to be a basis or free basis for M. In this case any v ∈ M can be written
uniquely as a linear combination of elements in S. An Rmodule M is said to be a
free Rmodule if it is zero or if it has a basis. The next two theorems are obvious,
except for the confusing notation. You might try ﬁrst the case T = ¦1, 2, ..., n¦ and
⊕R
t
= R
n
(see p 72).
Theorem For each t ∈ T, let R
t
= R
R
and for each c ∈ T, let e
c
∈ ⊕R
t
=
t∈T
R
t
be e
c
= ¦r
t
¦ where r
c
= l
¯
and r
t
= 0
¯
if t = c. Then ¦e
c
¦
c∈T
is a basis for ⊕R
t
called
the canonical basis or standard basis.
Chapter 5 Linear Algebra 79
Theorem Suppose N is an Rmodule and M is a free Rmodule with a basis
¦s
t
¦. Then ∃ a 11 correspondence from the set of all functions g : ¦s
t
¦ →N and the
set of all homomorphisms f : M → N. Given g, deﬁne f by f(s
t
1
r
1
+ +s
tn
r
n
) =
g(s
t
1
)r
1
+ +g(s
tn
)r
n
. Given f, deﬁne g by g(s
t
) = f(s
t
). In other words, f is
completely determined by what it does on the basis S, and you are “free” to send the
basis any place you wish and extend to a homomorphism.
Recall that we have already had the preceding theorem in the case S is the canon
ical basis for M = R
n
(p 72). The next theorem is so basic in linear algebra that it
is used without comment. Although the proof is easy, it should be worked carefully.
Theorem Suppose N is a module, M is a free module with basis S = ¦s
t
¦, and
f : M →N is a homomorphism. Let f(S) be the sequence ¦f(s
t
)¦ in N.
1) f(S) generates N iﬀ f is surjective.
2) f(S) is independent in N iﬀ f is injective.
3) f(S) is a basis for N iﬀ f is an isomorphism.
4) If h : M →N is a homomorphism, then f = h iﬀ f [ S = h [ S.
Exercise Let (A
1
, .., A
n
) be a sequence of n vectors with each A
i
∈ Z
n
.
Show this sequence is linearly independent over Z iﬀ it is linearly independent over Q.
Is it true the sequence is linearly independent over Z iﬀ it is linearly independent
over R? This question is diﬃcult until we learn more linear algebra.
Characterization of Free Modules
Any free Rmodule is isomorphic to one of the canonical free Rmodules ⊕R
t
.
This is just an observation, but it is a central fact in linear algebra.
Theorem A nonzero Rmodule M is free iﬀ ∃ an index set T such that
M ≈
t∈T
R
t
. In particular, M has a ﬁnite free basis of n elements iﬀ M ≈ R
n
.
Proof If M is isomorphic to ⊕R
t
then M is certainly free. So now suppose M
has a free basis ¦s
t
¦. Then the homomorphism f : M → ⊕R
t
with f(s
t
) = e
t
sends
the basis for M to the canonical basis for ⊕R
t
. By 3) in the preceding theorem, f is
an isomorphism.
80 Linear Algebra Chapter 5
Exercise Suppose R is a commutative ring, A ∈ R
n
, and the homomorphism
f : R
n
→ R
n
deﬁned by f(B) = AB is surjective. Show f is an isomorphism, i.e.,
show A is invertible. This is a key theorem in linear algebra, although it is usually
stated only for the case where R is a ﬁeld. Use the fact that ¦e
1
, .., e
n
¦ is a free basis
for R
n
.
The next exercise is routine, but still informative.
Exercise Let R = Z, A =
2 1 0
3 2 −5
and f: Z
3
→ Z
2
be the group homo
morphism deﬁned by A. Find a nontrivial linear combination of the columns of A
which is 0
¯
. Also ﬁnd a nonzero element of kernel(f).
If R is a commutative ring, you can relate properties of R as an Rmodule to
properties of R as a ring.
Exercise Suppose R is a commutative ring and v ∈ R, v = 0
¯
.
1) v is independent iﬀ v is .
2) v is a basis for R iﬀ v generates R iﬀ v is .
Note that 2) here is essentially the ﬁrst exercise for the case n = 1. That is, if
f : R →R is a surjective Rmodule homomorphism, then f is an isomorphism.
Relating these concepts to matrices
The theorem stated below gives a summary of results we have already had. It
shows that certain concepts about matrices, linear independence, injective homo
morphisms, and solutions of equations, are all the same — they are merely stated in
diﬀerent language. Suppose A ∈ R
m,n
and f : R
n
→R
m
is the homomorphism associ
ated with A, i.e., f(B) = AB. Let v
1
, .., v
n
∈ R
m
be the columns of A, i.e., f(e
i
) = v
i
= column i of A. Let B =
¸
¸
b
1
.
b
n
¸
represent an element of R
n
and C =
¸
¸
c
1
.
c
m
¸
Chapter 5 Linear Algebra 81
represent an element of R
m
.
Theorem
1) The element f(B) is a linear combination of the columns of A, that is
f(B) = f(e
1
b
1
+ +e
n
b
n
) = v
1
b
1
+ +v
n
b
n
. Thus the image of f
is generated by the columns of A. (See bottom of page 89.)
2) ¦v
1
, .., v
n
¦ generates R
m
iﬀ f is surjective iﬀ (for any C ∈ R
m
, AX = C
has a solution).
3) ¦v
1
, .., v
n
¦ is independent iﬀ f is injective iﬀ AX = 0
¯
has a unique
solution iﬀ (∃ C ∈ R
m
such that AX = C has a unique solution).
4) ¦v
1
, .., v
n
¦ is a basis for R
m
iﬀ f is an isomorphism iﬀ (for any C ∈ R
m
,
AX = C has a unique solution).
Relating these concepts to square matrices
We now look at the preceding theorem in the special case where n = m and R
is a commutative ring. So far in this chapter we have just been cataloging. Now we
prove something more substantial, namely that if f : R
n
→ R
n
is surjective, then f
is injective. Later on we will prove that if R is a ﬁeld, injective implies surjective.
Theorem Suppose R is a commutative ring, A ∈ R
n
, and f : R
n
→R
n
is deﬁned
by f(B) = AB. Let v
1
, .., v
n
∈ R
n
be the columns of A, and w
1
, .., w
n
∈ R
n
= R
1,n
be the rows of A. Then the following are equivalent.
1) f is an automorphism.
2) A is invertible, i.e., [ A [ is a unit in R.
3) ¦v
1
, .., v
n
¦ is a basis for R
n
.
4) ¦v
1
, .., v
n
¦ generates R
n
.
5) f is surjective.
2
t
) A
t
is invertible, i.e., [ A
t
[ is a unit in R.
3
t
) ¦w
1
, .., w
n
¦ is a basis for R
n
.
82 Linear Algebra Chapter 5
4
t
) ¦w
1
, .., w
n
¦ generates R
n
.
Proof Suppose 5) is true and show 2). Since f is onto, ∃ u
1
, ..., u
n
∈ R
n
with
f(u
i
) = e
i
. Let g : R
n
→R
n
be the homomorphism satisfying g(e
i
) = u
i
. Then f ◦ g
is the identity. Now g comes from some matrix D and thus AD = I. This shows that
A has a right inverse and is thus invertible. Recall that the proof of this fact uses
determinant, which requires that R be commutative (see the exercise on page 64).
We already know the ﬁrst three properties are equivalent, 4) and 5) are equivalent,
and 3) implies 4). Thus the ﬁrst ﬁve are equivalent. Furthermore, applying this
result to A
t
shows that the last three properties are equivalent to each other. Since
[ A [=[ A
t
[, 2) and 2
t
) are equivalent.
Uniqueness of Dimension
There exists a ring R with R
2
≈ R
3
as Rmodules, but this is of little interest. If
R is commutative, this is impossible, as shown below. First we make a convention.
Convention For the remainder of this chapter, R will be a commutative ring.
Theorem If f : R
m
→R
n
is a surjective Rmodule homomorphism, then m ≥ n.
Proof Suppose k = n − m is positive. Deﬁne h : (R
m
⊕ R
k
= R
n
) → R
n
by
h(u, v) = f(u). Then h is a surjective homomorphism, and by the previous section,
also injective. This is a contradiction and thus m ≥ n.
Corollary If f : R
m
→R
n
is an isomorphism, then m = n.
Proof Each of f and f
−1
is surjective, so m = n by the previous theorem.
Corollary If ¦v
1
, .., v
m
¦ generates R
n
, then m ≥ n.
Proof The hypothesis implies there is a surjective homomorphism R
m
→R
n
. So
this follows from the ﬁrst theorem.
Lemma Suppose M is a f.g. module (i.e., a ﬁnite generated Rmodule). Then
if M has a basis, that basis is ﬁnite.
Chapter 5 Linear Algebra 83
Proof Suppose U ⊂ M is a ﬁnite generating set and S is a basis. Then any
element of U is a ﬁnite linear combination of elements of S, and thus S is ﬁnite.
Theorem Suppose M is a f.g. module. If M has a basis, that basis is ﬁnite
and any other basis has the same number of elements. This number is denoted by
dim(M), the dimension of M. (By convention, 0
¯
is a free module of dimension 0.)
Proof By the previous lemma, any basis for M must be ﬁnite. M has a basis of
n elements iﬀ M ≈ R
n
. The result follows because R
n
≈ R
m
iﬀ n = m.
Change of Basis
Before changing basis, we recall what a basis is. Previously we deﬁned generat
ing, independence, and basis for sequences, not for collections. For the concept of
generating it matters not whether you use sequences or collections, but for indepen
dence and basis, you must use sequences. Consider the columns of the real matrix
A =
2 3 2
1 4 1
. If we consider the column vectors of A as a collection, there are
only two of them, yet we certainly don’t wish to say the columns of A form a basis for
R
2
. In a set or collection, there is no concept of repetition. In order to make sense,
we must consider the columns of A as an ordered triple of vectors, and this sequence
is dependent. In the deﬁnition of basis on page 78, basis is deﬁned for sequences, not
for sets or collections.
Two sequences cannot begin to be equal unless they have the same index set.
Here we follow the classical convention that an index set with n elements will be
¦1, 2, .., n¦, and thus a basis for M with n elements is a sequence S = ¦u
1
, .., u
n
¦
or if you wish, S = (u
1
, .., u
n
) ∈ M
n
. Suppose M is an Rmodule with a basis of
n elements. Recall there is a bijection α : Hom
R
(R
n
, M) → M
n
deﬁned by α(h) =
(h(e
1
), .., h(e
n
)). Now h : R
n
→M is an isomorphism iﬀ α(h) is a basis for M.
Summary The point of all this is that selecting a basis of n elements for M
is the same as selecting an isomorphism from R
n
to M, and from this viewpoint,
change of basis can be displayed by the diagram below.
Endomorphisms on R
n
are represented by square matrices, and thus have a de
terminant and trace. Now suppose M is a f.g. free module and f : M → M is a
homomorphism. In order to represent f by a matrix, we must select a basis for M
(i.e., an isomorphism with R
n
). We will show that this matrix is well deﬁned up to
similarity, and thus the determinant, trace, and characteristic polynomial of f are
welldeﬁned.
84 Linear Algebra Chapter 5
Deﬁnition Suppose M is a free module, S = ¦u
1
, .., u
n
¦ is a basis for M, and
f : M → M is a homomorphism. The matrix A = (a
i,j
) ∈ R
n
of f w.r.t. the basis
S is deﬁned by f(u
i
) = u
1
a
1,i
+ +u
n
a
n,i
. (Note that if M = R
n
and u
i
= e
i
, A is
the usual matrix associated with f).
Theorem Suppose T = ¦v
1
, .., v
n
¦ is another basis for M and B ∈ R
n
is the
matrix of f w.r.t. T. Deﬁne C = (c
i,j
) ∈ R
n
by v
i
= u
1
c
1,i
+ +u
n
c
n,i
. Then C is
invertible and B = C
−1
AC, i.e., A and B are similar. Therefore [A[ = [B[,
trace(A)=trace(B), and A and B have the same characteristic polynomial (see page
66 of chapter 4).
Conversely, suppose C = (c
i,j
) ∈ R
n
is invertible. Deﬁne T = ¦v
1
, .., v
n
¦ by
v
i
= u
1
c
1,i
+ +u
n
c
n,i
. Then T is a basis for M and the matrix of f w.r.t. T is
B = C
−1
AC. In other words, conjugation of matrices corresponds to change of basis.
Proof The proof follows by seeing that the following diagram is commutative.
R
n
R
n
R
n
R
n
M M C C
A
B
≈ ≈
≈ ≈
≈ ≈
e
i
v
i
e
i
u
i
v
i
e
i
u
i
e
i
f
· ·
´
»
»
´
»
´
´
»
The diagram also explains what it means for A to be the matrix of f w.r.t. the
basis S. Let h : R
n
→ M be the isomorphism with h(e
i
) = u
i
for 1 ≤ i ≤ n. Then
the matrix A ∈ R
n
is the one determined by the endomorphism h
−1
◦f ◦h : R
n
→R
n
.
In other words, column i of A is h
−1
(f(h(e
i
))).
An important special case is where M = R
n
and f : R
n
→ R
n
is given by some
matrix W. Then h is given by the matrix U whose i
th
column is u
i
and A =
U
−1
WU. In other words, W represents f w.r.t. the standard basis, and U
−1
WU
represents f w.r.t. the basis ¦u
1
, .., u
n
¦.
Deﬁnition Suppose M is a f.g. free module and f : M →M is a homomorphism.
Deﬁne [f[ to be [A[, trace(f) to be trace(A), and CP
f
(x) to be CP
A
(x), where A is
Chapter 5 Linear Algebra 85
the matrix of f w.r.t. some basis. By the previous theorem, all three are welldeﬁned,
i.e., do not depend upon the choice of basis.
Exercise Let R = Z and f : Z
2
→ Z
2
be deﬁned by f(D) =
3 3
0 −1
D.
Find the matrix of f w.r.t. the basis
2
1
,
3
1
¸
.
Exercise Let L ⊂ R
2
be the line L = ¦(r, 2r)
t
: r ∈ R¦. Show there is one
and only one homomorphism f : R
2
→ R
2
which is the identity on L and has
f((−1, 1)
t
) = (1, −1)
t
. Find the matrix A ∈ R
2
which represents f with respect
to the basis ¦(1, 2)
t
, (−1, 1)
t
¦. Find the determinant, trace, and characteristic
polynomial of f. Also ﬁnd the matrix B ∈ R
2
which represents f with respect to
the standard basis. Finally, ﬁnd an invertible matrix C ∈ R
2
with B = C
−1
AC.
Vector Spaces
So far in this chapter we have been developing the theory of linear algebra in
general. The previous theorem, for example, holds for any commutative ring R, but
it must be assumed that the module M is free. Endomorphisms in general will not
have a determinant, trace, or characteristic polynomial. We now focus on the case
where R is a ﬁeld F, and show that in this case, every Fmodule is free. Thus any
ﬁnitely generated Fmodule will have a welldeﬁned dimension, and endomorphisms
on it will have welldeﬁned determinant, trace, and characteristic polynomial.
In this section, F is a ﬁeld. Fmodules may also be called vector spaces and
Fmodule homomorphisms may also be called linear transformations.
Theorem Suppose M is an Fmodule and v ∈ M. Then v = 0
¯
iﬀ v is independent.
That is, if v ∈ V and r ∈ F, vr = 0
¯
implies v = 0
¯
in M or r = 0
¯
in F.
Proof Suppose vr = 0
¯
and r = 0
¯
. Then 0
¯
= (vr)r
−1
= v1
¯
= v.
Theorem Suppose M = 0
¯
is an Fmodule and v ∈ M. Then v generates M iﬀ v
is a basis for M. Furthermore, if these conditions hold, then M ≈ F
F
, any nonzero
element of M is a basis, and any two elements of M are dependent.
86 Linear Algebra Chapter 5
Proof Suppose v generates M. Then v = 0
¯
and is thus independent by the
previous theorem. In this case M ≈ F, and any nonzero element of F is a basis, and
any two elements of F are dependent.
Theorem Suppose M = 0
¯
is a ﬁnitely generated Fmodule. If S = ¦v
1
, .., v
m
¦
generates M, then any maximal independent subsequence of S is a basis for M. Thus
any ﬁnite independent sequence can be extended to a basis. In particular, M has a
ﬁnite free basis, and thus is a free Fmodule.
Proof Suppose, for notational convenience, that ¦v
1
, .., v
n
¦ is a maximal inde
pendent subsequence of S, and n < i ≤ m. It must be shown that v
i
is a linear
combination of ¦v
1
, .., v
n
¦. Since ¦v
1
, .., v
n
, v
i
¦ is dependent, ∃ r
1
, ..., r
n
, r
i
not all
zero, such that v
1
r
1
+ +v
n
r
n
+v
i
r
i
= 0
¯
. Then r
i
= 0
¯
and v
i
= −(v
1
r
1
+ +v
n
r
n
)r
−1
i
.
Thus ¦v
1
, .., v
n
¦ generates S and thus all of M. Now suppose T is a ﬁnite indepen
dent sequence. T may be extended to a ﬁnite generating sequence, and inside that
sequence it may be extended to a maximal independent sequence. Thus T extends
to a basis.
After so many routine theorems, it is nice to have one with real power. It not
only says any ﬁnite independent sequence can be extended to a basis, but it can be
extended to a basis inside any ﬁnite generating set containing it. This is one of the
theorems that makes linear algebra tick. The key hypothesis here is that the ring
is a ﬁeld. If R = Z, then Z is a free module over itself, and the element 2 of Z is
independent. However it certainly cannot be extended to a basis. Also the ﬁniteness
hypothesis in this theorem is only for convenience, as will be seen momentarily.
Since F is a commutative ring, any two bases of M must have the same number
of elements, and thus the dimension of M is well deﬁned (see theorem on page 83).
Theorem Suppose M is an Fmodule of dimension n, and ¦v
1
, ..., v
m
¦ is an
independent sequence in M. Then m ≤ n and if m = n, ¦v
1
, .., v
m
¦ is a basis.
Proof ¦v
1
, .., v
m
¦ extends to a basis with n elements.
The next theorem is just a collection of observations.
Theorem Suppose M and N are ﬁnitely generated Fmodules.
Chapter 5 Linear Algebra 87
1) M ≈ F
n
iﬀ dim(M) = n.
2) M ≈ N iﬀ dim(M) = dim(N).
3) F
m
≈ F
n
iﬀ n = m.
4) dim(M ⊕N) = dim(M) + dim(N).
Here is the basic theorem for vector spaces in full generality.
Theorem Suppose M = 0
¯
is an Fmodule and S = ¦v
t
¦
t∈T
generates M.
1) Any maximal independent subsequence of S is a basis for M.
2) Any independent subsequence of S may be extended to a maximal
independent subsequence of S, and thus to a basis for M.
3) Any independent subsequence of M can be extended to a basis for M.
In particular, M has a free basis, and thus is a free Fmodule.
Proof The proof of 1) is the same as in the case where S is ﬁnite. Part 2) will
follow from the Hausdorﬀ Maximality Principle. An independent subsequence of S is
contained in a maximal monotonic tower of independent subsequences. The union of
these independent subsequences is still independent, and so the result follows. Part
3) follows from 2) because an independent sequence can always be extended to a
generating sequence.
Theorem Suppose M is an Fmodule and K ⊂ M is a submodule.
1) K is a summand of M, i.e., ∃ a submodule L of M with K ⊕L = M.
2) If M is f.g., then dim(K) ≤ dim(M) and K = M iﬀ dim(K) = dim(M).
Proof Let T be a basis for K. Extend T to a basis S for M. Then S−T generates
a submodule L with K ⊕L = M. Part 2) follows from 1).
Corollary Q is a summand of R. In other words, ∃ a Qsubmodule V ⊂ R
with Q⊕V = R as Qmodules. (See exercise on page 77.)
Proof Q is a ﬁeld, R is a Qmodule, and Q is a submodule of R.
Corollary Suppose M is a f.g. Fmodule, N is an Fmodule, and f : M → N
is a homomorphism. Then dim(M) = dim(ker(f)) + dim(image(f)).
88 Linear Algebra Chapter 5
Proof Let K = ker(f) and L ⊂ M be a submodule with K ⊕ L = M. Then
f [ L : L →image(f) is an isomorphism.
Exercise Suppose R is a domain with the property that, for Rmodules, every
submodule is a summand. Show R is a ﬁeld.
Exercise Find a free Zmodule which has a generating set containing no basis.
Exercise The real vector space R
2
is generated by the sequence S =
¦(π, 0), (2, 1), (3, 2)¦. Show there are three maximal independent subsequences of
S, and each is a basis for R
2
. (Row vectors are used here just for convenience.)
The real vector space R
3
is generated by S = ¦(1, 1, 2), (1, 2, 1), (3, 4, 5), (1, 2, 0)¦.
Show there are three maximal independent subsequences of S and each is a basis
for R
3
. You may use determinant.
Square matrices over ﬁelds
This theorem is just a summary of what we have for square matrices over ﬁelds.
Theorem Suppose A ∈ F
n
and f : F
n
→ F
n
is deﬁned by f(B) = AB. Let
v
1
, .., v
n
∈ F
n
be the columns of A, and w
1
, .., w
n
∈ F
n
= F
1,n
be the rows of A.
Then the following are equivalent.
1) ¦v
1
, .., v
n
¦ is independent, i.e., f is injective.
2) ¦v
1
, .., v
n
¦ is a basis for F
n
, i.e., f is an automorphism, i.e., A is
invertible, i.e., [ A [ = 0
¯
.
3) ¦v
1
, .., v
n
¦ generates F
n
, i.e., f is surjective.
1
t
) ¦w
1
, .., w
n
¦ is independent.
2
t
) ¦w
1
, .., w
n
¦ is a basis for F
n
, i.e., A
t
is invertible, i.e., [ A
t
[ = 0
¯
.
3
t
) ¦w
1
, .., w
n
¦ generates F
n
.
Chapter 5 Linear Algebra 89
Proof Except for 1) and 1
t
), this theorem holds for any commutative ring R.
(See the section Relating these concepts to square matrices, pages 81 and 82.)
Parts 1) and 1
t
) follow from the preceding section.
Exercise Add to this theorem more equivalent statements in terms of solutions
of n equations in n unknowns.
Overview Suppose each of X and Y is a set with n elements and f : X →Y is a
function. By the pigeonhole principle, f is injective iﬀ f is bijective iﬀ f is surjective.
Now suppose each of U and V is a vector space of dimension n and f : U → V is
a linear transformation. It follows from the work done so far that f is injective iﬀ
f is bijective iﬀ f is surjective. This shows some of the simple and deﬁnitive nature
of linear algebra.
Exercise Let A = (A
1
, .., A
n
) be an nn matrix over Z with column i = A
i
∈
Z
n
. Let f : Z
n
→ Z
n
be deﬁned by f(B) = AB and
¯
f : R
n
→ R
n
be deﬁned by
¯
f(C) = AC. Show the following are equivalent. (See the exercise on page 79.)
1) f : Z
n
→Z
n
is injective.
2) The sequence (A
1
, .., A
n
) is linearly independent over Z.
3) [A[ = 0.
4)
¯
f : R
n
→R
n
is injective.
5) The sequence (A
1
, .., A
n
) is linearly independent over R.
Rank of a matrix Suppose A ∈ F
m,n
. The row (column) rank of A is deﬁned
to be the dimension of the submodule of F
n
(F
m
) generated by the rows (columns)
of A.
Theorem If C ∈ F
m
and D ∈ F
n
are invertible, then the row (column) rank of
A is the same as the row (column) rank of CAD.
Proof Suppose f : F
n
→ F
m
is deﬁned by f(B) = AB. Each column of A
is a vector in the range F
m
, and we know from page 81 that each f(B) is a linear
90 Linear Algebra Chapter 5
combination of those vectors. Thus the image of f is the submodule of F
m
generated
by the columns of A, and its dimension is the column rank of A. This dimension
is the same as the dimension of the image of g ◦ f ◦ h : F
n
→ F
m
, where h is any
automorphism on F
n
and g is any automorphism on F
m
. This proves the theorem
for column rank. The theorem for row rank follows using transpose.
Theorem If A ∈ F
m,n
, the row rank and the column rank of A are equal. This
number is called the rank of A and is ≤ min¦m, n¦.
Proof By the theorem above, elementary row and column operations change
neither the row rank nor the column rank. By row and column operations, A may be
changed to a matrix H where h
1,1
= = h
t,t
= 1
¯
and all other entries are 0
¯
(see the
ﬁrst exercise on page 59). Thus row rank = t = column rank.
Exercise Suppose A has rank t. Show that it is possible to select t rows and t
columns of A such that the determined t t matrix is invertible. Show that the rank
of A is the largest integer t such that this is possible.
Exercise Suppose A ∈ F
m,n
has rank t. What is the dimension of the solution
set of AX = 0
¯
?
Deﬁnition If N and M are ﬁnite dimensional vector spaces and f : N →M is a
linear transformation, the rank of f is the dimension of the image of f. If f : F
n
→F
m
is given by a matrix A, then the rank of f is the same as the rank of the matrix A.
Geometric Interpretation of Determinant
Suppose V ⊂ R
n
is some nice subset. For example, if n = 2, V might be the
interior of a square or circle. There is a concept of the ndimensional volume of V .
For n = 1, it is length. For n = 2, it is area, and for n = 3 it is “ordinary volume”.
Suppose A ∈ R
n
and f : R
n
→R
n
is the homomorphism given by A. The volume of
V does not change under translation, i.e., V and V +p have the same volume. Thus
f(V ) and f(V +p) = f(V ) +f(p) have the same volume. In street language, the next
theorem says that “f multiplies volume by the absolute value of its determinant”.
Theorem The ndimensional volume of f(V ) is ±[A[(the ndimensional volume
of V ). Thus if [A[ = ±1, f preserves volume.
Chapter 5 Linear Algebra 91
Proof If [A[ = 0, image(f) has dimension < n and thus f(V ) has ndimensional
volume 0. If [A[ = 0 then A is the product of elementary matrices (see page 59)
and for elementary matrices, the theorem is obvious. The result follows because the
determinant of the composition is the product of the determinants.
Corollary If P is the ndimensional parallelepiped determined by the columns
v
1
, .. , v
n
of A, then the ndimensional volume of P is ±[A[.
Proof Let V = [0, 1] [0, 1] = ¦e
1
t
1
+ +e
n
t
n
: 0 ≤ t
i
≤ 1¦. Then
P = f(V ) = ¦v
1
t
1
+ +v
n
t
n
: 0 ≤ t
i
≤ 1¦.
Linear functions approximate diﬀerentiable functions locally
We continue with the special case F = R. Linear functions arise naturally in
business, science, and mathematics. However this is not the only reason that linear
algebra is so useful. It is a central fact that smooth phenomena may be approx
imated locally by linear phenomena. Without this great simpliﬁcation, the world
of technology as we know it today would not exist. Of course, linear transforma
tions send the origin to the origin, so they must be adjusted by a translation. As
a simple example, suppose h : R → R is diﬀerentiable and p is a real number. Let
f : R →R be the linear transformation f(x) = h
(p)x. Then h is approximated near
p by g(x) = h(p) +f(x −p) = h(p) +h
(p)(x −p).
Now suppose V ⊂ R
2
is some nice subset and h = (h
1
, h
2
) : V → R
2
is injective
and diﬀerentiable. Deﬁne the Jacobian by J(h)(x, y) =
∂h
1
∂x
∂h
1
∂y
∂h
2
∂x
∂h
2
∂y
and for each
(x, y) ∈ V , let f(x, y) : R
2
→ R
2
be the homomorphism deﬁned by J(h)(x, y).
Then for any (p
1
, p
2
) ∈ V , h is approximated near (p
1
, p
2
) (after translation) by
f(p
1
, p
2
). The area of V is
V
1dxdy. From the previous section we know that
any homomorphism f multiplies area by [ f [. The student may now understand
the following theorem from calculus. (Note that if h is the restriction of a linear
transformation from R
2
to R
2
, this theorem is immediate from the previous section.)
Theorem Suppose the determinant of J(h)(x, y) is nonnegative for each
(x, y) ∈ V . Then the area of h(V ) is
V
[ J(h) [ dxdy.
92 Linear Algebra Chapter 5
The Transpose Principle
We now return to the case where F is a ﬁeld (of arbitrary characteristic). F
modules may also be called vector spaces and submodules may be called subspaces.
The study of Rmodules in general is important and complex. However the study of
Fmodules is short and simple – every vector space is free and every subspace is a
summand. The core of classical linear algebra is not the study of vector spaces, but
the study of homomorphisms, and in particular, of endomorphisms. One goal is to
show that if f : V → V is a homomorphism with some given property, there exists
a basis of V so that the matrix representing f displays that property in a prominent
manner. The next theorem is an illustration of this.
Theorem Let F be a ﬁeld and n be a positive integer.
1) Suppose V is an ndimensional vector space and f : V →V is a
homomorphism with [f[ = 0
¯
. Then ∃ a basis of V such that the matrix
representing f has its ﬁrst row zero.
2) Suppose A ∈ F
n
has [A[ = 0
¯
. Then ∃ an invertible matrix C such that
C
−1
AC has its ﬁrst row zero.
3) Suppose V is an ndimensional vector space and f : V →V is a
homomorphism with [f[ = 0. Then ∃ a basis of V such that the matrix
representing f has its ﬁrst column zero.
4) Suppose A ∈ F
n
has [A[ = 0
¯
. Then ∃ an invertible matrix D such that
D
−1
AD has its ﬁrst column zero.
We ﬁrst wish to show that these 4 statements are equivalent. We know that
1) and 2) are equivalent and also that 3) and 4) are equivalent because change of
basis corresponds to conjugation of the matrix. Now suppose 2) is true and show
4) is true. Suppose [A[ = 0
¯
. Then [A
t
[ = 0
¯
and by 2) ∃ C such that C
−1
A
t
C has
ﬁrst row zero. Thus (C
−1
A
t
C)
t
= C
t
A(C
t
)
−1
has ﬁrst row column zero. The result
follows by deﬁning D = (C
t
)
−1
. Also 4) implies 2).
This is an example of the transpose principle. Loosely stated, it is that theorems
about change of basis correspond to theorems about conjugation of matrices and
theorems about the rows of a matrix correspond to theorems about the columns of a
matrix, using transpose. In the remainder of this chapter, this will be used without
further comment.
Chapter 5 Linear Algebra 93
Proof of the theorem We are free to select any of the 4 parts, and we select
part 3). Since [ f [= 0, f is not injective and ∃ a nonzero v
1
∈ V with f(v
1
) = 0
¯
.
Now v
1
is independent and extends to a basis ¦v
1
, .., v
n
¦. Then the matrix of f w.r.t
this basis has ﬁrst column zero.
Exercise Let A =
3π 6
2π 4
. Find an invertible matrix C ∈ R
2
so that C
−1
AC
has ﬁrst row zero. Also let A =
¸
¸
0 0 0
1 3 4
2 1 4
¸
and ﬁnd an invertible matrix D ∈ R
3
so that D
−1
AD has ﬁrst column zero.
Exercise Suppose M is an ndimensional vector space over a ﬁeld F, k is an
integer with 0 < k < n, and f : M → M is an endomorphism of rank k. Show
there is a basis for M so that the matrix representing f has its ﬁrst n −k rows zero.
Also show there is a basis for M so that the matrix representing f has its ﬁrst n −k
columns zero. Work these out directly without using the transpose principle.
Nilpotent Homomorphisms
In this section it is shown that an endomorphism f is nilpotent iﬀ all of its char
acteristic roots are 0
¯
iﬀ it may be represented by a strictly upper triangular matrix.
Deﬁnition An endomorphism f : V →V is nilpotent if ∃ m with f
m
= 0
¯
. Any
f represented by a strictly upper triangular matrix is nilpotent (see page 56).
Theorem Suppose V is an ndimensional vector space and f : V → V is a
nilpotent homomorphism. Then f
n
= 0
¯
and ∃ a basis of V such that the matrix
representing f w.r.t. this basis is strictly upper triangular. Thus the characteristic
polynomial of f is CP
f
(x) = x
n
.
Proof Suppose f = 0
¯
is nilpotent. Let t be the largest positive integer with
f
t
= 0
¯
. Then f
t
(V ) ⊂ f
t−1
(V ) ⊂ ⊂ f(V ) ⊂ V . Since f is nilpotent, all of these
inclusions are proper. Therefore t < n and f
n
= 0
¯
. Construct a basis for V by
starting with a basis for f
t
(V ), extending it to a basis for f
t−1
(V ), etc. Then the
matrix of f w.r.t. this basis is strictly upper triangular.
Note To obtain a matrix which is strictly lower triangular, reverse the order of
the basis.
94 Linear Algebra Chapter 5
Exercise Use the transpose principle to write 3 other versions of this theorem.
Theorem Suppose V is an ndimensional vector space and f : V → V is a
homomorphism. Then f is nilpotent iﬀ CP
f
(x) = x
n
. (See the exercise at the end
of Chapter 4 for the case n = 2.)
Proof Suppose CP
f
(x) = x
n
. For n = 1 this implies f = 0
¯
, so suppose n > 1.
Since the constant term of CP
f
(x) is 0
¯
, the determinant of f is 0
¯
. Thus ∃ a basis
of V such that the matrix A representing f has its ﬁrst column zero. Let B ∈ F
n−1
be the matrix obtained from A by removing its ﬁrst row and ﬁrst column. Now
CP
A
(x) = x
n
= xCP
B
(x). Thus CP
B
(x) = x
n−1
and by induction on n, B is
nilpotent and so ∃ C such that C
−1
BC is strictly upper triangular. Then
¸
¸
¸
¸
¸
¸
¸
1 0 0
0
C
−1
0
¸
¸
¸
¸
¸
¸
¸
¸
0 ∗ ∗
B
0
¸
¸
¸
¸
¸
¸
¸
¸
1 0 0
0
C
0
¸
=
¸
¸
¸
¸
¸
¸
¸
0 ∗ ∗
0
C
−1
BC
0
¸
is strictly upper triangular.
Exercise Suppose F is a ﬁeld, A ∈ F
3
is a strictly lower triangular matrix of
rank 2, and B =
¸
¸
0 0 0
1 0 0
0 1 0
¸
. Using conjugation by elementary matrices, show there
is an invertible matrix C so that C
−1
AC = B. Now suppose V is a 3dimensional
vector space and f : V →V is a nilpotent endomorphism of rank 2. We know f can
be represented by a strictly lower triangular matrix. Show there is a basis ¦v
1
, v
2
, v
3
¦
for V so that B is the matrix representing f. Also show that f(v
1
) = v
2
, f(v
2
) = v
3
,
and f(v
3
) = 0
¯
. In other words, there is a basis for V of the form ¦v, f(v), f
2
(v)¦
with f
3
(v) = 0
¯
.
Exercise Suppose V is a 3dimensional vector space and f : V →V is a nilpotent
endomorphism of rank 1. Show there is a basis for V so that the matrix representing
f is
¸
¸
0 0 0
1 0 0
0 0 0
¸
.
Chapter 5 Linear Algebra 95
Eigenvalues
Our standing hypothesis is that V is an ndimensional vector space over a ﬁeld F
and f : V →V is a homomorphism.
Deﬁnition An element λ ∈ F is an eigenvalue of f if ∃ a nonzero v ∈ V with
f(v) = λv. Any such v is called an eigenvector. E
λ
⊂ V is deﬁned to be the set of
all eigenvectors for λ (plus 0
¯
). Note that E
λ
= ker(λI − f) is a subspace of V . The
next theorem shows the eigenvalues of f are just the characteristic roots of f.
Theorem If λ ∈ F then the following are equivalent.
1) λ is an eigenvalue of f, i.e., (λI −f) : V →V is not injective.
2) [ (λI −f) [= 0
¯
.
3) λ is a characteristic root of f, i.e., a root of the characteristic
polynomial CP
f
(x) =[ (xI −A) [, where A is any matrix representing f.
Proof It is immediate that 1) and 2) are equivalent, so let’s show 2) and 3)
are equivalent. The evaluation map F[x] → F which sends h(x) to h(λ) is a ring
homomorphism (see theorem on page 47). So evaluating (xI − A) at x = λ and
taking determinant gives the same result as taking the determinant of (xI − A) and
evaluating at x = λ. Thus 2) and 3) are equivalent.
The nicest thing you can say about a matrix is that it is similar to a diagonal
matrix. Here is one case where that happens.
Theorem Suppose λ
1
, .., λ
k
are distinct eigenvalues of f, and v
i
is an eigenvector
of λ
i
for 1 ≤ i ≤ k. Then the following hold.
1) ¦v
1
, .., v
k
¦ is independent.
2) If k = n, i.e., if CP
f
(x) = (x −λ
1
) (x −λ
n
), then ¦v
1
, .., v
n
¦ is a
basis for V . The matrix of f w.r.t. this basis is the diagonal matrix whose
(i, i) term is λ
i
.
Proof Suppose ¦v
1
, .., v
k
¦ is dependent. Suppose t is the smallest positive integer
such that ¦v
1
, .., v
t
¦ is dependent, and v
1
r
1
+ +v
t
r
t
= 0
¯
is a nontrivial linear
combination. Note that at least two of the coeﬃcients must be nonzero. Now
(f −λ
t
)(v
1
r
1
+ +v
t
r
t
) = v
1
(λ
1
−λ
t
)r
1
+ +v
t−1
(λ
t−1
−λ
t
)r
t−1
+0
¯
= 0
¯
is a shorter
96 Linear Algebra Chapter 5
nontrivial linear combination. This is a contradiction and proves 1). Part 2) follows
from 1) because dim(V ) = n.
Exercise Let A =
0 1
−1 0
∈ R
2
. Find an invertible C ∈ C
2
such that
C
−1
AC is diagonal. Show that C cannot be selected in R
2
. Find the characteristic
polynomial of A.
Exercise Suppose V is a 3dimensional vector space and f : V →V is an endo
morphism with CP
f
(x) = (x−λ)
3
. Show that (f −λI) has characteristic polynomial
x
3
and is thus a nilpotent endomorphism. Show there is a basis for V so that the
matrix representing f is
¸
¸
λ 0 0
1 λ 0
0 1 λ
¸
,
¸
¸
λ 0 0
1 λ 0
0 0 λ
¸
or
¸
¸
λ 0 0
0 λ 0
0 0 λ
¸
.
We could continue and ﬁnally give an ad hoc proof of the Jordan canonical form,
but in this chapter we prefer to press on to inner product spaces. The Jordan form
will be developed in Chapter 6 as part of the general theory of ﬁnitely generated
modules over Euclidean domains. The next section is included only as a convenient
reference.
Jordan Canonical Form
This section should be just skimmed or omitted entirely. It is unnecessary for the
rest of this chapter, and is not properly part of the ﬂow of the chapter. The basic
facts of Jordan form are summarized here simply for reference.
The statement that a square matrix B over a ﬁeld F is a Jordan block means that
∃ λ ∈ F such that B is a lower triangular matrix of the form
B =
¸
¸
¸
¸
¸
¸
¸
λ 0
1 λ
0 1 λ
¸
. B gives a homomorphism g : F
m
→F
m
with g(e
m
) = λe
m
and g(e
i
) = e
i+1
+λe
i
for 1 ≤ i < m. Note that CP
B
(x) = (x −λ)
m
and so λ is the
only eigenvalue of B, and B satisﬁes its characteristic polynomial, i.e., CP
B
(B) = 0
¯
.
Chapter 5 Linear Algebra 97
Deﬁnition A matrix D ∈ F
n
is in Jordan form if ∃ Jordan blocks B
1
, .. , B
t
such
that D =
¸
¸
¸
¸
¸
¸
¸
¸
¸
B
1
B
2
0
0
B
t
¸
. Suppose D is of this form and B
i
∈ F
n
i
has
eigenvalue λ
i
. Then n
1
+ +n
t
= n and CP
D
(x) = (x−λ
1
)
n
1
(x −λ
t
)
nt
. Note that
a diagonal matrix is a special case of Jordan form. D is a diagonal matrix iﬀ each
n
i
= 1, i.e., iﬀ each Jordan block is a 1 1 matrix.
Theorem If A ∈ F
n
, the following are equivalent.
1) ∃ an invertible C ∈ F
n
such that C
−1
AC is in Jordan form.
2) ∃ λ
1
, .., λ
n
∈ F (not necessarily distinct) such that CP
A
(x) = (x −λ
1
)
(x −λ
n
). (In this case we say that all the eigenvalues of A belong to F.)
Theorem Jordan form (when it exists) is unique. This means that if A and D are
similar matrices in Jordan form, they have the same Jordan blocks, except possibly
in diﬀerent order.
The reader should use the transpose principle to write three other versions of the
ﬁrst theorem. Also note that we know one special case of this theorem, namely that
if A has n distinct eigenvalues in F, then A is similar to a diagonal matrix. Later on
it will be shown that if A is a symmetric real matrix, then A is similar to a diagonal
matrix.
Let’s look at the classical case A ∈ R
n
. The complex numbers are algebraically
closed. This means that CP
A
(x) will factor completely in C[x], and thus ∃ C ∈ C
n
with C
−1
AC in Jordan form. C may be selected to be in R
n
iﬀ all the eigenvalues
of A are real.
Exercise Find all real matrices in Jordan form that have the following charac
teristic polynomials: x(x − 2), (x − 2)
2
, (x − 2)(x − 3)(x − 4), (x − 2)(x − 3)
2
,
(x −2)
2
(x −3)
2
, (x −2)(x −3)
3
.
Exercise Suppose D ∈ F
n
is in Jordan form and has characteristic polynomial
a
0
+a
1
x + +x
n
. Show a
0
I +a
1
D + +D
n
= 0
¯
, i.e., show CP
D
(D) = 0
¯
.
98 Linear Algebra Chapter 5
Exercise (CayleyHamilton Theorem) Suppose E is a ﬁeld and A ∈ E
n
.
Assume the theorem that there is a ﬁeld F containing E such that CP
A
(x) factors
completely in F[x]. Thus ∃ an invertible C ∈ F
n
such that D = C
−1
AC is in Jordan
form. Use this to show CP
A
(A) = 0
¯
. (See the second exercise on page 66.)
Exercise Suppose A ∈ F
n
is in Jordan form. Show A is nilpotent iﬀ A
n
= 0
¯
iﬀ CP
A
(x) = x
n
. (Note how easy this is in Jordan form.)
Inner Product Spaces
The two most important ﬁelds for mathematics and science in general are the
real numbers and the complex numbers. Finitely generated vector spaces over R or
C support inner products and are thus geometric as well as algebraic objects. The
theories for the real and complex cases are quite similar, and both could have been
treated here. However, for simplicity, attention is restricted to the case F = R.
In the remainder of this chapter, the power and elegance of linear algebra become
transparent for all to see.
Deﬁnition Suppose V is a real vector space. An inner product (or dot product)
on V is a function V V →R which sends (u, v) to u v and satisﬁes
1) (u
1
r
1
+u
2
r
2
) v = (u
1
v)r
1
+ (u
2
v)r
2
for all u
1
, u
2
, v ∈ V
v (u
1
r
1
+u
2
r
2
) = (v u
1
)r
1
+ (v u
2
)r
2
and r
1
, r
2
∈ R.
2) u v = v u for all u, v ∈ V .
3) u u ≥ 0 and u u = 0 iﬀ u = 0
¯
for all u ∈ V .
Theorem Suppose V has an inner product.
1) If v ∈ V , f : V →R deﬁned by f(u) = u v is a homomorphism.
Thus 0
¯
v = 0.
2) Schwarz’ inequality. If u, v ∈ V, (u v)
2
≤ (u u)(v v).
Proof of 2) Let a =
√
v v and b =
√
u u. If a or b is 0, the result is obvious.
Suppose neither a nor b is 0. Now 0 ≤ (ua ±vb) (ua ±vb) = (u u)a
2
±2ab(u v)+
(v v)b
2
= b
2
a
2
±2ab(u v)+a
2
b
2
. Dividing by 2ab yields 0 ≤ ab±(u v) or [ u v [≤ ab.
Chapter 5 Linear Algebra 99
Theorem Suppose V has an inner product. Deﬁne the norm or length of a vector
v by v =
√
v v. The following properties hold.
1) v = 0 iﬀ v = 0
¯
.
2) vr = v [ r [.
3) [ u v [ ≤ uv. (Schwarz’ inequality)
4) u +v ≤ u +v. (The triangle inequality)
Proof of 4) u + v
2
= (u + v) (u + v) = u
2
+ 2(u v) + v
2
≤ u
2
+
2uv +v
2
= (u +v)
2
.
Deﬁnition An Inner Product Space (IPS) is a real vector space with an
inner product. Suppose V is an IPS. A sequence ¦v
1
, .., v
n
¦ is orthogonal provided
v
i
v
j
= 0 when i = j. The sequence is orthonormal if it is orthogonal and each
vector has length 1, i.e., v
i
v
j
= δ
i,j
for 1 ≤ i, j ≤ n.
Theorem If S = ¦v
1
, .., v
n
¦ is an orthogonal sequence of nonzero vectors in an
IPS V, then S is independent. Furthermore
v
1
v
1

, ,
v
n
v
n

¸
is orthonormal.
Proof Suppose v
1
r
1
+ +v
n
r
n
= 0
¯
. Then 0 = (v
1
r
1
+ +v
n
r
n
) v
i
= r
i
(v
i
v
i
)
and thus r
i
= 0. Thus S is independent. The second statement is transparent.
It is easy to deﬁne an inner product, as is shown by the following theorem.
Theorem Suppose V is a real vector space with a basis S = ¦v
1
, .., v
n
¦. Then
there is a unique inner product on V which makes S an orthornormal basis. It is
given by the formula (v
1
r
1
+ +v
n
r
n
) (v
1
s
1
+ +v
n
s
n
) = r
1
s
1
+ +r
n
s
n
.
Convention R
n
will be assumed to have the standard inner product deﬁned by
(r
1
, .., r
n
)
t
(s
1
, .., s
n
)
t
= r
1
s
1
+ +r
n
s
n
. S = ¦e
1
, .., e
n
¦ will be called the canonical
or standard orthonormal basis (see page 72). The next theorem shows that this
inner product has an amazing geometry.
Theorem If u, v ∈ R
n
, u v = uv cos Θ where Θ is the angle between u
100 Linear Algebra Chapter 5
and v.
Proof Let u = (r
1
, .., r
n
) and v = (s
1
, .., s
n
). By the law of cosines u − v
2
=
u
2
+v
2
−2uv cos Θ. So (r
1
−s
1
)
2
+ +(r
n
−s
n
)
2
= r
2
1
+ +r
2
n
+s
2
1
+
+s
2
n
−2uv cos Θ. Thus r
1
s
1
+ +r
n
s
n
= uv cos Θ.
Exercise This is a simple exercise to observe that hyperplanes in R
n
are cosets.
Suppose f : R
n
→R is a nonzero homomorphism given by a matrix A = (a
1
, .., a
n
) ∈
R
1,n
. Then L = ker(f) is the set of all solutions to a
1
x
1
+ +a
n
x
n
= 0, i.e., the
set of all vectors perpendicular to A. Now suppose b ∈ R and C =
¸
¸
¸
¸
c
1
c
n
¸
∈ R
n
has f(C) = b. Then f
−1
(b) is the set of all solutions to a
1
x
1
+ +a
n
x
n
= b which
is the coset L+C, and this the set of all solutions to a
1
(x
1
−c
1
) + +a
n
(x
n
−c
n
) = 0.
GramSchmidt orthonormalization
Theorem (Fourier series) Suppose W is an IPS with an orthonormal basis
¦w
1
, .., w
n
¦. Then if v ∈ W, v = w
1
(v w
1
) + +w
n
(v w
n
).
Proof v = w
1
r
1
+ +w
n
r
n
and v w
i
= (w
1
r
1
+ +w
n
r
n
) w
i
= r
i
Theorem Suppose W is an IPS, Y ⊂ W is a subspace with an orthonormal basis
¦w
1
, .., w
k
¦, and v ∈ W−Y . Deﬁne the projection of v onto Y by p(v) = w
1
(v w
1
)+
+w
k
(vw
k
), and let w = v−p(v). Then (ww
i
) = (v−w
1
(vw
1
)−w
k
(vw
k
))w
i
= 0.
Thus if w
k+1
=
w
w
, then ¦w
1
, .., w
k+1
¦ is an orthonormal basis for the subspace
generated by ¦w
1
, .., w
k
, v¦. If ¦w
1
, .., w
k
, v¦ is already orthonormal, w
k+1
= v.
Theorem (GramSchmidt) Suppose W is an IPS with a basis ¦v
1
, .., v
n
¦.
Then W has an orthonormal basis ¦w
1
, .., w
n
¦. Moreover, any orthonormal sequence
in W extends to an orthonormal basis of W.
Proof Let w
1
=
v
1
v
1

. Suppose inductively that ¦w
1
, .., w
k
¦ is an orthonormal
basis for Y , the subspace generated by ¦v
1
, .., v
k
¦. Let w = v
k+1
− p(v
k+1
) and
Chapter 5 Linear Algebra 101
w
k+1
=
w
w
. Then by the previous theorem, ¦w
1
, .., w
k+1
¦ is an orthonormal basis
for the subspace generated by ¦w
1
, .., w
k
, v
k+1
¦. In this manner an orthonormal basis
for W is constructed. Notice that this construction deﬁnes a function h which sends
a basis for W to an orthonormal basis for W (see topology exercise on page 103).
Now suppose W has dimension n and ¦w
1
, .., w
k
¦ is an orthonormal sequence in
W. Since this sequence is independent, it extends to a basis ¦w
1
, .., w
k
, v
k+1
, .., v
n
¦.
The process above may be used to modify this to an orthonormal basis ¦w
1
, .., w
n
¦.
Exercise Let f : R
3
→ R be the homomorphism deﬁned by the matrix (2,1,3).
Find an orthonormal basis for the kernel of f. Find the projection of (e
1
+ e
2
) onto
ker(f). Find the angle between e
1
+e
2
and the plane ker(f).
Exercise Let W = R
3
have the standard inner product and Y ⊂ W be the
subspace generated by ¦w
1
, w
2
¦ where w
1
= (1, 0, 0)
t
and w
2
= (0, 1, 0)
t
. W is
generated by the sequence ¦w
1
, w
2
, v¦ where v = (1, 2, 3)
t
. As in the ﬁrst theorem
of this section, let w = v − p(v), where p(v) is the projection of v onto Y , and set
w
3
=
w
w
. Find w
3
and show that for any t with 0 ≤ t ≤ 1, ¦w
1
, w
2
, (1 −t)v +tw
3
¦
is a basis for W. This is a key observation for an exercise on page 103 showing O(n)
is a deformation retract of GL
n
(R).
Isometries Suppose each of U and V is an IPS. A homomorphism f : U → V
is said to be an isometry provided it is an isomorphism and for any u
1
, u
2
in U,
(u
1
u
2
)
U
= (f(u
1
) f(u
2
))
V
.
Theorem Suppose each of U and V is an ndimensional IPS, ¦u
1
, .., u
n
¦ is an
orthonormal basis for U, and f : U →V is a homomorphism. Then f is an isometry
iﬀ ¦f(u
1
), .., f(u
n
)¦ is an orthonormal sequence in V .
Proof Isometries certainly preserve orthonormal sequences. So suppose T =
¦f(u
1
), .., f(u
n
)¦ is an orthonormal sequence in V . Then T is independent and thus
T is a basis for V and thus f is an isomorphism (see the second theorem on page 79).
It is easy to check that f preserves inner products.
We now come to one of the deﬁnitive theorems in linear algebra. It is that, up to
isometry, there is only one inner product space for each dimension.
102 Linear Algebra Chapter 5
Theorem Suppose each of U and V is an ndimensional IPS. Then ∃ an isometry
f : U →V. In particular, U is isometric to R
n
with its standard inner product.
Proof There exist orthonormal bases ¦u
1
, .., u
n
¦ for U and ¦v
1
, .., v
n
¦ for V .
By the ﬁrst theorem on page 79, there exists a homomorphism f : U → V with
f(u
i
) = v
i
, and by the previous theorem, f is an isometry.
Exercise Let f : R
3
→ R be the homomorphism deﬁned by the matrix (2,1,3).
Find a linear transformation h : R
2
→R
3
which gives an isometry from R
2
to ker(f).
Orthogonal Matrices
As noted earlier, linear algebra is not so much the study of vector spaces as it is
the study of endomorphisms. We now wish to study isometries from R
n
to R
n
.
We know from a theorem on page 90 that an endomorphism preserves volume iﬀ
its determinant is ±1. Isometries preserve inner product, and thus preserve angle and
distance, and so certainly preserve volume.
Theorem Suppose A ∈ R
n
and f : R
n
→ R
n
is the homomorphism deﬁned by
f(B) = AB. Then the following are equivalent.
1) The columns of A form an orthonormal basis for R
n
, i.e., A
t
A = I.
2) The rows of A form an orthonormal basis for R
n
, i.e., AA
t
= I.
3) f is an isometry.
Proof A left inverse of a matrix is also a right inverse (see the exercise on
page 64). Thus 1) and 2) are equivalent because each of them says A is invert
ible with A
−1
= A
t
. Now ¦e
1
, .., e
n
¦ is the canonical orthonormal basis for R
n
, and
f(e
i
) is column i of A. Thus by the previous section, 1) and 3) are equivalent.
Deﬁnition If A ∈ R
n
satisﬁes these three conditions, A is said to be orthogonal.
The set of all such A is denoted by O(n), and is called the orthogonal group.
Theorem
1) If A is orthogonal, [ A [= ±1.
2) If A is orthogonal, A
−1
is orthogonal. If A and C are orthogonal, AC is
orthogonal. Thus O(n) is a multiplicative subgroup of GL
n
(R).
Chapter 5 Linear Algebra 103
3) Suppose A is orthogonal and f is deﬁned by f(B) = AB. Then f preserves
distances and angles. This means that if u, v ∈ R
n
, u −v =
f(u)−f(v) and the angle between u and v is equal to the angle between
f(u) and f(v).
Proof Part 1) follows from [A[
2
= [A[ [A
t
[ = [I[ = 1. Part 2) is imme
diate, because isometries clearly form a subgroup of the multiplicative group of
all automorphisms. For part 3) assume f : R
n
→ R
n
is an isometry. Then
u − v
2
= (u − v) (u − v) = f(u − v) f(u − v) = f(u − v)
2
= f(u) − f(v)
2
.
The proof that f preserves angles follows from u v = uvcosΘ.
Exercise Show that if A ∈ O(2) has [A[ = 1, then A =
cosΘ −sinΘ
sinΘ cosΘ
for
some number Θ. (See the exercise on page 56.)
Exercise (topology) Let R
n
≈ R
n
2
have its usual metric topology. This means
a sequence of matrices ¦A
i
¦ converges to A iﬀ it converges coordinatewise. Show
GL
n
(R) is an open subset and O(n) is closed and compact. Let h : GL
n
(R) →
O(n) be deﬁned by GramSchmidt. Show H : GL
n
(R) [0, 1] → GL
n
(R) deﬁned
by H(A, t) = (1 −t)A+th(A) is a deformation retract of GL
n
(R) to O(n).
Diagonalization of Symmetric Matrices
We continue with the case F = R. Our goals are to prove that, if A is a symmetric
matrix, all of its eigenvalues are real and that ∃ an orthogonal matrix C such that
C
−1
AC is diagonal. As background, we ﬁrst note that symmetric is the same as
selfadjoint.
Theorem Suppose A ∈ R
n
and u, v ∈ R
n
. Then (A
t
u) v = u (Av).
Proof If y, z ∈ R
n
, then the dot product y z, is the matrix product y
t
z, and
matrix multiplication is associative. Thus (A
t
u) v = (u
t
A)v = u
t
(Av) = u (Av).
Deﬁnition Suppose A ∈ R
n
. A is said to be symmetric provided A
t
= A. Note
that any diagonal matrix is symmetric. A is said to be selfadjoint if (Au)v = u(Av)
for all u, v ∈ R
n
. The next theorem is just an exercise using the previous theorem.
Theorem A is symmetric iﬀ A is selfadjoint.
104 Linear Algebra Chapter 5
Theorem Suppose A ∈ R
n
is symmetric. Then ∃ real numbers λ
1
, .., λ
n
(not
necessarily distinct) such that CP
A
(x) = (x − λ
1
)(x − λ
2
) (x − λ
n
). That is, all
the eigenvalues of A are real.
Proof We know CP
A
(x) factors into linears over C. If µ = a + bi is a complex
number, its conjugate is deﬁned by ¯ µ = a −bi. If h : C →C is deﬁned by h(µ) = ¯ µ,
then h is a ring isomorphism which is the identity on R. If w = (a
i,j
) is a complex
matrix or vector, its conjugate is deﬁned by ¯ w = (¯ a
i,j
). Since A ∈ R
n
is a real
symmetric matrix, A = A
t
=
¯
A
t
. Now suppose λ is a complex eigenvalue of A
and v ∈ C
n
is an eigenvector with Av = λv. Then λ(v
t
¯ v) = (λv)
t
¯ v = (Av)
t
¯ v =
(v
t
A)¯ v = v
t
(A¯ v) = v
t
(Av) = v
t
(λv) =
¯
λ(v
t
¯ v). Thus λ =
¯
λ and λ ∈ R. Or
you can deﬁne a complex inner product on C
n
by (w v) = w
t
¯ v. The proof then
reads as λ(v v) = (λv v) = (Av v) = (v Av) = (v λv) =
¯
λ(v v). Either way,
λ is a real number.
We know that eigenvectors belonging to distinct eigenvalues are linearly indepen
dent. For symmetric matrices, we show more, namely that they are perpendicular.
Theorem Suppose A is symmetric, λ
1
, λ
2
∈ R are distinct eigenvalues of A, and
Au = λ
1
u and Av = λ
2
v. Then u v = 0.
Proof λ
1
(u v) = (Au) v = u (Av) = λ
2
(u v).
Review Suppose A ∈ R
n
and f : R
n
→ R
n
is deﬁned by f(B) = AB. Then A
represents f w.r.t. the canonical orthonormal basis. Let S = ¦v
1
, .., v
n
¦ be another
basis and C ∈ R
n
be the matrix with v
i
as column i. Then C
−1
AC is the matrix
representing f w.r.t. S. Now S is an orthonormal basis iﬀ C is an orthogonal matrix.
Summary Representing f w.r.t. an orthonormal basis is the same as conjugating
A by an orthogonal matrix.
Theorem Suppose A ∈ R
n
and C ∈ O(n). Then A is symmetric iﬀ C
−1
AC
is symmetric.
Proof Suppose A is symmetric. Then (C
−1
AC)
t
= C
t
A(C
−1
)
t
= C
−1
AC.
The next theorem has geometric and physical implications, but for us, just the
incredibility of it all will suﬃce.
Chapter 5 Linear Algebra 105
Theorem If A ∈ R
n
, the following are equivalent.
1) A is symmetric.
2) ∃ C ∈ O(n) such that C
−1
AC is diagonal.
Proof By the previous theorem, 2) ⇒ 1). Show 1) ⇒ 2). Suppose A is a
symmetric 22 matrix. Let λ be an eigenvalue for A and ¦v
1
, v
2
¦ be an orthonormal
basis for R
2
with Av
1
= λv
1
. Then w.r.t this basis, the transformation determined
by A is represented by
λ b
0 d
. Since this matrix is symmetric, b = 0.
Now suppose by induction that the theorem is true for symmetric matrices in
R
t
for t < n, and suppose A is a symmetric n n matrix. Denote by λ
1
, .., λ
k
the
distinct eigenvalues of A, k ≤ n. If k = n, the proof is immediate, because then there
is a basis of eigenvectors of length 1, and they must form an orthonormal basis. So
suppose k < n. Let v
1
, .., v
k
be eigenvectors for λ
1
, .., λ
k
with each  v
i
= 1. They
may be extended to an orthonormal basis v
1
, .., v
n
. With respect to this basis, the
transformation determined by A is represented by
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
λ
1
λ
k
¸
(B)
(0) (D)
¸
.
Since this is a symmetric matrix, B = 0 and D is a symmetric matrix of smaller
size. By induction, ∃ an orthogonal C such that C
−1
DC is diagonal. Thus conjugating
by
I 0
0 C
makes the entire matrix diagonal.
This theorem is so basic we state it again in diﬀerent terminology. If V is an IPS, a
linear transformation f : V →V is said to be selfadjoint provided (uf(v)) = (f(u)v)
for all u, v ∈ V .
Theorem If V is an ndimensional IPS and f : V →V is a linear transformation,
then the following are equivalent.
1) f is selfadjoint.
2) ∃ an orthonormal basis ¦v
1
, ..., v
n
¦ for V with each
v
i
an eigenvector of f.
106 Linear Algebra Chapter 5
Exercise Let A =
2 2
2 2
. Find an orthogonal C such that C
−1
AC is diagonal.
Do the same for A =
2 1
1 2
.
Exercise Suppose A, D ∈ R
n
are symmetric. Under what conditions are A and D
similar? Show that, if A and D are similar, ∃ an orthogonal C such that D = C
−1
AC.
Exercise Suppose V is an ndimensional real vector space. We know that V is
isomorphic to R
n
. Suppose f and g are isomorphisms from V to R
n
and A is a subset
of V . Show that f(A) is an open subset of R
n
iﬀ g(A) is an open subset of R
n
. This
shows that V , an algebraic object, has a godgiven topology. Of course, if V has
an inner product, it automatically has a metric, and this metric will determine that
same topology. Finally, suppose V and W are ﬁnitedimensional real vector spaces
and h : V →W is a linear transformation. Show that h is continuous.
Exercise Deﬁne E : C
n
→C
n
by E(A) = e
A
= I +A+(1/2!)A
2
+. This series
converges and thus E is a well deﬁned function. If AB = BA, then E(A + B) =
E(A)E(B). Since A and −A commute, I = E(0
¯
) = E(A − A) = E(A)E(−A), and
thus E(A) is invertible with E(A)
−1
= E(−A). Furthermore E(A
t
) = E(A)
t
, and
if C is invertible, E(C
−1
AC) = C
−1
E(A)C. Now use the results of this section to
prove the statements below. (For part 1, assume the Jordan form, i.e., assume any
A ∈ C
n
is similar to a lower triangular matrix.)
1) If A ∈ C
n
, then [ e
A
[= e
trace(A)
. Thus if A ∈ R
n
, [ e
A
[= 1
iﬀ trace(A) = 0.
2) ∃ a nonzero matrix N ∈ R
2
with e
N
= I.
3) If N ∈ R
n
is symmetric, then e
N
= I iﬀ N = 0
¯
.
4) If A ∈ R
n
and A
t
= −A, then e
A
∈ O(n).
Chapter 6
Appendix
The ﬁve previous chapters were designed for a year undergraduate course in algebra.
In this appendix, enough material is added to form a basic ﬁrst year graduate course.
Two of the main goals are to characterize ﬁnitely generated abelian groups and to
prove the Jordan canonical form. The style is the same as before, i.e., everything is
right down to the nub. The organization is mostly a linearly ordered sequence except
for the last two sections on determinants and dual spaces. These are independent
sections added on at the end.
Suppose R is a commutative ring. An Rmodule M is said to be cyclic if it can
be generated by one element, i.e., M ≈ R/I where I is an ideal of R. The basic
theorem of this chapter is that if R is a Euclidean domain and M is a ﬁnitely generated
Rmodule, then M is the sum of cyclic modules. Thus if M is torsion free, it is a
free Rmodule. Since Z is a Euclidean domain, ﬁnitely generated abelian groups
are the sums of cyclic groups – one of the jewels of abstract algebra.
Now suppose F is a ﬁeld and V is a ﬁnitely generated Fmodule. If T : V →V is
a linear transformation, then V becomes an F[x]module by deﬁning vx = T(v). Now
F[x] is a Euclidean domain and so V
F[x]
is the sum of cyclic modules. This classical
and very powerful technique allows an easy proof of the canonical forms. There is a
basis for V so that the matrix representing T is in Rational canonical form. If the
characteristic polynomial of T factors into the product of linear polynomials, then
there is a basis for V so that the matrix representing T is in Jordan canonical form.
This always holds if F = C. A matrix in Jordan form is a lower triangular matrix
with the eigenvalues of T displayed on the diagonal, so this is a powerful concept.
In the chapter on matrices, it is stated without proof that the determinant of the
product is the product of the determinants. A proof of this, which depends upon the
classiﬁcation of certain types of alternating multilinear forms, is given in this chapter.
The ﬁnal section gives the fundamentals of dual spaces.
107
108 Appendix Chapter 6
The Chinese Remainder Theorem
On page 50 in the chapter on rings, the Chinese Remainder Theorem was proved
for the ring of integers. In this section this classical topic is presented in full generality.
Surprisingly, the theorem holds even for noncommutative rings.
Deﬁnition Suppose R is a ring and A
1
, A
2
, ..., A
m
are ideals of R. Then the sum
A
1
+ A
2
+ + A
m
is the set of all a
1
+ a
2
+ + a
m
with a
i
∈ A
i
. The product
A
1
A
2
A
m
is the set of all ﬁnite sums of elements a
1
a
2
a
m
with a
i
∈ A
i
. Note
that the sum and product of ideals are ideals and A
1
A
2
A
m
⊂ (A
1
∩A
2
∩ ∩A
m
).
Deﬁnition Ideals A and B of R are said to be comaximal if A+B = R.
Theorem If A and B are ideals of a ring R, then the following are equivalent.
1) A and B are comaximal.
2) ∃ a ∈ A and b ∈ B with a +b = 1
¯
.
3) π(A) = R/B where π : R →R/B is the projection.
Theorem If A
1
, A
2
, ..., A
m
and B are ideals of R with A
i
and B comaximal for
each i, then A
1
A
2
A
m
and B are comaximal. Thus A
1
∩ A
2
∩ ∩ A
m
and B
are comaximal.
Proof Consider π : R →R/B. Then π(A
1
A
2
A
m
) = π(A
1
)π(A
2
) π(A
m
) =
(R/B)(R/B) (R/B) = R/B.
Chinese Remainder Theorem Suppose A
1
, A
2
, ..., A
n
are pairwise comaximal
ideals of R, with each A
i
= R. Then the natural map π : R →R/A
1
R/A
2
R/A
n
is a surjective ring homomorphism with kernel A
1
∩ A
2
∩ ∩ A
n
.
Proof There exists a
i
∈ A
i
and b
i
∈ A
1
A
2
A
i−1
A
i+1
A
n
with a
i
+b
i
= 1
¯
. Note
that π(b
i
) = (0, .., 0, 1
¯
i
, 0, .., 0). If (r
1
+ A
1
, r
2
+ A
2
, ..., r
n
+ A
n
) is an element of the
range, it is the image of r
1
b
1
+r
2
b
2
++r
n
b
n
= r
1
(1
¯
−a
1
)+r
2
(1
¯
−a
2
)++r
n
(1
¯
−a
n
).
Theorem If R is commutative and A
1
, A
2
, ..., A
n
are pairwise comaximal ideals
of R, then A
1
A
2
A
n
= A
1
∩ A
2
∩ ∩ A
n
.
Proof for n = 2. Show A
1
∩A
2
⊂ A
1
A
2
. ∃ a
1
∈ A
1
and a
2
∈ A
2
with a
1
+a
2
= 1
¯
.
If c ∈ A
1
∩ A
2
, then c = c(a
1
+a
2
) ∈ A
1
A
2
.
Chapter 6 Appendix 109
Prime and Maximal Ideals and UFD
s
In the ﬁrst chapter on background material, it was shown that Z is a unique
factorization domain. Here it will be shown that this property holds for any principle
ideal domain. Later on it will be shown that every Euclidean domain is a principle
ideal domain. Thus every Euclidean domain is a unique factorization domain.
Deﬁnition Suppose R is a commutative ring and I ⊂ R is an ideal.
I is prime means I = R and if a, b ∈ R have ab ∈ I, then a or b ∈ I.
I is maximal means I = R and there are no ideals properly between I and R.
Theorem 0
¯
is a prime ideal of R iﬀ R is
0
¯
is a maximal ideal of R iﬀ R is
Theorem Suppose J ⊂ R is an ideal, J = R.
J is a prime ideal iﬀ R/J is
J is a maximal ideal iﬀ R/J is
Corollary Maximal ideals are prime.
Proof Every ﬁeld is a domain.
Theorem If a ∈ R is not a unit, then ∃ a maximal ideal I of R with a ∈ I.
Proof This is a classical application of the Hausdorﬀ Maximality Principle. Con
sider ¦J : J is an ideal of R containing a with J = R¦. This collection contains a
maximal monotonic collection ¦V
t
¦
t∈T
. The ideal V =
¸
t∈T
V
t
does not contain 1
¯
and
thus is not equal to R. Therefore V is equal to some V
t
and is a maximal ideal
containing a.
Note To properly appreciate this proof, the student should work the exercise in
group theory at the end of this section (see page 114).
Deﬁnition Suppose R is a domain and a, b ∈ R. Then we say a ∼ b iﬀ there
exists a unit u with au = b. Note that ∼ is an equivalence relation. If a ∼ b, then a
110 Appendix Chapter 6
and b are said to be associates.
Examples If R is a domain, the associates of 1
¯
are the units of R, while the only
associate of 0
¯
is 0
¯
itself. If n ∈ Z is not zero, then its associates are n and −n.
If F is a ﬁeld and g ∈ F[x] is a nonzero polynomial, then the associates of g are
all cg where c is a nonzero constant.
The following theorem is elementary, but it shows how associates ﬁt into the
scheme of things. An element a divides b (a[b) if ∃! c ∈ R with ac = b.
Theorem Suppose R is a domain and a, b ∈ (R − 0
¯
). Then the following are
equivalent.
1) a ∼ b.
2) a[b and b[a.
3) aR = bR.
Parts 1) and 3) above show there is a bijection from the associate classes of R to
the principal ideals of R. Thus if R is a PID, there is a bijection from the associate
classes of R to the ideals of R. If an element of a domain generates a nonzero prime
ideal, it is called a prime element.
Deﬁnition Suppose R is a domain and a ∈ R is a nonzero nonunit.
1) a is irreducible if it does not factor, i.e., a = bc ⇒ b or c is a unit.
2) a is prime if it generates a prime ideal, i.e., a[bc ⇒ a[b or a[c.
Note If a is a prime and a[c
1
c
2
c
n
, then a[c
i
for some i. This follows from the
deﬁnition and induction on n. If each c
j
is irreducible, then a ∼ c
i
for some i.
Note If a ∼ b, then a is irreducible (prime) iﬀ b is irreducible (prime). In other
words, if a is irreducible (prime) and u is a unit, then au is irreducible (prime).
Note a is prime ⇒a is irreducible. This is immediate from the deﬁnitions.
Theorem Factorization into primes is unique up to order and associates, i.e., if
d = b
1
b
2
b
n
= c
1
c
2
c
m
with each b
i
and each c
i
prime, then n = m and for some
permutation σ of the indices, b
i
and c
σ(i)
are associates for every i. Note also ∃ a unit
u and primes p
1
, p
2
, . . . , p
t
where no two are associates and du = p
s
1
1
p
s
2
2
p
st
t
.
Chapter 6 Appendix 111
Proof This follows from the notes above.
Deﬁnition R is a factorization domain (FD) means that R is a domain and if a is
a nonzero nonunit element of R, then a factors into a ﬁnite product of irreducibles.
Deﬁnition R is a unique factorization domain (UFD) means R is a FD in which
factorization is unique (up to order and associates).
Theorem If R is a UFD and a is a nonzero nonunit of R, then a is irreducible
⇔ a is prime. Thus in a UFD, elements factor as the product of primes.
Proof Suppose R is a UFD, a is an irreducible element of R, and a[bc. If either
b or c is a unit or is zero, then a divides one of them, so suppose each of b and c is
a nonzero nonunit element of R. There exists an element d with ad = bc. Each of
b and c factors as the product of irreducibles and the product of these products is
the factorization of bc. It follows from the uniqueness of the factorization of ad = bc,
that one of these irreducibles is an associate of a, and thus a[b or a[c. Therefore
the element a is a prime.
Theorem Suppose R is a FD. Then the following are equivalent.
1) R is a UFD.
2) Every irreducible element of R is prime, i.e., a irreducible ⇔ a is prime.
Proof We already know 1) ⇒ 2). Part 2) ⇒ 1) because factorization into primes
is always unique.
This is a revealing and useful theorem. If R is a FD, then R is a UFD iﬀ each
irreducible element generates a prime ideal. Fortunately, principal ideal domains
have this property, as seen in the next theorem.
Theorem Suppose R is a PID and a ∈ R is nonzero nonunit. Then the following
are equivalent.
1) aR is a maximal ideal.
2) aR is a prime ideal, i.e., a is a prime element.
3) a is irreducible.
Proof Every maximal ideal is a prime ideal, so 1) ⇒ 2). Every prime element is
an irreducible element, so 2) ⇒ 3). Now suppose a is irreducible and show aR is a
maximal ideal. If I is an ideal containing aR, ∃ b ∈ R with I = bR. Since b divides
a, the element b is a unit or an associate of a. This means I = R or I = aR.
112 Appendix Chapter 6
Our goal is to prove that a PID is a UFD. Using the two theorems above, it
only remains to show that a PID is a FD. The proof will not require that ideals be
principally generated, but only that they be ﬁnitely generated. This turns out to
be equivalent to the property that any collection of ideals has a “maximal” element.
We shall see below that this is a useful concept which ﬁts naturally into the study of
unique factorization domains.
Theorem Suppose R is a commutative ring. Then the following are equivalent.
1) If I ⊂ R is an ideal, ∃ a ﬁnite set ¦a
1
, a
2
, ..., a
n
¦ ⊂ R such that I =
a
1
R +a
2
R + +a
n
R, i.e., each ideal of R is ﬁnitely generated.
2) Any nonvoid collection of ideals of R contains an ideal I which is maximal in
the collection. This means if J is an ideal in the collection with J ⊃ I, then
J = I. (The ideal I is maximal only in the sense described. It need not contain
all the ideals of the collection, nor need it be a maximal ideal of the ring R.)
3) If I
1
⊂ I
2
⊂ I
3
⊂ ... is a monotonic sequence of ideals, ∃ t
0
≥ 1 such that I
t
= I
t
0
for all t ≥ t
0
.
Proof Suppose 1) is true and show 3). The ideal I = I
1
∪ I
2
∪ . . . is ﬁnitely
generated and ∃ t
0
≥ 1 such that I
t
0
contains those generators. Thus 3) is true. Now
suppose 2) is true and show 1). Let I be an ideal of R, and consider the collection
of all ﬁnitely generated ideals contained in I. By 2) there is a maximal one, and it
must be I itself, and thus 1) is true. We now have 2)⇒1)⇒3), so suppose 2) is false
and show 3) is false. So there is a collection of ideals of R such that any ideal in the
collection is properly contained in another ideal of the collection. Thus it is possible
to construct a sequence of ideals I
1
⊂ I
2
⊂ I
3
. . . with each properly contained in
the next, and therefore 3) is false. (Actually this construction requires the Hausdorﬀ
Maximality Principle or some form of the Axiom of Choice, but we slide over that.)
Deﬁnition If R satisﬁes these properties, R is said to be Noetherian, or it is said
to satisfy the ascending chain condition. This property is satisﬁed by many of the
classical rings in mathematics. Having three deﬁnitions makes this property useful
and easy to use. For example, see the next theorem.
Theorem A Noetherian domain is a FD. In particular, a PID is a FD.
Proof Suppose there is a nonzero nonunit element that does not factor as the
ﬁnite product of irreducibles. Consider all ideals dR where d does not factor. Since
R is Noetherian, ∃ a maximal one cR. The element c must be reducible, i.e., c = ab
where neither a nor b is a unit. Each of aR and bR properly contains cR, and so each
Chapter 6 Appendix 113
of a and b factors as a ﬁnite product of irreducibles. This gives a ﬁnite factorization
of c into irreducibles, which is a contradiction.
Corollary A PID is a UFD. So Z is a UFD and if F is a ﬁeld, F[x] is a UFD.
You see the basic structure of UFD
s
is quite easy. It takes more work to prove
the following theorems, which are stated here only for reference.
Theorem If R is a UFD then R[x
1
, ..., x
n
] is a UFD. Thus if F is a ﬁeld,
F[x
1
, ..., x
n
] is a UFD. (This theorem goes all the way back to Gauss.)
If R is a PID, then the formal power series R[[x
1
, ..., x
n
]] is a UFD. Thus if F
is a ﬁeld, F[[x
1
, ..., x
n
]] is a UFD. (There is a UFD R where R[[x]] is not a UFD.
See page 566 of Commutative Algebra by N. Bourbaki.)
Theorem Germs of analytic functions on C
n
form a UFD.
Proof See Theorem 6.6.2 of An Introduction to Complex Analysis in Several Vari
ables by L. H¨ ormander.
Theorem Suppose R is a commutative ring. Then R is Noetherian ⇒R[x
1
, ..., x
n
]
and R[[x
1
, ..., x
n
]] are Noetherian. (This is the famous Hilbert Basis Theorem.)
Theorem If R is Noetherian and I ⊂ R is a proper ideal, then R/I is Noetherian.
(This follows immediately from the deﬁnition. This and the previous theorem show
that Noetherian is a ubiquitous property in ring theory.)
Domains With Nonunique Factorizations Next are presented two of the
standard examples of Noetherian domains that are not unique factorization domains.
Exercise Let R = Z(
√
5) = ¦n +m
√
5 : n, m ∈ Z¦. Show that R is a subring of
R which is not a UFD. In particular 2 2 = (1 −
√
5) (−1 −
√
5) are two distinct
irreducible factorizations of 4. Show R is isomorphic to Z[x]/(x
2
−5), where (x
2
−5)
represents the ideal (x
2
− 5)Z[x], and R/(2) is isomorphic to Z
2
[x]/(x
2
− [5]) =
Z
2
[x]/(x
2
+ [1]), which is not a domain.
114 Appendix Chapter 6
Exercise Let R = R[x, y, z]/(x
2
− yz). Show x
2
− yz is irreducible and thus
prime in R[x, y, z]. If u ∈ R[x, y, z], let ¯ u ∈ R be the coset containing u. Show R
is not a UFD. In particular ¯ x ¯ x = ¯ y ¯ z are two distinct irreducible factorizations
of ¯ x
2
. Show R/(¯ x) is isomorphic to R[y, z]/(yz), which is not a domain. An easier
approach is to let f : R[x, y, z] → R[x, y] be the ring homomorphism deﬁned by
f(x) = xy, f(y) = x
2
, and f(z) = y
2
. Then S = R[xy, x
2
, y
2
] is the image of
f and S is isomorphic to R. Note that xy, x
2
, and y
2
are irreducible in S and
(xy)(xy) = (x
2
)(y
2
) are two distinct irreducible factorizations of (xy)
2
in S.
Exercise In Group Theory If G is an additive abelian group, a subgroup H
of G is said to be maximal if H = G and there are no subgroups properly between
H and G. Show that H is maximal iﬀ G/H ≈ Z
p
for some prime p. For simplicity,
consider the case G = Q. Which one of the following is true?
1) If a ∈ Q, then there is a maximal subgroup H of Q which contains a.
2) Q contains no maximal subgroups.
Splitting Short Exact Sequences
Suppose B is an Rmodule and K is a submodule of B. As deﬁned in the chapter
on linear algebra, K is a summand of B provided ∃ a submodule L of B with
K+L = B and K∩L = 0
¯
. In this case we write K⊕L = B. When is K a summand
of B? It turns out that K is a summand of B iﬀ there is a splitting map from
B/K to B. In particular, if B/K is free, K must be a summand of B. This is used
below to show that if R is a PID, then every submodule of R
n
is free.
Theorem 1 Suppose R is a ring, B and C are Rmodules, and g : B → C is a
surjective homomorphism with kernel K. Then the following are equivalent.
1) K is a summand of B.
2) g has a right inverse, i.e., ∃ a homomorphism h : C →B with g ◦h = I : C →C.
(h is called a splitting map.)
Proof Suppose 1) is true, i.e., suppose ∃ a submodule L of B with K ⊕ L = B.
Then (g[L) : L → C is an isomorphism. If i : L → B is inclusion, then h deﬁned
by h = i ◦ (g[L)
−1
is a right inverse of g. Now suppose 2) is true and h : C → B
is a right inverse of g. Then h is injective, K + h(C) = B and K ∩ h(C) = 0
¯
.
Thus K ⊕h(C) = B.
Chapter 6 Appendix 115
Deﬁnition Suppose f : A → B and g : B → C are Rmodule homomorphisms.
The statement that 0 →A
f
→B
g
→C →0 is a short exact sequence (s.e.s) means
f is injective, g is surjective and f(A) = ker(g). The canonical split s.e.s. is A →
A ⊕ C → C where f = i
1
and g = π
2
. A short exact sequence is said to split if ∃
an isomorphism B
≈
→A⊕C such that the following diagram commutes.
0 → A B C →0
A⊕C
≈
f
g
i
1
π
2
`
`
`
`
`
`
·
We now restate the previous theorem in this terminology.
Theorem 1.1 A short exact sequence 0 → A → B → C → 0 splits iﬀ f(A) is
a summand of B, iﬀ B → C has a splitting map. If C is a free Rmodule, there is
a splitting map and thus the sequence splits.
Proof We know from the previous theorem f(A) is a summand of B iﬀ B → C
has a splitting map. Showing these properties are equivalent to the splitting of the
sequence is a good exercise in the art of diagram chasing. Now suppose C has a free
basis T ⊂ C, and g : B → C is surjective. There exists a function h : T → B such
that g ◦ h(c) = c for each c ∈ T. The function h extends to a homomorphism from
C to B which is a right inverse of g.
Theorem 2 If R is a domain, then the following are equivalent.
1) R is a PID.
2) Every submodule of R
R
is a free Rmodule of dimension ≤ 1.
This theorem restates the ring property of PID as a module property. Although
this theorem is transparent, 1)⇒2) is a precursor to the following classical result.
Theorem 3 If R is a PID and A ⊂ R
n
is a submodule, then A is a free Rmodule
of dimension ≤ n. Thus subgroups of Z
n
are free Zmodules of dimension ≤ n.
Proof From the previous theorem we know this is true for n = 1. Suppose n > 1
and the theorem is true for submodules of R
n−1
. Suppose A ⊂ R
n
is a submodule.
116 Appendix Chapter 6
Consider the following short exact sequences, where f : R
n−1
→R
n−1
⊕R is inclusion
and g = π : R
n−1
⊕R →R is the projection.
0 −→R
n−1
f
−→R
n−1
⊕R
π
−→R −→0
0 −→A∩ R
n−1
−→A −→π(A) −→0
By induction, A∩ R
n−1
is free of dimension ≤ n −1. If π(A) = 0
¯
, then A ⊂ R
n−1
.
If π(A) = 0
¯
, it is free of dimension 1 and thus the sequence splits by Theorem 1.1.
In either case, A is a free submodule of dimension ≤ n.
Exercise Let A ⊂ Z
2
be the subgroup generated by ¦(6, 24), (16, 64)¦. Show A
is a free Zmodule of dimension 1. Also show the s.e.s. Z
4
×3
−→ Z
12
−→ Z
3
splits
but Z
×2
−→Z −→Z
2
and Z
2
×2
−→Z
4
−→Z
2
do not (see top of page 78).
Euclidean Domains
The ring Z possesses the Euclidean algorithm and the polynomial ring F[x] has
the division algorithm (pages 14 and 45). The concept of Euclidean domain is an
abstraction of these properties, and the eﬃciency of this abstraction is displayed in
this section. Furthermore the ﬁrst axiom, φ(a) ≤ φ(ab), is used only in Theorem
2, and is sometimes omitted from the deﬁnition. Anyway it is possible to just play
around with matrices and get some deep results. If R is a Euclidean domain and M
is a ﬁnitely generated Rmodule, then M is the sum of cyclic modules. This is one of
the great classical theorems of abstract algebra, and you don’t have to worry about
it becoming obsolete. Here N will denote the set of all nonnegative integers, not
just the set of positive integers.
Deﬁnition A domain R is a Euclidean domain provided ∃ φ : (R−0
¯
) −→N such
that if a, b ∈ (R −0
¯
), then
1) φ(a) ≤ φ(ab).
2) ∃ q, r ∈ R such that a = bq +r with r = 0
¯
or φ(r) < φ(b).
Examples of Euclidean Domains
Z with φ(n) = [n[.
A ﬁeld F with φ(a) = 1 ∀ a = 0
¯
or with φ(a) = 0 ∀ a = 0
¯
.
F[x] where F is a ﬁeld with φ(f = a
0
+a
1
x + +a
n
x
n
) = deg(f).
Z[i] = ¦a +bi : a, b ∈ Z¦ = Gaussian integers with φ(a +bi) = a
2
+b
2
.
Chapter 6 Appendix 117
Theorem 1 If R is a Euclidean domain, then R is a PID and thus a UFD.
Proof If I is a nonzero ideal, then ∃ b ∈ I −0
¯
satisfying φ(b) ≤ φ(a) ∀ a ∈ I −0
¯
.
Then b generates I because if a ∈ I − 0
¯
, ∃ q, r with a = bq + r. Now r ∈ I and
r = 0
¯
⇒φ(r) < φ(b) which is impossible. Thus r = 0
¯
and a ∈ bR so I = bR.
Theorem 2 If R is a Euclidean domain and a, b ∈ R −0
¯
, then
φ(1
¯
) is the smallest integer in the image of φ.
a is a unit in R iﬀ φ(a) = φ(1
¯
).
a and b are associates ⇒ φ(a) = φ(b).
Proof This is a good exercise. However it is unnecessary for Theorem 3 below.
The following remarkable theorem is the foundation for the results of this section.
Theorem 3 If R is a Euclidean domain and (a
i,j
) ∈ R
n,t
is a nonzero matrix,
then by elementary row and column operations (a
i,j
) can be transformed to
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
d
1
0 0
0 d
2
.
.
.
.
.
.
d
m
0
0 0
¸
where each d
i
= 0
¯
, and d
i
[d
i+1
for 1 ≤ i < m. Also d
1
generates the ideal of R
generated by the entries of (a
i,j
).
Proof Let I ⊂ R be the ideal generated by the elements of the matrix A = (a
i,j
).
If E ∈ R
n
, then the ideal J generated by the elements of EA has J ⊂ I. If E is
invertible, then J = I. In the same manner, if E ∈ R
t
is invertible and J is the ideal
generated by the elements of AE, then J = I. This means that row and column
operations on A do not change the ideal I. Since R is a PID, there is an element
d
1
with I = d
1
R, and this will turn out to be the d
1
displayed in the theorem.
The matrix (a
i,j
) has at least one nonzero element d with φ(d) a miminum.
However, row and column operations on (a
i,j
) may produce elements with smaller
118 Appendix Chapter 6
φ values. To consolidate this approach, consider matrices obtained from (a
i,j
) by a
ﬁnite number of row and column operations. Among these, let (b
i,j
) be one which
has an entry d
1
= 0 with φ(d
1
) a minimum. By elementary operations of type 2, the
entry d
1
may be moved to the (1, 1) place in the matrix. Then d
1
will divide the other
entries in the ﬁrst row, else we could obtain an entry with a smaller φ value. Thus
by column operations of type 3, the other entries of the ﬁrst row may be made zero.
In a similar manner, by row operations of type 3, the matrix may be changed to the
following form.
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
d
1
0 0
0
.
.
. c
ij
0
¸
Note that d
1
divides each c
i,j
, and thus I = d
1
R. The proof now follows by induction
on the size of the matrix.
This is an example of a theorem that is easy to prove playing around at the
blackboard. Yet it must be a deep theorem because the next two theorems are easy
consequences.
Theorem 4 Suppose R is a Euclidean domain, B is a ﬁnitely generated free R
module and A ⊂ B is a nonzero submodule. Then ∃ free bases ¦a
1
, a
2
, ..., a
t
¦ for A
and ¦b
1
, b
2
, ..., b
n
¦ for B, with t ≤ n, and such that each a
i
= d
i
b
i
, where each d
i
= 0
¯
,
and d
i
[d
i+1
for 1 ≤ i < t. Thus B/A ≈ R/d
1
⊕R/d
2
⊕ ⊕R/d
t
⊕R
n−t
.
Proof By Theorem 3 in the section Splitting Short Exact Sequences, A has a
free basis ¦v
1
, v
2
, ..., v
t
¦. Let ¦w
1
, w
2
, ..., w
n
¦ be a free basis for B, where n ≥ t. The
composition
R
t
≈
−→A
⊂
−→B
≈
−→R
n
e
i
−→v
i
w
i
−→e
i
is represented by a matrix (a
i,j
) ∈ R
n,t
where v
i
= a
1,i
w
1
+a
2,i
w
2
+ +a
n,i
w
n
. By
the previous theorem, ∃ invertible matrixes U ∈ R
n
and V ∈ R
t
such that
Chapter 6 Appendix 119
U(a
i,j
)V =
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
d
1
0 0
0 d
2
0
.
.
. 0
.
.
.
d
t
0 0
¸
with d
i
[d
i+1
. Since changing the isomorphisms R
t
≈
−→A and B
≈
−→R
n
corresponds
to changing the bases ¦v
1
, v
2
, ..., v
t
¦ and ¦w
1
, w
2
, ..., w
n
¦, the theorem follows.
Theorem 5 If R is a Euclidean domain and M is a ﬁnitely generated Rmodule,
then M ≈ R/d
1
⊕R/d
2
⊕ ⊕R/d
t
⊕R
m
where each d
i
= 0
¯
, and d
i
[d
i+1
for 1 ≤ i < t.
Proof By hypothesis ∃ a ﬁnitely generated free module B and a surjective homo
morphism B −→M −→0. Let A be the kernel, so 0 −→A
⊂
−→B −→M −→0 is
a s.e.s. and B/A ≈ M. The result now follows from the previous theorem.
The way Theorem 5 is stated, some or all of the elements d
i
may be units, and for
such d
i
, R/d
i
= 0
¯
. If we assume that no d
i
is a unit, then the elements d
1
, d
2
, ..., d
t
are
called invariant factors. They are unique up to associates, but we do not bother with
that here. If R = Z and we select the d
i
to be positive, they are unique. If R = F[x]
and we select the d
i
to be monic, then they are unique. The splitting in Theorem 5
is not the ultimate because the modules R/d
i
may split into the sum of other cyclic
modules. To prove this we need the following Lemma.
Lemma Suppose R is a PID and b and c are nonzero nonunit elements of R.
Suppose b and c are relatively prime, i.e., there is no prime common to their prime
factorizations. Then bR and cR are comaximal ideals. (See p 108 for comaximal.)
Proof There exists an a ∈ R with aR = bR + cR. Since a[b and a[c, a is a
unit, so R = bR +cR.
Theorem 6 Suppose R is a PID and d is a nonzero nonunit element of R.
Assume d = p
s
1
1
p
s
2
2
p
st
t
is the prime factorization of d (see bottom of p 110). Then
the natural map R/d
≈
−→R/p
s
1
1
⊕ ⊕ R/p
st
t
is an isomorphism of Rmodules.
(The elements p
s
i
i
are called elementary divisors of R/d.)
Proof If i = j, p
s
i
i
and p
s
j
j
are relatively prime. By the Lemma above, they are
120 Appendix Chapter 6
comaximal and thus by the Chinese Remainder Theorem, the natural map is a ring
isomorphism (page 108). Since the natural map is also an Rmodule homomorphism,
it is an Rmodule isomorphism.
This theorem carries the splitting as far as it can go, as seen by the next exercise.
Exercise Suppose R is a PID, p ∈ R is a prime element, and s ≥ 1. Then the
Rmodule R/p
s
has no proper submodule which is a summand.
Torsion Submodules This will give a little more perspective to this section.
Deﬁnition Suppose M is a module over a domain R. An element m ∈ M is said
to be a torsion element if ∃ r ∈ R with r = 0
¯
and mr = 0
¯
. This is the same as
saying m is dependent. If R = Z, it is the same as saying m has ﬁnite order. Denote
by T(M) the set of all torsion elements of M. If T(M) = 0
¯
, we say that M is torsion
free.
Theorem 7 Suppose M is a module over a domain R. Then T(M) is a submodule
of M and M/T(M) is torsion free.
Proof This is a simple exercise.
Theorem 8 Suppose R is a Euclidean domain and M is a ﬁnitely generated
Rmodule which is torsion free. Then M is a free Rmodule, i.e., M ≈ R
m
.
Proof This follows immediately from Theorem 5.
Theorem 9 Suppose R is a Euclidean domain and M is a ﬁnitely generated
Rmodule. Then the following s.e.s. splits.
0 −→T(M) −→M −→M/T(M) −→0
Proof By Theorem 7, M/T(M) is torsion free. By Theorem 8, M/T(M) is a free
Rmodule, and thus there is a splitting map. Of course this theorem is transparent
anyway, because Theorem 5 gives a splitting of M into a torsion part and a free part.
Chapter 6 Appendix 121
Note It follows from Theorem 9 that ∃ a free submodule V of M such that T(M)⊕
V = M. The ﬁrst summand T(M) is unique, but the complementary summand V is
not unique. V depends upon the splitting map and is unique only up to isomorphism.
To complete this section, here are two more theorems that follow from the work
we have done.
Theorem 10 Suppose T is a domain and T
∗
is the multiplicative group of units
of T. If G is a ﬁnite subgroup of T
∗
, then G is a cyclic group. Thus if F is a ﬁnite
ﬁeld, the multiplicative group F
∗
is cyclic. Thus if p is a prime, (Z
p
)
∗
is cyclic.
Proof This is a corollary to Theorem 5 with R = Z. The multiplicative group G
is isomorphic to an additive group Z/d
1
⊕Z/d
2
⊕ ⊕ Z/d
t
where each d
i
> 1 and
d
i
[d
i+1
for 1 ≤ i < t. Every u in the additive group has the property that ud
t
= 0
¯
.
So every g ∈ G is a solution to x
dt
− 1
¯
= 0
¯
. If t > 1, the equation will have degree
less than the number of roots, which is impossible. Thus t = 1 and so G is cyclic.
Exercise For which primes p and q is the group of units (Z
p
Z
q
)
∗
a cyclic group?
We know from Exercise 2) on page 59 that an invertible matrix over a ﬁeld is the
product of elementary matrices. This result also holds for any invertible matrix over
a Euclidean domain.
Theorem 11 Suppose R is a Euclidean domain and A ∈ R
n
is a matrix with
nonzero determinant. Then by elementary row and column operations, A may be
transformed to a diagonal matrix
¸
¸
¸
¸
¸
d
1
0
d
2
.
.
.
0 d
n
¸
where each d
i
= 0
¯
and d
i
[d
i+1
for 1 ≤ i < n. Also d
1
generates the ideal generated
by the entries of A. Furthermore A is invertible iﬀ each d
i
is a unit. Thus if A is
invertible, A is the product of elementary matrices.
122 Appendix Chapter 6
Proof It follows from Theorem 3 that A may be transformed to a diagonal matrix
with d
i
[d
i+1
. Since the determinant of A is not zero, it follows that each d
i
= 0
¯
.
Furthermore, the matrix A is invertible iﬀ the diagonal matrix is invertible, which is
true iﬀ each d
i
is a unit. If each d
i
is a unit, then the diagonal matrix is the product
of elementary matrices of type 1. Therefore if A is invertible, it is the product of
elementary matrices.
Exercise Let R = Z, A =
3 11
0 4
and D =
3 11
1 4
. Perform elementary
operations on A and D to obtain diagonal matrices where the ﬁrst diagonal element
divides the second diagonal element. Write D as the product of elementary matri
ces. Find the characteristic polynomials of A and D. Find an elementary matrix B
over Z such that B
−1
AB is diagonal. Find an invertible matrix C in R
2
such that
C
−1
DC is diagonal. Show C cannot be selected in Q
2
.
Jordan Blocks
In this section, we deﬁne the two special types of square matrices used in the
Rational and Jordan canonical forms. Note that the Jordan block B(q) is the sum
of a scalar matrix and a nilpotent matrix. A Jordan block displays its eigenvalue
on the diagonal, and is more interesting than the companion matrix C(q). But as
we shall see later, the Rational canonical form will always exist, while the Jordan
canonical form will exist iﬀ the characteristic polynomial factors as the product of
linear polynomials.
Suppose R is a commutative ring, q = a
0
+ a
1
x + + a
n−1
x
n−1
+ x
n
∈ R[x]
is a monic polynomial of degree n ≥ 1, and V is the R[x]module V = R[x]/q.
V is a torsion module over the ring R[x], but as an Rmodule, V has a free basis
¦1, x, x
2
, . . . , x
n−1
¦. (See the last part of the last theorem on page 46.) Multipli
cation by x deﬁnes an Rmodule endomorphism on V , and C(q) will be the ma
trix of this endomorphism with respect to this basis. Let T : V → V be deﬁned
by T(v) = vx. If h(x) ∈ R[x], h(T) is the Rmodule homomorphism given by
multiplication by h(x). The homomorphism from R[x]/q to R[x]/q given by
multiplication by h(x), is zero iﬀ h(x) ∈ qR[x]. That is to say q(T) = a
0
I + a
1
T+
+ T
n
is the zero homomorphism, and h(T) is the zero homomorphism iﬀ
h(x) ∈ qR[x]. All of this is supposed to make the next theorem transparent.
Theorem Let V have the free basis ¦1, x, x
2
, ..., x
n−1
¦. The companion matrix
Chapter 6 Appendix 123
representing T is
C(q) =
¸
¸
¸
¸
¸
¸
¸
¸
0 . . . . . . 0 −a
0
1 0 . . . 0 −a
1
0 1 0 −a
2
.
.
.
.
.
.
.
.
.
.
.
.
0 . . . . . . 1 −a
n−1
¸
The characteristic polynomial of C(q) is q, and [C(q)[ = (−1)
n
a
0
. Finally, if h(x) ∈
R[x], h(C(q)) is zero iﬀ h(x) ∈ qR[x].
Theorem Suppose λ ∈ R and q(x) = (x − λ)
n
. Let V have the free basis
¦1, (x −λ), (x −λ)
2
, . . . , (x −λ)
n−1
¦. Then the matrix representing T is
B(q) =
¸
¸
¸
¸
¸
¸
¸
¸
λ 0 . . . . . . 0
1 λ 0 . . . 0
0 1 λ
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 . . . . . . 1 λ
¸
The characteristic polynomial of B(q) is q, and [B(q)[ = λ
n
= (−1)
n
a
0
. Finally, if
h(x) ∈ R[x], h(B(q)) is zero iﬀ h(x) ∈ qR[x].
Note For n = 1, C(a
0
+ x) = B(a
0
+x) = (−a
0
). This is the only case where a
block matrix may be the zero matrix.
Note In B(q), if you wish to have the 1
s
above the diagonal, reverse the order of
the basis for V .
Jordan Canonical Form
We are ﬁnally ready to prove the Rational and Jordan forms. Using the previous
sections, all that’s left to do is to put the pieces together. (For an overview of Jordan
form, read ﬁrst the section in Chapter 5, page 96.)
124 Appendix Chapter 6
Suppose R is a commutative ring, V is an Rmodule, and T : V → V is an
Rmodule homomorphism. Deﬁne a scalar multiplication V R[x] → V by
v(a
0
+a
1
x + +a
r
x
r
) = va
0
+T(v)a
1
+ +T
r
(v)a
r
.
Theorem 1 Under this scalar multiplication, V is an R[x]module.
This is just an observation, but it is one of the great tricks in mathematics.
Questions about the transformation T are transferred to questions about the module
V over the ring R[x]. And in the case R is a ﬁeld, R[x] is a Euclidean domain and so
we know almost everything about V as an R[x]module.
Now in this section, we suppose R is a ﬁeld F, V is a ﬁnitely generated Fmodule,
T : V →V is a linear transformation and V is an F[x]module with vx = T(v). Our
goal is to select a basis for V such that the matrix representing T is in some simple
form. A submodule of V
F[x]
is a submodule of V
F
which is invariant under T. We
know V
F[x]
is the sum of cyclic modules from Theorems 5 and 6 in the section on
Euclidean Domains. Since V is ﬁnitely generated as an Fmodule, the free part of
this decomposition will be zero. In the section on Jordan Blocks, a basis is selected
for these cyclic modules and the matrix representing T is described. This gives the
Rational Canonical Form and that is all there is to it. If all the eigenvalues for T are
in F, we pick another basis for each of the cyclic modules (see the second theorem in
the section on Jordan Blocks). Then the matrix representing T is called the Jordan
Canonical Form. Now we say all this again with a little more detail.
From Theorem 5 in the section on Euclidean Domains, it follows that
V
F[x]
≈ F[x]/d
1
⊕F[x]/d
2
⊕ ⊕F[x]/d
t
where each d
i
is a monic polynomial of degree ≥ 1, and d
i
[d
i+1
. Pick ¦1, x, x
2
, . . . , x
m−1
¦
as the Fbasis for F[x]/d
i
where m is the degree of the polynomial d
i
.
Theorem 2 With respect to this basis, the matrix representing T is
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
C(d
1
)
C(d
2
)
.
.
.
C(d
t
)
¸
Chapter 6 Appendix 125
The characteristic polynomial of T is p = d
1
d
2
d
t
and p(T) = 0
¯
. This is a type
of canonical form but it does not seem to have a name.
Now we apply Theorem 6 to each F[x]/d
i
. This gives V
F[x]
≈ F[x]/p
s
1
1
⊕ ⊕
F[x]/p
sr
r
where the p
i
are irreducible monic polynomials of degree at least 1. The p
i
need not be distinct. Pick an Fbasis for each F[x]/p
s
i
i
as before.
Theorem 3 With respect to this basis, the matrix representing T is
¸
¸
¸
¸
¸
¸
¸
¸
C(p
s
1
1
)
C(p
s
2
2
) 0
0
.
.
.
C(p
sr
r
)
¸
The characteristic polynomial of T is p = p
s
1
1
p
sr
r
and p(T) = 0
¯
. This is called
the Rational canonical form for T.
Now suppose the characteristic polynomial of T factors in F[x] as the product of
linear polynomials. Thus in the Theorem above, p
i
= x −λ
i
and
V
F[x]
≈ F[x]/(x −λ
1
)
s
1
⊕ ⊕F[x]/(x −λ
r
)
sr
is an isomorphism of F[x]modules. Pick ¦1, (x − λ
i
), (x − λ
i
)
2
, . . . , (x − λ
i
)
m−1
¦ as
the Fbasis for F[x]/(x −λ
i
)
s
i
where m is s
i
.
Theorem 4 With respect to this basis, the matrix representing T is
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
B((x −λ
1
)
s
1
)
0
B((x −λ
2
)
s
2
)
0
.
.
.
B((x −λ
r
)
sr
)
¸
126 Appendix Chapter 6
The characteristic polynomial of T is p = (x−λ
1
)
s
1
(x−λ
r
)
sr
and p(T) = 0
¯
. This
is called the Jordan canonical form for T. Note that the λ
i
need not be distinct.
Note A diagonal matrix is in Rational canonical form and in Jordan canonical
form. This is the case where each block is one by one. Of course a diagonal matrix
is about as canonical as you can get. Note also that if a matrix is in Jordan form,
its trace is the sum of the eigenvalues and its determinant is the product of the
eigenvalues. Finally, this section is loosely written, so it is important to use the
transpose principle to write three other versions of the last two theorems.
Exercise Suppose F is a ﬁeld of characteristic 0 and T ∈ F
n
has trace(T
i
) = 0
¯
for 0 < i ≤ n. Show T is nilpotent. Let p ∈ F[x] be the characteristic polynomial of
T. The polynomial p may not factor into linears in F[x], and thus T may have no
conjugate in F
n
which is in Jordan form. However this exercise can still be worked
using Jordan form. This is based on the fact that there exists a ﬁeld
¯
F containing F
as a subﬁeld, such that p factors into linears in
¯
F[x]. This fact is not proved in this
book, but it is assumed for this exercise. So ∃ an invertible matrix U ∈
¯
F
n
so that
U
−1
TU is in Jordan form, and of course, T is nilpotent iﬀ U
−1
TU is nilpotent. The
point is that it suﬃcies to consider the case where T is in Jordan form, and to show
the diagonal elements are all zero.
So suppose T is in Jordan form and trace (T
i
) = 0
¯
for 1 ≤ i ≤ n. Thus trace
(p(T)) = a
0
n where a
0
is the constant term of p(x). We know p(T) = 0
¯
and thus
trace (p(T)) = 0
¯
, and thus a
0
n = 0
¯
. Since the ﬁeld has characteristic 0, a
0
= 0
¯
and so 0
¯
is an eigenvalue of T. This means that one block of T is a strictly lower
triangular matrix. Removing this block leaves a smaller matrix which still satisﬁes
the hypothesis, and the result follows by induction on the size of T. This exercise
illustrates the power and facility of Jordan form. It also has a cute corollary.
Corollary Suppose F is a ﬁeld of characteristic 0, n ≥ 1, and (λ
1
, λ
2
, .., λ
n
) ∈ F
n
satisﬁes λ
i
1
+λ
i
2
+ +λ
i
n
= 0
¯
for each 1 ≤ i ≤ n. Then λ
i
= 0
¯
for 1 ≤ i ≤ n.
Minimal polynomials To conclude this section here are a few comments on the
minimal polynomial of a linear transformation. This part should be studied only if
you need it. Suppose V is an ndimensional vector space over a ﬁeld F and T : V →V
is a linear transformation. As before we make V a module over F[x] with T(v) = vx.
Chapter 6 Appendix 127
Deﬁnition Ann(V
F[x]
) is the set of all h ∈ F[x] which annihilate V , i.e., which
satisfy V h = 0
¯
. This is a nonzero ideal of F[x] and is thus generated by a unique
monic polynomial u(x) ∈ F(x), Ann(V
F[x]
) = uF[x]. The polynomial u is called the
minimal polynomial of T. Note that u(T) = 0
¯
and if h(x) ∈ F[x], h(T) = 0
¯
iﬀ
h is a multiple of u in F[x]. If p(x) ∈ F[x] is the characteristic polynomial of T,
p(T) = 0
¯
and thus p is a multiple of u.
Now we state this again in terms of matrices. Suppose A ∈ F
n
is a matrix
representing T. Then u(A) = 0
¯
and if h(x) ∈ F[x], h(A) = 0
¯
iﬀ h is a multiple of
u in F[x]. If p(x) ∈ F[x] is the characteristic polynomial of A, then p(A) = 0
¯
and
thus p is a multiple of u. The polynomial u is also called the minimal polynomial of
A. Note that these properties hold for any matrix representing T, and thus similar
matrices have the same minimal polynomial. If A is given to start with, use the linear
transformation T : F
n
→F
n
determined by A to deﬁne the polynomial u.
Now suppose q ∈ F[x] is a monic polynomial and C(q) ∈ F
n
is the compan
ion matrix deﬁned in the section Jordan Blocks. Whenever q(x) = (x − λ)
n
, let
B(q) ∈ F
n
be the Jordan block matrix also deﬁned in that section. Recall that q is
the characteristic polynomial and the minimal polynomial of each of these matrices.
This together with the rational form and the Jordan form will allow us to understand
the relation of the minimal polynomial to the characteristic polynomial.
Exercise Suppose A
i
∈ F
n
i
has q
i
as its characteristic polynomial and its minimal
polynomial, and A =
¸
¸
¸
¸
¸
A
1
0
A
2
.
.
.
0 A
r
¸
. Find the characteristic polynomial
and the minimal polynomial of A.
Exercise Suppose A ∈ F
n
.
1) Suppose A is the matrix displayed in Theorem 2 above. Find the characteristic
and minimal polynomials of A.
2) Suppose A is the matrix displayed in Theorem 3 above. Find the characteristic
and minimal polynomials of A.
3) Suppose A is the matrix displayed in Theorem 4 above. Find the characteristic
and minimal polynomials of A.
128 Appendix Chapter 6
4) Suppose λ ∈ F. Show λ is a root of the characteristic polynomial of A iﬀ λ
is a root of the minimal polynomial of A. Show that if λ is a root, its order
in the characteristic polynomial is at least as large as its order in the minimal
polynomial.
5) Suppose
¯
F is a ﬁeld containing F as a subﬁeld. Show that the minimal poly
nomial of A ∈ F
n
is the same as the minimal polynomial of A considered as a
matrix in
¯
F
n
. (This funny looking exercise is a little delicate.)
6) Let F = R and A =
¸
¸
5 −1 3
0 2 0
−3 1 −1
¸
. Find the characteristic and minimal
polynomials of A.
Determinants
In the chapter on matrices, it is stated without proof that the determinant of the
product is the product of the determinants (see page 63). The purpose of this section
is to give a proof of this. We suppose R is a commutative ring, C is an Rmodule,
n ≥ 2, and B
1
, B
2
, . . . , B
n
is a sequence of Rmodules.
Deﬁnition A map f : B
1
⊕ B
2
⊕ ⊕ B
n
→ C is Rmultilinear means that if
1 ≤ i ≤ n, and b
j
∈ B
j
for j = i, then f[(b
1
, b
2
, . . . , B
i
, . . . , b
n
) deﬁnes an Rlinear
map from B
i
to C.
Theorem The set of all Rmultilinear maps is an Rmodule.
Proof From the ﬁrst exercise in Chapter 5, the set of all functions from B
1
⊕B
2
⊕
⊕ B
n
to C is an Rmodule (see page 69). It must be seen that the Rmultilinear
maps form a submodule. It is easy to see that if f
1
and f
2
are Rmultilinear, so is
f
1
+f
2
. Also if f is Rmultilinear and r ∈ R, then (fr) is Rmultilinear.
From here on, suppose B
1
= B
2
= = B
n
= B.
Deﬁnition
1) f is symmetric means f(b
1
, . . . , b
n
) = f(b
τ(1)
, . . . , b
τ(n)
) for all
permutations τ on ¦1, 2, . . . , n¦.
2) f is skewsymmetric if f(b
1
, . . . , b
n
) = sign(τ)f(b
τ(1)
, . . . , b
τ(n)
) for all τ.
Chapter 6 Appendix 129
3) f is alternating if f(b
1
, . . . , b
n
) = 0
¯
whenever some b
i
= b
j
for i = j.
Theorem
i) Each of these three types deﬁnes a submodule of the set of all
Rmultilinear maps.
ii) Alternating ⇒ skewsymmetric.
iii) If no element of C has order 2, then alternating ⇐⇒ skewsymmetric.
Proof Part i) is immediate. To prove ii), assume f is alternating. It suﬃcies to
show that f(b
1
, ..., b
n
) = −f(b
τ(1)
, ..., b
τ(n)
) where τ is a transposition. For simplicity,
assume τ = (1, 2). Then 0
¯
= f(b
1
+ b
2
, b
1
+ b
2
, b
3
, ..., b
n
) = f(b
1
, b
2
, b
3
, ..., b
n
) +
f(b
2
, b
1
, b
3
, ..., b
n
) and the result follows. To prove iii), suppose f is skew symmetric
and no element of C has order 2, and show f is alternating. Suppose for convenience
that b
1
= b
2
and show f(b
1
, b
1
, b
3
, . . . , b
n
) = 0
¯
. If we let τ be the transposition (1, 2),
we get f(b
1
, b
1
, b
3
, . . . , b
n
) = −f(b
1
, b
1
, b
3
, . . . , b
n
), and so 2f(b
1
, b
1
, b
3
, . . . , b
n
) = 0
¯
,
and the result follows.
Now we are ready for determinant. Suppose C = R. In this case multilinear
maps are usually called multilinear forms. Suppose B is R
n
with the canonical basis
¦e
1
, e
2
, . . . , e
n
¦. (We think of a matrix A ∈ R
n
as n column vectors, i.e., as an element
of B ⊕B ⊕ ⊕B.) First we recall the deﬁnition of determinant.
Suppose A = (a
i,j
) ∈ R
n
. Deﬁne d : B⊕B⊕ ⊕B →R by d(a
1,1
e
1
+a
2,1
e
2
+ +
a
n,1
e
n
, ....., a
1,n
e
1
+a
2,n
e
2
+ +a
n,n
e
n
) =
¸
all τ
sign(τ)(a
τ(1),1
a
τ(2),2
a
τ(n),n
) = [A[.
The next theorem follows from the section on determinants on page 61.
Theorem d is an alternating multilinear form with d(e
1
, e
2
, . . . , e
n
) = 1
¯
.
If c ∈ R, dc is an alternating multilinear form, because the set of alternating forms
is an Rmodule. It turns out that this is all of them, as seen by the following theorem.
Theorem Suppose f : B ⊕ B ⊕. . . ⊕B → R is an alternating multilinear form.
Then f = df(e
1
, e
2
, . . . , e
n
). This means f is the multilinear form d times the scalar
f(e
1
, e
2
, ..., e
n
). In other words, if A = (a
i,j
) ∈ R
n
, then f(a
1,1
e
1
+ a
2,1
e
2
+ +
a
n,1
e
n
, ....., a
1,n
e
2
+ a
2,n
e
2
+ + a
n,n
e
n
) = [A[f(e
1
, e
2
, ..., e
n
). Thus the set of alter
nating forms is a free Rmodule of dimension 1, and the determinant is a generator.
130 Appendix Chapter 6
Proof For n = 2, you can simply write it out. f(a
1,1
e
1
+a
2,1
e
2
, a
1,2
e
1
+a
2,2
e
2
) =
a
1,1
a
1,2
f(e
1
, e
1
) +a
1,1
a
2,2
f(e
1
, e
2
) +a
2,1
a
1,2
f(e
2
, e
1
) +a
2,1
a
2,2
f(e
2
, e
2
) = (a
1,1
a
2,2
−
a
1,2
a
2,1
)f(e
1
, e
2
) = [A[f(e
1
, e
2
). For the general case, f(a
1,1
e
1
+a
2,1
e
2
+ +
a
n,1
e
n
, ....., a
1,n
e
1
+a
2,n
e
2
+ +a
n,n
e
n
) =
¸
a
i
1
,1
a
i
2
,2
a
in,n
f(e
i
1
, e
i
2
, ..., e
in
) where
the sum is over all 1 ≤ i
1
≤ n, 1 ≤ i
2
≤ n, ..., 1 ≤ i
n
≤ n. However, if any i
s
= i
t
for s = t, that term is 0 because f is alternating. Therefore the sum is
just
¸
all τ
a
τ(1),1
a
τ(2),2
a
τ(n),n
f(e
τ(1)
, e
τ(2)
, . . . , e
τ(n)
) =
¸
all τ
sign(τ)a
τ(1),1
a
τ(2),2
a
τ(n),n
f(e
1
, e
2
, . . . , e
n
) = [A[f(e
1
, e
2
, ..., e
n
).
This incredible classiﬁcation of these alternating forms makes the proof of the
following theorem easy. (See the third theorem on page 63.)
Theorem If C, A ∈ R
n
, then [CA[ = [C[[A[.
Proof Suppose C ∈ R
n
. Deﬁne f : R
n
→R by f(A) = [CA[. In the notation of
the previous theorem, B = R
n
and R
n
= R
n
⊕ R
n
⊕ ⊕ R
n
. If A ∈ R
n
, A =
(A
1
, A
2
, ..., A
n
) where A
i
∈ R
n
is column i of A, and f : R
n
⊕ ⊕ R
n
→ R
has f(A
1
, A
2
, ..., A
n
) = [CA[. Use the fact that CA = (CA
1
, CA
2
, ..., CA
n
) to
show that f is an alternating multilinear form. By the previous theorem, f(A) =
[A[f(e
1
, e
2
, ..., e
n
). Since f(e
1
, e
2
, ..., e
n
) = [CI[ = [C[, it follows that [CA[ = f(A) =
[A[[C[.
Dual Spaces
The concept of dual module is basic, not only in algebra, but also in other areas
such as diﬀerential geometry and topology. If V is a ﬁnitely generated vector space
over a ﬁeld F, its dual V
∗
is deﬁned as V
∗
= Hom
F
(V, F). V
∗
is isomorphic to V , but
in general there is no natural isomorphism from V to V
∗
. However there is a natural
isomorphism from V to V
∗∗
, and so V
∗
is the dual of V and V may be considered
to be the dual of V
∗
. This remarkable fact has many expressions in mathematics.
For example, a tangent plane to a diﬀerentiable manifold is a real vector space. The
union of these spaces is the tangent bundle, while the union of the dual spaces is the
cotangent bundle. Thus the tangent (cotangent) bundle may be considered to be the
dual of the cotangent (tangent) bundle. The sections of the tangent bundle are called
vector ﬁelds while the sections of the cotangent bundle are called 1forms.
In algebraic topology, homology groups are derived from chain complexes, while
cohomology groups are derived from the dual chain complexes. The sum of the
cohomology groups forms a ring, while the sum of the homology groups does not.
Chapter 6 Appendix 131
Thus the concept of dual module has considerable power. We develop here the basic
theory of dual modules.
Suppose R is a commutative ring and W is an Rmodule.
Deﬁnition If M is an Rmodule, let H(M) be the Rmodule H(M)=Hom
R
(M, W).
If M and N are Rmodules and g : M → N is an Rmodule homomorphism, let
H(g) : H(N) → H(M) be deﬁned by H(g)(f) = f ◦ g. Note that H(g) is an
Rmodule homomorphism.
M N
W
f
g
H(g)(f) = f ◦ g
·
`
`
`
`
`
`
``
Theorem
i) If M
1
and M
2
are Rmodules, H(M
1
⊕M
2
) ≈ H(M
1
) ⊕H(M
2
).
ii) If I : M →M is the identity, then H(I) : H(M) →H(M) is the
identity.
iii) If M
1
g
−→M
2
h
−→M
3
are Rmodule homomorphisms, then H(g)◦H(h) =
H(h ◦ g). If f : M
3
→W is a homomorphism, then
(H(g) ◦ H(h))(f) = H(h ◦ g)(f) = f ◦ h ◦ g.
M
1
M
2
M
3
g
h
f
f ◦ h
W
f ◦ h ◦ g
·
`
`
`
`
`
`
Note In the language of the category theory, H is a contravariant functor from
the category of Rmodules to itself.
132 Appendix Chapter 6
Theorem If M and N are Rmodules and g : M → N is an isomorphism, then
H(g) : H(N) →H(M) is an isomorphism with H(g
−1
) = H(g)
−1
.
Proof
I
H(N)
= H(I
N
) = H(g ◦ g
−1
) = H(g
−1
) ◦ H(g)
I
H(M)
= H(I
M
) = H(g
−1
◦ g) = H(g) ◦ H(g
−1
)
Theorem
i) If g : M →N is a surjective homomorphism, then H(g) : H(N) →H(M)
is injective.
ii) If g : M →N is an injective homomorphism and g(M) is a summand
of N, then H(g) : H(N) →H(M) is surjective.
iii) If R is a ﬁeld and g : M →N is a homomorphism, then g is surjective
(injective) iﬀ H(g) is injective (surjective).
Proof This is a good exercise.
For the remainder of this section, suppose W = R
R
. In this case H(M) =
Hom
R
(M, R) is denoted by H(M) = M
∗
and H(g) is denoted by H(g) = g
∗
.
Theorem Suppose M has a ﬁnite free basis ¦v
1
, ..., v
n
¦. Deﬁne v
∗
i
∈ M
∗
by
v
∗
i
(v
1
r
1
+ + v
n
r
n
) = r
i
. Thus v
∗
i
(v
j
) = δ
i,j
. Then v
∗
1
, . . . , v
∗
n
is a free basis for
M
∗
, called the dual basis. Therefore M
∗
is free and is isomorphic to M.
Proof First consider the case of R
n
= R
n,1
, with basis ¦e
1
, . . . , e
n
¦ where e
i
=
¸
¸
¸
¸
¸
¸
¸
0
1
i
0
¸
.
We know (R
n
)
∗
≈ R
1,n
, i.e., any homomorphism from R
n
to R is given by a 1 n
matrix. Now R
1,n
is free with dual basis ¦e
∗
1
, . . . , e
∗
n
¦ where e
∗
i
= (0, . . . , 0, 1
i
, 0, . . . , 0).
For the general case, let g : R
n
≈
→ M be given by g(e
i
) = v
i
. Then g
∗
: M
∗
→ (R
n
)
∗
sends v
∗
i
to e
∗
i
. Since g
∗
is an isomorphism, ¦v
∗
1
, . . . , v
∗
n
¦ is a basis for M
∗
.
Theorem Suppose M is a free module with a basis ¦v
1
, . . . , v
m
¦ and N is a free
module with a basis ¦w
1
, . . . , w
n
¦ and g : M → N is the homomorphism given by
A = (a
i,j
) ∈ R
n,m
. This means g(v
j
) = a
1,j
w
1
+ + a
n,j
w
n
. Then the matrix of
g
∗
: N
∗
→M
∗
with respect to the dual bases, is given by A
t
.
Chapter 6 Appendix 133
Proof Note that g
∗
(w
∗
i
) is a homomorphism from M to R. Evaluation on v
j
gives
g
∗
(w
∗
i
)(v
j
) = (w
∗
i
◦ g)(v
j
) = w
∗
i
(g(v
j
)) = w
∗
i
(a
1,j
w
1
+ +a
n,j
w
n
) = a
i,j
. Thus g
∗
(w
∗
i
)
= a
i,1
v
∗
1
+ +a
i,m
v
∗
m
, and thus g
∗
is represented by A
t
.
Exercise If U is an Rmodule, deﬁne φ
U
: U
∗
⊕ U → R by φ
U
(f, u) = f(u).
Show that φ
U
is Rbilinear. Suppose g : M → N is an Rmodule homomorphism,
f ∈ N
∗
and v ∈ M. Show that φ
N
(f, g(v)) = φ
M
(g
∗
(f), v). Now suppose M =
N = R
n
and g : R
n
→ R
n
is represented by a matrix A ∈ R
n
. Suppose f ∈ (R
n
)
∗
and v ∈ R
n
. Use the theorem above to show that φ : (R
n
)
∗
⊕ R
n
→ R has the
property φ(f, Av) = φ(A
t
f, v). This is with the elements of R
n
and (R
n
)
∗
written as
column vectors. If the elements of R
n
are written as column vectors and the elements
of (R
n
)
∗
are written as row vectors, the formula is φ(f, Av) = φ(fA, v). Of course
this is just the matrix product fAv. Dual spaces are confusing, and this exercise
should be worked out completely.
Deﬁnition “Double dual” is a “covariant” functor, i.e., if g : M → N is
a homomorphism, then g
∗∗
: M
∗∗
→ N
∗∗
. For any module M, deﬁne α : M → M
∗∗
by α(m) : M
∗
→ R is the homomorphism which sends f ∈ M
∗
to f(m) ∈ R, i.e.,
α(m) is given by evaluation at m. Note that α is a homomorphism.
Theorem If M and N are Rmodules and g : M →N is a homomorphism, then
the following diagram is commutative.
M M
∗∗
N N
∗∗
α
α
g
g
∗∗
· ·
Proof On M, α is given by α(v) = φ
M
(−, v). On N, α(u) = φ
N
(−, u).
The proof follows from the equation φ
N
(f, g(v)) = φ
M
(g
∗
(f), v).
Theorem If M is a free Rmodule with a ﬁnite basis ¦v
1
, . . . , v
n
¦, then
α : M →M
∗∗
is an isomorphism.
Proof ¦α(v
1
), . . . , α(v
n
)¦ is the dual basis of ¦v
∗
1
, . . . , v
∗
n
¦, i.e., α(v
i
) = (v
∗
i
)
∗
.
134 Appendix Chapter 6
Note Suppose R is a ﬁeld and C is the category of ﬁnitely generated vector spaces
over R. In the language of category theory, α is a natural equivalence between the
identity functor and the double dual functor.
Note For ﬁnitely generated vector spaces, α is used to identify V and V
∗∗
. Under
this identiﬁcation V
∗
is the dual of V and V is the dual of V
∗
. Also, if ¦v
1
, . . . , v
n
¦
is a basis for V and ¦v
∗
i
, . . . , v
∗
n
¦ its dual basis, then ¦v
1
, . . . , v
n
¦ is the dual basis
for ¦v
∗
1
, . . . , v
∗
n
¦.
In general there is no natural way to identify V and V
∗
. However for real inner
product spaces there is.
Theorem Let R = R and V be an ndimensional real inner product space.
Then β : V →V
∗
given by β(v) = (v, −) is an isomorphism.
Proof β is injective and V and V
∗
have the same dimension.
Note If β is used to identify V with V
∗
, then φ
V
: V
∗
⊕ V → R is just the dot
product V ⊕V →R.
Note If ¦v
1
, . . . , v
n
¦ is any orthonormal basis for V, ¦β(v
1
), . . . , β(v
n
)¦ is the dual
basis of ¦v
1
, . . . , v
n
¦, that is β(v
i
) = v
∗
i
. The isomorphism β : V → V
∗
deﬁnes an
inner product on V
∗
, and under this structure, β is an isometry. If ¦v
1
, . . . , v
n
¦ is
an orthonormal basis for V, ¦v
∗
1
, . . . , v
∗
n
¦ is an orthonormal basis for V
∗
. Also, if U
is another ndimensional IPS and f : V → U is an isometry, then f
∗
: U
∗
→ V
∗
is an isometry and the following diagram commutes.
V V
∗
U U
∗
β
β
f f
∗
·
`
Exercise Suppose R is a commutative ring, T is an inﬁnite index set, and
for each t ∈ T, R
t
= R. Show (
t∈T
R
t
)
∗
is isomorphic to R
T
=
¸
t∈T
R
t
. Now let
T = Z
+
, R = R, and M =
t∈T
R
t
. Show M
∗
is not isomorphic to M.
Index
Abelian group, 20, 71
Algebraically closed ﬁeld, 46, 97
Alternating group, 32
Ascending chain condition, 112
Associate elements in a domain, 47, 109
Automorphism
of groups, 29
of modules, 70
of rings, 43
Axiom of choice, 10
Basis or free basis
canonical or standard for R
n
, 72, 79
of a module, 78, 83
Bijective or onetoone correspondence,7
Binary operation, 19
Boolean algebras, 52
Boolean rings, 51
Cancellation law
in a group, 20
in a ring, 39
Cartesian product, 2, 11
Cayley’s theorem, 31
CayleyHamilton theorem, 66, 98, 125
Center of group, 22
Change of basis, 83
Characteristic of a ring, 50
Characteristic polynomial
of a homomorphism, 85, 95
of a matrix, 66
Chinese remainder theorem, 50, 108
Classical adjoint of a matrix, 63
Cofactor of a matrix, 62
Comaximal ideals, 108, 120
Commutative ring, 37
Complex numbers, 1, 40, 46, 47, 97, 104
Conjugate, 64
Conjugation by a unit, 44
Contravariant functor, 131
Coproduct or sum of modules, 76
Coset, 24, 42, 74
Cycle, 32
Cyclic
group, 23
module, 107
Determinant
of a homomorphism, 85
of a matrix, 60, 128
Diagonal matrix, 56
Dimension of a free module, 83
Division algorithm, 45
Domain
euclidean, 116
integral domain, 39
of a function, 5
principal ideal, 46
unique factorization, 111
Dual basis, 132
Dual spaces, 130
Eigenvalues, 95
Eigenvectors, 95
Elementary divisors, 119, 120
Elementary matrices, 58
135
136 Index
Elementary operations, 57, 122
Endomorphism of a module, 70
Equivalence class, 4
Equivalence relation, 4
Euclidean algorithm, 14
Euclidean domain, 116
Evaluation map, 47, 49
Even permutation, 32
Exponential of a matrix, 106
Factorization domain (FD), 111
Fermat’s little theorem, 50
Field, 39
Formal power series, 113
Fourier series, 100
Free basis, 72, 78, 79, 83
Free Rmodule, 78
Function or map, 6
bijective, 7
injective, 7
surjective, 7
Function space Y
T
as a group, 22, 36
as a module, 69
as a ring, 44
as a set, 12
Fundamental theorem of algebra, 46
Gauss, 113
General linear group GL
n
(R), 55
Generating sequence in a module, 78
Generators of Z
n
, 40
Geometry of determinant, 90
GramSchmidt orthonormalization, 100
Graph of a function, 6
Greatest common divisor, 15
Group, 19
abelian, 20
additive, 20
cyclic, 23
multiplicative, 19
symmetric, 31
Hausdorﬀ maximality principle, 3, 87,
109
Hilbert, 113
Homogeneous equation, 60
Homormophism
of groups, 23
of rings, 42
of modules, 69
Homomorphism of quotient
group, 29
module, 74
ring, 44
Ideal
left, 41
maximal, 109
of a ring, 41
prime, 109
principal, 42, 46
right, 41
Idempotent element in a ring, 49, 51
Image of a function, 7
Independent sequence in a module, 78
Index of a subgroup, 25
Index set, 2
Induction, 13
Injective or onetoone, 7, 79
Inner product spaces, 98
Integers mod n, 27, 40
Integers, 1, 14
Invariant factors, 119
Inverse image, 7
Invertible or nonsingular matrix, 55
Irreducible element, 47, 110
Isometries of a square, 26, 34
Isometry, 101
Isomorphism
Index 137
of groups, 29
of modules, 70
of rings, 43
Jacobian matrix, 91
Jordan block, 96, 123
Jordan canonical form, 96, 123, 125
Kernel, 28, 43, 70
Least common multiple, 17, 18
Linear combination, 78
Linear ordering, 3
Linear transformation, 85
Matrix
elementary, 58
invertible, 55
representing a linear transformation,
84
triangular, 56
Maximal
ideal, 109
independent sequence, 86, 87
monotonic subcollection, 4
subgroup, 114
Minimal polynomial, 127
Minor of a matrix, 62
Module over a ring, 68
Monomial, 48
Monotonic collection of sets, 4
Multilinear forms, 129
Multiplicative group of a ﬁnite ﬁeld, 121
Nilpotent
element, 56
homomorphism, 93
Noetherian ring, 112
Normal subgroup, 26
Odd permutation, 32
Onto or surjective, 7, 79
Order of an element or group, 23
Orthogonal group O(n), 102
Orthogonal vectors, 99
Orthonormal sequence, 99
Partial ordering, 3
Partition of a set, 5
Permutation, 31
Pigeonhole principle, 8, 39
Polynomial ring, 45
Power set, 12
Prime
element, 110
ideal, 109
integer, 16
Principal ideal domain (PID), 46
Principal ideal, 42
Product
of groups, 34, 35
of modules, 75
of rings, 49
of sets, 2, 11
Projection maps, 11
Quotient group, 27
Quotient module, 74
Quotient ring, 42
Range of a function, 6
Rank of a matrix, 59, 89
Rational canonical form, 107, 125
Relation, 3
Relatively prime
integers, 16
elements in a PID, 119
Right and left inverses of functions, 10
Ring, 38
Root of a polynomial, 46
Row echelon form, 59
Scalar matrix, 57
138 Index
Scalar multiplication, 21, 38, 54, 71
Self adjoint, 103, 105
Short exact sequence, 115
Sign of a permutation, 60
Similar matrices, 64
Solutions of equations, 9, 59, 81
Splitting map, 114
Standard basis for R
n
, 72, 79
Strips (horizontal and vertical), 8
Subgroup, 14, 21
Submodule, 69
Subring, 41
Summand of a module, 77, 115
Surjective or onto, 7, 79
Symmetric groups, 31
Symmetric matrix, 103
Torsion element of a module, 121
Trace
of a homormophism, 85
of a matrix, 65
Transpose of a matrix, 56, 103, 132
Transposition, 32
Unique factorization,
in principal ideal domains, 113
of integers, 16
Unique factorization domain (UFD), 111
Unit in a ring, 38
Vector space, 67, 85
Volume preserving homomorphism, 90
Zero divisor in a ring, 39
ii
E.H. Connell Department of Mathematics University of Miami P.O. Box 249085 Coral Gables, Florida 33124 USA ec@math.miami.edu
Mathematical Subject Classiﬁcations (1991): 1201, 1301, 1501, 1601, 2001
c 1999
E.H. Connell
March 20, 2004
iii
Introduction
In 1965 I ﬁrst taught an undergraduate course in abstract algebra. It was fun to teach because the material was interesting and the class was outstanding. Five of those students later earned a Ph.D. in mathematics. Since then I have taught the course about a dozen times from various texts. Over the years I developed a set of lecture notes and in 1985 I had them typed so they could be used as a text. They now appear (in modiﬁed form) as the ﬁrst ﬁve chapters of this book. Here were some of my motives at the time. 1) To have something as short and inexpensive as possible. In my experience, students like short books. 2) To avoid all innovation. To organize the material in the most simpleminded straightforward manner. 3) To order the material linearly. To the extent possible, each section should use the previous sections and be used in the following sections. 4) To omit as many topics as possible. This is a foundational course, not a topics course. If a topic is not used later, it should not be included. There are three good reasons for this. First, linear algebra has top priority. It is better to go forward and do more linear algebra than to stop and do more group and ring theory. Second, it is more important that students learn to organize and write proofs themselves than to cover more subject matter. Algebra is a perfect place to get started because there are many “easy” theorems to prove. There are many routine theorems stated here without proofs, and they may be considered as exercises for the students. Third, the material should be so fundamental that it be appropriate for students in the physical sciences and in computer science. Zillions of students take calculus and cookbook linear algebra, but few take abstract algebra courses. Something is wrong here, and one thing wrong is that the courses try to do too much group and ring theory and not enough matrix theory and linear algebra. 5) To oﬀer an alternative for computer science majors to the standard discrete mathematics courses. Most of the material in the ﬁrst four chapters of this text is covered in various discrete mathematics courses. Computer science majors might beneﬁt by seeing this material organized from a purely mathematical viewpoint.
In 1996 I wrote a sixth chapter. It is stated there only as a reference for undergraduate courses. giving enough material for a full ﬁrst year graduate course. Finally. The presentation is compact and tightly organized. Chapter 6 continues the material to complete a ﬁrst year graduate course. after the ﬁrst four chapters. and Shulim Kaliman. this theorem should be assumed there without proof. Indeed. but still somewhat informal. this book is fondly dedicated. The proof is contained in Chapter 6. Chapter 2 is the most diﬃcult part of the book because groups are written in additive and multiplicative notation. As bare as the ﬁrst four chapters are. It is intended for students in mathematics. It hung together pretty well except for the last two sections on determinants and dual spaces. and the physical sciences. John Zweibel. Dmitry Gokhman. After Chapter 2 the book gets easier as you go along. Huseyin Kocak. Brian Coomes. and the concept of coset is confusing at ﬁrst. However they are structured to provide the background for the chapter on linear algebra.e. Marta Alpar. Lourdes Robles.
. i. Classes with little background can do the ﬁrst three chapters in the ﬁrst semester. This chapter was written in the same “style” as the previous chapters. These proofs are to be provided by the professor in class or assigned as homework exercises. you still have to truck right along to ﬁnish them in one semester.iv Over the years I used the ﬁve chapters that were typed as a base for my algebra courses. There is a nontrivial theorem stated without proof in Chapter 4. The ﬁrst three or four chapters can stand alone as a one semester course in abstract algebra. and chapters 4 and 5 in the second semester. The Jordan form should not be considered part of Chapter 5. Chapter 6 is not written primarily for reference. namely the determinant of the product is the product of the determinants. This is the personal background of how this book came about. the linear algebra follows easily. Finishing the chapter on linear algebra gives a basic one year undergraduate course in abstract algebra. These were independent topics stuck on at the end. but as an additional chapter for more advanced courses. More advanced classes can do four chapters the ﬁrst semester and chapters 5 and 6 the second semester. To these and all who contributed. For the proper ﬂow of the course. computer science. supplementing them as I saw ﬁt. In the academic year 199798 I revised all six chapters and had them typed in LaTeX. My sincere gratitude goes especially to Marilyn Gonzalez. everything was right down to the nub.. It is diﬃcult to do anything in life without help from friends. The proofs of many of the elementary theorems are omitted. and many of my friends have contributed to this text. This book is a survey of abstract algebra with emphasis on linear algebra.
they participate little and learn little. Every eﬀort has been extended to make the subject move rapidly and to make the ﬂow from one topic to the next as seamless as possible. When using this text. Professors should give more direction in that regard. This book is my attempt at that organization. Connell Department of Mathematics University of Miami Coral Gables. H. The bare bones nature of this book adds to its ﬂexibility. The purpose of class is to learn.miami. because mathematics is learned in hindsight. but rather builds the course upon it.edu
. A few minutes of preparation does wonders to leverage classroom learning. The professor picks which topics to assign for serious study and which ones to “wave arms at”. When students come to class cold and spend the period taking notes. Because after you extract it. and this book is intended to be used in that manner.” Mathematics is not taught. This leads to a dead class and also to the bad psychology of “O K. not just back. I would have made the book shorter. Study forward. Teaching abstract algebra and linear algebra as distinct courses results in a loss of synergy and a loss of momentum. Also with this text the professor does not extract the course from the text. not to do transcription work. E. it is learned. and this time should be allocated with care. Unfortunately mathematics is a diﬃcult and heavy subject. but I did not have any more time. because you can build whatever course you want around it. The goal is to stay focused and go forward. and each assignment should include the study of the next few pages. I am convinced it is easier to build a course from a base than to extract it from a big book. The style and approach of this book is to make it a little lighter. and many students never learn how to learn. FL 33124 ec@math. I am here. I hope the students and professors who try it. This book works best when viewed lightly and read as a story. the student already has the outline of the next lecture. so teach me the subject. you still have to build it. but it requires a lot of organization. enjoy it. Basic algebra is a subject of incredible elegance and utility.v This text is written with the conviction that it is more eﬀective to teach abstract and linear algebra as one coherent discipline rather than as two separate ones. The student has limited time during the semester for serious study.
strips. subgroups. solutions of equations. ﬁelds The integers mod n Ideals and quotient rings Homomorphisms Polynomial rings Product of rings The Chinese remainder theorem Characteristic Boolean rings Chapter 4 Matrices and Matrix Rings
37 38 40 41 42 45 49 50 50 51
Addition and multiplication of matrices. and scalar matrices Elementary operations and elementary matrices Systems of equations
53 56 56 57 59
. right and left inverses. scalar multiplication for additive groups Subgroups. projections Notation for the logic of mathematics Integers. equivalence relations Functions. quotient groups. domains. partial orderings. unique factorization Chapter 2 Groups 19 21 25 27 31 34
Groups. Cartesian products Relations.vi
Outline
Chapter 1 Background and Fundamentals of Mathematics 1 3 5 13 14 Sets. Hausdorﬀ maximality principle. the symmetric groups Product of groups Chapter 3 Rings
Rings Units. the integers mod n Homomorphisms Permutations. bijections. order. cosets Normal subgroups. invertible matrices Transpose Triangular. diagonal.
submodules Homomorphisms Homomorphisms on Rn Cosets and quotient modules Products and coproducts Summands Independence. rank of a matrix Geometric interpretation of determinant Linear functions approximate diﬀerentiable functions locally The transpose principle Nilpotent homomorphisms Eigenvalues. the orthogonal group Diagonalization of symmetric matrices Chapter 6 Appendix
The Chinese remainder theorem Prime and maximal ideals and UFDs Splitting short exact sequences Euclidean domains Jordan blocks Jordan canonical form Determinants Dual spaces
108 109 114 116 122 123 128 130
. characteristic roots Jordan canonical form Inner product spaces. and characteristic polynomial Chapter 5 Linear Algebra 68 69 71 74 75 77 78 79 82 83 85 90 91 92 93 95 96 98 102 103 60 64
Modules. square matrices over ﬁelds. generating sets. the classical adjoint Similarity.vii Determinants. and free basis Characterization of free modules Uniqueness of dimension Change of basis Vector spaces. trace. GramSchmidt orthonormalization Orthogonal matrices.
This trick is based. not on sleight of hand. the ﬁrst 11 with numbered tiles and the last vacant. and end again with the last space vacant? After giving up on this. you can study permutation groups and learn the answer!
. Abstract algebra is not all work and no play. In this puzzle.viii
1 5 9
2 6 11
3 7 10
4 8
Abstract algebra is not only a major subject of science. but to understand it you need some group theory. Anyone can do it. The last two tiles are out of order. See. and it is certainly not a dull boy. you might ﬁrst try your skills on the famous (some would say infamous) tile puzzle. And before beginning the course. Is it possible to slide the tiles around to get them all in order. for example. but it is also magic and fun. a frame has 12 spaces. the neat card trick on page 18. but rather on a theorem in abstract algebra.
An equivalence relation on a set A is shown to be simply a partition of A into disjoint subsets. are sets. The ﬁnal section gives a proof of the unique factorization theorem for the integers. A ∩ B = {x : x ∈ A and x ∈ B} = the set of all x which are elements 1
. functions.Chapter 1
Background and Fundamentals of Mathematics
This chapter is fundamental. equivalence relations.. and the properties of surjective. 2. −2.} Z = the ring of integers = {. C.. There is an emphasis on the concept of function. partial orderings. The symbol ∃ means “there exists” and ∃! means “there exists a unique”... The basic concepts are products of sets. N = Z+ = the set of positive integers = {1. 2. and the integers. injective. Five of these are listed below. Some sets (or collections) are so basic they have their own proprietary symbols. 0.. 3. but for all ﬁelds related to mathematics. We use the standard notation for intersection and union. not just for algebra. . −1. and bijective.. .. b ∈ Z. and most properties of functions can be stated in terms of solutions of equations. The symbol ∀ means “for each” and ⇒ means “implies”.} Q = the ﬁeld of rational numbers = {a/b : a. B.. Notation Mathematics has its own universally accepted shorthand. The notion of a solution of an equation is central in mathematics.. 1.. In elementary courses the section on the Hausdorﬀ Maximality Principle should be ignored. b ∈ R} (i2 = −1) Sets Suppose A. b = 0} R = the ﬁeld of real numbers C = the ﬁeld of complex numbers = {a + bi : a.
B ⊂ S. The statement that A is not a subset of B means . That is. x ∈ At } = {x : ∀t ∈ T. (A ∩ B) = A ∪ B (A ∪ B) = A ∩ B and
Cartesian Products If X and Y are sets. The statement that A is a subset of B (A ⊂ B) means that if a is an element of A.
Background
Chapter 1
A ∪ B = {x : x ∈ A or x ∈ B} = the set of all x which are elements of A or B. let C . Exercise Suppose each of A and B is a set. If A ∩ B = ∅. a ∈ A ⇒ a ∈ B. x ∈ At }
t∈T
Let ∅ be the null set. Then for any A. Any set called an index set is assumed to be nonvoid.. the Cartesian product of X and Y is deﬁned to be the set of all ordered pairs whose ﬁrst term is in X and whose second term is in Y . be deﬁned by C = S − C = {x ∈ S : x ∈ C}. if C is a subset of S). In other words. Example R × R = R2 = the plane. then a is an element of B.e. or B contains A.
t∈T
At = {x : ∃ t ∈ T with x ∈ At } At = {x : if t ∈ T. Deﬁnition Suppose each of A and B is a set. X × Y = {(x. Theorem (De Morgan’s laws) Suppose S is a set.2 of A and B. y) : x ∈ X and y ∈ Y }. Suppose T is an index set and for each t ∈ T .
. If C ⊂ S (i. the complement of C in S. At is a set. If A ⊂ B we may say A is contained in B. then A and B are said to be disjoint.
Then 1) If a ∈ A. . then If a ∼ b and b ∼ c. (antisymmetric) a ∼ c. and we write this fact by the expression a ∼ b.
Hausdorﬀ Maximality Principle (HMP) Suppose S is a nonvoid subset of A and ∼ is a relation on A. is a partial ordering.. b ∈ A. then a ≤ c. If a ∼ b. then a ≤ b or b ≤ a.. 1) 2) 2) 3) If a ∈ A. if a. If a ∼ b and b ∼ a. and 3) is called a partial ordering. b) ∈ R we say that a is related to b.Chapter 1
Background
3
Deﬁnition If each of X1 . a partial ordering on A deﬁnes a partial ordering
. X1 × · · · × Xn = {(x1 . (transitive)
Deﬁnition A relation which satisﬁes 1). 2 ).. then a = b. Deﬁnition A linear ordering is a partial ordering with the additional property that. 2). Xn is a set. with a ≤ b deﬁned by a ⊂ b. 2 ). Here are several properties which a relation may possess. a nonvoid subset R ⊂ A × A is called a relation on A. then a ∼ a. Example Example A = R with the ordinary ordering. In this case we write a ∼ b as a ≤ b. or 3) on A. is a linear ordering. Is (R × R2 ) = (R2 × R) = R3 ? Relations If A is a nonvoid set. This deﬁnes a relation on S.. then b ∼ a. 3) If a ≤ b and b ≤ c. then (reﬂexive) (symmetric) a = b. . Example Question R × · · · × R = Rn = real nspace. If the relation satisﬁes any of the properties 1)... If (a. In particular. 2 ) If a ≤ b and b ≤ a. then a ≤ a. xn ) : xi ∈ Xi for 1 ≤ i ≤ n} = the set of all ordered ntuples whose ith term is in Xi . A = all subsets of R2 . the relation also satisﬁes these properties when restricted to S.
Equivalence Relations an equivalence relation. Corollary to HMP Suppose X is a nonvoid set and A is some nonvoid collection of subsets of X. one is contained in the other. given any two sets of the collection. Deﬁnition A collection of sets is said to be monotonic if. First. d) provided a ≤ c and b ≤ d. Then ∃ a maximal monotonic subcollection of A which contains S. in the Appendix. to show that inﬁnitely generated vector spaces have free bases.
The HMP is used twice in this book. and apply HMP. 2). Show this is a partial ordering which is linear on S = {(a.
A relation satisfying properties 1). In each of these applications. Show this is an equivalence relation. to show that rings have maximal ideals (see pages 87 and 109). In elementary courses. these results may be assumed. and thus the HMP may be ignored.4
Background
Chapter 1
on S. a) : a < 0}.
. The HMP is that any linearly ordered subset of a partially ordered set is contained in a maximal linearly ordered subset. b) ∼ (c. we deﬁne the equivalence class containing a by cl(a) = {x ∈ A : a ∼ x}. the maximal monotonic subcollection will have a maximal element. and second. Exercise Deﬁne a relation on A = R2 by (a. Deﬁnition If ∼ is an equivalence relation on A and a ∈ A. However the ordering may be linear on S but not linear on A. and S is a subcollection of A which is monotonic. Proof Deﬁne a partial ordering on A by V ≤ W iﬀ V ⊂ W. Find at least two maximal linearly ordered subsets of R2 which contain S. One of the most useful applications of the HMP is to obtain maximal monotonic collections of subsets. and 3) is called
Exercise Deﬁne a relation on A = Z by n ∼ m iﬀ n − m is a multiple of 3.
2a) : a ∈ R}. In other words. In either case. Then ∼ is an equivalence relation. 2).Chapter 1 Theorem 1)
Background
5
2) 3)
If b ∈ cl(a) then cl(b) = cl(a). is there
Exercise Let H ⊂ R2 be the line H = {(a. Theorem Suppose A is a nonvoid set with a partition. Note that if A has an equivalence relation. 2). and 3). and the equivalence classes are just the subsets of the partition. there are two ways of deﬁning a function. Thus we may speak of a subset of A being an equivalence class with no mention of any element contained in it. V ⊂ A is an equivalence class and U ∩ V = ∅. Find the equivalence relation on R2 deﬁned by this partition of R2 . If each of U. One is the “intuitive” deﬁnition. Each element of A is an element of one and only one equivalence class. and the other is the “graph” or “ordered pairs” deﬁnition.
. Summary There are two ways of viewing an equivalence relation — one is as a relation on A satisfying 1). Exercise Deﬁne an equivalence relation on Z by n ∼ m iﬀ n − m is a multiple of 3. Functions Just as there are two ways of viewing an equivalence relation. 2 ) and 3) ? an equivalence relation on R which is also a partial ordering? That is. a collection of nonvoid subsets of A is a partition of A provided any a ∈ A is an element of one and only one subset of the collection.
Deﬁnition A partition of A is a collection of disjoint nonvoid subsets whose union is A. i. all lines in the plane with slope 2. We use the “intuitive” deﬁnition because everyone thinks that way. the equivalence classes form a partition of A. What are the equivalence classes? Exercise Is there a relation on R satisfying 1).. Consider the collection of all translates of H. and the other is as a partition of A into disjoint subsets.e. domain and range are inherent parts of the deﬁnition. Deﬁne a relation on A by a ∼ b iﬀ a and b belong to the same subset of the partition. then U = V.
Example Constant functions y0 for all x ∈ X. Deﬁnition The graph of a function (X. Y. Suppose y0 ∈ Y . The identity on X is denoted by IX or just I : X → X.”
Example Identity functions Here X = Y and f : X → X is deﬁned by f (x) = x for all x ∈ X. then h ◦ (g ◦ f ) = (h ◦ g) ◦ f. Inclusion If S is a nonvoid subset of X. Note that inclusion is a restriction of the identity. The function is deﬁned by “f (x) is the second term of the ordered pair in Γ whose ﬁrst term is x.
f g
deﬁne g ◦ f : W → Y by
f g h
Theorem (The associative law of composition) If V → W → X → Y . then ∃! f : X → Y whose graph is Γ. Theorem If f : X → Y . Composition Given W → X → Y (g ◦ f )(x) = g(f (x)). The statement that (X. f (x)) : x ∈ X}. f ) where f assigns to each x ∈ X a well deﬁned element f (x) ∈ Y . if Γ is a subset of X × Y with the property that each x ∈ X is the ﬁrst term of one and only ordered pair in Γ. Deﬁne f : X → Y by f (x) =
Restriction Given f : X → Y and a nonvoid subset S of X. Y. then the graph Γ ⊂ X × Y has the property that each x ∈ X is the ﬁrst term of one and only one ordered pair in Γ. This may be written as h ◦ g ◦ f . f ) is a function is written f as f : X → Y or X → Y . Y. f ) is the subset Γ ⊂ X × Y deﬁned by Γ = {(x. a function or mapping or map with domain X and range Y . is an ordered triple (X.6
Background
Chapter 1
Deﬁnition If X and Y are (nonvoid) sets.
. The connection between the “intuitive” and “graph” viewpoints is given in the next theorem. Conversely. deﬁne the inclusion i : S → X by i(s) = s for all s ∈ S. deﬁne f  S : S → Y by (f  S)(s) = f (s) for all s ∈ S.
Note that f −1 : Y → X is also bijective and (f −1 )−1 = f .. f −1 (y) is a nonvoid subset of X. the inverse image of T is a subset of X. then f (x1 ) and f (x2 ) are distinct elements of Y . the image is the range. f : X → Y is bijective or is a 11 correspondence provided f is surjective and injective. (f −1 (x) is written as ln(x). f −1 (T ) = {x ∈ X : f (x) ∈ T }. i. f : X → Y is surjective or onto provided image (f ) = Y i. In this case. image (f ) = f (X) = {f (x) : x ∈ X} = {y ∈ Y : ∃x ∈ X with f (x) = y}.. f : [0.Chapter 1 Deﬁnitions 1)
Background Suppose f : X → Y .e. f (S) = {f (s) : s ∈ S} = {y ∈ Y : ∃s ∈ S with f (s) = y}.e. the image of S is a subset of Y . f : R → [−1. The image of f is the image of X .) (f −1 (x) is
5)
f : R → (0. If S ⊂ X.e.
2)
3)
4)
5)
6)
Examples 1) 2) 3) 4) f : R → R deﬁned by f (x) = sin(x) is neither surjective nor injective.” A function is not deﬁned unless the domain and range are speciﬁed.e.
. written as arcsin(x) or sin−1 (x). ∞) deﬁned by f (x) = ex is bijective. i. π/2] → R deﬁned by f (x) = sin(x) is injective but not surjective. f : X → Y is injective or 11 provided (x1 = x2 ) ⇒ f (x1 ) = f (x2 ). if x1 and x2 are distinct elements of X. there is function f −1 : Y → X with f −1 ◦ f = IX : X → X and f ◦ f −1 = IY : Y → Y . π/2] → [0. i.
7
If T ⊂ Y . f : [0. 1] deﬁned by f (x) = sin(x) is surjective but not injective.. 1] deﬁned by f (x) = sin(x) is bijective..)
Note There is no such thing as “the function sin(x). if y ∈ Y .
There exists an injective f : X → Y iﬀ n There exists a surjective f : X → Y iﬀ n There exists a bijective f : X → Y iﬀ n . These three sets are disjoint. Exercise 1) 2) 3) Suppose X is a set with 6 elements and Y is a ﬁnite set with n elements. as can be seen by the following exercise. In other words. y) : y ∈ Y } = (x0 × Y ) is called a vertical strip. T = f (f −1 (T )). the pigeonhole principle does not hold for inﬁnite sets. {(x. Show that if f is injective.
Strips
We now deﬁne the vertical and horizontal strips of X × Y . in part 1) for m = n = 6. Exercise Suppose f : [−2.8
Background
Chapter 1
Exercise Show there are natural bijections from (R × R2 ) to (R2 × R) and from (R2 × R) to R × R × R. and f : X → Y is a function. and you run out of pigeons before you ﬁll the holes. then f is not surjective. Also ﬁnd the relationship between T and f (f −1 (T )). Y is a ﬁnite set with n elements.
Pigeonhole Principle Suppose X is a ﬁnite set with m elements. Find the relationship between S and f −1 (f (S)).
. If m > n. Show that if f is surjective. 5])). . 2] → R is deﬁned by f (x) = x2 . {(x0 . S = f −1 (f (S)). then f is injective iﬀ f is surjective iﬀ f is bijective. If m < n. Exercise Suppose f : X → Y is a function. then you have placed 2 pigeons in one hole. Also ﬁnd f (f −1 ([3. Exercise Show there is a function f : Z+ → Z+ which is injective but not surjective. y0 ) : x ∈ X} = (X × y0 ) is called a horizontal strip. Find f −1 (f ([1. Of course. If y0 ∈ Y. Also show there is one which is surjective but not injective.
If x0 ∈ X. then f is not injective. 2])).
If you are placing 6 pigeons in 6 holes. S ⊂ X and T ⊂ Y . 1) 2) 3) If m = n. but the bijections between them are so natural that we sometimes identify them. . if f is not surjective then f is not injective.
The purpose of the next theorem is to restate properties of functions in terms of horizontal strips. Consider the equation f (x) = y0 .Chapter 1
Background
9
Theorem Suppose S ⊂ X × Y . This is just a restatement of the property of a graph of a function. Here y0 is given and x is considered to be a “variable”. Also f (x) = y0 has a solution iﬀ y0 ∈ image(f ) iﬀ f −1 (y0 ) is nonvoid. then f is injective. . f is The equation f (x) = y0 has a unique solution for each y0 ∈ Y iﬀ f is . If g ◦ f is injective. The equation f (x) = y0 has at least one solution for each y0 ∈ Y iﬀ f is . Theorem 1) 2) 3) Suppose f : X → Y . .
f g
.
Right and Left Inverses One way to understand functions is to study right and left inverses.
Theorem 1) 2) 3)
Suppose f : X → Y has graph Γ. The subset S is the graph of a function with domain X and range Y iﬀ each vertical strip intersects S in exactly one point.
Solutions of Equations Now we restate these properties in terms of solutions of equations. Note that the set of all solutions to f (x) = y0 is f −1 (y0 ). Suppose f : X → Y and y0 ∈ Y . which are deﬁned after the next theorem. The equation f (x) = y0 has at most one solution for each y0 ∈ Y iﬀ . A solution to this equation is any x0 ∈ X with f (x0 ) = y0 . Then Each horizontal strip intersects Γ in at least one point iﬀ f is Each horizontal strip intersects Γ in at most one point iﬀ f is Each horizontal strip intersects Γ in exactly one point iﬀ f is . Theorem 1) Suppose X → Y → W are functions.
However. Note It is a classical theorem in set theory that the Axiom of Choice and the Hausdorﬀ Maximality Principle are equivalent. Also a function from X to Y is bijective iﬀ it has a left inverse and a right inverse iﬀ it has a left and right inverse. Any such left inverse must be surjective. for each y ∈ Y . q}.
Corollary Suppose each of X and Y is a nonvoid set.
. For our purposes it is assumed that the Axiom of Choice and the HMP are true. Theorem 1) 2) Suppose f : X → Y is a function. if you worked 1) of the theorem above. then f −1 (y) is an equivalence class and every equivalence class is of this form. A right inverse of f is a function h : Y → X such that f ◦ h = IY : Y → Y . That is. The Axiom of Choice If f : X → Y is surjective.
Chapter 1
Example X = W = {p}. Deﬁnition Suppose f : X → Y is a function. If y belongs to the image of f . f (p) = p. it is possible to choose an x ∈ f −1 (y) and thus to deﬁne h(y) = x. In the next chapter where f is a group homomorphism. and g(p) = g(q) = p. If g ◦ f is bijective. Show this is an equivalence relation. Here g ◦ f is the identity. For completeness. f has a right inverse iﬀ f is surjective.10 2) 3)
Background If g ◦ f is surjective. A left inverse of f is a function g : Y → X such that g ◦ f = IX : X → X. f has a left inverse iﬀ f is injective. Y = {p. Deﬁne a relation on X by a ∼ b if Exercise f (a) = f (b). but f is not surjective and g is not injective. Then ∃ an injective f : X → Y iﬀ ∃ a surjective g : Y → X. you unknowingly used one version of it. Suppose f : X → Y is a function. then f is injective and g is surjective. However in this text we do not go that deeply into set theory. we state this part of 1) again. these equivalence classes will be called cosets. Note The Axiom of Choice is not discussed in this book. then f has a right inverse h. then g is surjective. Any such right inverse must be injective.
Y f1 X1
© ' π1 d d
f
c
d
f2
d d
X1 × X 2
π2 E
X2
One nice thing about this concept is that it works ﬁne for inﬁnite Cartesian products. . and X2 are nonvoid sets. . Given f . .). xn ). It is summarized by the equation f = (f1 . Deﬁnition Suppose T is an index set and for each t ∈ T . This concept is displayed in the diagram below. x2 ) = xi . For any T and any s in T . the sequence {ft } is deﬁned by ft = πt ◦ f. Given {ft }. . Xt is a nonvoid set.
t∈T
. . deﬁne f1 = π1 ◦ f and f2 = π2 ◦ f . . . . If T = Z+ then {xt } is the sequence (x1 . . x2 . there is a 11 correspondence between {functions f : Y → Xt } and {sequences of functions {ft }t∈T where ft : Y → Xt }. we deﬁne the projection maps π1 : X1 × X2 → X1 and π2 : X1 × X2 → X2 by πi (x1 . Theorem If Y. Thus a function from Y to X1 × X2 is merely a pair of functions from Y to X1 and Y to X2 . there is a 11 correspondence between {functions f: Y → X1 × X2 } and {ordered pairs of functions (f1 . . n} then {xt } is the ordered ntuple (x1 . f2 (y)). f is deﬁned by f (y) = {ft (y)}. the projection map πs : Xt → Xs is deﬁned by πs ({xt }) = xs .Chapter 1
Background
11
Projections If X1 and X2 are nonvoid sets. 2. Theorem If Y is any nonvoid set. Formally these sequences are functions α from T to Xt with each α(t) in Xt and written as α(t) = xt . If T = {1. Given f1 and f2 deﬁne f : Y → X1 × X2 by f (y) = (f1 (y). X1 . Proof Given f . Xt = Xt is the collection of all sequences {xt }t∈T = {xt } Then the product where xt ∈ Xt . x2 . f2 ). f2 ) where f1: Y → X1 and f2 : Y → X2 }. .
the power set of T . deﬁne Y T to be the collection of all functions with domain T and range Y . 3}. It is included here for students who wish to do a little more set theory. and be 0 when t ∈ S. 1}T by  α(S) = χS . then Y T has nm elements. 2. 1] → R which have an inﬁnite number of derivatives. Y N is the set of all inﬁnite sequences (y1 . when T = {1. Y T = Y × Y × Y has n3 elements. For any Y and Yt . 1}T → P(T ) by β(f ) = f −1 (1). and f (t) = 1 if ft (t) = 0. 1} by letting χS (t) be 1 when t ∈ S. Use the mean value theorem to show that D is injective. 1) If Y is a nonvoid set. denote γ(t) by γ(t) = ft : T → {0. (This is the fundamental property of Cartesian products presented in the two previous theorems. P(T ) has 2m elements. 1}T is a function and show that it cannot be surjective. Then Y T =
t∈T
2) Suppose each of Y1 and Y2 is a nonvoid set. . Deﬁne f : T → {0. 1} then α ◦ β(f ) = f . Let A0 ⊂ A be the subcollection of those functions f with f (0) = 0. 1}T 5) Suppose γ : T → {0. T . P(T ) ←→ {0. Show that if n ≥ 3. Exercise This exercise is not used elsewhere in this text and may be omitted.2. Suppose T is a nonvoid set. Use the fundamental theorem of calculus to show that D is surjective. then the set {0.3} of all injective functions has n(n − 1)(n − 2) elements. . 1}T represents a “higher order of inﬁnity than T ”.12
Background
Chapter 1
A Calculus Exercise Let A be the collection of all functions f : [0.) where each yi ∈ Y . If T = N. Deﬁne D : A0 → A by D(f ) = df /dx. Deﬁne α : P(T ) → {0. then Y T is the inﬁnite product Y × Y × · · · . Show there is a natural bijection from (Y1 × Y2 )T to Y1T × Y2T . That is. These injective functions are called permutations on Y taken 3 at a time. Thus α is a bijection and β = α−1 . If t ∈ T . Show that if T is a ﬁnite set with m elements. Show that f is not in the image of γ and thus γ cannot be surjective. This shows that if T is an inﬁnite set. 1} by f (t) = 0 if ft (t) = 1. y2 . In particular. deﬁne its characteristic function χS : T → {0. the subset of Y {1. 6) An inﬁnite set Y is said to be countable if there is a bijection from the positive
.) 3) Deﬁne P(T ). to be the collection of all subsets of T (including the null set). . Show that if S ⊂ T then β ◦ α(S) = S. let Yt be a copy of Y for each t ∈ T. 4) If S is any subset of T . and if f : T → {0. 1}. Deﬁne β : {0. Show that if T and Y are ﬁnite sets with m and n elements.
that A ⇒ B and B ⇒ A). or to suppose B is false and show A is false. . Mathematical symbols are shorthand for phrases and sentences in the English language.) Notation for the Logic of Mathematics Each of the words “Lemma”. 1}.Chapter 1
Background
13
integers N to Y. “Theorem”. A ⇒ B (A implies B ).. . We know that ii) and iii) are equal and there is a natural bijection between i) and ii).
There are two ways to prove the theorem — to suppose A is true and show B is true. “A is equivalent to B”. 1}N . i. (This is not so easy. {0.e. We also know there is no surjective map from N to {0.
. Show Q is countable but the following three collections are not. the collection of all subsets of N. Finally. the collection of all functions f : N → {0. 1}N is uncountable. then B is true. The expressions “A ⇔ B”. then x2 is a positive integer”. To start with. “x ∈ B ” means “x is an element of the set B. then S contains a smallest element. and “Corollary” means “true statement”. If A is true. 1}N to the real numbers R. For example. The important thing to remember is that thoughts and expressions ﬂow through the language. 1}N . Conclusion Statement B. and “A is true iﬀ B is true ” have the same meaning (namely. show there is a bijection from {0. i) P(N).) where each yi is 0 or 1. . ii) {0. Then B is true. y2 . iii) The collection of all sequences (y1 . Suppose A is true. Suppose A and B are statements.” If A is the statement “x ∈ Z+ ” and B is the statement “x2 ∈ Z+ ”.
Mathematical Induction is based upon the fact that if S ⊂ Z+ is a nonvoid subset. you have to decide what the real numbers are. A theorem may be stated in any of the following ways: Theorem Theorem Theorem Theorem Hypothesis Statement A. then “A ⇒ B”means “If x is a positive integer.
.. 2n. Then for each n ≥ 1. b with b = 0. not both zero. Here we will establish the following three basic properties of the integers. Exercise Use induction to show that. ∃! m and r with 0 ≤ r <b and a = bm + r. lower case letters a.. b divides a “m times with a remainder of r”. b.e.. and G is the collection of all linear combinations of a and b. The Integers In this section.14
Background
Chapter 1
Theorem Suppose P (n) is a statement for each n = 1. which we now state formally. 0.
.. P (n) ⇒ P (n + 1).. In other words. If a and b are integers. 0 does not divide anything.. −n. This fact is written as b  a. elements of Z. if a = −17 and b = 5. then m = −4 and r = 3. Proof If the theorem is false. c. then G is a subgroup of Z. we say that b divides a or a is a multiple of b. Note that b  a ⇔ the rational number a/b is an integer ⇔ ∃! m such that a = bm ⇔ a ∈ bZ. n. for each n ≥ 1.
3)
All of this will follow from long division. Suppose P (1) is true and for each n ≥ 1. ± 1 divides anything . For example. will represent integers. i. . P (n) is true. Since P (m − 1) is true.}. . Also n divides a and b with the same remainder iﬀ n divides (a − b). and its positive generator is the greatest common divisor of a and b. If n = 0. then n factors uniquely as the product of primes. then ∃ a smallest positive integer m such that P (m) is false. g2 ∈ G ⇒ (g1 + g2 ) ∈ G). 1 + 2 + · · · + n = n(n + 1)/2. the set of integers which n divides is nZ = {nm : m ∈ Z} = {. −2n.. Euclidean Algorithm Given a.. this is impossible. .
Deﬁnition A nonvoid subset G ⊂ Z is a subgroup provided (g ∈ G ⇒ −g ∈ G) and (g1 . −17 = 5(−4) + 3.. then ∃ n ≥ 0 such that G = nZ. Deﬁnition If r = 0. If n ≥ 2. . 1) 2) If G is a subgroup of Z. Note Anything (except 0) divides 0. We say that G is closed under negation and closed under addition. 2.
b ∈ Z and at least one of a and b is nonzero. It is called the subgroup generated by a and b. Let n be the smallest positive integer in G. Theorem Let G be the set of all linear combinations of a and b.e. then (m1 g1 + m2 g2 ) ∈ G for all integers m1 . g = nm + r where 0 ≤ r < n. In fact. it must be 0. Part 2) is straightforward. By the previous theorem. Theorem 1) 2) 3) Suppose G ⊂ Z is a subgroup. (n  a and n  b) ⇒ n = ±1.e.e. b). If g1 and g2 ∈ G. then n also divides (a. and thus n  (a. G = (a.. Now (−g) ∈ G and thus 0 = g + (−g) belongs to G. nZ ⊃ (a. b)Z. b)  a and (a. b). Corollary 1) The following are equivalent. Also note that ∃ m.
Proof Since G is nonvoid. Since r ∈ G. n such that ma + nb = (a. so consider 3). b) the smallest positive integer in G. Thus if n = 0.
Now suppose a. the set of integers which n divides is a subgroup of Z. and thus (a. G is a subgroup. b)Z.. b) is called the greatest common divisor of a and b.Chapter 1
Background
15
Theorem If n ∈ Z then nZ is a subgroup. b)  b. ∃ g ∈ G. G = {ma + nb : m. n ∈ Z}. If g ∈ G. then G = nZ. b).
4)
Proof of 4) Suppose n  a and n  b i. In fact. and g ∈ nZ. ∃! nonnegative integer n such that G = nZ. b ∈ nZ. i. it is the smallest subgroup containing a and b. m2 . The integer (a. If G = 0.. Since G is the smallest subgroup containing a and b. it must contain a positive element. Denote by (a. i.
a and b have no common divisors. If n is an integer which divides a and b. Then 1) 2) 3) G contains a and b. Then 0 ∈ G.
. The next theorem states that every subgroup of Z is of this form. suppose a. and so 1) is true. if G = {0} and n is the smallest positive integer in G.
.
Chapter 1
Deﬁnition If any one of these three conditions is satisﬁed. That is.e. The ﬁrst few primes are 2.. Part 3) follows from 2) and induction on n. not existence. ∃ a unique collection of distinct primes p1 . . i. 13.. Thus if each ai is a prime. this factorization is unique. Now a  mac and a  nbc.. sk such that a = ±ps1 ps2 · · · psk . then by 1). Theorem 1) 2) 3) Suppose p is a prime.e. (p. ∃ m. Theorem If a and b are relatively prime with a not zero.16 2) 3)
Background (a. 3. c ∈ Z and a  bc. except for order. p2 . the subgroup generated by a and b is all of Z. a) = p or (p. If p  a1 a2 · · · an then p divides some ai . In other words. then p is equal to some ai .
If a is an integer which is not a multiple of p.. If p does not divide a.
The Unique Factorization Theorem Suppose a is an integer which is not 0. Then there exist m. If p  ab then p  a or p  b. n with ma + nb = 1. and uniqueness follows from 3) in the theorem above. p must divide b. Now suppose p  ab.1. The power of this theorem is uniqueness.
Proof Part 1) follows immediately from the deﬁnition of prime. 17. Thus 2) is true. Deﬁnition A prime is an integer p > 1 which does not factor.. a) = 1 and by the previous theorem... Then a may be factored into the product of primes and..
. we say that a and b are relatively prime. (p. 1 2 k Proof Factorization into primes is obvious. if a is any integer. 7. a) = 1. a) = 1. 11. pk and positive integers s1 . s2 . 5. or 1. then (p..
This next theorem is the basis for unique factorization. n ∈Z with ma + nb = 1. Thus a  (mac + nbc) and so a  c. .
Proof Suppose a and b are relatively prime. i. b) = 1. . if p = ab then a = ±1 or a = ±p. and thus mac + nbc = c. then abc ⇒ ac..
. ﬁnd the greatest common divisor of 180 and 28.. then it is divisible by some prime. i. Then 1 k (a. Find integers m and n such that 180m + 28n = (180.e. if a and b are positive. 2 and 3 are irrational. then their least common multiple is just their product. and show that it is equal to (180 · 28)/(180. For example (23 · 5 · 11.) (See the ﬁfth exercise below. 22 · 54 · 7) = 22 · 5. Thus a = ±ps1 · · · psk where 0 ≤ si and 1 k b = ±pt1 · · · ptk where 0 ≤ ti .) √ √ Suppose c is an integer √ greater than 1.Chapter 1
Background
17
Now that we have unique factorization and part 3) above. pk } be the union of the distinct primes of their factorizations. pk . Let {p1 .. then n is a multiple of c. Here are some of the basic properties of the integers in this light.28). Find the least common multiple of 180 and 28. b) = 1. b). Note that c is a multiple of a and b. . 1 k Let vi be the maximum of si and ti . There is an inﬁnite number of primes. . 1 If  a > 1 and  b > 1. ∃ m.. Then c is rational iﬀ c is an √ √ integer. (Proof: If c is √ rational. b) = pu1 · · · puk . 28). Then the only 1 k t divisors of a are of the form ±pt1 · · · pkk where 0 ≤ ti ≤ si for i = 1.)
2)
3)
3)
4)
5)
Exercise Find (180. ﬁnd the positive generator of the subgroup generated by {180. Let ui be the minimum of si and ti . k.
. ∃ positive integers a and b with c = a/b and (a. Theorem (Summary) 1) Suppose  a > 1 has prime factorization a = ±ps1 · · · psk . In particular.. p2 . . and since cb2 = a2 . If b > 1. . and if n is a multiple of a and b. This is a √ contradiction and thus b = 1 and c is an integer. 28).. and also (a2 .. Then no prime would divide (p1 p2 · · · pk + 1). Finally. the picture becomes transparent.e. b) = 1 iﬀ there is no common prime in their factorizations. their least common multiple is c = ab/(a. b2 ) = 1. then (a. Suppose  a > 1 and  b > 1. . i. and if in addition a and b are relatively prime. (Proof: Suppose there were only a ﬁnite number of primes p1 . this prime will also appear in the prime factorization of a. . Then c = pv1 · · · pvk is the least 1 k (positive) common multiple of a and b. n with ma + nb = 1. Thus if there is no common prime in their factorizations..28}.
There is a unique positive real number x with xn = c. Express the gcd and the lcm in terms of the prime factorizations of the ai .18
Background
Chapter 1
Exercise We have deﬁned the greatest common divisor (gcd) and the least common multiple (lcm) of a pair of integers. and announce you will place the selected card in with the other six. its nth root is an irrational number. . suppose c and n are integers greater than 1. Gm is a subgroup of Z. 42} is 2. .. and then you turn over the third card. then the second card to the bottom. Ask your friends to give you a number between one and seven (not allowing one).. it will be more transparent later on. Magic? Stay tuned for Chapter 2. Exercise Show that the gcd of S = {90. Now let b = an + an−1 + · · · + a0 . a0 = an 10n + an−1 10n−1 + · · · + a0 where 0 ≤ ai ≤ 9. G2 . Exercise Show that if the nth root of an integer is a rational number. let a = an an−1 . where it is shown that any nonzero element of Z7 has order 7. then G1 ∩ G2 ∩ · · · ∩ Gm is also a subgroup of Z. moving the top two cards to the bottom and turning the third card face up on top. Take the six cards face down in your left hand and the selected card in your right hand. and this will be the selected card. When is the lcm of S equal to the product a1 a2 · · · an ? Show that the set of all linear combinations of the elements of S is a subgroup of Z. Now let G = (90Z) ∩ (70Z) ∩ (42Z) and ﬁnd the positive integer n with G = nZ. Thus if p is a prime. leaving it face up on top. then it is an integer. a2 . Deﬁne the gcd and the lcm of the elements of S and develop their properties. Then repeat the process. an } is a ﬁnite collection of integers with ai  > 1 for 1 ≤ i ≤ n. In the language of the next chapter.. You move the top card to the bottom. Put your hands behind your back and place the selected card on top. Exercise Show that if each of G1 . and bring the seven cards in front in your left hand. and ﬁnd integers n1 . Suppose they say three. . Card Trick Ask friends to pick out seven cards from a deck and then to select one to look at without showing it to you. but they are not to know where. Also ﬁnd the lcm of the elements of S. Continue until there is only one card face down. n2 . That is. Show that if x is rational. Now suppose n ≥ 2 and S = {a1 . then it itself is an integer. and show that 3 divides a and b with the same remainder. More generally. and its positive generator is the gcd of the elements of S. it says that [a] = [b] in Z3 . n3 such that 90n1 + 70n2 + 42n3 = 2. Although this is a straightforward exercise in long division.
.. . 70.. Exercise Show that a positive integer is divisible by 3 iﬀ the sum of its digits is divisible by 3.
rings. because group operations may be written as addition or multiplication. 1) If a. In later chapters we will deﬁne rings and modules and see that they are special cases of groups. and matrices. Also ring homomorphisms and module homomorphisms are special cases of group homomorphisms. b) = a + b. and we will write φ(a. and also the concept of coset is confusing at ﬁrst. Consider the following properties. φ) is said to be a group. This chapter and the next two chapters are restricted to the most basic topics.Chapter 2
Groups
Groups are the central objects of algebra. we say it is a multiplicative group. b) = a+b. c ∈ G then a + (b + c) = (a + b) + c. The approach is to do quickly the fundamentals of groups. ∃b ∈ G with a · b = b · a = e If a ∈ G. then a + b = b + a. by far and above. (b is written as b = −a). If a. Even though the deﬁnition of group is simple. ∃b ∈ G with a + b = b + a = 0 ¯ (b is written as b = a−1 ). 19
. except that the product of groups is given in the additive notation. b ∈ G. Deﬁnition Suppose G is a nonvoid set and φ : G × G → G is a function. b) = a·b or φ(a. If we write φ(a. then a · b = b · a. If a. and 3) hold. it leads to a rich and amazing theory.
Deﬁnition If properties 1). φ is called a binary operation. 2) ∃ e = eG ∈ G such that if a ∈ G e · a = a · e = a. If we write φ(a. ¯ ¯
3) If a ∈ G. (G. 4) If a. Everything presented here is standard. b ∈ G. b) = a · b. 2). This is the notation used in later chapters for the products of rings and modules. b. and to push forward to the chapter on linear algebra. c ∈ G then a · (b · c) = (a · b) · c. ∃ 0=0G ∈ G such that if a ∈ G ¯ ¯ 0+a = a+0= a. This chapter is. b. the most diﬃcult chapter in the book.
Suppose a. b ∈ G.. ¯ ¯ if a. However. If a ∈ G. there is deﬁned an element an in G. and a(−n) = (−a) + (−a) · · + (−a). Of course. If n1 . Also (an )m = anm . i. n2 . In general. then f is injective. let a0 =0 and if n > 0. Suppose G is an additive group. φ) be a multiplicative group.e.
.. Also f is bijective with f −1 given by f −1 (c) = a−1 · c.. Theorem (i) Let (G. If in addition. property 4) holds. Recall that b is an identity in G provided it is a right and left identity for any a in G. . i. then (a−1 )−1 = a.
(ii)
(iii)
(iv) (v)
(vi)
(vii)
Exercise. we say the group is abelian or commutative. if a · b = e then b = a−1 . This is so basic. this is just a special case of the cancellation law in (i). This means that if a is in G and n is an integer. The multiplication a1 · a2 · a3 = a1 · (a2 · a3 ) = (a1 · a2 ) · a3 is welldeﬁned. 1 n−1 n Suppose a ∈ G. then (a · b)n = an · bn . In fact. e is unique. Theorem. b ∈ G.20
Groups
Chapter 2
we say it is an additive group. If a ∈ G. then b = e. that we state it explicitly. Let a0 = e and if n > 0. nt ∈ Z then an1 · an2 · · · ant = an1 +···+nt . if G is abelian and a. (a · b)−1 = b−1 · a−1 . group structure is so rigid that if ∃ a ∈ G such that b is a right identity for a. ¯ let an = (a + · · +a) where the sum is n times. ¯ ¯ ¯ Also c · a = c · a ⇒ c = c. Finally.e. Also if b · a = e then b = a−1 .. Thus inverses are unique. if f : G → G is deﬁned by f (c) = a · c. c. c ∈ G. Write out the above theorem where G is an additive group. b ∈ G then (a · b = a) ⇒ (b = e) and (a · b = b) ⇒ (a = e). Also (a1 · a2 · · · an )−1 = a−1 · a−1 · · · a−1 . Note that part (vii) states that G has a scalar multiplication over Z.. then e = e. If a. Every right inverse is an inverse. ¯ ¯ In other words. a1 · a2 · · · an is well deﬁned. Then a · c = a · c ⇒ c = c. if e ∈ G satisﬁes 2). an = a · · · a (n times) and a−n = a−1 · · · a−1 (n times).
b) = a + b is an additive abelian group. and f has a left inverse in G iﬀ f is injective (see page 10). Show (G. φ) is a group. this theorem states that any additive abelian group is a Zmodule. i.. b) = a·b which satisﬁes 1). (See page 71.e. b) = ab is a multiplicative abelian group. b ∈ H then a · b ∈ H 2) if a ∈ H then a−1 ∈ H.. ∃b ∈ G with a · b = e].Chapter 2
Groups
21
which we write as (−a − a · · − a). If every element has a right inverse.e. then every element has a two sided inverse. b) = ab is not a group. the group axioms are stronger than necessary. G = R − 0 or G = Q − 0 with φ(a. or G = Z with φ(a. G = R+ = {r ∈ R : r > 0} with φ(a.
. i. Then the following properties hold in general. In other words. Show that f has a right inverse in G iﬀ f is surjective. Examples G = R. Also show that the set of all bijections from Z to Z is a group under composition. Subgroups Theorem satisfying and Suppose G is a multiplicative group and H ⊂ G is a nonvoid subset
Examples
1) if a. (a + b)n a(n + m) a(nm) a1 = = = = an + bn an + am (an)m a
Note that the plus sign is used ambiguously — sometimes for addition in G and sometimes for addition in Z.) Exercise Suppose G is a nonvoid set with a binary operation φ(a. G = Q. f · g = f ◦ g. show b · a = e. G = Z − 0 with φ(a. In the language used in Chapter 5. Note that G satisﬁes 1) and 2) but not 3). except the ﬁrst requires that G be abelian. b) = ab is a multiplicative abelian group. 2) and [ 3 ) If a ∈ G. and thus G is not a group. Exercise Suppose G is the set of all functions from Z to Z with multiplication deﬁned by composition.
Let H be the subset of G composed of all continuous functions. e ∈ H. then H = nZ is a subgroup of Z. Deﬁne an addition on G by (f + g)(t) = f (t) + g(t) for all t ∈ [0.e. Proof Since H is nonvoid. every subgroup of Z is of this form (see page 15). These are called the improper subgroups of G. H = {h ∈ G : g · h = h · g for all g ∈ G}. 1]. Show H1 ∪ H2 is not a subgroup of G. i. Suppose H1 and H2 are subgroups of G. then
Ht is a subgroup of G. Example If G = Z under addition. By 2). Suppose H1 and H2 are subgroups of G. ∃a ∈ H. Show H is a subgroup of G. H is called a subgroup of G. If G has an inﬁnite number of
.
t∈T
2) 3)
4) 5) 6)
Furthermore. The associative law is immediate and so H is a group. Ht is a subgroup of G. and n ∈ Z. By a theorem in the section on the integers in Chapter 1.. Let K be the subset of G composed of all diﬀerentiable functions.
t∈T
Suppose G= {all functions f : [0.22
Groups
Chapter 2
Then e ∈ H and H is a group under multiplication. if {Ht } is a monotonic collection.
Let H be the center of G. Suppose T is an index set and for each t ∈ T . Show Ht is a subgroup of G. 1] → R}. Example G is a subgroup of G and e is a subgroup of G. This is a key property of the integers.
Exercises 1)
Suppose G is a multiplicative group. with neither H1 nor H2 contained in the other. What theorems in calculus show that H and K are subgroups of G? What theorem shows that K is a subset (and thus subgroup) of H?
Order
Suppose G is a multiplicative group. Show H1 ∩ H2 is a subgroup of G. a−1 ∈ H and so by 1). This makes G into an abelian group.
we say that o(G).. and am = e iﬀ nm. . an−1 }. suppose a is an element of an additive group G. a1 . . H is isomorphic to Z or Zn . . . Suppose a ∈ G and H = {ai : i ∈ Z}. It says that the element a has ﬁnite order iﬀ f is not injective.. the Euclidean algorithm states that ∃ integers q and r with 0 ≤ r < n and m = nq + r.
Proof Suppose G = {ai : i ∈ Z} is a cyclic group and H is a subgroup of G. and we must show they are all of H.. G = {a0 . and so H = {a0 .e. .Chapter 2
Groups
23
elements. and in this case. the order of a is the smallest positive integer n with an = e. is inﬁnite. then G has an odd number of elements of order 2. . Thus am = anq · ar = ar . and f −1 (e) = nZ. in additive notation. a1 .. If ∃ distinct integers i and j with ai = aj . If m ∈ Z. then a has some ﬁnite order n. To begin. Note that f (k + l) = f (k) · f (l) where the addition is in Z and the multiplication is in the group H. If G has n elements. so suppose H = e.. Theorem Suppose a is an element of a multiplicative group G. H = {a0 . Exercise Show that if G is a ﬁnite group of even order. a1 . Let f : Z → H be the surjective function deﬁned by f (m) = am . If G is cyclic and H is a subgroup of G. the order of a is the smallest positive integer n with an = e. then H is cyclic. We come now to the ﬁrst real theorem in group theory. . and thus the positive integer m divides n. . a1 . . then o(G) = n. In either case.
. and am = e iﬀ nm.e. m divides t. and H = {ai : i ∈ Z}. the order of the subgroup generated by a. H is an abelian subgroup of G called the subgroup generated by a. If t is an integer with at ∈ H. and i H = {a : i ∈ Z}. Proof Suppose j < i and ai = aj . . In particular. In this case H has n distinct elements. an−1 } are distinct. i. Later in this chapter we will see that f is a homomorphism from an additive group to a multiplicative group and that. Exercise Write out this theorem for G an additive group. Now there is a smallest positive integer m with am ∈ H. an−1 }. i.. This implies that the elements of {a0 . then an = e ∈ H. then H is cyclic. Deﬁnition Theorem A group G is cyclic if ∃ an element of G which generates G. the order of G. an−1 }. Note that in the case G has ﬁnite order n. and thus am generates H. If H = e. we have a clear picture of the subgroups of G. Note that e is the only element of order 1. We deﬁne the order of the element a to be the order of H. Then ai−j = e and thus ∃ a smallest positive integer n with an = e... then by the Euclidean algorithm. Also note that this theorem was proved on page 15 for the additive group Z.
24
Groups
Chapter 2
Cosets Suppose H is a subgroup of a group G. It will be shown below that H partitions G into right cosets. It also partitions G into left cosets, and in general these partitions are distinct. If H is a subgroup of a multiplicative group G, then a ∼ b deﬁned by Theorem −1 a ∼ b iﬀ a · b ∈ H is an equivalence relation. If a ∈ G, cl(a) = {b ∈ G : a ∼ b} = {h · a : h ∈ H} = Ha. Note that a · b−1 ∈ H iﬀ b · a−1 ∈ H. If H is a subgroup of an additive group G, then a ∼ b deﬁned by a ∼ b iﬀ (a − b) ∈ H is an equivalence relation. If a ∈ G, cl(a) = {b ∈ G : a ∼ b} = {h + a : h ∈ H} = H + a. Note that (a − b) ∈ H iﬀ (b − a) ∈ H. Deﬁnition These equivalence classes are called right cosets. If the relation is deﬁned by a ∼ b iﬀ b−1 · a ∈ H, then the equivalence classes are cl(a) = aH and they are called left cosets. H is a left and right coset. If G is abelian, there is no distinction between right and left cosets. Note that b−1 · a ∈ H iﬀ a−1 · b ∈ H. In the theorem above, H is used to deﬁne an equivalence relation on G, and thus a partition of G. We now do the same thing a diﬀerent way. We deﬁne the right cosets directly and show they form a partition of G. You might ﬁnd this easier. Theorem Suppose H is a subgroup of a multiplicative group G. If a ∈ G, deﬁne the right coset containing a to be Ha = {h · a : h ∈ H}. Then the following hold. Ha = H iﬀ a ∈ H. If b ∈ Ha, then Hb = Ha, i.e., if h ∈ H, then H(h · a) = (Hh)a = Ha. If Hc ∩ Ha = ∅, then Hc = Ha. The right cosets form a partition of G, i.e., each a in G belongs to one and only one right coset. 5) Elements a and b belong to the same right coset iﬀ a · b−1 ∈ H iﬀ b · a−1 ∈ H. 1) 2) 3) 4) Proof There is no better way to develop facility with cosets than to prove this theorem. Also write this theorem for G an additive group.
Theorem
Suppose H is a subgroup of a multiplicative group G.
Chapter 2 1)
Groups
25
Any two right cosets have the same number of elements. That is, if a, b ∈ G, f : Ha → Hb deﬁned by f (h · a) = h · b is a bijection. Also any two left cosets have the same number of elements. Since H is a right and left coset, any two cosets have the same number of elements. G has the same number of right cosets as left cosets. The function F deﬁned by F (Ha) = a−1 H is a bijection from the collection of right cosets to the left cosets. The number of right (or left) cosets is called the index of H in G. If G is ﬁnite, o(H) (index of H) = o(G) and so o(H)  o(G). In other words, o(G)/o(H) = the number of right cosets = the number of left cosets. If G is ﬁnite, and a ∈ G, then o(a)  o(G). (Proof: The order of a is the order of the subgroup generated by a, and by 3) this divides the order of G.) If G has prime order, then G is cyclic, and any element (except e) is a generator. (Proof: Suppose o(G) = p and a ∈ G, a = e. Then o(a)  p and thus o(a) = p.) If o(G) = n and a ∈ G, then an = e. (Proof: ao(a) = e and n = o(a) (o(G)/o(a)) .)
2)
3)
4)
5)
6)
Exercises i) Suppose G is a cyclic group of order 4, G = {e, a, a2 , a3 } with a4 = e. Find the order of each element of G. Find all the subgroups of G. Suppose G is the additive group Z and H = 3Z. Find the cosets of H. Think of a circle as the interval [0, 1] with end points identiﬁed. Suppose G = R under addition and H = Z. Show that the collection of all the cosets of H can be thought of as a circle. Let G = R2 under addition, and H be the subgroup deﬁned by H = {(a, 2a) : a ∈ R}. Find the cosets of H. (See the last exercise on p 5.) Normal Subgroups We would like to make a group out of the collection of cosets of a subgroup H. In
ii) iii)
iv)
26
Groups
Chapter 2
general, there is no natural way to do that. However, it is easy to do in case H is a normal subgroup, which is described below. Theorem 1) 2) 3) 4) If H is a subgroup of a group G, then the following are equivalent. If a ∈ G, then aHa−1 = H If a ∈ G, then aHa−1 ⊂ H If a ∈ G, then aH = Ha Every right coset is a left coset, i.e., if a ∈ G, ∃ b ∈ G with Ha = bH.
Proof 1) ⇒ 2) is obvious. Suppose 2) is true and show 3). We have (aHa−1 )a ⊂ Ha so aH ⊂ Ha. Also a(a−1 Ha) ⊂ aH so Ha ⊂ aH. Thus aH = Ha. 3) ⇒ 4) is obvious. Suppose 4) is true and show 3). Ha = bH contains a, so bH = aH because a coset is an equivalence class. Thus aH = Ha. Finally, suppose 3) is true and show 1). Multiply aH = Ha on the right by a−1 . Deﬁnition If H satisﬁes any of the four conditions above, then H is said to be a normal subgroup of G. (This concept goes back to Evariste Galois in 1831.) Note For any group G, G and e are normal subgroups. If G is an abelian group, then every subgroup of G is normal. Exercise Show that if H is a subgroup of G with index 2, then H is normal.
Exercise Show the intersection of a collection of normal subgroups of G is a normal subgroup of G. Show the union of a monotonic collection of normal subgroups of G is a normal subgroup of G. Exercise Let A ⊂ R2 be the square with vertices (−1, 1), (1, 1), (1, −1), and (−1, −1), and G be the collection of all “isometries” of A onto itself. These are bijections of A onto itself which preserve distance and angles, i.e., which preserve dot product. Show that with multiplication deﬁned as composition, G is a multiplicative group. Show that G has four rotations, two reﬂections about the axes, and two reﬂections about the diagonals, for a total of eight elements. Show the collection of rotations is a cyclic subgroup of order four which is a normal subgroup of G. Show that the reﬂection about the xaxis together with the identity form a cyclic subgroup of order two which is not a normal subgroup of G. Find the four right cosets of this subgroup. Finally, ﬁnd the four left cosets of this subgroup.
This is made precise in the next theorem. n) = 1. and C and D are cosets..
Proof The element [a] is a generator iﬀ the subgroup generated by [a] contains [1] iﬀ ∃ an integer k such that [a]k = [1] iﬀ ∃ integers k and l such that ak + nl = 1. In this section we list
.. Proof Multiplication of elements in G/N is multiplication of subsets in G. and [a] = [a + nl] for any integer l. any nonzero element of Zp is a generator. In the additive abelian group R/Z. The coset E does not depend upon the choice of c and d. deﬁne E to be the coset containing c · d. Note that [a] = [r] where r is the remainder of a divided by n. Once multiplication is well deﬁned. Theorem If n > 1 and a is any integer.. [n − 1]. and thus the distinct elements of Zn are [0]. We already know that if p is a prime. Furthermore. . the group of integers mod n is deﬁned by Zn = Z/nZ.e. and in this case it is just [a]m = [am]. E = N (c · d). which is quite easy. Note that [a] + [b] = [a + b]. if G is ﬁnite. Exercise Show that a positive integer is divisible by 3 iﬀ the sum of its digits is divisible by 3.. Zn . the group axioms are immediate. i. It follows that they honor identities and inverses. Exercise Write out the above theorem for G an additive group. We wish to deﬁne a coset E which is the product of C and D. the coset a + nZ is denoted by [a]. determine those elements of ﬁnite order. o(G/N ) = o(G)/o(N ). Theorem Suppose G is a multiplicative group. Example Suppose G = Z under +. and G/N is the collection of all cosets. If c ∈ C and d ∈ D. G/N is a group. Its identity is N and (N a)−1 = (N a−1 ). Also Zn is cyclic because each of [1] and [−1] = [n − 1] is a generator. If a is an integer. and with this multiplication. Any additive abelian group has a scalar multiplication over Z. (N a) · (N b) = N (aN )b = N (N a)b = N (a · b). Then (N a) · (N b) = N (a · b) is a well deﬁned multiplication (binary operation) on G/N . N is a normal subgroup.) Homomorphisms Homomorphisms are functions between groups that commute with the group operations. (See the ﬁfth exercise on page 18. [1]. because Zp has p elements. −[a] = [−a].Chapter 2
Groups
27
Quotient Groups Suppose N is a normal subgroup of G. then [a] is a generator of Zn iﬀ (a. and N = nZ. n > 1. Note that [10] = [1] in Z3 .
In particular. ¯ ¯ ¯ ¯ If H is a subgroup of G. The kernel of f is deﬁned by −1 ker(f ) = f (¯) = {a ∈ G : f (a) = e}. ¯ If H is a subgroup of G. The ﬁrst inverse is in G. Properties 11). If g ∈ G. In other words..28
Groups
Chapter 2
the basic properties. In other words. f is injective ⇔ ker(f ) = e. ¯ ¯ Deﬁnition If G and G are multiplicative groups. for all a. f (a · b) = f (a) · f (b). if H is ¯ ¯ normal in G. The function h : Z → R − 0 deﬁned by h(t) = 2t is a homomorphism from an additive group to a multiplicative group. if f (g) = g then ¯ ¯ g ¯ −1 g ¯ f (¯) = N g where N = ker(f ).) ¯ ¯ ¯ Examples The constant map f : G → G deﬁned by f (a) = e is a homomorphism.e. while on the right side it is in G. image(f ) is ¯ a subgroup of G. the inclusion i : H → G is a homomorphism. On the left side. These will be helpful later on in the study of ring homomorphisms and module homomorphisms. and the second is in G. b ∈ G. i. As always. and should be considered as the cornerstones of abstract algebra. f (H) is a subgroup of G. f −1 (¯) is void or is a coset of ker(f ). and 13) show the connections between coset groups and homomorphisms. while the function deﬁned by f (t) = t + 2 is not a homomorphism. f −1 (H) is a subgroup of G. the group ¯ operation is in G. then f −1 (H) is normal in G.
We now catalog the basic properties of homomorphisms. (If G is an additive group.
5)
6) 7)
. ker(f ) = f −1 (0). ¯ ¯ f (a−1 ) = f (a)−1 . if the equation f (x) = g has a ¯ ¯ Suppose G and G are groups and f : G → G is a homomorphism. a function f : G → G is a homomorphism if. Theorem 1) 2) 3) 4) f (e) = e. The kernel of f is a normal subgroup of G. the student should rewrite the material in additive notation. ¯ If H is a subgroup of G. The function f : Z → Z deﬁned by f (t) = 2t is a homomorphism of additive groups. 12). the kernel is the set of e ¯ ¯ solutions to the equation f (x) = e. Furthermore.
¯ ¯ 12) Suppose H is a normal subgroup of G. domain(f )/ker(f ) ≈ image(f ).Chapter 2
Groups
29
solution. Then π : G → G/H deﬁned by π(a) = Ha is a surjective homomorphism with kernel H. Thus if H = ker(f ). then H is a subgroup of G ¯ ¯ iﬀ f (H) is a subgroup of G. ¯ ¯ If f : G → G is a bijection. f is called an isomorphism. 11) Suppose H is a normal subgroup of G. this is somewhat of a copout. and thus G/H ≈ image(f ). because an cyclic iﬀ G algebraic property is one that. then h ◦ f : G →G is a homomorphism. G is ¯ is cyclic. then G/H ≈ G (see below). then f : G/H → G ¯(Ha) = f (a) is a welldeﬁned homomorphism making deﬁned by f the following diagram commute.e. H is normal in G iﬀ f (H) is normal in G. Furthermore. then the set of all solutions is a coset of N = ker(f ). ¯ f is injective. etc. For example.
. f is also called an automorphism. 8) ¯ The composition of homomorphisms is a homomorphism. if h : G →G is = a homomorphism. This is a key fact which is used routinely in topics such as systems of equations and linear diﬀerential equations. if ¯ ¯ f : G → G is a surjective homomorphism with kernel H. 13) Given any group homomorphism f . ¯ In this case. and we write G ≈ G. In the case ¯ G = G. if f is an isomorphism and H ⊂ G is a subset. G π G/H
c& &
f
& & ¯
E b & &
¯ G
f
Thus deﬁning a homomorphism on a quotient group is the same as deﬁning a homomorphism on the numerator which sends the denominator to e. The ¯ ¯ ¯ image of f is the image of f and the kernel of f is ker(f )/H. This is the fundamental connection between quotient groups and homomorphisms. If H ⊂ ker(f ). by deﬁnition.
=
9)
10) Isomorphisms preserve all algebraic properties. i. Of course.. is preserved under isomorphisms. then the function f −1 : G → G is a homomorphism.
K is a cyclic group of order n iﬀ K ≈ Zn . Exercise Let G be as above and g ∈ G. 1] → R be deﬁned by g(t) = t3 − 3t + 4. If o(a) = n. K ≈ Z. Deﬁne f : G → G by f (h) = h + 5h + 6t2 h. i. ¯ Proof of 14) Suppose G = K is generated by some element a. Show that every group of order four is abelian.e. g = e. For which subgroup H of G is S an Hcoset?
. Find f −1 (g) and show it is a coset of ker(f ). Then K is an inﬁnite cycle group iﬀ K is isomorphic to the integers under addition. 1] → R : h has an inﬁnite number of derivatives}. If o(a) is inﬁnite. Then f : Z → K deﬁned by f (m) = am is a homomorphism from an additive group to a multiplicative group.30
Groups
Chapter 2
14) Suppose K is a group. Then f is a group homomorphism and the diﬀerential equation h +5h +6t2 h = g has a solution iﬀ g lies in the image of f . Let g : [0. Now suppose this equation has a solution and S ⊂ G is the set of all solutions. Exercise Let G = {h : [0.. Exercise If a is an element of a group G. Deﬁne f : G → G by f (h) = dh = h . f is an isomorphism. When is there a homomorphism from Zn to G which sends [1] to a? What are the homomorphisms from Z2 to Z6 ? What are the homomorphisms from Z4 to Z8 ? Exercise 1) Suppose G is a group and g is an element of G. Show f dt is a homomorphism and ﬁnd its kernel and image. Then G is a group under addition.
Under what conditions on g is there a homomorphism f : Z7 → G with f ([1]) = g ? Under what conditions on g is there a homomorphism f : Z15 → G with f ([1]) = g ? Under what conditions on G is there an injective homomorphism f : Z15 → G ? Under what conditions on G is there a surjective homomorphism f : Z15 → G ?
2)
3) 4)
Exercise We know every ﬁnite group of prime order is cyclic and thus abelian. there is always a homomorphism from Z to G which sends 1 to a. ker(f ) = nZ and ¯ f : Zn → K is an isomorphism.
Show that f is a homomorphism.. . 2. 2. In this setting. h is a homomorphism. Proof Let h : G → Sn be the function which sends a to the bijection ha : G → G deﬁned by (g)ha = g · a. i... n}. Exercise Show that if there is a bijection between X and Y . n}. there is an isomorphism between S(X) and S(Y ). . A bijection f : X → X is called a permutation on X. S(X) forms a multiplicative group under composition. 1) 2) 3) For each given a.Chapter 2
Groups
31
Exercise Suppose G is a multiplicative group and a ∈ G. They are all denoted by the one symbol Sn . and these groups are called the symmetric groups on n elements. Find g −1 Hg.e. The proof follows from the following observations. and H = {f ∈ Sn : (n)f = n}. ha·b = ha ◦ hb . f (g) = a−1 · g · a.. h is injective and thus G is isomorphic to image(h) ⊂ Sn . The next theorem shows that the symmetric groups are incredibly rich and complex. Therefore the composition f ◦ g means “f followed by g”. Theorem (Cayley’s Theorem) Suppose G is a multiplicative group with n elements and Sn is the group of all permutations on the set G. i. Then G is isomorphic to a subgroup of Sn . i. f = (x)f . S(X) ≈ S(Y ). Deﬁne f : G → G to be conjugation by a. ha is a bijection from G to G. Exercise Show that o(Sn ) = n!..e... Also show f is an automorphism and ﬁnd its inverse. Show H is a subgroup of Sn which is isomorphic to Sn−1 .e. The following deﬁnition shows that each element of Sn may
. Let g be any permutation on X with (n)g = 1.. Permutations Suppose X is a (nonvoid) set. Thus if each of X and Y has n elements. Sn = S(X). and the collection of all these permutations is denoted by S = S(X). Let X = {1..
The Symmetric Groups Now let n ≥ 2 and let Sn be the group of all permutations on {1. variables are written on the left.
. ak ) and (c1 . There is a special type of permutation called a cycle. a2 )(a1 . 2) and (a1 . They are all easy except 4). ) The parity of the number of these transpositions is unique. (This is easy. f is said to be an even permutation. 2)(1. a2 . For example (1. a1 a2 . (This is obvious.) Every permutation can be written (nonuniquely) as the product of transpositions.. b1 b2 .. (Proof: I = (1. . For these we have a special notation. Properties 9) and 10) are listed solely for reference. Deﬁnition a1 a2 . a3 ) · · · (a1 . ak } is a collection of distinct integers with 1 ≤ ai ≤ n... The cycles (a1 .... a2 . In the other case.32 be represented by a matrix. 2)(1. ... then p is even iﬀ q is even. .. ak ). ak Then the matrix represents f ∈ Sn deﬁned by (ai )f = bi for 1 ≤ i ≤ k. {a1 . 3) = (1. a2 a3 .. A kcycle is even (odd) iﬀ k is odd (even).. 2. .ak−1 ak is called a kcycle. and is denoted by (a1 ... then g ◦ f is
3)
4)
5)
6)
. c ) are disjoint provided ai = cj for all 1 ≤ i ≤ k and 1 ≤ j ≤ . ak ).. 3) is an even permutation. b2 . .ak a1 A 2cycle is called a transposition.
Groups
Chapter 2
Deﬁnition Suppose 1 < k ≤ n... . bk } is the same collection in some diﬀerent order.. If one of f and g is even and the other is odd.) Every nonidentity permutation can be written uniquely (except for order) as the product of disjoint cycles.. Suppose f.. which takes a little work. g ∈ Sn . In this case... The composition of two permutations is computed by applying the matrix on the left ﬁrst and the matrix on the right second. and {b1 .. f is an odd permutation. ak ) = (a1 . This means that if f is the product of p transpositions and also of q transpositions... Listed here are eight basic properties of permutations. Theorem 1) 2) Disjoint cycles commute. bk and (a)f = a for all other a.
and c are distinct integers in {1. n)(a.
8)
The following parts are not included in this course. . c) = (a.) Exercise 1) 1 2 3 4 5 6 7 as the product of disjoint cycles. Note that (a.1) as the product of disjoint cycles. An is simple. n)} generates Sn . Since t is odd. n). . Write (3. . 2). n}. . 3)3 . We will rewrite I as a product of transpositions σ1 σ2 · · · σt where (n)σi = (n) for 1 ≤ i < t and (n)σt = n. 2. n). . Proof of 4) It suﬃces to prove if the product of t transpositions is the identity I on {1. {(1. c. Suppose for convenience the ﬁrst transposition is (a. then t is even. . c and d are distinct integers in {1.e. n) = I. (Of course there are subgroups of Sn which cannot be generated by two elements). every even permutation is the product of 3cycles. Its kernel (the subgroup of even permutations) is denoted by An and is called the alternating group. 6 5 4 3 1 7 2 Write (1. 2. i. Which of these permutations are odd and which are even? Write
.5. n − 1}. .7)(2. 2. . Thus An is a normal subgroup of index 2. (Now you may solve the tile puzzle on page viii. and Sn /An ≈ Z2 . b) and (a. b).1)(1.5. d)(a. n). If a.3. then g ◦ f is even. c.7. .6.7. b)(b.3. n)(a. They are presented here merely for reference..) 7) The map h : Sn → Z2 deﬁned by h(even)= [0] and h(odd)= [1] is a homomorphism from a multiplicative group to an additive group. c)(a. which will be a contradiction.. b)(a. where t is the smallest odd integer this is possible.Chapter 2
Groups
33
odd. d) = (a. n)(a. c)(c. (a. (Obvious. has no proper normal subgroups. . 9) For any n = 4. then (a. then (a. .6. . 2. (1. n)(b. it is at least 3..
10) Sn can be generated by two elements.4)(3. n) = (a. it follows that for n ≥ 3. . and (a.4) as the product of disjoint cycles. n}. (a. Since I = (1. This can be done by inductively “pushing n to the right” using the equations below.. b. If a. n). In fact. c) = (a. b)(c. Suppose this is false and I is written as t transpositions. b. n)(b. If f and g are both even or both odd. c. n) cannot occur here because it would result in a shorter odd product. 2. c) = (b.7)(2.
−1). What is the order of their product? Suppose σ ∈ Sn . This operation makes G1 × G2 into a group. (1. . Show that H = {σ ∈ S6 : (6)σ = 6} is a subgroup of S6 and ﬁnd its right cosets and its left cosets. Theorem Suppose G1 and G2 are additive groups. depending on personal preference or context. the opposite group. read ﬁrst the two theorems on page 11. The two theorems below are transparent and easy. If G is a multiplicative group. a2 ) + (b1 . 2. As an exercise. Show that it has the same identity and the same inverses as G. For background. . this section should be rewritten using multiplicative notation. 1). and (−1. 6) and ﬁnd τ −1 (1. 3)σ = ((1)σ. In other words. Also let τ = (4. . For simplicity we ﬁrst consider the product of two groups. . It follows from Cayley’s theorem that G is isomorphic to a subgroup of S8 . −a2 ). . . a2 + b2 ). 5)τ . and that f : G → Gop deﬁned by f (a) = a−1 is a group isomorphism. n} with the variable op written on the right. n} with the variable written on the left. . ak ) and (c1 . Show that G is isomorphic to a subgroup of S4 . . The projections π1 : G1 × G2 → G1 ¯ ¯
. Now consider the special case G = Sn . op Show that an element of Sn is a permutation on {1. the new multiplication is the old multiplication in the opposite order. 2.) Product of Groups
3)
4)
5)
6)
The product of groups is usually presented for multiplicative groups. . 3. 5. 4. (3)σ). This deﬁnes a new group denoted by Gop . It is presented here for additive groups because this is the form that occurs in later chapters. a2 ) = (−a1 . and G be the collection of all isometries of A onto itself. deﬁne a new multiplication on the set G by a ◦ b = b · a. −1). . 2. The convention used in this section is that an element of Sn is a permutation on {1. Show that σ −1 (1. 1).34 2)
Groups
Chapter 2
Suppose (a1 . but quite useful. . This shows that conjugation by σ is just a type of relabeling. Deﬁne an addition on G1 × G2 by (a1 . either Sn or Sn may be called the symmetric group. Let A ⊂ R2 be the square with vertices (−1. We know from a previous exercise that G is a group with eight elements. (1. c ) are disjoint cycles. 02 ) and −(a1 . . . (Of course. b2 ) = (a1 + b1 . (2)σ. 2. Its “zero” is (01 . although the case of inﬁnite products is only slightly more diﬃcult. . . .
Exercise isomorphic. G is isomorphic to Z4 or Z2 ×Z2 . Now f (a + b) = (f1 (a + b). n) = 1. Suppose G is an additive group. f is a group homomorphism iﬀ each of f1 and f2 is a group homomorphism. Exercise Suppose n > 2. The next theorem is stated in full generality. f2 (a + b)) and f (a) + f (b) = (f1 (a). Show Z12 is isomorphic to Z4 × Z3 . a2 ) in G1 × G2 .
. so let’s prove the last part. Finally.
Exercise Show that if G is any group of order 4. Show that the addition in the product R × R is just the usual addition in analytic geometry. f2 (b)) = (f1 (a) + f1 (b). Proof It is transparent that the product of groups is a group. Is Sn isomorphic to An × G where G is a multiplicative group of order 2 ? One nice thing about the product of groups is that it works ﬁne for any ﬁnite number. Exercise Suppose G1 and G2 are groups and i1 : G1 → G1 × G2 is deﬁned by i1 (g1 ) = (g1 . or even any inﬁnite number. Show i1 is an injective group homomorphism and its image is a ¯ normal subgroup of G1 × G2 . show that Zmn is isomorphic to Zm × Zn iﬀ (m. Let π2 : G1 × G2 → G2 be the projection map deﬁned in the Background chapter. Exercise Suppose G1 and G2 are groups. Usually G1 is identiﬁed with its image under i1 . G1 . We know there is a bijection from {functions f : G → G1 × G2 } to {ordered pairs of functions (f1 . ﬁnd the order of (a1 . f2 (a) + f2 (b)). f2 ) is a function from G to G1 × G2 .Chapter 2
Groups
35
and π2 : G1 × G2 → G2 are group homomorphisms. f2 (a)) + (f1 (b). Show that G1 × G2 and G2 × G1 are If o(a1 ) = m and o(a2 ) = n. f2 ) where f1 : G → G1 and f2 : G → G2 }. Show π2 is a surjective homomorphism with kernel G1 . so G1 may be considered to be a normal subgroup of G1 × G2 . Under this bijection. Show Z4 is not isomorphic to Z2 × Z2 . Therefore (G1 × G2 )/G1 ≈ G2 as you would expect. Suppose G. An examination of these two equations shows that f is a group homomorphism iﬀ each of f1 and f2 is a group homomorphism. Exercise Let R be the reals under addition. 02 ). and G2 are groups and f = (f1 .
Exercise Suppose s is an element of T and πs : Gt → Gs is the projection map deﬁned in the Background chapter. For each t ∈ T . [m]). Suppose G is an additive group. [m]. Gt is an additive group. [m]). Also note that if T = [0. Finally. and d is greater than 1 and f : Z → Zb × Zc × Zd is deﬁned by f (m) = ([m]. Exercise Suppose s is an element of T and is : Gs → Gt is deﬁned by is (a) = {at } where at = 0 if t = s and as = a. {at }n = {at n}. Find the kernel of h and show that h is surjective..)
. Now let h : Z → Z8 × Z9 × Z35 be deﬁned by h(m) = ([m]. Show GT is a group. [m]. f is a group homomorphism iﬀ each ft is a group homomorphism. Find the kernel of f and show that f is not surjective. [m]). Find the kernel of f. Deﬁne an addition on Gt = Gt by {at } + {bt } = {at + bt }. 21} is 1. Exercise Suppose T is a nonvoid set. (For the ring and module versions. Exercise Let f : Z → Z30 × Z100 be the homomorphism deﬁned by f (m) = ([4m]. see exercises on pages 44 and 69. Under the natural bijection from {functions f : G → Gt } to {sequences of functions {ft }t∈T where ft : G → Gt }. ¯ Each projection πs : Gt → Gs is a group homomorphism. Show πs is a surjective homomorphism and ﬁnd its kernel. Exercise Let f : Z → Z90 × Z70 × Z42 be the group homomorphism deﬁned by f (m) = ([m]. 35. 1] and G = R. and for any t ∈ T . c. Find the kernel of g and determine if g is surjective. Note that the gcd of {45. let Gt = G. [3m]). [m]. Find the order of ([4].e. [m]. [m]). i. Find necessary and suﬃcient conditions for f to be surjective (see the ﬁrst exercise on page 18).36
Groups
Chapter 2
Theorem Suppose T is an index set. This opt∈T
eration makes the product into a group. Finally suppose each of b. Note that GT is just another way of writing Gt . the addition deﬁned on
t∈T
GT is just the usual addition of functions used in calculus. [3]) in Z30 × Z100 . the scalar multiplication on Gt by integers is given coordinatewise. and GT is the collection of all functions f : T → G with addition deﬁned by (f + g)(t) = f (t) + g(t). Show is is an injective homomorphism ¯ and its image is a normal subgroup of Gt . G is an additive group. Its “zero” is {0t } and −{at } = {−at }. Let g : Z → Z45 × Z35 × Z21 be deﬁned by g(m) = ([m]. Thus each Gs may be considered to be a normal subgroup of Gt . Proof The addition on Gt is coordinatewise.
b. a · (b + c) = (a · b) + (a · c) and (b + c) · a = (b · a) + (c · a). because these concepts are unnecessary for our development of linear algebra. Also there is no mention of prime or maximal ideals. b. it would be easy to do.. i.)
3) 4)
Deﬁnition If 1). (The commutative property for multiplication.. Assuming the results of Chapter 2. map from R × R to R) which is denoted by multiplication. If in addition 4) is satisﬁed. Examples The basic commutative rings in mathematics are the integers Z. R is said to be a commutative ring. b ∈ R. We do not show that the polynomial ring F [x] is a unique factorization domain. (The distributive law.) R has a multiplicative identity. and 3) are satisﬁed. Consider the following properties. (The associative property of multiplication. the 37
. ¯ ¯ ¯ ¯ If a. a · b = b · a. R = 0. there is an element 1 = 1R ∈ R such that if a ∈ R. c ∈ R. These concepts are developed in the Appendix.Chapter 3
Rings
Rings are additive abelian groups with a second operation called multiplication. c ∈ R.) If a. a · 1 = 1 · a = a. This is because ideals are also normal subgroups and ring homomorphisms are also group homomorphisms. this chapter ﬂows smoothly. which connects addition and multiplication. 2). although with the material at hand.e. The connection between the two operations is provided by the distributive law. Suppose R is an additive abelian group. R is said to be a ring. and R has a second binary ¯ operation (i. 1) 2) If a. (a · b) · c = a · (b · c). A section on Boolean rings is included because of their importance in logic and computer science.e.
it follows that 1 = 0. we will deﬁne R[x1 . Then na = n · a.) Let n = n1. has a natural multiplication under which it is a commutative ring. b ∈ R. ¯ ¯ ¯ ¯ ¯ ¯ (−a) · b = a · (−b) = −(a · b).e. m ∈ Z. . Under these operations. n ≥ 1.38
Rings
Chapter 3
rational numbers Q. ¯ Theorem 0 can never be a unit. Since R = 0. Now suppose R is any ring. In the next chapter. The product of units is a unit with (a · b)−1 = b−1 · a−1 . scalar ¯ ¯ ¯ ¯ ¯ ¯ multiplication by n is the same as ring multiplication by n. x2 . it has a scalar multiplication over Z (page 20). ¯ ¯ Units Deﬁnition An element a of a ring R is a unit provided ∃ an element a−1 ∈ R with a · a−1 = a−1 · a = 1. If n > 1. They should be worked as exercises. This is a basic example of a noncommutative ring. i. since R is an additive abelian group. ¯ Of course. The next two theorems show that ring multiplication behaves as you would wish it to.. and the next theorem shows it relates nicely to the ring multiplication. a · 0 = 0 · a = 0. operations of addition and multiplication of matrices will be deﬁned. More
2)
. If a is a unit. 2 = 1 + 1. and the complex numbers C. Theorem 1) Suppose a. a−1 is also a ¯ −1 ¯ unit with (a )−1 = a. This scalar multiplication can be written on the right or left. Rn is a ring. na = an. (na) · (mb) = (nm)(a · b). even if R is commutative. .
Recall that. xn ]. n may be 0 even though n = 0. Rn is never commutative. the real numbers R. It will be shown later that Zn . the integers mod n. and Rn is the collection of all n×n matrices over R. that is. For example. a polynomical ring in n variables. . Also if R is any commutative ring. 1 is always a unit. . b ∈ R and n. Theorem 1) 2) Suppose R is a ring and a. (This follows from the distributive law and the previous theorem.
. Deﬁnition Suppose R is a commutative ring. (−a) is a unit and (−a)−1 = −(a−1 ).. Theorem Suppose a ∈ R and ∃ elements b and c with b · a = a · c = 1. ¯ by the pigeonhole principle. In other words. The set of all units of R forms a multiplicative group denoted by n−1 1 n R∗ . Theorem Suppose R is a commutative ring and a ∈ (R − 0) is not a zero divisor. . Theorem A ﬁeld is a domain. we ﬁrst consider the concept of zero divisor. it cannot be a zero divisor. A ﬁeld is a commutative ring such that. It suﬃces to require a left inverse and a right inverse. Finally if a is a unit. In order for a to be a unit. then their product is a unit with (a1 · a2 · · · an )−1 = a−1 · a−1 · · · a−1 . if a1 .
Proof A ﬁeld is a domain because a unit cannot be a zero divisor. A ﬁnite domain is a ﬁeld. An element a ∈ R is called a zero divisor provided it is nonzero and ∃ a nonzero element b with a · b = 0. Then f : R → R deﬁned by f (b) = a · b is injective and.Chapter 3
Rings
39
generally. f is surjective. It is surjective iﬀ a is a unit.
. Deﬁnition A domain (or integral domain) is a commutative ring such that.. Then ¯ b = c and so a is a unit with a−1 = b = c. multiplication by a is an injective map from R to R. Note that ¯ if a is a unit. as shown in the next theorem. a is ¯ ¯ a unit. a2 . it must have a twosided inverse.
Corollary
Domains and Fields In order to deﬁne these two types of rings. ¯ Then (a · b = a · c) ⇒ b = c. if a = 0. ¯ ¯ Inverses are unique. Thus a is a unit and so R is a ﬁeld. Proof b = b · 1 = b · (a · c) = (b · a) · c = 1 · c = c. In other words. an are units. R is a ﬁeld if it is commutative and its nonzero elements form a group under multiplication. if a = 0. a is not a zero divisor. Suppose R is a ﬁnite domain and a = 0.
Then n = ab where 1 < a < n. and C are ﬁelds.. 0) and if i = (0. n is a prime. Thus 1) and 3) are equivalent. Then by the previous theorem. the multiplication Proof is welldeﬁned. This is a well deﬁned binary operation which makes Zn into a commutative ring. Show C is a commutative ring which is a ﬁeld. It leads to a neat little theory. Note that 1 = (1. then i2 = −1. the following are equivalent. [2].[n − 1] is a unit.. because Zn is ﬁnite. [a] is a generator of the additive group Zn . ¯ ¯ Examples Z is a domain. However.) We know from page 27 that Zn is an additive abelian group. ad + bc).
. Corollary 1) 2) 3) If n > 1. b) · (c.. Deﬁne a multiplication on Zn by [a] · [b] = [ab]. The Integers Mod n The concept of integers mod n is fundamental in mathematics. because each says ∃ an integer b with [a]b = [1]. Theorem 1) 2) 3) Suppose n > 1 and a ∈ Z. Since [a + kn] · [b + l n] = [ab + n(al + bk + kl n)] = [ab].. (a. R. (See the Chinese Remainder Theorem on page 50.
Proof We already know 1) and 2) are equivalent. and thus 2) is true. as seen by the theorems below. [a] is a unit of the ring Zn .
Proof We already know from page 27 that 1) and 2) are equivalent. Theorem Suppose n > 1. each of [1]. 1). the basic theory cannot be completed until the product of rings is deﬁned. Then the following are equivalent. Deﬁne multiplication by (a. Now suppose 3) is false. The ring axioms are easily veriﬁed. Q.40
Rings
Chapter 3
Exercise Let C be the additive abelian group R2 . Zn is a domain. 1 < b < n. Recall that if b is an integer. d) = (ac − bd. Suppose 3) is true. [a]b = [a] · [b] = [ab]. Zn is a ﬁeld. n) = 1.
2sided) ideals of R. then . if R is commutative. Note that Z is a subring of Q. ¯ If {It }t∈T is a collection of right (left.
Theorem 1) 2)
Suppose R is a ring. 1 ∈ S . Of course. every right or left ideal is an ideal. Show that in Z12 the equation x2 = 1 has four ¯ solutions. then s may be a unit in R but not in S. as well as in group theory. The statement that S is a subring of R means that S is a subgroup of the group R.
Subrings Suppose S is a subset of a ring R. That role is played by ideals. and an ideal is never a subring (unless it is the entire ring). 2sided) ideal of R. Show that (Z7 )∗ is a cyclic group but (Z12 )∗ is not. The a · b and b · a ∈ I word “ideal ” means “2sided ideal”. then right (left. and thus occupy a special place in ring theory. and (a. Q is a subring of R. (See page 22. and thus [a] is a zero divisor and 1) is false. Subrings do not play a role analogous to subgroups. Ideals and Quotient Rings Ideals in ring theory play a role analagous to normal subgroups in group theory. Exercise List the units and their inverses for Z7 and Z12 .) It is a
t∈T
. b ∈ S ⇒ a · b ∈ S).Chapter 3
Rings
41
[a][b] = [0]. Note also that Z and Zn have no proper subrings. x2 = 1 can have at most two solutions ¯ in R (see the ﬁrst theorem on page 46). and R is a subring of C. Finally show that if R is a domain. These are called the improper ideals. Note that if S is a subring of R and s ∈ S. ¯ Then clearly S is a ring and has the same multiplicative identity as R.
R and 0 are ideals of R.
left Deﬁnition A subset I of a ring R is a right ideal provided it is a subgroup 2−sided a·b ∈ I b·a∈I of the additive group R and if a ∈ R and b ∈ I.
i) I = R. Since I is a normal subgroup of the additive group R. 4) If a ∈ R. b ∈ R then f (a · b) = f (a) · f (b). multiplication
. because it is of the form nZ. R/I is an additive abelian group. Thus multiplication is well deﬁned. iii) I contains 1. Show that R is a ﬁeld iﬀ R contains no proper ideals. ¯
5)
Exercise Suppose R is a commutative ring. Theorem Suppose R is a ring and I ⊂ R is an ideal. The following theorem is just an observation. then
t∈T
Rings
Chapter 3
It is a right (left. The multiplicative identity is (1 + I). and I = nZ. but it is in some sense the beginning of ring theory. the ring structure on Zn = Z/nZ is the same as the one previously deﬁned. Proof (a + I) · (b + I) = a · b + aI + Ib + II ⊂ a · b + I. 2sided)
ideal of R. if the collection is monotonic. and the ring axioms are easily veriﬁed. n > 1. aR is an ideal. Thus every subgroup of Z is a principal ideal. A function f : R → R is a ring homomorphism provided 1) 2) 3) f is a group homomorphism f (1R ) = 1R and ¯ ¯ ¯ if a. ¯ Observation If R = Z. I = aR is a right ideal. If R is a commutative ring and I ⊂ R is an ideal. Homomorphisms ¯ ¯ Deﬁnition Suppose R and R are rings. (On the left. ii) I contains some unit u.42 3) Furthermore. Multiplication of cosets deﬁned by (a + I) · (b + I) = (ab + I) is welldeﬁned and makes R/I a ring. Thus if R is commutative. then the following are equivalent. I = R. called a principal ideal.
The image of a ring homomorphism is a subring of the range. I = R. then f −1 (I) is an ideal of R. From now on the word “homomorphism” means “ring homomorphism”. ¯ ¯ ¯ The composition of ring homomorphisms is a ring homomorphism. Much of this work has already been done by the theorem in group theory on page 28. namely ker(f ) = f −1 (0). then f : R/I → R deﬁned by f (a + I) = f (a)
3) 4)
5) 6)
7)
8)
.Chapter 3
Rings ¯ is in R. f is also called a ring automorphism. then R/I ≈ R (see below). if f : R → R is a surjective ¯ ring homomorphism with kernel I. Then π is a surjective ring ¯ homomorphism with kernel I. π(a) = (a + I). ¯ Suppose f : R → R is a homomorphism and I is an ideal of R.)
43
The kernel of f is the kernel of f considered as a group homomorphism. ¯ If f : R → R is a bijection which is a ring homomorphism. Theorem 1) 2) ¯ Suppose each of R and R is a ring.
The identity map IR : R → R is a ring homomorphism. ¯ ¯ In fact. In the case R = R. ¯ The zero map from R to R is not a ring homomorphism (because it does not send 1R to 1R ). I = R. if f : R → R is a homomorphism and I ⊂ R is an ideal. and π : R → R/I is the natural projection. ¯ ¯ ¯ If I ⊂ ker(f ). ¯
Here is a list of the basic properties of ring homomorphisms. Furthermore. Such an f is called then f : R ¯ a ring isomorphism. −1 ¯ → R is a ring homomorphism. Suppose I is an ideal of R. The kernel of a ring homomorphism is an ideal of the domain. while on the right multiplication is in R.
This means if f and g are functions from T to R. R is a ring. Exercise Suppose T is a nonvoid set.44
Rings
Chapter 3
is a welldeﬁned homomorphism making the following diagram commute. and it is only necessary ¯ to check that f is a ring homomorphism. Thus if I = ker(f ). show that f : R → R deﬁned by f (a) = u−1 · a · u is a ring homomorphism which is an isomorphism. then conjugation by u is an automorphism on R. The image of f is the image of f .
. Proof We know all this on the group level.
Exercise Find a ring R with a proper ideal I and an element b such that b is not a unit in R but (b + I) is a unit in R/I. and RT is the collection of all functions f : T → R.e. Exercise Show that if u is a unit in a ring R.1] be the collection of all C ∞ functions. and ¯ ¯ the kernel of f is ker(f )/I. A ={f : [0. Notice that much of the work has been done in the previous exercise. Let A ⊂ R[0.1] . R π
c &&
f
&
E b & &
¯ R
&
&
&
¯ f
R/I Thus deﬁning a homomorphism on a quotient ring is the same as deﬁning a homomorphism on the numerator which sends the ¯ denominator to zero. Deﬁne addition and multiplication on RT pointwise. Exercise Now consider the case T = [0.. 1] and R = R. It is only necessary to show that A is a subring of the ring R[0. 1] → R : f has an inﬁnite number of derivatives}. domain(f )/ker(f ) ≈ image(f ). deﬁne a function α∗ (f ) : S → R by α∗ (f ) = f ◦ α. If f : T → R is a function. i. Show α∗ : RT → RS is a ring homomorphism. Show that under these operations R T is a ring. That is. and so R/I ≈ image (f ). then (f + g)(t) = f (t) + g(t) and (f · g)(t) = f (t)g(t). 9) Given any ring homomorphism f. which is obvious. Show A is a ring. f is injective. Suppose S is a nonvoid set and α : S → T is a function.
a1 . a1 . R[x ] is also a domain. a0 b2 + a1 b1 + a2 b0 .. . a0 b1 + a1 b0 . think of a polynomial a0 + a1 x + · · · as an inﬁnite sequence (a0 . Let ai xi and bj xj be the ﬁrst nonzero terms of f and g. (If R is a ﬁeld. ¯ and is denoted by n = deg(f ).. the idea is to divide f into g ¯ until the remainder has degree less than m.) = (a0 + b0 . ¯ To be more formal.) Then for any g ∈ R[x ]. a1 + b1 . .. If the top term an = 1.) such that each ai ∈ R and only a ﬁnite number are nonzero. as is often done for convenience...Chapter 3
Rings Polynomial Rings
45
In calculus.) and (a0 .. ∃! h..) = (a0 b0 . . Suppose n ≥ m and the result holds for any polynomial of degree less than
. Then (a0 . the top coeﬃcient of f will always be a unit. ¯ Proof This theorem states the existence and uniqueness of polynomials h and r....).. Suppose f = a0 + a1 x + · · +am xm where m ≥ 1 and am is a unit in R. We can do the same thing formally in a purely algebraic setting.) · (b0 . r ∈ R[x ] such that g = f h + r with r = 0 or deg(r) < deg(f ). and it is easy to see that the collection of polynomial functions forms a commutative ring. f ∈ R[x ] has degree ≥ 1 and its top coeﬃcient is a unit in R. b1 . Deﬁnition Suppose R is a commutative ring and x is a “variable” or “symbol”. For the general case. Then ai bj xi+j is the ﬁrst nonzero term of f g.. . the ring multiplication a · b is written simply as ab. Under the obvious addition and multiplication. b1 . f (x ) = a0 + a1 x + · · +an x n . The proof is by induction on the degree of g. Then deg(f )+deg(g) = deg(f g) and thus f g is not 0. Another way to prove this theorem is to look at the bottom ¯ terms instead of the top terms.
Proof Suppose f and g are nonzero polynomials. . The degree of a nonzero polynomial f is the largest integer n such that an = 0. . set h = 0 and r = g. The sum and product of polynomials are again polynomials. We outline the proof of existence and leave uniqueness as an exercise.. we consider real functions f which are polynomials. The polynomial ring R[x ] is the collection of all polynomials f = a0 + a1 x + · · +an x n where ai ∈ R. . R[x ] is a commutative ring. Theorem (The Division Algorithm) Suppose R is a commutative ring. Note that on the right. a1 .. then f is said to be monic.
Theorem
If R is a domain. For any g with deg(g) < m.) + (b0 .
Note this is Proof. nk and a unique polynomial h with no root in R so that g(x) = (x − c1 )n1 · · · (x − ck )nk h(x). Then g has at most n roots..e. The theorem is clearly true for n = 1.. n > 0.e. This is a good exercise in the use of the division algorithm.. Then I contains a unique polynomial of the form f = a0 + a1 x + · · +an−1 x n−1 + x n and it has the property that I = f F [x ]. Exercise Suppose g is a nonconstant polynomial in R[x]. and in that case both roots belong to R. i. C is an algebraically closed ﬁeld. c2 . Then ∃ a polynomial h1 with g(x) = (x − c1 )h1 . if h = an . ∃ h1 and r with f h1 + r = (g − f bxt ) and deg(r) < m. we say “all the roots of g are 0”. The result follows from the equation f (h1 + bxt ) + r = g.) ¯ Proof Uniqueness is easy so let’s prove existence. Also show that if g(x) = x2 + bx + c. each coset of I can be written uniquely in the form (c0 + c1 x + · · +cn−1 x n−1 + I). ∃ t ∈ T such that I = tT. Show that if g has odd degree then it has a real root. If g = an xn . ck be the distinct roots of g in the ring R. This is called The Fundamental Theorem of Algebra.
.. i. Now suppose g is a polynomial of degree n and c1 is a root of g.. similar to showing that a subgroup of Z is generated by one element (see page 15). then it has a real root iﬀ b2 ≥ 4c.e. and n is the smallest positive integer such that I contains a polynomial of degree n. I is a proper ideal of F [x ]. Suppose n > 1 and the theorem is true for any polynomial of degree less than n. Furthermore. Since h1 has degree less than n. Now ∃ a monomial bxt with t = n − m and deg(g − f bxt ) < n. Thus F [x ] is a PID. i. Note If g is any nonconstant polynomial in C[x]. Note If r = 0 we say that f divides g. Then ∃ a unique sequence of positive integers n1 . and g(x) = a0 + a1 x + · · · + an xn is a Theorem polynomial of degree n with at least one root in R. all the roots of g belong to C. . g(c) = 0. (If h has degree 0. Theorem Suppose F is a ﬁeld. .. x − c divides g with remainder g(c).46
Rings
Chapter 3
n. Let c1 .. the result follows by induction. n2 . Note that f = x − c divides g iﬀ c is ¯ a root of g. given any ideal I. ¯ Suppose R is a domain. Suppose g is a polynomial of degree n. then we say “all the roots of g belong to R”. More generally. Note that Z is a PID and any ﬁeld is PID. Deﬁnition A domain T is a principal ideal domain (PID) if. and it is assumed without proof for this textbook. By induction.
adjoin x to R and set x2 = −1. Exercise Show that. The statement that f is irreducible means that if h is a nonconstant polynomial which divides f . and should not be studied at this stage. One diﬀerence is that the units of F [x ] are nonzero constants. then h is an associate of f . It is deﬁned by h(a0 + a1 x + · · +an x n ) = a0 + a1 c + · · +an cn .. b ∈ R}. while the units of Z
. h sends f (x) to f (c). Exercise Let C = {a + bi : a. and this h is surjective. The Division Algorithm corresponds to the Euclidean Algorithm. Write out the multiplication table for this ring and show that it is a ﬁeld. the development is easy because it corresponds to the development of Z in Chapter 1. The image of h is the smallest subring of C containing R and c. Deﬁnition Suppose F is a ﬁeld and f ∈ F [x] has degree ≥ 1. The degree function corresponds to the absolute value function. Show that [1] + [2]x is a unit in Z4 [x ]. Irreducible polynomials correspond to prime integers. Since R is a subring of C. Show ker(h) = (x2 + 1)R[x ] and thus R[x ]/(x 2 + 1) ≈ C. Thus if F is a ﬁeld. This is a good way to look at the complex numbers. We do not develop the theory of F [x ] here. i.
In this chapter we do not prove F [x] is a unique factorization domain. if R is a domain. Also multiplying two polynomials in R[x ] and evaluating is the same as evaluating and then multiplying in C. The next deﬁnition and theorem are included merely for reference. Exercise Z2 [x ]/(x 2 + x + 1) has 4 elements.e.e. the units of R[x ] are just the units of R. The theorem says that adding two polynomials in R[x ] and evaluating is the same as evaluating and then adding in C. there exists a homomorphism h : R[x] → C which sends x to i. Suppose R is a subring of a commutative ring C and c ∈ C.. However. This map h is called an evaluation map. In street language the theorem says you are free to send x wherever you wish and extend to a ring homomorphism on R[x]. nor do we even deﬁne unique factorization domain. the units of F [x ] are the nonzero constants. i. to obtain C. The statement that g is an associate of f means ∃ a unit u ∈ F [x] such that g = uf . Then ∃! homomorphism h : R[x ] → C with h(x ) = c and h(r) = r for all r ∈ R.Chapter 3
Rings
47
Theorem.
. If a ∈ R and n. x2 . y] is not a PID. xn ] to be any ﬁnite sum of monomials. y] = {f ∈ F [x. Theorem R[x . Then f factors as the product of irreducibles. ... m ≥ 0.) Theorem Suppose F is a ﬁeld and f ∈ F [x ] has degree ≥ 1. it is easy to prove the following theorem. However F [x . F [x ]/(f ) is a ﬁeld. y] is a commutative ring and (R[x ])[y] ≈ R[x ..48
Rings
Chapter 3
are just ±1. Side Comment It is true that if F is a ﬁeld. f is irreducible. Thus the associates of f are all cf with c = 0 while the associates of an ¯ integer n are just ±n. xn ] is a domain and its units are just the
. . xn−1 ])[xn ].. Here is the basic theorem.. .. y] to be any ﬁnite sum of monomials.. each f ∈ F [x . y] : f (0. This gives a commutative ring and there is canonical isomorphism R[x1 . y] + yF [x. any polynomial in x and y with coeﬃcients in R may be written as a polynomial in y with coeﬃcients in R[x ].. xn ] ≈ (R[x1 ..
Deﬁnition Now suppose x and y are “variables”. y] ≈ (R[y])[x ]. the concept of a polynomial ring in n variables works ﬁne without a hitch. x2 .. Theorem units of R. 0) = 0} is not principal.. . If a ∈ R and v1 . vn are nonnegative v v v integers.. Also the following are equivalent. x2 . 1) 2) 3) F [x ]/(f ) is a domain. . Using this and induction on n. y] factors as the product of irreducibles. then ax1 1 x2 2 · · · xn n is called a monomial. the ideal I = xF [x. v2 . ¯ ¯ ¯ If R is a commutative ring and n ≥ 2. then ax n y m = ay m x n is called a monomial. Deﬁne an element of R[x . x2 . R[x1 .. (This theory is developed in full in the Appendix under the topic of Euclidean domains. and this factorization is unique up to order and associates. If R is a domain. In other words. or as a polynomial in x with coeﬃcients in R[y]. Deﬁne an element of R[x1 . Order does not matter here. For example..
This means f (p(x. Suppose T is a commutative ring and e ∈ T is an idempotent with 0 = e = 1. Let e = (1. Exercise Suppose R and S are rings. just as does the product of groups. f is a ring homomorphism iﬀ each ft is a ring homomorphism. (1 − e)2 = (1 − e). y]. Exercise If T is any ring. y] → R[x] is the evaluation map which sends y to 0. Show I ×J is an ideal of R×S and every ideal of R × S is of this form. Theorem Suppose T is an index set and for each t ∈ T . Suppose R is a ring. 1S ). deﬁne multiplication by {rt } · {st } = {rt · st }. since multiplication is deﬁned ¯ ¯ ¯ ¯ coordinatewise. and f : T → R × S deﬁned by f (t) = (et. Show T = R × S is not a domain. Note that R × 0 is not a subring of R × S because it does not contain (1R . ¯ ¯ ¯ ¯ Suppose I ⊂ R and J ⊂ S are ideals. y)) = p(x. This shows that a commutative ring T splits as the product of two rings iﬀ it contains a nontrivial idempotent. and ¯ f (1R ) = {1t } iﬀ ft (1R ) = 1t for each t ∈ T. Product of Rings The product of rings works ﬁne. y]/(y) is isomorphic to R[x]. an element e of T is called an idempotent provided e2 = e. if you adjoin y to R[x] and then factor it out.
t∈T
. 0) ∈ R × S and show e2 = e. Exercise Suppose R and S are commutative rings. On the additive abelian group Rt = Rt . Note that {1t } is the multiplicative identity of Rt . Show each of the ideals R and S is a ring with identity. f is a ring homomorphism iﬀ each ft is a ring homomorphism. Under the natural bijection from {functions f : R → Rt } to {sequences of functions {ft }t∈T where ft : R → Rt }. you get R[x] back. Show R × 0 is an ideal and (R × S/R × 0) ≈ S. Then Rt is a ring and each projection πs : Rt → Rs is a ring homomorphism. Let R = eT and S = (1 − e)T . The elements 0 and 1 are idempotents called the trivial idempotents. Finally. Proof We already know f is a group homomorphism iﬀ each ft is a group homomorphism (see page 36). R × 0 = eT . Use the fact that “the domain mod the kernel is isomorphic to the image” to show R[x. That is. (1 − e)t) is a ring isomorphism. Show f is a ring ¯ ¯ homomorphism whose kernel is the ideal (y) = yR[x. 0). and 0 × S = (1 − e)T .Chapter 3
Rings
49
Exercise Suppose R is a commutative ring and f : R[x. Rt is a ring.
.. The element f (1)m = ([1]. and (ni . this map is surjective with kernel mnZ. nt are integers. Thus the subgroup of R generated by 1 is a subring ¯ ¯ ¯ of R isomorphic to Z or isomorphic to Zn for some positive integer n. [1])m = ([m]. Theorem If R is a ring. (Note that the bracket symbol is used ambiguously. .. Since their least common multiple is n. (See exercise three on page 35.. nj ) = 1 for all i = j. Use this and the Chinese Remainder Theorem to show that if b is a positive integer. then [a] = [ap ] in Zp (Fermat’s Little Theorem).. the order of f (1) is n. ft ) : Z → Zn1 × · · ×Znt is surjective.) Then the ring homomorphism f = (f1 . (See the fourth exercise on page 36 for the case t = 3. Proof We wish to show that the order of f (1) is n. the kernel of f is nZ. each ni > 1. (See page 23 for the deﬁnition of order. . nt . and thus f (1) is a group generator.. The next theorem is a classical generalization of this. ¯ It is an interesting fact that. if R is a domain. but it shows that in ring theory. Deﬁnition Suppose R is a ring and f : Z → R is the natural ring homomorphism f (m) = m1 = m. Characteristic The following theorem is just an observation. Furthermore. and thus f is surjective. The nonnegative integer n with ker(f ) = nZ is called the ¯ ¯ characteristic of R. all the nonzero elements of R have the same order. and thus also as groups. If f is not injective.50 The Chinese Remainder Theorem
Rings
Chapter 3
The natural map from Z to Zm × Zn is a group homomorphism and also a ring homomorphism. there is one and only one ring homomorphism f : Z → R.. Let fi : Z → Zni be deﬁned by fi (a) = [a]. ..) Theorem Suppose n1 . If m and n are relatively prime. the characteristic of R is the order of 1. . where n = n1 n2 · · nt . It is given by f (m) = m1 = m. [m]) is zero iﬀ m is a multiple of each of n1 . the ring of integers is a “cornerstone”...) Exercise Show that if a is an integer and p is a prime. it has the same last digit as b5 . .)
.. Thus f is injective iﬀ R has characteristic 0 iﬀ 1 has inﬁnite ¯ order. and thus Zmn and Zm × Zn are isomorphic as groups and as rings. Thus Zn and Zn1 × · · ×Znt are isomorphic as rings.
a2 = a. a = 0 and R is a domain. each element of R is an idempotent. If a ∈ R.
R has characteristic 2. ¯ 2 2 2 2 Proof (a + a) = (a + a) = a + 2a + a = 4a. ¯ ¯ ¯ ¯ Thus o(a) = ∞.. It follows from 3) that R/I is a domain iﬀ R/I is a ﬁeld iﬀ R/I ≈ Z2 . and so a = −a. Thus a · b = b · a.
Proof Suppose a = 0. Then R contains Zn as a subring. and is included for reference. Deﬁnition A ring R is a Boolean ring if for each a ∈ R. ¯ ¯ ¯ Exercise Show that if F is a ﬁeld of characteristic 0. If a is a nonzero element of R. I is a prime ideal iﬀ I is a maximal ideal iﬀ R/I ≈ Z2 ). and thus Zn is a domain and n is a prime. then every element of R/I is idempotent and thus R/I is a Boolean ring. R ≈ Z2 . if I is an ideal of R with I = R. However it ﬁts easily here.
2)
Proof (a + b) = (a + b)2 = a2 + (a · b) + (b · a) + b2 = a + (a · b) − (b · a) + b. na = n · a = 0 · a = 0 and thus o(a)n and thus o(a) = n. If R has ﬁnite characteristic n. 3) If R is a domain. F contains Q as a subring. If R has characteristic 0. a is a nonzero element of R.e. Then a · (1 − a) = 0 and so a = 1. (In the language of Chapter 6.
. i. ¯ ¯ ¯ ¯ 4) The image of a Boolean ring is a Boolean ring. Thus 2a = 0. show that the injective homomorphism f : Z → F extends to an injective ¯ homomorphism f : Q → F . ¯ R is commutative. then n is a prime and each nonzero a ∈ R has order n. then each nonzero a ∈ R has inﬁnite order. Then ma = m · a cannot be 0 because m. That is. That is. Proof Suppose R has characteristic 0. Theorem 1) Suppose R is a Boolean ring. 2a = a + a = 0. Now suppose R has characteristic n. and m is a positive integer. Boolean Rings This section is not used elsewhere in this book.Chapter 3
Rings
51
Theorem Suppose R is a domain.
a. Under this multiplication R becomes a Boolean ring with 1 = X. Deﬁne a multiplication on R by a · b = a ∩ b. Theorem Suppose R is a Boolean algebra of sets. These operations cup and cap are associative. Show each fi is a homomorphism and thus f = (f1 . Then a ∪ b = (a ∩ b ) belongs to R and so 3) is true. and so 4) is true. Show f is an isomorphism. ¯ Exercise Let X = {1. In this case. Proof Suppose 1) and 2) are true. (See exercises 1) and 4) on page 12.. . it is a classical fact that. Now deﬁne a ∨ b = a ∪ b and a ∧ b = a ∩ b. Note Suppose R is a Boolean ring.. Deﬁne an addition on R by a + b = (a ∪ b) − (a ∩ b). Anyway. let a = (X −a) be a complement of a in X. So let’s just suppose R is a Boolean algebra of sets which is a Boolean ring with addition and multiplication deﬁned as above. Consider the following properties which the collection R may possess. . it contains some element a. R is not a group under cup or cap.
Theorem If 1) and 2) are satisﬁed. then 3) and 4) are satisﬁed. fn ) : R → Z2 × Z2 × · · ×Z2 is a ring homomorphism. If a is a subset of X. Now suppose R is a nonvoid collection of subsets of X.. b ∈ R ⇒ (a ∩ b) ∈ R.. The advantage of the ring viewpoint is that you can draw from the rich theory of commutative rings. n} and let R be the Boolean ring of all subsets of X. have identity elements.52
Rings
Chapter 3
Suppose X is a nonvoid set. With these two operations (along with complement). The advantage of the algebra is that it is symmetric in cup and cap.) Exercise Use the last exercise on page 49 to show that any ﬁnite Boolean ring is isomorphic to Z2 × Z2 × · · ×Z2 . Since R is nonvoid. commutative. Deﬁne fi : R → Z2 by fi (a) = [1] iﬀ i ∈ a. if you have a Boolean ring (algebra). R is called a Boolean algebra of sets. ∅ ∈ R and X ∈ R. and a. b ∈ R ⇒ (a ∪ b) ∈ R. and each distributes over the other. and thus also to the Boolean ring of subsets above.. b ∈ R. Then ∅ = a ∩ a and X = a ∪ a belong to R.
. Under this addition. R is an abelian group with 0 = ∅ and ¯ a = −a. It is a classical theorem that ∃ a Boolean algebra of sets whose Boolean ring is isomorphic to R. Note that o(R) = 2n . a. 1) 2) 3) 4) a ∈ R ⇒ a ∈ R. 2.. R is called a Boolean algebra. you have a Boolean algebra (ring).
am.Chapter 4
Matrices and Matrix Rings
We ﬁrst consider matrices in full generality.e. However. and characteristic polynomial. This chapter concludes with the theorem that similar matrices have the same determinant. A = (ai. elementary matrices.1 . Rn = R ⊕ R ⊕ · · · ⊕ R. The highlight of the chapter is the theorem that a square matrix is a unit in the matrix ring iﬀ its determinant is a unit in the ring. i.n be the collection of all m × n matrices a1. after the ﬁrst few pages. i. am. Deﬁnition Suppose R is a ring and m and n are positive integers. trace. Rn = Rn. Our convention is to write elements of R n as column vectors. . . trace. Square matrices are so important that they have a special notation. to identify Rn with Rn. To emphasize that Rn does not have a ring structure. where each entry a ∈ R.e. Rn is identiﬁed with R1.n .n . .j ) = .. are all classical. transpose. . The topics.
53
. This will be used in the next chapter to show that an endomorphism on a ﬁnitely generated vector space has a welldeﬁned determinant. . systems of equations. i. .. a1. If the elements of Rn are written as row vectors. we use the “sum” notation.n A matrix may be viewed as m ndimensional row vectors or as n mdimensional column vectors.1 . Let Rm. it will be assumed that R is commutative. A matrix is said to be square if it has the same number of rows as columns.1 . over an arbitrary ring R.j . . and characteristic polynomial.n . and determinant. such as invertible matrices. Rn is deﬁned to be the additive abelian group R × R × · · · × R.
Its “zero” is the matrix 0 = 0m. The product (ai. VA..j ). j term of the sum is the sum of the i. A matrix may be “multiplied” on the right or left by a scalar. In the language of the next chapter.j + bi.n → Rm.j ) = (ai. the dot product of row s of A with column t of B..n ≈ Rmn . they must have the same number of rows and the same number of columns.e.t + · · · + as. the i. AV. It is a function Rm. i.n × R → Rm.e.p .e.V = 0 1 1 0 .n × Rm.p → Rm.j ) = (−ai.e.j ) is deﬁned to be the matrix whose (s.n .j · c). i..n and c. The matrix AB will have the same number of rows as A and the same number of columns as B.n is an additive abelian group. as additive groups. Then (A + B)c = Ac + Bc A(c + d) = Ac + Ad A(cd) = (Ac)d A1 = A
and
This theorem is entirely transparent. . Right scalar multiplication is deﬁned by (ai.n × Rn. B ∈ Rm. if R is commutative. Rm. t) term is as. i. there is no distinction between right and left scalar multiplication.
Multiplication of Matrices The matrix product AB is deﬁned iﬀ the number of columns of A is equal to the number of rows of B.n is a right module over the ring R. and W
. AW . The following theorem is just an observation. Of course..n . The addition is deﬁned by (ai. i. Also −(ai. Exercise and W = Consider real matrices A = 1 2 0 1 a b c d .j )c = (ai. Furthermore.U = 2 0 0 1 .
Scalar multiplication An element of R is called a scalar.54
Matrices
Chapter 4
Addition of matrices To “add” two matrices. it merely states that Rm.t . j terms. Find the matrices AU.1 · b1.j ) + (bi. multiplication is a function Rm. Note in particular that scalar multiplication is deﬁned on Rn . Theorem Rm.n all of whose terms are zero.j )(bi.
A. d ∈ R. UA.j ). Theorem Suppose A.n · bn. addition is a binary operation Rm.
m A = 0p. t) term of (AB)C is as.j ) = BC.p .n A0n. B ∈ Rn. F ∈ Rn. Show that the only ideal of R2 containing a is R2 itself. and D. 0p.t =
i i j i. Note that ABC ∈ Rm.j
j i
bj. They form a group under multiplication called the general linear group and denoted by GLn (R) = (Rn )∗ . G ∈ Rm.i ci. H ∈ Rm .
Proof This elegant little theorem is immediate from the theorems above. C. then aA is right ideal of A.t =
j
as.t = as. t) terms are equal.
.1 = 1 and the other entries are 0. Then multiplication in Rn+m is given by A C B D E G F H = AE + BG CE + DG AF + BH CF + DH .
Theorem
For each ring R and integer n ≥ 1.j bj. Theorem 1) 2) Theorem Suppose A ∈ Rm.n .m .i ci.t which is the (s. Exercise Recall that if A is a ring and a ∈ A. Rn is a ring.q . Let A = R2 and a = (ai.q .j yj.p Im A = A = AIn (The distributive laws) (A + B)C = AC + BC C(A + B) = CA + CB and whenever the
operations are deﬁned. Proof We must show that the (s.j
as. The proof involves writing it out and changing the order of summation.t = Then the (s.n .p = 0m.j ) where a1.i ci. xs.
Multiplication by blocks Suppose A.j ) = AB and (yi.n . Then (AB)C = A(BC). t) term of A(BC). and C ∈ Rp. B. Let (xi.j bj.i ci. Theorem (The associative law for matrix multiplication) Suppose A ∈ Rm.Chapter 4
Matrices
55
Deﬁnition The identity matrix In ∈ Rn is the square matrix whose diagonal terms are 1 and whose oﬀdiagonal terms are 0. Find aR2 and R2 a. The units of Rn are called invertible or nonsingular matrices. E ∈ Rn .
then At is an ndimensional column vector.
Matrices
Chapter 4
For the remainder of this chapter on matrices. At is also square. A is diagonal if it is upper and lower triangular. Theorem 1) 2) 3) 4) 5) (At )t = A (A + B)t = At + B t If c ∈ R. Triangular Matrices If A ∈ Rn . (Ac)t = At c (AB)t = B t At If A ∈ Rn .56 Transpose Notation tative ring. In this case.j = 0 for all i ≥ j (all j ≥ i). Note that if A is upper (lower) triangular.
.n . Rn is noncommutative. then A is invertible iﬀ At is invertible. (1 − t) is a unit with inverse 1 + t + t2 + · · · + tn−1 ..m is the matrix whose (i. Suppose A is invertible.n to Rn. an element t ∈ T is said to be nilpotent provided ∃n n such that t = 0.
Proof The way to understand this is just multiply it out for n = 2 and n = 3. The geometry of this theorem will become transparent later in Chapter 5 when the matrix A deﬁnes an Rmodule endomorphism on Rn (see page 93). ai.
Transpose is a function from Rm. I − B is invertible.e. Theorem If A ∈ Rn is strictly upper (or lower) triangular.j = 0 for all i = j. suppose R is a commuOf course. j) term is the (j. for n > 1.
Proof of 5)
Exercise Characterize those invertible matrices A ∈ R2 which have A−1 = At . If A ∈ Rm. Show that they form a subgroup of GL2 (R). i. i) term of A. In this case (A−1 )t = (At )−1 . Deﬁnition If T is any ring. If A is an ndimensional row vector.j = 0 for all i > j (all j > i). then At is lower (upper) triangular. If A is a square matrix. At ∈ Rn. A is strictly upper (lower) triangular provided ai. then A is upper (lower) triangular provided ai.m . Thus if T = Rn and B is a nilpotent matrix. So row i (column i) of A becomes column i (row i) of At . Then I = I t = (AA−1 )t = (A−1 )t At . then An = 0.
is a diagonal matrix. if B ∈ Rn .
Scalar matrices A scalar matrix is a diagonal matrix for which all the diagonal terms are equal. AB = BA..p . Show A is a scalar matrix. i. A need not be square. There are 3 types of elementary row and column operations on the matrix A. The map R → Rn which sends c to cIn is an injective ring homomorphism.. For n > 1. (cIn )B = cB = Bc = B(cIn ).e.n . Add a times column i to column j where i = j and a is any element of R. Show that BA is obtained from B by multiplying column i of B by ai . Show A is a unit in Rn iﬀ each ai is a unit in R. and thus we may consider R to be a subring of Rn . Recall we are assuming R is a commutative ring. and thus scalar matrices commute with everything. Interchange row i and row j. Elementary Operations and Elementary Matrices Suppose R is a commutative ring and A is a matrix over R. Multiply column i by some unit a ∈ R.
a1 a2 0 · 0 ·
Exercise
Suppose A =
an and C ∈ Rn.e. i. Show AC is obtained from C by multiplying row i of C by ai . B ∈ Rm. Exercise Suppose A ∈ Rn and for each B ∈ Rn . a matrix of the form cIn . Add a times row j to row i where i = j and a is any element of R.
Type 2 Type 3
.Chapter 4
Matrices 1 2 −3 Find the inverse of 0 1 4 . Type 1 Multiply row i by some unit a ∈ R. this shows how noncommutative Rn is. Multiplying by a scalar is the same as multiplying by a scalar matrix. Interchange column i and column j. 0 0 1
57
Exercise
Let R = Z.
all the oﬀdiagonal elements are zero. Exercise Show that if B is an elementary matrix of type 1. perform the operation on an identity matrix to obtain an elementary matrix B. The following theorem is handy when working with matrices.j is any element of R.
In type 1. Theorem Suppose A is a matrix. In type 2.2.j 1 0
Type 1
B=
where a is a unit in R. That is.58
Matrices
Chapter 4
Elementary Matrices Elementary matrices are square and invertible. In type 3. It need not be square. and multiply on the left (right). To perform an elementary row (column) operation on A. They are obtained by performing row or column operations on the identity matrix. (See the exercise on page 54. BA = row operation on A and AB = column operation on A. there are two nonzero oﬀdiagonal elements. then B is invertible and B −1 is an elementary matrix of the same type.
Type 2
B=
Type 3
B=
where i = j and ai.
1 1 a 1 0 1 1 1 0 1 1 1 0 1 1 1 1 1 0 1 1 ai. There are three types. and it may be above or below the diagonal. there is at most one nonzero oﬀdiagonal element.)
. or 3.
can be written as one .
Suppose A = (ai. Systems of Equations
c1 . namely as (ai.n xn = . This shows the matrices B and C may be selected as products of elementary matrices.n .t = 1 and all other entries are 0. . The integer t is called the rank of A.
59
where d1. .j )
A matrix T is said to be in row echelon form if. Show ∃ an invertible matrix B ∈ Fm such that BA is in row echelon form. for each 1 ≤ i < m.
Show ∃ invertible matrices B ∈ Fm and C ∈ Fn such that BAC = (di. the ﬁrst nonzero term of row (i + 1) is to the right of the ﬁrst nonzero term of row i. The system
.Chapter 4 Exercise 1)
Matrices Suppose F is a ﬁeld and A ∈ Fm. Show A is the product of elementary matrices.j ) ∈ Rm. Is it possible to write them as products matrices over Z?
4)
For 1).1 .1 = · · · = dt. the number of nonzero rows is the rank of T . . perform row and column operations on A to reach the desired form.n and C =
c1 · · cm
∈ Rm = Rm. . am. 3 0 of elementary of elementary Let A = 11 3 11 and D = . Notice that if T is in rowechelon form. (See page 89 of Chapter 5. of m equations in n unknowns.j ) · · cm xn
a1. . Write A and D as products 1 4 4 matrices over Q. use only row operations. For part 3).1 x1 + · · · + am.) 2) 3) Suppose A ∈ Fn is invertible.n xn = cm c1 x1 · · or AX = C. . = matrix equation in one unknown.1 x1 + · · · + a1. Part 2) also follows from this procedure.
. Theorem 1) 2) AX = 0 is called the homogeneous equation. the solution set f −1 (C) is the coset D + ker(f ) in Rn . If σ is a permutation on {1.  (a)  = a.j ) ∈ Rn . n}. The proper development of this concept requires a study of multilinear forms. Thus we may perform any row operation on both sides of the equation and not change the solution set.σ(n) . this says that f is an Rmodule homomorphism. this agrees with the deﬁnition above.σ(2) · · · an. (Note that here we are writing the permutation functions as σ(i) and not as (i)σ.. In this section we simply present the basic properties. Its solution set is ker(f ). Then f is a group homomorphism and also f (Dc) = f (D)c for any c ∈ R.)
. In the language of the next chapter. and sign(σ) = −1 if σ is an odd permutation. If m = n and A ∈ Rm is invertible. Check that for
all σ
n = 2. For n = 2. Determinants The concept of determinant is one of the most amazing in all of mathematics. 2. = ad − bc. then AX = C has the unique solution X = A−1 C.σ(1) · a2. c d Deﬁnition Let A = (ai. The determinant is deﬁned by  A = sign(σ) a1. page 28. (See part 7 of the theorem on homomorphisms in Chapter 2. AX = C has a solution iﬀ C ∈ image(f ). Then AX = C and (BA)X = BC have the same set of solutions. which is given in Chapter 6.
3)
4)
The geometry of systems of equations over a ﬁeld will not become really transparent until the development of linear algebra in Chapter 5. For n = 1.60
Matrices
Chapter 4
Deﬁne f : Rn → Rm by f (D) = AD. The next theorem summarizes what we already know about solutions of linear equations in this setting. If D ∈ Rn is one solution. For each n ≥ 1 and each commutative ring R. ..) Suppose B ∈ Rm is invertible. let sign(σ) = 1 if σ is an even permutation. determinant is a function from Rn a b to R.
. .
all σ
all τ
Corollary
A = At . aτ (1).σ(1) · a2. i.σ(1) · a2. . . .1 · aτ (2). a1. . Let γ be the transposition which interchanges one and two. To reduce the abstraction. these pairs cancel in the summation. A2 . .5 . 2. An ) + c(A1 .σ(n) contains exactly one factor from each row and one factor from each column. and a. we may rearrange the factors so that the ﬁrst comes from the ﬁrst column. Cr .
You may view an n × n matrix A as a sequence of n column vectors or as a sequence of n row vectors. Since R is commutative. . . the two previous theorems say that d is an alternating multilinear form. . τ is the inverse of σ and thus there are two ways to deﬁne determinant.
all τ
.2 · · · aτ (n).e. . ¯ Proof For simplicity. . . . . Anyway.n .Chapter 4
Matrices
61
For each σ. Then (A1 . . and so τ (5) = 2. the second from the second column. . In the second expression.2 · · · aτ (n).1 . Cr ∈ Rn.σ(n) = aτ (1).n = aτ γ(1). If two columns of A are equal. .1 · aτ γ(2).2 · · · aτ (n). .5 . Then for any τ . Ar−1 . . then A = 0.σ(2) · · · an.σ(1) · · · an.1 = Rn . Theorem A = sign(σ)a1. and since sign(τ )=−sign(τ γ). This pairs up the n! terms of the summation. . it will appear as aτ (5). . sign(τ )aτ (1). . A1 = A2 . aAr + cCr . Summary Determinant is a function d : Rn → R. Therefore A = 0.n . c ∈ R.. suppose σ(2) = 5. Then the ﬁrst expression will contain the factor a2. Here we will use column vectors.n . It follows that the determinant of a matrix is equal to the determinant of its transpose. An ) where each Ai ∈ Rn.n and this summation has n! terms and Now A = Theorem n! is an even number. An ) = a(A1 . Ar−1 . Ar+1 . An ) Proof This is immediate from the deﬁnition of determinant and the distributive law of multiplication in the ring R.σ(2) · · · an. . We wish to show that τ = σ −1 and thus sign(σ) = sign(τ ).1 · · · aτ (n). . The next two theorems show that alternating implies skewsymmetric (see page 129). This means we write the matrix A as A = (A1 . ¯ Theorem Suppose 1 ≤ r ≤ n. In the language used in the Appendix. This means that there is a permutation τ on {1. . . . n} such that a1.σ(n) = sign(τ )aτ (1). Ar+1 . etc.1 · aτ (2).2 · · · aτ γ(n).1 · aτ (2). assume the ﬁrst two columns are equal.
Let Mi. More generally.2 + · · · + ai. expansion by any row or expansion by any column. . A1 + A2 .j . A3 . then A = sign(τ )(Aτ (1) . A2 .j + · · · + an. .62 Theorem one. . An ) + (A2 .j be the determinant of the (n − 1) × (n − 1) matrix obtained by removing row i and column j from A.j are called the (i. or adding c times one column to another column.j C2. Adding c times one row to another row. . . Theorem Proof Exercise If τ is a permutation of (1. Since the ﬁrst and last of these four terms are zero. . A3 . The determinant of A is the sum of six terms. Rewrite the four preceding theorems using rows instead of columns. . We know 0 = ¯ (A1 + A2 .n .j and Ci. . . .1 + ai. . the determinant is zero. . . . An ). For any 1 ≤ j ≤ n. c1 c2 c3
Exercise
. . . its determinant is zero. An ) + (A1 . An ) + (A2 . or one column is c times another column. if one row is c times another row. .
The following theorem is just a summary of some of the work done so far. the result follows. . . Thus if any row or any column is zero. . . then the determinant is zero. If a matrix has two rows equal or two columns equal. does not change the determinant.j = (−1)i+j Mi. Interchanging two rows or two columns multiplies the determinant by −1.  A = ai. A3 . .n Ci. A3 . A3 . . . .1 Ci.j + a2.2 Ci.j . . Theorem Multiplying any row or column of matrix by a scalar c ∈ R.
Matrices
Chapter 4
Interchanging two columns of A multiplies the determinant by minus
Proof For simplicity. j) minor and cofactor of A. show that (A2 . Aτ (2) . An ) = −A. A3 . . The permutation τ is the ﬁnite product of transpositions. 2.j C1. Theorem For any 1 ≤ i ≤ n. . A1 . Let Ci. The following theorem is useful but the proof is a little tedious and should not be done as an exercise. . A2 . An ) = (A1 .j Cn. n). A1 . .
There are 2n ways to compute  A . A1 . Mi. multiplies the determinant by c. Aτ (n) ).  A = a1. a1 a2 a3 Let A = b1 b2 b3 . . .
see page 130 of the Appendix.
The following remarkable theorem takes some work to prove.  A = −1. The result follows by expanding by the ﬁrst column. i. Suppose A ∈ Rn is upper triangular. O D Theorem Proof Expand by the ﬁrst column and use induction on n.  A  is the product of the diagonal elements. i) term is the (i.j )t . B ∈ Rn. Before
.m . i. Suppose n > 2 and the theorem is true for matrices in Rn−1 . j) cofactor. An elementary matrix of type 2 is obtained from the identity matrix by interchanging two rows or columns.Chapter 4
Matrices
63
Write out the determinant of A expanding by the ﬁrst column and also expanding by the second row. then  A  is a unit in R and  A−1  =  A −1 .e.  AB  =  A  B . B ∈ Rn . If A ∈ R2 Proof is an upper triangular matrix. if A.
Classical adjoint Suppose R is a commutative ring and A ∈ Rn . so its determinant is 1.) Theorem The determinant of the product is the product of the determinants. We will prove the ﬁrst statement for upper triangular matrices.  C −1 AC  = ACC −1  =  A . (Determinant by blocks) Suppose A ∈ Rn . If A is an elementary matrix of type 3.. We assume it here without proof.
1 =  I  =  AA−1  =  A  A−1  . and D ∈ Rm . and thus has determinant −1.  A = 1. If A is an elementary matrix of type 2. A B Then the determinant of is  A  D . An elementary matrix of type 3 is a special type of upper or lower triangular matrix. Corollary Proof If A is a unit in Rn . Theorem If A is an upper or lower triangular matrix. The classical adjoint of A is (Ci..
One of the major goals of this chapter is to prove the converse of the preceding corollary. the matrix whose (j.e. then its determinant is the product of the diagonal elements. (For the proof. Thus  AB  =  BA  and if C is invertible.
In particular.
invertible and A−1 =  A −1 (Ci. the (s.. We are now ready for one of the most beautiful and useful theorems in all of mathematics. (Thus if R is a ﬁeld. The proof that (Ci.j )t A = AI is similar and is left as an exercise. if  A  = 1.) If A is invertible. the classical adjoint of A.j )t A =
=  A  I. suppose A. let’s examine 2 × 2 matrices. Proof This follows immediately from the preceding theorem.
Proof We must show that the diagonal elements of the product A(Ci. the (s.j ) = A 0 0 A d −c −b a
Matrices
Chapter 4
and so (Ci.j )t = (Ci. B is said to be similar to A if ∃ an invertible C ∈ Rn such that B = C −1 AC.
Exercise Show that any right inverse of A is also a left inverse. Then A is a unit in Rn iﬀ  A  is a unit in R. For s = t.j ). t) term is the dot product of row s of A with row t of (Ci. That is.j )t =
d −b −c a
. Theorem
If R is commutative and A ∈ Rn . Thus if  A  = 1.64 we consider the general case. Show A is invertible with A−1 = B. Then
A(Ci.
. A−1 = (Ci. A−1 = Here is the general case.j )t = (Ci. and thus BA = I. Theorem Suppose R is a commutative ring and A ∈ Rn . B ∈ Rn . Theorem B is similar to B. then A(Ci. B ∈ Rn and AB = I. Thus if  A  is a unit in R. Since this is the determinant of a matrix with row s = row t.j )t A =  A  I. B is similar to A iﬀ B is a conjugate of A. A is d −b −c a .j )t . t) term is 0. If A = a b c d then (Ci. s) term is the dot product of row s of A with row s of (Ci.j )t .j )t . The (s.j )t are all  A  and the other elements are 0. A is invertible iﬀ  A  = 0.e. then A−1 =  A −1 (Ci.j ) and is thus  A  (computed by expansion by row s). i. Similarity Suppose A.
suppose A = (a1 .1 a1. Then AB is the scalar a1 b1 + · · · + an bn while BA is the n × n matrix (bi aj ).2 + · · · + an. b2 . bj. “Similarity” is an equivalence relation on Rn .j + · · ·+ bj. Theorem Suppose A ∈ Rm. trace(A + B) = trace(A) + trace(B) and trace(AB) = trace(BA). Then  A  =  B  and thus A is invertible iﬀ B is invertible.m . the trace of A is the sum of its diagonal terms. Then the trace is deﬁned by trace(A) = a1.j bj. Theorem If A.. For example. Then AB and BA are square matrices with trace(AB) = trace(BA).m am. Then  B  =  C −1 AC  = ACC −1  =  A . Here is the theorem in full generality.n .. a2 .i = ai.j ) ∈ Rn . Proof Suppose B = C −1 AC. . Theorem Proof If A and B are similar. bn )t ..j = ai.
Theorem Suppose A and B are similar. Proof The ﬁrst part of the theorem is immediate. trace(B) = trace(C −1 AC) = trace(ACC −1 ) = trace(A).. an ) and B = (b1 . By deﬁnition.i + · · ·+ ai. If D is similar to B and B is similar to A. then trace(A) = trace(B). B ∈ Rn .
. Proof This proof involves a change in the order of summation..1 + a2. .Chapter 4
Matrices B is similar to A iﬀ A is similar to B. That is.n bn. One of the most useful properties of trace is trace(AB) = trace(BA) whenever AB and BA are deﬁned.
Trace Suppose A = (ai. Note that trace(AB) = trace(BA). and the second part is a special case of the previous theorem.i = trace(AB) =
1≤i≤m
1≤i≤m
1≤j≤n
1≤j≤n
trace(BA). then D is similar to A.
65
Proof
This is a good exercise using the deﬁnition.n and B ∈ Rn.1 b1..
If A and B are similar.  A = trace(A) = 0. but depends on linear algebra (see pages 93. If A and B are similar.
. and 98). ∃ an elementary matrix C such that C −1 AC is strictly upper triangular. Furthermore  AB  =  BA  and trace(AB) = trace(BA). Proof Theorem mials. Find aA and trace(aA). then they have the same characteristic polynoCPB (x) =  (xI − C −1 AC)  =  C −1 (xI − A)C  =
Proof Suppose B = C −1 AC. Suppose R is a commutative ring. Exercise Suppose A ∈ Rn and a ∈ R.
Note This exercise is a special case of a more general theorem. This remarkable result cannot be proved by matrix theory alone. Find a0 and a1 and show that a0 I + a1 A + A2 = 0. A = Exercise 1) 2) 3) 4) Suppose F is a ﬁeld and A ∈ F2 .e. Show the following are equivalent. Determinant is a multiplicative homomorphism and trace is an additive homomorphism. CPA (x) = x2 .66
Matrices
Chapter 4
Summary Determinant and trace are functions from Rn to R. Theorem CPA (x) = a0 + a1 x + · · · + an−1 xn−1 + xn where trace(A) = −an−1 and  A  = (−1)n a0 . show A satisﬁes its characteristic polynomial. This follows from a direct computation of the determinant. CPA (A) = 0. A square matrix over a ﬁeld is nilpotent iﬀ all its characteristic roots are 0 iﬀ it is similar to a strictly ¯ upper triangular matrix. i.  A  =  B  and trace(A) = trace(B). In other words.
Characteristic polynomials If A ∈ Rn . A2 = 0. Any λ ∈ R which is a root of CPA (x) is called a characteristic root of A. the characteristic polynomial CPA (x) ∈ R[x] is deﬁned by CPA (x) =  (xI − A) .  (xI − A)  = CPA (x). 94.. and c d CPA (x) = a0 + a1 x + x2 . Exercise
a b is a matrix in R2 .
then f may be represented by a strictly upper triangular matrix. i. The elementary facts about cosets.
67
. i. We give a simple proof that if R is a commutative ring and f : Rn → Rn is a surjective Rmodule homomorphism. modules are deﬁned over an arbitrary ring R and not just over a ﬁeld. The key theorem is that any vector space V has a free basis.. Also any endomorphism f : V → V may be represented by a matrix. and with the beautiful theory relating orthogonal matrices and symmetric matrices.e. solutions of equations. and simpliﬁes some of the development of linear algebra. then f is an isomorphism. quotients. if f is not injective.. and homomorphisms follow the same pattern as in the chapters on groups and rings.e. As another example. this single integer determines V up to isomorphism. then f may be represented by a matrix whose ﬁrst column is zero. It is in this chapter that the concepts about functions. After the general theory. This chapter concludes with the study of real inner product spaces. it has a well deﬁned dimension. For example. It is stated here in full generality only for reference and completeness. if f is nilpotent. The basic theory is developed here in full generality. and any change of basis corresponds to conjugation of that matrix. modules over a ﬁeld. and incredible as it may seem. The theorem on Jordan canonical form is not proved in this chapter. One of the goals in linear algebra is to select a basis so that the matrix representing f has a simple form. and should not be considered part of this chapter. we restrict our attention to vector spaces.Chapter 5
Linear Algebra
The exalted position held by linear algebra is based upon the subject’s ubiquitous utility and ease of application. This shows that ﬁnitely generated free Rmodules have a well deﬁned dimension. and generating sets come together in one uniﬁed theory. and thus if V is ﬁnitely generated. The proof is given in the Appendix. matrices.
¯ ¯ If a ∈ M and r ∈ R. a0R = 0M . as addition in M and as addition in R. The statement that M is a right Rmodule means there is a scalar multiplication M ×R → M (m. then f : M → M deﬁned by f (a) = ar is a homomorphism of additive groups.
. m) → rm satisfying r(a1 + a2 ) (r1 + r2 )a (r1 · r2 )a 1a ¯ = = = = ra1 + ra2 r1 a + r2 a r1 (r2 a) a
Note that the plus sign is used ambiguously. r2 ∈ R. r1 . In particular (0M )r = 0M . a1 . we may write the scalars on either side. This is a good exercise in using the axioms for an Rmodule. r) → mr satisfying (a1 + a2 )r a(r1 + r2 ) a(r1 · r2 ) a1 ¯ = = = = a1 r + a2 r ar1 + ar2 (ar1 )r2 a
for all a. a2 ∈ M and r. The statement that M is a left Rmodule means there is a scalar multiplication R×M → M (r. Theorem 1) 2) 3) Proof Suppose M is an Rmodule. If R is commutative and M = MR then left scalar multiplication deﬁned by ra = ar makes M into a left Rmodule.68
Linear Algebra
Chapter 5
Deﬁnition Suppose R is a ring and M is an additive abelian group. Convention Unless otherwise stated. ¯ ¯ If a ∈ M . If r ∈ R. Thus for commutative rings. then (−a)r = −(ar) = a(−r). it is assumed that R is a ring and the word “Rmodule” (or sometimes just “module”) means “right Rmodule”.
Notation The fact that M is a right (left) Rmodule will be denoted by M = MR (M = R M ). In this text we stick to right Rmodules.
and for each t ∈ T . 2. If T = {1. f (ar) = f (a)r..e. the statement that a subset N ⊂ M is a submodule means it is a subgroup which is closed under scalar multiplication. Also in 8) it is only necessary to show that HomR (M. Note that 0 and M are submodules. Homomorphisms Suppose M and N are Rmodules. N is an Rmodule. i. Exercise Suppose T is a nonvoid set. Show N T is an Rmodule. Much
. which is immediate.. in 5) of the theorem below. A function f : M → N is a homomorphism (i. and N T is the collection of all functions f : T → N with addition deﬁned by (f +g)(t) = f (t)+g(t). if a ∈ N and r ∈ R.) This simple fact is quite useful in linear algebra. N1 + N2 is the smallest submodule of M containing N1 ∪ N2 .
t∈T
2) 3)
+t∈T Nt = {all ﬁnite sums a1 + · · +am : each ai belongs to some Nt } is a submodule. and scalar multiplication deﬁned by (f r)(t) = f (t)r. Theorem Suppose M is an Rmodule. N ) is a subgroup of N M .
Proof We know from page 22 that versions of 1) and 2) hold for subgroups. called the ¯ improper submodules of M . In this case N will be an Rmodule because the axioms will automatically be satisﬁed. If {Nt } is a monotonic collection. N ) forms an abelian group. then this submodule may be written as N1 + N2 + · · +Nn = {a1 + a2 + · · +an : each ai ∈ Ni }. Nt is a submodule of M . T is an index set. So it is only necessary to show that HomR (M.. then ar ∈ N.. On the left. To ﬁnish the proofs it is only necessary to check scalar multiplication. 1)
t∈T
Nt is a submodule of M . Also the proof of 3) is immediate. it is stated that HomR (M. n}. For example. an Rmodule homomorphism) provided it is a group homomorphism and if a ∈ M and r ∈ R. (We know from the last exercise in Chapter 2 that N T is a group. and so it is only necessary to check scalar multiplication. N ) is a submodule of N M .Chapter 5
Linear Algebra
69
Submodules If M is an Rmodule.e. Nt is a submodule. . The basic facts about homomorphisms are listed below. Note that if N1 and N2 are submodules of M . scalar multiplication is in M and on the right it is in N . and in particular for subgroups of additive abelian groups.
M ) are the automorphisms. The composition of homomorphisms is a homomorphism. Theorem 1) 2) 3) 4) The zero map M → N is a homomorphism. The identity map I : M → M is a homomorphism. HomR (M. If k : P → M is a homomorphism. The units of the endomorphism ring HomR (M. Then f + g is a homomorphism. Then f (G) is a submodule of N and f −1 (H) is a submodule of M. If h : N → P is a homomorphism. the set of all homomorphisms from M to N . If a bijection f : M → N is a homomorphism. then g : M → M deﬁned by g(a) = ar is a homomorphism. NR ). If R is commutative. HomR (Rn . An isomorphism f : M → M is called an automorphism. Suppose f : M → N is a homomorphism. Furthermore. and H ⊂ N is a submodule. HomR (M. f r deﬁned by (f r)(a) = f (ar) = f (a)r is a homomorphism. A homomorphism f : M → M is called an endomorphism. M ). G ⊂ M is a submodule. then f −1 : N → M is also a homomorphism. deﬁne (f + g) : M → N by (f + g)(a) = f (a) + g(a).70
Linear Algebra
Chapter 5
of this work has already been done in the chapter on groups (see page 28). In particular. If R is commutative and r ∈ R. g : M → N are homomorphisms. HomR (M. h ◦ (f + g) = (h ◦ f ) + (h ◦ g). (f + g ) ◦ k = (f ◦ k) + (g ◦ k).
5)
6)
7)
8) 9)
Proof
. if f : M → N is a homomorphism. In this case f and f −1 are called isomorphisms. is a ring. image(f ) is a submodule of N and ker(f ) = f −1 (0) is a submodule of M . We will see later that if M = R n . Thus the automorphisms on M form a group under composition. If f. with multiplication deﬁned to be composition. The sum of homomorphisms is a homomorphism. Also (−f ) deﬁned by (−f )(a) = −f (a) is a homomorphism. ¯ This is just a series of observations. Rn ) is just the matrix ring Rn and the automorphisms are merely the invertible matrices. N ) = Hom(MR . forms an abelian group under addition. N ) is an Rmodule.
Theorem Suppose R is a ring and N is a subset of R.n admits a scalar multiplication by elements in R. Given m ∈ M . Proof The deﬁnitions are the same except expressed in diﬀerent language. R as a right Rmodule Let M = R and deﬁne scalar multiplication on the right by ar = a · r. Then f = g. We begin with the case n = 1. because a1 = a. Thus
. ¯ Proof Suppose f (1) = g(1). scalar multiplication is just ring multiplication. and if M is abelian. If M and N are Qmodules and f : M → N is a Zmodule homomorphism. if f : M → N is a group homomorphism of abelian groups. if m ∈ M . Then N is a submodule of RR (R R) iﬀ N is a right (left) ideal of R. Summary Additive abelian groups are “the same things” as Zmodules. the study of additive abelian groups is a special case of the study of Rmodules.n an Rmodule. g : R → M are homomorphisms with f (1) = ¯ g(1). ∃! homomorphism h : R → M with ¯ h(1) = m. Furthermore. Then f (r) = f (1 · r) = f (1)r = g(1)r = g(1 · r) = ¯ ¯ ¯ ¯ ¯ ¯ g(r). then f is also a Zmodule homomorphism. HomR (R.Chapter 5
Linear Algebra
71
Abelian groups are Zmodules On page 21. The properties listed there were exactly those needed to make Rm.1 (see page 53). In other words. This makes R a right Rmodule denoted by RR (or just R). a2 = a + a. That is. etc.
Theorem Suppose M = MR and f. must it also be a Qmodule homomorphism? Homomorphisms on Rn Rn as an Rmodule On page 54 it was shown that the additive abelian group Rm. h : R → M deﬁned by h(r) = mr is a homomorphism. the properties are satisﬁed to make M a Zmodule. M ) ≈ M . Note that this is the only way M can be a Zmodule. This is the same as the deﬁnition before for Rn when n = 1. it is shown that any additive group M admits a scalar multiplication by integers. Of particular importance is the case Rn = R ⊕ · · ⊕R = Rn. While group theory in general is quite separate from linear algebra. Rmodules are also Zmodules and Rmodule homomorphisms are also Exercise Zmodule homomorphisms. Furthermore.
. and every module homomorphism is of this form. mn r). M ) to M n = M × M × · · ×M and this bijection is a group isomorphism.. We now consider the case where the domain is Rn . m2 r. if m1 . In the case M = R.. or which Rmodule M is selected. m2 ... Furthermore. It does not matter how complicated the ring R is.
0 r1 ¯ · · Homomorphisms on Rn Deﬁne ei ∈ Rn by ei = 1i . This theorem reveals some of the great simplicity of linear algebra.. . We will see later that the product M n is an Rmodule with scalar multiplication deﬁned by (m1 . mn ∈ M . M ) is an Rmodule. Show f is injective.
. If R is commutative so that HomR (Rn . the above theorem states that multiplication on left by some m ∈ R deﬁnes a right Rmodule homomorphism from R to R. this theorem gives an Rmodule isomorphism from HomR (Rn . . The element m should be thought of as a 1 × 1 matrix. m2 . Exercise Suppose R is a ﬁeld and f : RR → M is a nonzero homomorphism. Theorem Suppose M = MR and f.72
Linear Algebra
Chapter 5
evaluation at 1 gives a bijection from HomR (R.. ∃! homomorphism h : Rn → M with h(ei ) = mi for 1 ≤ i ≤ m. Note that any ¯ · · 0 rn ¯ can be written uniquely as e1 r1 + · · +en rn . Proof The proof is straightforward. and this bijection is clearly ¯ a group isomorphism. g : Rn → M are homomorphisms with f (ei ) = g(ei ) for 1 ≤ i ≤ n. it is an isomorphism of Rmodules. Then f = g.. mn )r = (m1 r. M ) to M .. If R is commutative. Any Rmodule homomorphism from Rn to M is determined by its values on the basis. en } is called the canonical free basis or standard basis for R n . M ) to M n . The sequence {e1 .. and any function from that basis to M extends uniquely to a homomorphism from Rn to M . The homomorphism h is deﬁned by h(e1 r1 + · · +en rn ) = m1 r1 + · · +mn rn . Note this theorem gives a bijection from HomR (Rn . .
n and g : Rm → Rp is the homomorphism given by C ∈ Rp. If R is commutative. Proof This is just the associative law of matrix multiplication.n . In the case where the domain and range are the same. g : Rn → Rm are given by matrices A. C ∈ Rm. vn ∈ Rm . Even though this follows easily from the previous theorem and properties of matrices. It is the matrix which represents the composition of the functions. Rm ) ≈ Rm. Conversely. Matrices over R give Rmodule homomorphisms! Furthermore.n are isomorphic as additive groups. This corollary shows one way noncommutative rings arise. Thus HomR (Rn . and multiplication of matrices corresponds to composition of homomorphisms. we have the following elegant corollary. We now return to the general theory of modules (over some given ring R). Corollary HomR (Rn . addition of matrices corresponds to addition of homomorphisms. composition of homomorphisms corresponds to multiplication of matrices. .j ) ∈ Rm.Chapter 5
Linear Algebra
73
Now let’s examine the special case M = Rm and show HomR (Rn . Theorem If f : Rn → Rm is the homomorphism given by A ∈ Rm. Then f deﬁned by f (B) = AB is the unique homomorphism from Rn to Rm with f (ei ) = vi . These properties are made explicit in the next two theorems. Rm ) and Rm.
.n to be the matrix with column i = vi .n . . Theorem If f. That is. then f + g is given by the matrix A + C. Rn ) and Rn are isomorphic as rings. then g ◦ f : Rn → Rp is given by CA ∈ Rp. Even if R is commutative. The automorphisms correspond to the invertible matrices. it is one of the great classical facts of linear algebra.
The previous theorem reveals where matrix multiplication comes from. C(AB) = (CA)B. . namely as endomorphism rings. they are isomorphic as Rmodules.n .n . deﬁne A ∈ Rm. if v1 . Theorem Suppose A = (ai. .m . Then f : Rn → Rm deﬁned by f (B) = AB is a homomorphism with f (ei ) = column i of A. Rn is never commutative unless n = 1.
and thus (M/N ) ≈ image(f ). and this is immediate. the additive abelian quotient group M/N is deﬁned. Since N is a normal subgroup of M . f M π
c && & & & E b & &
¯ M
&
¯ f
M/N Thus deﬁning a homomorphism on a quotient module is the same as deﬁning a homo¯ morphism on the numerator that sends the denominator to 0. then f homomorphism making the following diagram commute. Proof On the group level. f is injective. Scalar multiplication deﬁned by (a + N )r = (ar + N ) is welldeﬁned and gives M/N the structure of an Rmodule. if f : M → M is a surjective homomor¯ phism with ker(f ) = N . Thus if N = ker(f ). It is only necessary to check the scalar multiplication. (domain(f )/ker(f )) ≈ image(f ). as shown by the next theorem. this is all known from Chapter 2 (see pages 27 and 29). ¯ : (M/N ) → M deﬁned by f (a + N ) = f (a) is a welldeﬁned ¯ ¯ If N ⊂ ker(f ).
. quotient modules go through without a hitch. then M/N ≈ M (see below). The image of f is the ¯ ¯ ¯ image of f . It is ¯ only necessary to check that f is a module homomorphism. and the kernel of f is ker(f )/N .74
Linear Algebra Cosets and Quotient Modules
Chapter 5
After seeing quotient groups and quotient rings.
The relationship between quotients and homomorphisms for modules is the same as for groups and rings. ¯ Theorem Suppose f : M → M is a homomorphism and N is a submodule of M . which is obvious. Proof On the group level this is all known from Chapter 2 (see page 29). As before. Theorem Suppose M is a module and N ⊂ M is a submodule. The natural projection π : M → M/N is a surjective ¯ homomorphism with kernel N . Furthermore. Therefore for any homomorphism f . R is a ring and module means Rmodule.
i.
The natural homomorphism K → (K + L)/L is surjective with kernel ≈ K ∩ L. Suppose M is a module.
. Thus (K/K ∩ L) → (K + L)/L is an isomorphism. Then Mt is an Rmodule and. On the additive abelian group Mt = Mt deﬁne scalar multiplication by {mt }r =
t∈T
{mt r}. M =Z K = 3Z L = 5Z K ∩ L = 15Z K/K ∩ L = 3Z/15Z ≈ Z/5Z = (K + L)/L K +L=Z
Examples 1) 2)
M =Z K = 6Z L = 3Z (K ⊂ L) (M/K)/(L/K) = (Z/6Z)/(3Z/6Z) ≈ Z/3Z = M/L Products and Coproducts
Inﬁnite products work ﬁne for modules. the natural projection πs : Mt → Ms is a homomorphism. f is a homomorphism iﬀ each ft is a homomorphism. although the student should think of the ﬁnite case. These two examples are for the case R = Z. the ﬁnite product is also a coproduct. This makes the structure of module homomorphisms much more simple.e. ii) Suppose K ⊂ L. for each s ∈ T . f is a module homomorphism iﬀ each ft is a module homomorphism. i.e. Under the natural 11 correspondence from {functions f : M → Mt } to {sequence of functions {ft }t∈T where ft : M → Mt }. This is stated below in full generality. Since scalar multiplication is deﬁned coordinatewise. Thus (M/K)/(L/K) → M/L is an isomorphism. for abelian groups. The natural homomorphism M/K → M/L is surjective ≈ with kernel L/K. Proof We already know from Chapter 2 that f is a group homomorphism iﬀ each ft is a group homomorphism.. just as they do for groups and rings.Chapter 5
Linear Algebra
75
Theorem i)
Suppose M is an Rmodule and K and L are submodules of M . In the ﬁnite case something important holds for modules that does not hold for nonabelian groups or rings – namely.. Mt is an Rmodule. M1 × M2 × · · ×Mn = M1 ⊕ M2 ⊕ · · ⊕Mn . For the ﬁnite case we may use either the product or sum notation. Theorem Suppose T is an index set and for each t ∈ T .
the inclusion homomorphisms is : Ms → ⊕Mt is deﬁned by is (a) = {at } where at = 0 if t = s and as = a. Given g. If R is commutative. Since there are only a ﬁnite number of nonzero terms. M ) ⊕ · · ⊕HomR (Mn . For each s ∈ T . h2 ).
t
M f 1
d d
M f d 2 g1
E d T s d
M1
© '
f
c
π1
M1 ⊕ M 2
π2
d d E
g
d
g2
d d
M2
M1
i1
M1 ⊕ M 2 '
i2
M2
Theorem For ﬁnite T . M ) ≈ HomR (M1 . M1 ⊕ · · ⊕Mn ) ≈ HomR (M. 2} the product and sum properties are displayed in the following commutative diagrams. Thus each Ms may be considered to be a submodule ¯ of ⊕Mt . There is a 11 correspondence from {homomorphisms g : ⊕Mt → M } and {sequences of homomorphisms {gt }t∈T where gt : Mt → M } . M ) and
Proof Let’s look at this theorem for products with n = 2. the 11 correspondences in the above theorems actually produce group isomorphisms. f2 ) and h = (h1 .
. f2 )r = (f1 r. Theorem Suppose M is an Rmodule. f2 + h2 ). M1 ) ⊕ · · ⊕HomR (M. the coproduct or sum Mt = Mt = ⊕Mt is the submodule of Mt
t∈T t∈T
consisting of all sequences {mt } with only a ﬁnite number of nonzero terms. Given {gt }. then the isomorphisms are module isomorphisms.76
Linear Algebra
Chapter 5
Deﬁnition If T is ﬁnite. the coproduct and product are the same module. If T is inﬁnite. f2 r). so that the objects are Rmodules and not merely additive groups. All it says is that if f = (f1 . g is deﬁned by g({mt }) = gt (mt ). then f + h = (f1 + h1 . This says merely that f r = (f1 . If R is commutative. this sum is well deﬁned. Mn ) HomR (M1 ⊕ · · ⊕Mn . HomR (M. gt is deﬁned by gt = g ◦ it . they give isomorphisms of Rmodules. For T = {1.
(See page 35 for the group version. These isomorphisms are transparent and are used routinely in algebra without comment (see Th 4. because L ≈ M/K. Thus f is an isomorphism iﬀ K + L = M and K ∩ L = 0. m) : m ∈ M } ⊂ M ⊕ M . −a) : a ∈ K ∩ L}. here are two theorems for background. Show that M ⊕ N is isomorphic to N ⊕ M. Show K is a submodule of M ⊕ M which is a summand. Is Q a summand of R? With the material at hand. M and 0 are always summands of M . l) = k + l. M ) → M n which is a Rmodule isomorphism. Then the image of f is K + L and the kernel of f is {(a. Before deﬁning summand. Theorem Consider M1 = M1 ⊕0 as a submodule of M1 ⊕M2 . the abelian group (Z ⊕ Z)/(2Z ⊕ 3Z) is isomorphic to Z2 ⊕ Z3 . if a ∈ R and b ∈ R. this is the same as there exists a submodule L with K + L = M and K ∩ L = 0. For example. page 118).
. it will be easy. B ⊂ N are submodules and show (M ⊕ N )/(A ⊕ B) is isomorphic to (M/A) ⊕ (N/B). it need not be unique. Now suppose A ⊂ M. According to the previous theorem. but it will be unique up to ¯ isomorphism.Chapter 5
Linear Algebra
77
Exercise Suppose M and N are Rmodules. this is not an easy question. This abuse of notation allows us to ¯ avoid talking about “internal” and “external” direct sums. and Q ⊂ R is a submodule. Summands One basic question in algebra is “When does a module split as the sum of two modules?”. ¯ Exercise Suppose M is a module and K = {(m. Deﬁne a function α : HomR (Rn . Exercise R is a module over Q.) This is exactly what you would expect. In particular. Of course. In this case we write K ⊕ L = M . Exercise Suppose R is a commutative ring. Suppose K and L are submodules of M and f : K ⊕ L → M is the Theorem natural homomorphism. Then the projection ¯ map π2 : M1 ⊕ M2 → M2 is a surjective homomorphism with kernel M1 . and the next theorem is almost as intuitive. M is an Rmodule. then (R ⊕ R)/(aR ⊕ bR) is isomorphic to (R/aR) ⊕ (R/bR). Deﬁnition Suppose K is a submodule of M . f (k. Later on. and n ≥ 1. The statement that K is a summand of M means ∃ a submodule L of M with K ⊕ L = M . Thus (M1 ⊕ M2 )/M1 is isomorphic to M2 . If such an L exists.
..
. and for each t ∈ T ..) 1) 2) 3) 4) Is 2Z a summand of Z? Is 2Z4 a summand of Z4 ? Is 3Z12 a summand of Z12 ? Suppose m. When is nZmn a summand of Zmn ?
Exercise If T is a ring. tn in T . ¯ S is independent. except for the confusing notation. . then S is dependent. SR is a submodule of M called the submodule generated by S. let ec ∈ ⊕Rt = Rt
t∈T
be ec = {rt } where rc = l and rt = 0 if t = c. Generating Sets. then S is said to be a basis or free basis for M . st ∈ M . There is a exercise on page 57 to show that the center of T is the subring of scalar matrices. Note that if some st = 0. Let SR be the set of all linear combinations st1 r1 + · · +stn rn . and elements r1 . let Rt = RR and for each c ∈ T . In this case any v ∈ M can be written uniquely as a linear combination of elements in S. Let R be a commutative ring and T = Rn . . An Rmodule M is said to be a free Rmodule if it is zero or if it has a basis.. i.. not all zero. Also if ∃ distinct ¯ elements t1 and t2 in T with st1 = st2 . such that the linear combination st1 r1 + · · +stn rn = 0. Zmodules.78
Linear Algebra
Chapter 5
Exercise Answer the following questions about abelian groups. n} and ⊕Rt = Rn (see p 72). Show R n is a left T module and ﬁnd HomT (Rn . deﬁne the center of T to be the subring {t : ts = st for all s ∈ T }. . These concepts work ﬁne for an inﬁnite index set T because linear combination means ﬁnite linear combination. Otherwise. If S is independent and generates M . Independence. and Free Basis This section is a generalization and abstraction of the brief section Homomorphisms on Rn . Deﬁnition Suppose M is an Rmodule. 2. T is an index set. n > 1. You might try ﬁrst the case T = {1. then S is dependent. Then {ec }c∈T is a basis for ⊕Rt called ¯ ¯ the canonical basis or standard basis.. Rn ). Theorem For each t ∈ T .. However. the student should ﬁrst consider the case where T is ﬁnite.e. to avoid dizziness. rn in R.. The statement that S is dependent means ∃ a ﬁnite number of distinct elements t1 . The next two theorems are obvious. Let S be the sequence {st }t∈T = {st }.. (See the third exercise on page 35.
M≈
t∈T
Proof If M is isomorphic to ⊕Rt then M is certainly free. deﬁne g by g(st ) = f (st ). If h : M → N is a homomorphism. Then ∃ a 11 correspondence from the set of all functions g : {st } → N and the set of all homomorphisms f : M → N . M has a ﬁnite free basis of n elements iﬀ M ≈ Rn . and you are “free” to send the basis any place you wish and extend to a homomorphism. but it is a central fact in linear algebra. f (S) is a basis for N iﬀ f is an isomorphism. Although the proof is easy. 1) 2) 3) 4) f (S) generates N iﬀ f is surjective. An ) be a sequence of n vectors with each Ai ∈ Zn . Theorem A nonzero Rmodule M is free iﬀ ∃ an index set T such that Rt .
Exercise Let (A1 . Then the homomorphism f : M → ⊕Rt with f (st ) = et sends the basis for M to the canonical basis for ⊕Rt . Given f . . Recall that we have already had the preceding theorem in the case S is the canonical basis for M = Rn (p 72). In other words. Given g. By 3) in the preceding theorem.Chapter 5
Linear Algebra
79
Theorem Suppose N is an Rmodule and M is a free Rmodule with a basis {st }. f is completely determined by what it does on the basis S. then f = h iﬀ f  S = h  S. Characterization of Free Modules Any free Rmodule is isomorphic to one of the canonical free Rmodules ⊕Rt .. and f : M → N is a homomorphism. f is an isomorphism. So now suppose M has a free basis {st }. This is just an observation. deﬁne f by f (st1 r1 + · · +stn rn ) = g(st1 )r1 + · · +g(stn )rn . Show this sequence is linearly independent over Z iﬀ it is linearly independent over Q. Is it true the sequence is linearly independent over Z iﬀ it is linearly independent over R? This question is diﬃcult until we learn more linear algebra. In particular. f (S) is independent in N iﬀ f is injective. Let f (S) be the sequence {f (st )} in N . The next theorem is so basic in linear algebra that it is used without comment. Theorem Suppose N is a module. it should be worked carefully..
. M is a free module with basis S = {st }.
Find a nontrivial linear combination of the columns of A which is 0. Also ﬁnd a nonzero element of kernel(f ). i.80
Linear Algebra
Chapter 5
Exercise Suppose R is a commutative ring. f (B) = AB.. That is. vn ∈ Rm be the columns of A.. linear independence. if f : R → R is a surjective Rmodule homomorphism. although it is usually stated only for the case where R is a ﬁeld. f (ei ) = vi b1 c1 = column i of A. This is a key theorem in linear algebra. .e. show A is invertible.
Relating these concepts to matrices The theorem stated below gives a summary of results we have already had. en } is a free basis for Rn ... Show f is an isomorphism. ¯ Exercise Let R = Z. A = If R is a commutative ring. Use the fact that {e1 . The next exercise is routine. bn cm
. you can relate properties of R as an Rmodule to properties of R as a ring. A ∈ Rn . and the homomorphism f : Rn → Rn deﬁned by f (B) = AB is surjective.. represent an element of Rn and C = . . then f is an isomorphism. . 2 1 0 and f: Z3 → Z2 be the group homo3 2 −5 morphism deﬁned by A. Exercise 1) 2) Suppose R is a commutative ring and v ∈ R. Let B = . injective homomorphisms. v = 0.e. It shows that certain concepts about matrices.Let v1 . and solutions of equations.n and f : Rn → Rm is the homomorphism associated with A. but still informative. Suppose A ∈ Rm.e. are all the same — they are merely stated in diﬀerent language.
Note that 2) here is essentially the ﬁrst exercise for the case n = 1... ¯ v is independent iﬀ v is v is a basis for R iﬀ v generates R iﬀ v is
. i. i.
... vn } is a basis for Rm iﬀ f is an isomorphism iﬀ (for any C ∈ Rm . At is invertible.
2)
3)
4)
Relating these concepts to square matrices We now look at the preceding theorem in the special case where n = m and R is a commutative ring. vn } is a basis for Rn . . vn } is independent iﬀ f is injective iﬀ AX = 0 has a unique ¯ solution iﬀ (∃ C ∈ Rm such that AX = C has a unique solution). {v1 .. Theorem 1) The element f (B) is a linear combination of the columns of A. .. ...e.. i. {v1 . Let v1 .. injective implies surjective.
.) {v1 .. .n be the rows of A.e. . {w1 .. AX = C has a solution). vn } generates Rm iﬀ f is surjective iﬀ (for any C ∈ Rm . Then the following are equivalent. .. {v1 . and f : Rn → Rn is deﬁned by f (B) = AB. then f is injective. namely that if f : Rn → Rn is surjective. i. Thus the image of f is generated by the columns of A. A ∈ Rn . (See bottom of page 89. . Now we prove something more substantial. AX = C has a unique solution)..  A  is a unit in R.. . vn } generates Rn .. 1) 2) 3) 4) 5) 2t ) 3t ) f is an automorphism. {v1 . vn ∈ Rn be the columns of A.Chapter 5
Linear Algebra
81
represent an element of Rm . Later on we will prove that if R is a ﬁeld. Theorem Suppose R is a commutative ring. A is invertible.  At  is a unit in R. and w1 . that is f (B) = f (e1 b1 + · · +en bn ) = v1 b1 + · · +vn bn .. So far in this chapter we have just been cataloging. wn ∈ Rn = R1. wn } is a basis for Rn . f is surjective...
Deﬁne h : (Rm ⊕ Rk = Rn ) → Rn by h(u. un ∈ Rn with f (ui ) = ei .. If R is commutative. that basis is ﬁnite.. and 3) implies 4). so m = n by the previous theorem. then m ≥ n. R will be a commutative ring.. Corollary Proof If f : Rm → Rn is an isomorphism. and by the previous section. Recall that the proof of this fact uses determinant.e. Then if M has a basis.
Each of f and f −1 is surjective.
Corollary
Proof The hypothesis implies there is a surjective homomorphism R m → Rn . module (i. 2) and 2t ) are equivalent. which requires that R be commutative (see the exercise on page 64).. ∃ u1 . This shows that A has a right inverse and is thus invertible. this is impossible. 4) and 5) are equivalent. a ﬁnite generated Rmodule). as shown below.
Proof Suppose k = n − m is positive. Since  A = At . So this follows from the ﬁrst theorem. wn } generates Rn . Thus the ﬁrst ﬁve are equivalent. Uniqueness of Dimension There exists a ring R with R2 ≈ R3 as Rmodules. Convention Theorem For the remainder of this chapter.
Linear Algebra
Chapter 5
Proof Suppose 5) is true and show 2). If {v1 . . Since f is onto. .. applying this result to At shows that the last three properties are equivalent to each other.g. Let g : Rn → Rn be the homomorphism satisfying g(ei ) = ui . then m ≥ n. This is a contradiction and thus m ≥ n.82 4t ) {w1 . Now g comes from some matrix D and thus AD = I. . We already know the ﬁrst three properties are equivalent.. v) = f (u).. then m = n.. Then f ◦ g is the identity. Then h is a surjective homomorphism. Lemma Suppose M is a f. but this is of little interest.
. If f : Rm → Rn is a surjective Rmodule homomorphism. Furthermore. First we make a convention. vm } generates Rn . also injective.
and thus have a determinant and trace. un } or if you wish. . Change of Basis Before changing basis. This number is denoted by dim(M ).. The result follows because Rn ≈ Rm iﬀ n = m.Chapter 5
Linear Algebra
83
Proof Suppose U ⊂ M is a ﬁnite generating set and S is a basis. (By convention. and thus S is ﬁnite. In the deﬁnition of basis on page 78. there is no concept of repetition. un ) ∈ M n . the dimension of M. S = (u1 .. For the concept of generating it matters not whether you use sequences or collections. If M has a basis. In order to represent f by a matrix. . an isomorphism with Rn ). We will show that this matrix is well deﬁned up to similarity. and this sequence is dependent. we recall what a basis is.. and basis for sequences.. . Then any element of U is a ﬁnite linear combination of elements of S. yet we certainly don’t wish to say the columns of A form a basis for R2 .
. free module and f : M → M is a homomorphism.e. In a set or collection. If we consider the column vectors of A as a collection. and characteristic polynomial of f are welldeﬁned.. independence. Here we follow the classical convention that an index set with n elements will be {1. h(en )). not for collections. we must consider the columns of A as an ordered triple of vectors. Endomorphisms on Rn are represented by square matrices. In order to make sense.. you must use sequences. Theorem Suppose M is a f. module.. 2.. Two sequences cannot begin to be equal unless they have the same index set. . Consider the columns of the real matrix 2 3 2 A= . we must select a basis for M (i.. Now suppose M is a f. Previously we deﬁned generating. M has a basis of n elements iﬀ M ≈ Rn . trace. any basis for M must be ﬁnite. that basis is ﬁnite and any other basis has the same number of elements. Now h : Rn → M is an isomorphism iﬀ α(h) is a basis for M . not for sets or collections. change of basis can be displayed by the diagram below. 0 is a free module of dimension 0. but for independence and basis. there are 1 4 1 only two of them. n}.g. Suppose M is an Rmodule with a basis of n elements. and thus the determinant. The point of all this is that selecting a basis of n elements for M Summary is the same as selecting an isomorphism from Rn to M .) ¯ Proof By the previous lemma. and thus a basis for M with n elements is a sequence S = {u1 . M ) → M n deﬁned by α(h) = (h(e1 ).g. and from this viewpoint. Recall there is a bijection α : HomR (Rn . basis is deﬁned for sequences.
Deﬁne f  to be A.j ) ∈ Rn by vi = u1 c1.t. Rn B
d s E
ei
s d d d d ≈d vi d d
ei
© ≈ vi ©
Rn
C ≈
c
M
©
f
E
M
d us ≈ id d E s d d d d
≈ ui
≈ C
c
Rn
©i e
ed i
A
Rn
The diagram also explains what it means for A to be the matrix of f w. Let h : Rn → M be the isomorphism with h(ei ) = ui for 1 ≤ i ≤ n. Then T is a basis for M and the matrix of f w.i + · · +un cn.r... vn } is another basis for M and B ∈ Rn is the matrix of f w. conjugation of matrices corresponds to change of basis.i .i .g.r. A is the usual matrix associated with f ). .. un }. trace(f ) to be trace(A).. An important special case is where M = Rn and f : Rn → Rn is given by some matrix W .r. Conversely. Then C is invertible and B = C −1 AC.t. Theorem Suppose T = {v1 .r. trace(A)=trace(B). .i + · · +un an. The matrix A = (ai.r. and U −1 W U represents f w. and f : M → M is a homomorphism.. T . T is B = C −1 AC.r. the basis {u1 . the standard basis. In other words..t.t. suppose C = (ci. Then the matrix A ∈ Rn is the one determined by the endomorphism h−1 ◦ f ◦ h : Rn → Rn . S = {u1 .e. vn } by vi = u1 c1.i .84
Linear Algebra
Chapter 5
Deﬁnition Suppose M is a free module. free module and f : M → M is a homomorphism.. .. and A and B have the same characteristic polynomial (see page 66 of chapter 4).j ) ∈ Rn is invertible.. un } is a basis for M . Therefore A = B. In other words.t. the basis S is deﬁned by f (ui ) = u1 a1.j ) ∈ Rn of f w. Deﬁnition Suppose M is a f. i. column i of A is h−1 (f (h(ei ))). Proof The proof follows by seeing that the following diagram is commutative. where A is
. (Note that if M = Rn and ui = ei . Deﬁne C = (ci. W represents f w. In other words.i + · · +un cn.t. the basis S. Deﬁne T = {v1 . A and B are similar. Then h is given by the matrix U whose ith column is ui and A = U −1 W U. and CPf (x) to be CPA (x). .
if v ∈ V and r ∈ F .t. Finally.e. or characteristic polynomial. Theorem Suppose M is an F module and v ∈ M .t. Vector Spaces So far in this chapter we have been developing the theory of linear algebra in general. The previous theorem.r. Then v = 0 iﬀ v is independent. Endomorphisms in general will not have a determinant. trace. ﬁnd an invertible matrix C ∈ R2 with B = C −1 AC. −1)t . trace.
. ¯ That is. and show that in this case. By the previous theorem.
Exercise
Let R = Z and f : Z2 → Z2 be deﬁned by f (D) = 2 1 . 2r)t : r ∈ R}. Then 0= (vr)r = v1 = v. trace. the basis
Exercise Let L ⊂ R2 be the line L = {(r. Then v generates M iﬀ v ¯ is a basis for M . Also ﬁnd the matrix B ∈ R2 which represents f with respect to the standard basis. if these conditions hold. Thus any ﬁnitely generated F module will have a welldeﬁned dimension. ¯ ¯ ¯ ¯ Theorem Suppose M = 0 is an F module and v ∈ M . any nonzero element of M is a basis. i.. In this section. but it must be assumed that the module M is free. all three are welldeﬁned. ¯ ¯ ¯ −1 Proof Suppose vr = 0 and r = 0. 3 1 . and characteristic polynomial of f . then M ≈ FF .Chapter 5
Linear Algebra
85
the matrix of f w. Find the determinant. 1)t }. every F module is free. F is a ﬁeld. Show there is one and only one homomorphism f : R2 → R2 which is the identity on L and has f ((−1. Furthermore. for example.
3 3 0 −1
D. We now focus on the case where R is a ﬁeld F .r. 2)t . and characteristic polynomial. some basis. (−1. and any two elements of M are dependent. Find the matrix A ∈ R2 which represents f with respect to the basis {(1. and endomorphisms on it will have welldeﬁned determinant. vr = 0 implies v = 0 in M or r = 0 in F . 1)t ) = (1. do not depend upon the choice of basis. F modules may also be called vector spaces and F module homomorphisms may also be called linear transformations. holds for any commutative ring R.
Find the matrix of f w.
∃ r1 . In this case M ≈ F . It not only says any ﬁnite independent sequence can be extended to a basis.. The key hypothesis here is that the ring is a ﬁeld... Theorem Suppose M = 0 is a ﬁnitely generated F module. . However it certainly cannot be extended to a basis. . Theorem Suppose M is an F module of dimension n.. and any nonzero element of F is a basis. but it can be extended to a basis inside any ﬁnite generating set containing it. It must be shown that vi is a linear combination of {v1 . Then m ≤ n and if m = n. Theorem Suppose M and N are ﬁnitely generated F modules.. Then ri = 0 and vi = −(v1 r1 + · · +vn rn )ri . T may be extended to a ﬁnite generating sequence. vn } is a maximal independent subsequence of S. vn }. vn } generates S and thus all of M . and thus is a free F module. ri not all −1 zero... vm } is an independent sequence in M.
The next theorem is just a collection of observations.. and {v1 . . M has a ﬁnite free basis.. .. Then v = 0 and is thus independent by the ¯ previous theorem. ¯ ¯ Thus {v1 . Thus T extends to a basis. In particular. and thus the dimension of M is well deﬁned (see theorem on page 83). . as will be seen momentarily.
Since F is a commutative ring.. then Z is a free module over itself.. Now suppose T is a ﬁnite independent sequence.86
Linear Algebra
Chapter 5
Proof Suppose v generates M . Proof {v1 . vn ... such that v1 r1 + · · +vn rn + vi ri = 0. and the element 2 of Z is independent. any two bases of M must have the same number of elements. and any two elements of F are dependent. If S = {v1 .. Since {v1 . This is one of the theorems that makes linear algebra tick. .
. and n < i ≤ m.. vm } extends to a basis with n elements. vi } is dependent. After so many routine theorems. Also the ﬁniteness hypothesis in this theorem is only for convenience. then any maximal independent subsequence of S is a basis for M . and inside that sequence it may be extended to a maximal independent sequence. . . for notational convenience. rn .. Thus any ﬁnite independent sequence can be extended to a basis.. it is nice to have one with real power. . If R = Z. Proof Suppose.. vm } is a basis.. {v1 . vm } ¯ generates M . that {v1 .
.
. M ≈ N iﬀ dim(M ) = dim(N ).
Proof Let T be a basis for K. Part 3) follows from 2) because an independent sequence can always be extended to a generating sequence.
Proof The proof of 1) is the same as in the case where S is ﬁnite. M has a free basis. F m ≈ F n iﬀ n = m. F module. Part 2) will follow from the Hausdorﬀ Maximality Principle. Extend T to a basis S for M . R is a Qmodule. In other words.g. and thus to a basis for M . K is a summand of M . ∃ a Qsubmodule V ⊂ R with Q ⊕ V = R as Qmodules. dim(M ⊕ N ) = dim(M ) + dim(N ). (See exercise on page 77. and f : M → N is a homomorphism. N is an F module. Part 2) follows from 1).. ∃ a submodule L of M with K ⊕ L = M . An independent subsequence of S is contained in a maximal monotonic tower of independent subsequences. Any independent subsequence of M can be extended to a basis for M .g. then dim(K) ≤ dim(M ) and K = M iﬀ dim(K) = dim(M ). Then S −T generates a submodule L with K ⊕ L = M. Theorem 1) 2) 3) Suppose M = 0 is an F module and S = {vt }t∈T generates M .
87
Here is the basic theorem for vector spaces in full generality. and thus is a free F module.
Corollary Suppose M is a f.e. and Q is a submodule of R. In particular. and so the result follows. Then dim(M ) = dim(ker(f )) + dim(image(f )). If M is f. Any independent subsequence of S may be extended to a maximal independent subsequence of S.Chapter 5 1) 2) 3) 4)
Linear Algebra M ≈ F n iﬀ dim(M ) = n.) Proof Q is a ﬁeld. Corollary Q is a summand of R. ¯ Any maximal independent subsequence of S is a basis for M . i. Theorem 1) 2) Suppose M is an F module and K ⊂ M is a submodule. The union of these independent subsequences is still independent.
¯ {w1 . {w1 .n be the rows of A.  A = 0. i. and each is a basis for R2 . vn ∈ F n be the columns of A.. i. . . Then the following are equivalent. 2). vn } is a basis for F n .e... for Rmodules... 2.e. i... Theorem Suppose A ∈ Fn and f : F n → F n is deﬁned by f (B) = AB..e. i. i. and w1 ..e. . Show there are three maximal independent subsequences of S. You may use determinant. . every submodule is a summand. (2. 1).88
Linear Algebra
Chapter 5
Proof Let K = ker(f ) and L ⊂ M be a submodule with K ⊕ L = M . wn ∈ F n = F1. 5). (1..e.) The real vector space R3 is generated by S = {(1. f is injective. 2. wn } is a basis for F n . wn } generates F n . (1.. 0)}. 4.. i. .. {w1 . vn } is independent.e.. (Row vectors are used here just for convenience. 2)}.. Then f  L : L → image(f ) is an isomorphism. Let v1 . ¯ {v1 .e.. 1. 0).. Show there are three maximal independent subsequences of S and each is a basis for R3 .. f is surjective. Show R is a ﬁeld. (3. wn } is independent.
Exercise The real vector space R2 is generated by the sequence S = {(π... (3. .
3) 1t ) 2t ) 3t )
. {v1 . A is invertible.. 1). . Exercise Suppose R is a domain with the property that. vn } generates F n ..  At = 0. At is invertible. Exercise Find a free Zmodule which has a generating set containing no basis. ..
Square matrices over ﬁelds This theorem is just a summary of what we have for square matrices over ﬁelds. i. 1) 2) {v1 . f is an automorphism.
This shows some of the simple and deﬁnitive nature of linear algebra. and we know from page 81 that each f (B) is a linear
. Now suppose each of U and V is a vector space of dimension n and f : U → V is a linear transformation. Each column of A is a vector in the range F m . f is injective iﬀ f is bijective iﬀ f is surjective. Exercise Let A = (A1 . this theorem holds for any commutative ring R. . then the row (column) rank of A is the same as the row (column) rank of CAD. Proof Suppose f : F n → F m is deﬁned by f (B) = AB.. (See the section Relating these concepts to square matrices. Show the following are equivalent.. Overview Suppose each of X and Y is a set with n elements and f : X → Y is a function. Let f : Zn → Zn be deﬁned by f (B) = AB and f : Rn → Rn be deﬁned by ¯(C) = AC.. By the pigeonhole principle.. An ) be an n × n matrix over Z with column i = Ai ∈ n ¯ Z .n . It follows from the work done so far that f is injective iﬀ f is bijective iﬀ f is surjective. An ) is linearly independent over R. A = 0.Chapter 5
Linear Algebra
89
Proof Except for 1) and 1t ). Theorem If C ∈ Fm and D ∈ Fn are invertible. (See the exercise on page 79. . An ) is linearly independent over Z. Exercise Add to this theorem more equivalent statements in terms of solutions of n equations in n unknowns.) Parts 1) and 1t ) follow from the preceding section..
Rank of a matrix Suppose A ∈ Fm. ¯ f : Rn → Rn is injective.) f 1) 2) 3) 4) 5) f : Zn → Zn is injective. .. The sequence (A1 . pages 81 and 82. The row (column) rank of A is deﬁned to be the dimension of the submodule of F n (F m ) generated by the rows (columns) of A. The sequence (A1 .
i.t = 1 and all other entries are 0 (see the ¯ ¯ ﬁrst exercise on page 59). and for n = 3 it is “ordinary volume”. This number is called the rank of A and is ≤ min{m. V might be the interior of a square or circle. and its dimension is the column rank of A.
. elementary row and column operations change Proof neither the row rank nor the column rank. The volume of V does not change under translation. V and V + p have the same volume. Show that the rank of A is the largest integer t such that this is possible. f preserves volume. it is length. Thus f (V ) and f (V + p) = f (V ) + f (p) have the same volume. the rank of f is the dimension of the image of f . Exercise Suppose A has rank t. For example. Theorem The ndimensional volume of f (V ) is ±A(the ndimensional volume of V ). Thus row rank = t = column rank. Suppose A ∈ Rn and f : Rn → Rn is the homomorphism given by A.n . Theorem If A ∈ Fm. Thus the image of f is the submodule of F m generated by the columns of A. then the rank of f is the same as the rank of the matrix A. Geometric Interpretation of Determinant Suppose V ⊂ Rn is some nice subset. By row and column operations. the next theorem says that “f multiplies volume by the absolute value of its determinant”.1 = ·· = ht. it is area. n}. For n = 1. A may be changed to a matrix H where h1.90
Linear Algebra
Chapter 5
combination of those vectors. What is the dimension of the solution set of AX = 0? ¯ Deﬁnition If N and M are ﬁnite dimensional vector spaces and f : N → M is a linear transformation. the row rank and the column rank of A are equal.e. Thus if A = ±1. There is a concept of the ndimensional volume of V . If f : F n → F m is given by a matrix A. This dimension is the same as the dimension of the image of g ◦ f ◦ h : F n → F m . This proves the theorem for column rank. if n = 2. The theorem for row rank follows using transpose. By the theorem above. where h is any automorphism on F n and g is any automorphism on F m . In street language.n has rank t. Show that it is possible to select t rows and t columns of A such that the determined t × t matrix is invertible. For n = 2.. Exercise Suppose A ∈ Fm.
1] × · · ×[0. y) is nonnegative for each
V V ∂h1 ∂x ∂h2 ∂x ∂h1 ∂y ∂h2 ∂y
and for each
1dxdy. However this is not the only reason that linear algebra is so useful.Chapter 5
Linear Algebra
91
Proof If A = 0. linear transformations send the origin to the origin. Corollary If P is the ndimensional parallelepiped determined by the columns v1 . h2 ) : V → R2 is injective and diﬀerentiable. image(f ) has dimension < n and thus f (V ) has ndimensional volume 0. Linear functions approximate diﬀerentiable functions locally We continue with the special case F = R. let f (x. Let f : R → R be the linear transformation f (x) = h (p)x.) Theorem Suppose the determinant of J(h)(x. 1] = {e1 t1 + · · +en tn : 0 ≤ ti ≤ 1}. y). (Note that if h is the restriction of a linear transformation from R2 to R2 . vn of A. It is a central fact that smooth phenomena may be approximated locally by linear phenomena. p2 ) ∈ V . If A = 0 then A is the product of elementary matrices (see page 59) and for elementary matrices. Then h is approximated near p by g(x) = h(p) + f (x − p) = h(p) + h (p)(x − p). y) ∈ V . Of course. . Proof Let V = [0. p2 ) (after translation) by f (p1 . the theorem is obvious. and mathematics. Without this great simpliﬁcation. Deﬁne the Jacobian by J(h)(x. The area of V is any homomorphism f multiplies area by  f . The result follows because the determinant of the composition is the product of the determinants. Then for any (p1 . Now suppose V ⊂ R2 is some nice subset and h = (h1 . The student may now understand the following theorem from calculus. From the previous section we know that
(x. h is approximated near (p1 . . this theorem is immediate from the previous section.. As a simple example. then the ndimensional volume of P is ±A. suppose h : R → R is diﬀerentiable and p is a real number. science. the world of technology as we know it today would not exist. Then P = f (V ) = {v1 t1 + · · +vn tn : 0 ≤ ti ≤ 1}. y) : R2 → R2 be the homomorphism deﬁned by J(h)(x. p2 ). y) ∈ V .
. so they must be adjusted by a translation. Linear functions arise naturally in business. y) = (x. Then the area of h(V ) is
 J(h)  dxdy.
92
Linear Algebra The Transpose Principle
Chapter 5
We now return to the case where F is a ﬁeld (of arbitrary characteristic). F modules may also be called vector spaces and submodules may be called subspaces. The study of Rmodules in general is important and complex. However the study of F modules is short and simple – every vector space is free and every subspace is a summand. The core of classical linear algebra is not the study of vector spaces, but the study of homomorphisms, and in particular, of endomorphisms. One goal is to show that if f : V → V is a homomorphism with some given property, there exists a basis of V so that the matrix representing f displays that property in a prominent manner. The next theorem is an illustration of this. Theorem 1) Let F be a ﬁeld and n be a positive integer. Suppose V is an ndimensional vector space and f : V → V is a homomorphism with f  = 0. Then ∃ a basis of V such that the matrix ¯ representing f has its ﬁrst row zero. Suppose A ∈ Fn has A = 0. Then ∃ an invertible matrix C such that ¯ C −1 AC has its ﬁrst row zero. Suppose V is an ndimensional vector space and f : V → V is a homomorphism with f  = 0. Then ∃ a basis of V such that the matrix representing f has its ﬁrst column zero. Suppose A ∈ Fn has A = 0. Then ∃ an invertible matrix D such that ¯ D −1 AD has its ﬁrst column zero.
2) 3)
4)
We ﬁrst wish to show that these 4 statements are equivalent. We know that 1) and 2) are equivalent and also that 3) and 4) are equivalent because change of basis corresponds to conjugation of the matrix. Now suppose 2) is true and show 4) is true. Suppose A = 0. Then At  = 0 and by 2) ∃ C such that C −1 At C has ¯ ¯ ﬁrst row zero. Thus (C −1 At C)t = C t A(C t )−1 has ﬁrst row column zero. The result follows by deﬁning D = (C t )−1 . Also 4) implies 2). This is an example of the transpose principle. Loosely stated, it is that theorems about change of basis correspond to theorems about conjugation of matrices and theorems about the rows of a matrix correspond to theorems about the columns of a matrix, using transpose. In the remainder of this chapter, this will be used without further comment.
Chapter 5
Linear Algebra
93
Proof of the theorem We are free to select any of the 4 parts, and we select part 3). Since  f = 0, f is not injective and ∃ a nonzero v1 ∈ V with f (v1 ) = 0. ¯ Now v1 is independent and extends to a basis {v1 , .., vn }. Then the matrix of f w.r.t this basis has ﬁrst column zero. Exercise Let A = 3π 6 2π 4 . Find an invertible matrix C ∈ R2 so that C −1 AC
0 0 0 has ﬁrst row zero. Also let A = 1 3 4 and ﬁnd an invertible matrix D ∈ R3 2 1 4 so that D −1 AD has ﬁrst column zero. Exercise Suppose M is an ndimensional vector space over a ﬁeld F , k is an integer with 0 < k < n, and f : M → M is an endomorphism of rank k. Show there is a basis for M so that the matrix representing f has its ﬁrst n − k rows zero. Also show there is a basis for M so that the matrix representing f has its ﬁrst n − k columns zero. Work these out directly without using the transpose principle. Nilpotent Homomorphisms In this section it is shown that an endomorphism f is nilpotent iﬀ all of its characteristic roots are 0 iﬀ it may be represented by a strictly upper triangular matrix. ¯ Deﬁnition An endomorphism f : V → V is nilpotent if ∃ m with f m = 0. Any ¯ f represented by a strictly upper triangular matrix is nilpotent (see page 56). Theorem Suppose V is an ndimensional vector space and f : V → V is a nilpotent homomorphism. Then f n = 0 and ∃ a basis of V such that the matrix ¯ representing f w.r.t. this basis is strictly upper triangular. Thus the characteristic polynomial of f is CPf (x) = xn . Proof Suppose f = 0 is nilpotent. Let t be the largest positive integer with ¯ t f = 0. Then f t (V ) ⊂ f t−1 (V ) ⊂ ·· ⊂ f (V ) ⊂ V . Since f is nilpotent, all of these ¯ inclusions are proper. Therefore t < n and f n = 0. Construct a basis for V by ¯ starting with a basis for f t (V ), extending it to a basis for f t−1 (V ), etc. Then the matrix of f w.r.t. this basis is strictly upper triangular. Note To obtain a matrix which is strictly lower triangular, reverse the order of the basis.
94 Exercise
Linear Algebra
Chapter 5
Use the transpose principle to write 3 other versions of this theorem.
Theorem Suppose V is an ndimensional vector space and f : V → V is a homomorphism. Then f is nilpotent iﬀ CPf (x) = xn . (See the exercise at the end of Chapter 4 for the case n = 2.) Proof Suppose CPf (x) = xn . For n = 1 this implies f = 0, so suppose n > 1. ¯ Since the constant term of CPf (x) is 0, the determinant of f is 0. Thus ∃ a basis ¯ ¯ of V such that the matrix A representing f has its ﬁrst column zero. Let B ∈ Fn−1 be the matrix obtained from A by removing its ﬁrst row and ﬁrst column. Now CPA (x) = xn = xCPB (x). Thus CPB (x) = xn−1 and by induction on n, B is nilpotent and so ∃ C such that C −1 BC is strictly upper triangular. Then
1 0 · ·0 0 · C −1 · 0
0 ∗ · ·∗ · B · 0
0 ∗ · ·∗ 1 0 · ·0 0 0 = · C −1 BC · C · · 0 0
is strictly upper triangular.
Suppose F is ﬁeld, A ∈ F3 is a strictly lower triangular matrix of a 0 0 0 rank 2, and B = 1 0 0 . Using conjugation by elementary matrices, show there 0 1 0 is an invertible matrix C so that C −1 AC = B. Now suppose V is a 3dimensional vector space and f : V → V is a nilpotent endomorphism of rank 2. We know f can be represented by a strictly lower triangular matrix. Show there is a basis {v1 , v2 , v3 } for V so that B is the matrix representing f . Also show that f (v1 ) = v2 , f (v2 ) = v3 , and f (v3 ) = 0. In other words, there is a basis for V of the form {v, f (v), f 2 (v)} ¯ with f 3 (v) = 0. ¯ Exercise Exercise Suppose V is a 3dimensional vector space and f : V → V is a nilpotent endomorphism of rank 1. Show there is a basis for V so that the matrix representing 0 0 0 f is 1 0 0 . 0 0 0
Chapter 5
Linear Algebra Eigenvalues
95
Our standing hypothesis is that V is an ndimensional vector space over a ﬁeld F and f : V → V is a homomorphism. Deﬁnition An element λ ∈ F is an eigenvalue of f if ∃ a nonzero v ∈ V with f (v) = λv. Any such v is called an eigenvector. Eλ ⊂ V is deﬁned to be the set of all eigenvectors for λ (plus 0). Note that Eλ = ker(λI − f ) is a subspace of V . The ¯ next theorem shows the eigenvalues of f are just the characteristic roots of f . Theorem 1) 2) 3) If λ ∈ F then the following are equivalent. λ is an eigenvalue of f , i.e., (λI − f ) : V → V is not injective.  (λI − f ) = 0. ¯ λ is a characteristic root of f , i.e., a root of the characteristic polynomial CPf (x) =  (xI − A) , where A is any matrix representing f .
Proof It is immediate that 1) and 2) are equivalent, so let’s show 2) and 3) are equivalent. The evaluation map F [x] → F which sends h(x) to h(λ) is a ring homomorphism (see theorem on page 47). So evaluating (xI − A) at x = λ and taking determinant gives the same result as taking the determinant of (xI − A) and evaluating at x = λ. Thus 2) and 3) are equivalent. The nicest thing you can say about a matrix is that it is similar to a diagonal matrix. Here is one case where that happens. Theorem Suppose λ1 , .., λk are distinct eigenvalues of f , and vi is an eigenvector of λi for 1 ≤ i ≤ k. Then the following hold. 1) 2) {v1 , .., vk } is independent. If k = n, i.e., if CPf (x) = (x − λ1 ) · · · (x − λn ), then {v1 , .., vn } is a basis for V . The matrix of f w.r.t. this basis is the diagonal matrix whose (i, i) term is λi .
Proof Suppose {v1 , .., vk } is dependent. Suppose t is the smallest positive integer such that {v1 , .., vt } is dependent, and v1 r1 + · · +vt rt = 0 is a nontrivial linear ¯ combination. Note that at least two of the coeﬃcients must be nonzero. Now (f − λt )(v1 r1 + · · +vt rt ) = v1 (λ1 − λt )r1 + · · +vt−1 (λt−1 − λt )rt−1 + 0 = 0 is a shorter ¯ ¯
96
Linear Algebra
Chapter 5
nontrivial linear combination. This is a contradiction and proves 1). Part 2) follows from 1) because dim(V ) = n. Exercise 0 1 ∈ R2 . Find an invertible C ∈ C2 such that −1 0 C −1 AC is diagonal. Show that C cannot be selected in R2 . Find the characteristic polynomial of A. Let A =
Exercise Suppose V is a 3dimensional vector space and f : V → V is an endomorphism with CPf (x) = (x − λ)3 . Show that (f − λI) has characteristic polynomial x3 and is thus a nilpotent endomorphism. Show there is a basis for V so that the λ 0 0 λ 0 0 λ 0 0 matrix representing f is 1 λ 0 , 1 λ 0 or 0 λ 0 . 0 1 λ 0 0 λ 0 0 λ We could continue and ﬁnally give an ad hoc proof of the Jordan canonical form, but in this chapter we prefer to press on to inner product spaces. The Jordan form will be developed in Chapter 6 as part of the general theory of ﬁnitely generated modules over Euclidean domains. The next section is included only as a convenient reference. Jordan Canonical Form This section should be just skimmed or omitted entirely. It is unnecessary for the rest of this chapter, and is not properly part of the ﬂow of the chapter. The basic facts of Jordan form are summarized here simply for reference. The statement that a square matrix B over a ﬁeld F is a Jordan block means that ∃ λ ∈ F such that B is a lower triangular matrix of the form
B=
λ 1 λ 0
0 ·
· 1 λ
.
B gives a homomorphism g : F m → F m with g(em ) = λem
and g(ei ) = ei+1 + λei for 1 ≤ i < m. Note that CPB (x) = (x − λ)m and so λ is the only eigenvalue of B, and B satisﬁes its characteristic polynomial, i.e., CPB (B) = 0. ¯
Later on it will be shown that if A is a symmetric real matrix..e. D is a diagonal matrix iﬀ each ni = 1. then A is similar to a diagonal matrix. they have the same Jordan blocks. (x − 2)(x − 3)2 . ∃ an invertible C ∈ Fn such that C −1 AC is in Jordan form.. except possibly in diﬀerent order. The complex numbers are algebraically closed.e. . Bt such B1 B2 · 0 · Bt 0
. (x − 2)(x − 3)3 . and thus ∃ C ∈ Cn with C −1 AC in Jordan form. Theorem 1) 2) If A ∈ Fn . This means that CPA (x) will factor completely in C[x]. Exercise Suppose D ∈ Fn is in Jordan form and has characteristic polynomial a0 + a1 x + · · +xn . C may be selected to be in Rn iﬀ all the eigenvalues of A are real. Exercise Find all real matrices in Jordan form that have the following characteristic polynomials: x(x − 2). (In this case we say that all the eigenvalues of A belong to F . (x − 2)2 (x − 3)2 . Let’s look at the classical case A ∈ Rn . (x − 2)2 . i. The reader should use the transpose principle to write three other versions of the ﬁrst theorem. λn ∈ F (not necessarily distinct) such that CPA (x) = (x − λ1 ) · · (x − λn ). the following are equivalent.. Note that a diagonal matrix is a special case of Jordan form. .
that D =
Suppose D is of this form and Bi ∈ Fni has
eigenvalue λi . ∃ λ1 . ¯ ¯
. . iﬀ each Jordan block is a 1 × 1 matrix. This means that if A and D are similar matrices in Jordan form. show CPD (D) = 0..Chapter 5 Deﬁnition
Linear Algebra
97
A matrix D ∈ Fn is in Jordan form if ∃ Jordan blocks B1 . Show a0 I + a1 D + · · +D n = 0. then A is similar to a diagonal matrix.. namely that if A has n distinct eigenvalues in F . Then n1 + · · +nt = n and CPD (x) = (x − λ1 )n1 · ·(x − λt )nt . (x − 2)(x − 3)(x − 4). Also note that we know one special case of this theorem.)
Theorem Jordan form (when it exists) is unique. i.
¯ Schwarz’ inequality.98
Linear Algebra
Chapter 5
Exercise (CayleyHamilton Theorem) Suppose E is a ﬁeld and A ∈ En . for all u ∈ V . The theories for the real and complex cases are quite similar. However. (u · v)2 ≤ (u · u)(v · v). Thus ∃ an invertible C ∈ Fn such that D = C −1 AC is in Jordan form. Show A is nilpotent iﬀ An = 0 ¯ iﬀ CPA (x) = xn . v) to u · v and satisﬁes 1) (u1 r1 + u2 r2 ) · v = (u1 · v)r1 + (u2 · v)r2 v · (u1 r1 + u2 r2 ) = (v · u1 )r1 + (v · u2 )r2 u·v = v·u u · u ≥ 0 and u · u = 0 iﬀ u = 0 ¯ Suppose V has an inner product. Use this to show CPA (A) = 0. r2 ∈ R. v ∈ V and r1 . for all u. Finitely generated vector spaces over R or C support inner products and are thus geometric as well as algebraic objects. and both could have been treated here. If v ∈ V . u2 .) Inner Product Spaces The two most important ﬁelds for mathematics and science in general are the real numbers and the complex numbers. Assume the theorem that there is a ﬁeld F containing E such that CPA (x) factors completely in F [x].
. Dividing by 2ab yields 0 ≤ ab ± (u · v) or  u · v ≤ ab. f : V → R deﬁned by f (u) = u · v is a homomorphism. for all u1 .
2) 3)
Theorem 1)
2)
√ √ Proof of 2) Let a = v · v and b = u · u. If a or b is 0. Thus 0 · v = 0.) ¯ Exercise Suppose A ∈ Fn is in Jordan form. (Note how easy this is in Jordan form. attention is restricted to the case F = R. Suppose neither a nor b is 0. Now 0 ≤ (ua ± vb) · (ua ± vb) = (u · u)a2 ± 2ab(u · v)+ (v · v)b2 = b2 a2 ± 2ab(u · v) + a2 b2 . An inner product (or dot product) on V is a function V × V → R which sends (u. If u. Deﬁnition Suppose V is a real vector space. the result is obvious. In the remainder of this chapter. v ∈ V . v ∈ V. (See the second exercise on page 66. for simplicity. the power and elegance of linear algebra become transparent for all to see.
v ∈ Rn . Furthermore v1 vn Theorem Proof Suppose v1 r1 + · · +vn rn = 0. Suppose V is an IPS... The sequence is orthonormal if it is orthogonal and each vector has length 1... .j for 1 ≤ i.Chapter 5
Linear Algebra
99
Theorem √Suppose V has an inner product. i.e. . is orthonormal... then S is independent. The following properties hold... Then there is a unique inner product on V which makes S an orthornormal basis. ¯ vr = v  r . j ≤ n. . Theorem Suppose V is a real vector space with a basis S = {v1 .. If S = {v1 . . vn }.
2
Proof of 4) 2 u v + v
u + v 2 = (u + v) · (u + v) = u = ( u + v )2 . vn } is orthogonal provided vi · vj = 0 when i = j.. vi · vj = δi. (Schwarz’ inequality) (The triangle inequality)
2
u+v ≤ u + v .. vn } is an orthogonal sequence of nonzero vectors in an vn v1 . Convention Rn will be assumed to have the standard inner product deﬁned by (r1 . . en } will be called the canonical or standard orthonormal basis (see page 72).
+ 2(u · v) + v
2
≤ u
2
+
Deﬁnition An Inner Product Space (IPS) is a real vector space with an inner product. Deﬁne the norm or length of a vector v by v = v · v.. The second statement is transparent. Theorem If u. Then 0 = (v1 r1 + · · +vn rn ) · vi = ri (vi · vi ) ¯ and thus ri = 0. It is given by the formula (v1 r1 + · · +vn rn ) · (v1 s1 + · · +vn sn ) = r1 s1 + · · +rn sn . IPS V. The next theorem shows that this inner product has an amazing geometry. as is shown by the following theorem. 1) 2) 3) 4) v = 0 iﬀ v = 0. S = {e1 . rn )t · (s1 . Thus S is independent.  u·v ≤ u v . A sequence {v1 . sn )t = r1 s1 + · · +rn sn . It is easy to deﬁne an inner product.. u · v = u v cos Θ where Θ is the angle between u
.···. .
Thus r1 s1 + · · +rn sn = u v cos Θ. Theorem (GramSchmidt) Suppose W is an IPS with a basis {v1 .. Now suppose b ∈ R and C = ∈ Rn · cn −1 has f (C) = b. and this the set of all solutions to a1 (x1 − c1 ) + · · +an(xn − cn ) = 0. wk . . Then W has an orthonormal basis {w1 . the c1 · set of all vectors perpendicular to A. v = w1 (v · w1 ) + · · +wn (v · wn ). Suppose inductively that {w1 . If {w1 .
Linear Algebra
Chapter 5
Proof Let u = (r1 . Let w = vk+1 − p(vk+1 ) and
. Proof v = w1 r1 + · · +wn rn and v · wi = (w1 r1 + · · +wn rn ) · wi = ri
Theorem Suppose W is an IPS. then {w1 .. wn }. . Y ⊂ W is a subspace with an orthonormal basis {w1 . .. an ) ∈ R1. i... rn ) and v = (s1 . sn ).. . wk+1 } is an orthonormal basis for the subspace Thus if wk+1 = w generated by {w1 . v1 Proof Let w1 = . Then f (b) is the set of all solutions to a1 x1 + · · +an xn = b which is the coset L + C. and v ∈ W −Y .. wk }.. w .e. .. Suppose f : Rn → R is a nonzero homomorphism given by a matrix A = (a1 . wk . . So (r1 − s1 )2 + · · +(rn − sn )2 = r1 + · · +rn + s2 + · · 1 +s2 − 2 u v cos Θ. . . v} is already orthonormal. wk } is an orthonormal v1 basis for Y . n Exercise This is a simple exercise to observe that hyperplanes in Rn are cosets.. wn }. .. By the law of cosines u − v 2 = 2 2 2 u + v 2 − 2 u v cos Θ..... Then L = ker(f ) is the set of all solutions to a1 x1 + · · +an xn = 0. Then if v ∈ W ..
GramSchmidt orthonormalization Theorem (Fourier series) Suppose W is an IPS with an orthonormal basis {w1 .. v}. . Moreover.100 and v....n .. vk }. . wk+1 = v.. any orthonormal sequence in W extends to an orthonormal basis of W . .... Deﬁne the projection of v onto Y by p(v) = w1 (v ·w1 )+·· +wk (v ·wk ). the subspace generated by {v1 . and let w = v −p(v). vn }. Then (w ·wi ) = (v −w1 (v ·w1 )··−wk (v ·wk ))·wi = 0.
.
Isometries Suppose each of U and V is an IPS.. Then f is an isometry iﬀ {f (u1 ). vk+1 }. We now come to one of the deﬁnitive theorems in linear algebra. w2 . 1.. The process above may be used to modify this to an orthonormal basis {w1 .. and f : U → V is a homomorphism. (u1 · u2 )U = (f (u1 ) · f (u2 ))V . . where p(v) is the projection of v onto Y . v} where v = (1. and set w w3 = .1. . w2 .. {u1 . . up to isometry.Chapter 5 wk+1 =
Linear Algebra
101
w ..
Let f : R3 → R be the homomorphism deﬁned by the matrix (2. 3)t . 0. 2. it extends to a basis {w1 . there is only one inner product space for each dimension. W is generated by the sequence {w1 . let w = v − p(v). wk . f (un )} is an orthonormal sequence in V . {w1 . As in the ﬁrst theorem of this section. So suppose T = {f (u1 ). Then by the previous theorem. wn }. Theorem Suppose each of U and V is an ndimensional IPS. f (un )} is an orthonormal sequence in V . wk+1 } is an orthonormal basis w for the subspace generated by {w1 . 0)t and w2 = (0. It is easy to check that f preserves inner products. Find w3 and show that for any t with 0 ≤ t ≤ 1. Exercise Let W = R3 have the standard inner product and Y ⊂ W be the subspace generated by {w1 .
. vn }... {w1 . Find the projection of (e1 + e2 ) onto ker(f ). un } is an orthonormal basis for U . A homomorphism f : U → V is said to be an isometry provided it is an isomorphism and for any u1 . . This is a key observation for an exercise on page 103 showing O(n) is a deformation retract of GLn (R).. In this manner an orthonormal basis for W is constructed. Since this sequence is independent..... Notice that this construction deﬁnes a function h which sends a basis for W to an orthonormal basis for W (see topology exercise on page 103). . Then T is independent and thus T is a basis for V and thus f is an isomorphism (see the second theorem on page 79). wk .. .. wk } is an orthonormal sequence in W . u2 in U . 0)t . Now suppose W has dimension n and {w1 . Proof Isometries certainly preserve orthonormal sequences. .. It is that.. (1 − t)v + tw3 } w is a basis for W .. . Find the angle between e1 + e2 and the plane ker(f ). . w2 } where w1 = (1. Exercise Find an orthonormal basis for the kernel of f . vk+1 .3).
e. Thus 1) and 2) are equivalent because each of them says A is invertible with A−1 = At .e. i. We now wish to study isometries from Rn to Rn . there exists a homomorphism f : U → V with f (ui ) = vi .
Proof A left inverse of a matrix is also a right inverse (see the exercise on page 64).102
Linear Algebra
Chapter 5
Theorem Suppose each of U and V is an ndimensional IPS. If A and C are orthogonal. vn } for V . Thus O(n) is a multiplicative subgroup of GLn (R). Theorem Suppose A ∈ Rn and f : Rn → Rn is the homomorphism deﬁned by f (B) = AB. and f (ei ) is column i of A. . Then ∃ an isometry f : U → V. Exercise Let f : R3 → R be the homomorphism deﬁned by the matrix (2. In particular... 1) 2) 3) The columns of A form an orthonormal basis for Rn . By the ﬁrst theorem on page 79... .. Thus by the previous section. Isometries preserve inner product. A−1 is orthogonal. f is an isometry. en } is the canonical orthonormal basis for Rn . and thus preserve angle and distance. un } for U and {v1 .  A = ±1. linear algebra is not so much the study of vector spaces as it is the study of endomorphisms. A is said to be orthogonal. and is called the orthogonal group. Proof There exist orthonormal bases {u1 . Deﬁnition If A ∈ Rn satisﬁes these three conditions. At A = I.. Orthogonal Matrices As noted earlier. and by the previous theorem..1. 1) and 3) are equivalent. Now {e1 . If A is orthogonal.
. The set of all such A is denoted by O(n). U is isometric to Rn with its standard inner product. and so certainly preserve volume. f is an isometry. The rows of A form an orthonormal basis for Rn . Theorem 1) 2) If A is orthogonal. We know from a theorem on page 90 that an endomorphism preserves volume iﬀ its determinant is ±1. AC is orthogonal.3). Find a linear transformation h : R2 → R3 which gives an isometry from R2 to ker(f ). . i. Then the following are equivalent. AAt = I..
u − v = f (u)−f (v) and the angle between u and v is equal to the angle between f (u) and f (v). For part 3) assume f : Rn → Rn is an isometry. Deﬁnition Suppose A ∈ Rn . Theorem Suppose A ∈ Rn and u.
Proof Part 1) follows from A2 = A At  = I = 1. then the dot product y · z. A is said to be selfadjoint if (Au)·v = u·(Av) for all u. This means that if u. v ∈ Rn . Exercise Show that if A ∈ O(2) has A = 1. Then u − v 2 = (u − v) · (u − v) = f (u − v) · f (u − v) = f (u − v) 2 = f (u) − f (v) 2 . v ∈ Rn .
. then A =
2
cosΘ −sinΘ sinΘ cosΘ
for
some number Θ.Chapter 5 3)
Linear Algebra
103
Suppose A is orthogonal and f is deﬁned by f (B) = AB. and matrix multiplication is associative.) Exercise (topology) Let Rn ≈ Rn have its usual metric topology. 1] → GLn (R) deﬁned by H(A. The next theorem is just an exercise using the previous theorem. Let h : GLn (R) → O(n) be deﬁned by GramSchmidt. because isometries clearly form a subgroup of the multiplicative group of all automorphisms. Our goals are to prove that. As background. t) = (1 − t)A + th(A) is a deformation retract of GLn (R) to O(n). Show H : GLn (R) × [0. A is said to be symmetric provided At = A. Show GLn (R) is an open subset and O(n) is closed and compact. v ∈ Rn . we ﬁrst note that symmetric is the same as selfadjoint. Then f preserves distances and angles. Then (At u) · v = u · (Av). all of its eigenvalues are real and that ∃ an orthogonal matrix C such that C −1 AC is diagonal. Note that any diagonal matrix is symmetric. Thus (At u) · v = (ut A)v = ut (Av) = u · (Av). Diagonalization of Symmetric Matrices We continue with the case F = R. Theorem A is symmetric iﬀ A is selfadjoint.
Proof If y. Part 2) is immediate. The proof that f preserves angles follows from u · v = u v cosΘ. if A is a symmetric matrix. This means a sequence of matrices {Ai } converges to A iﬀ it converges coordinatewise. z ∈ Rn . is the matrix product y t z. (See the exercise on page 56.
For symmetric matrices. Then C −1 AC is the matrix representing f w. reads as λ(v · v) = (λv · v) = (Av · v) = (v · Av) = (v · λv) = λ(v λ is a real number. ¯ ¯ then h is a ring isomorphism which is the identity on R.
Review Suppose A ∈ Rn and f : Rn → Rn is deﬁned by f (B) = AB. . The proof then ¯ ¯ · v). We know that eigenvectors belonging to distinct eigenvalues are linearly independent.
. Either way. λ1 .j ). Thus λ = λ and λ ∈ R.. Then A represents f w. λn (not necessarily distinct) such that CPA (x) = (x − λ1 )(x − λ2 ) · · · (x − λn ). Since A ∈ Rn is a real ¯ a t ¯t . Or ¯ (v A)¯ = v (A¯) = v (Av) = v (λv) = λ(v ¯ v v you can deﬁne a complex inner product on Cn by (w · v) = w t v . an orthonormal basis is the same as conjugating A by an orthogonal matrix. If h : C → C is deﬁned by h(µ) = µ.t.t. its conjugate is deﬁned by µ = a − bi. Theorem Suppose A ∈ Rn and C ∈ O(n). Now suppose λ is a complex eigenvalue of A symmetric matrix. Theorem Suppose A is symmetric.r. A = A = A and v ∈ Cn is an eigenvector with Av = λv. but for us. Summary Representing f w.
The next theorem has geometric and physical implications.t. That is. Then ∃ real numbers λ1 .. the canonical orthonormal basis. If w = (ai. Proof Suppose A is symmetric. Then A is symmetric iﬀ C −1 AC is symmetric. . S.j ) is a complex matrix or vector. λ2 ∈ R are distinct eigenvalues of A. we show more. Now S is an orthonormal basis iﬀ C is an orthogonal matrix.104
Linear Algebra
Chapter 5
Theorem Suppose A ∈ Rn is symmetric. vn } be another basis and C ∈ Rn be the matrix with vi as column i. Then (C −1 AC)t = C t A(C −1 )t = C −1 AC. Then λ(v t v ) = (λv)t v = (Av)t v = ¯ ¯ ¯ t t t t ¯ t v ). and Au = λ1 u and Av = λ2 v. its conjugate is deﬁned by w = (¯i..r. just the incredibility of it all will suﬃce. namely that they are perpendicular. Let S = {v1 . Proof We know CPA (x) factors into linears over C. Then u · v = 0. all the eigenvalues of A are real.. If µ = a + bi is a complex number. Proof λ1 (u · v) = (Au) · v = u · (Av) = λ2 (u · v).r.
Let λ be an eigenvalue for A and {v1 . by 0 C This theorem is so basic we state it again in diﬀerent terminology. 1) 2) f is selfadjoint. . 2) ⇒ 1)... vk be eigenvectors for λ1 . . λk with each vi = 1.. the transformation determined λ b by A is represented by ...r. vn . v ∈ V . So suppose k < n. If k = n. Let v1 . ∃ an orthogonal C such that C −1 DC is diagonal. Then w. Show 1) ⇒ 2).Chapter 5 Theorem 1) 2)
Linear Algebra If A ∈ Rn . With respect to this basis. Suppose A is a symmetric 2 × 2 matrix. If V is an IPS. the following are equivalent. ∃ C ∈ O(n) such that C −1 AC is diagonal. 0 d Now suppose by induction that the theorem is true for symmetric matrices in Rt for t < n. vn } for V with each vi an eigenvector of f .
λk
(B)
.
105
Proof By the previous theorem. . and they must form an orthonormal basis. Thus conjugating I 0 makes the entire matrix diagonal. λk the distinct eigenvalues of A. Denote by λ1 . ∃ an orthonormal basis {v1 .. b = 0. then the following are equivalent. Since this matrix is symmetric.. They may be extended to an orthonormal basis v1 . A is symmetric. B = 0 and D is a symmetric matrix of smaller size.t this basis. a linear transformation f : V → V is said to be selfadjoint provided (u·f (v)) = (f (u)·v) for all u.. and suppose A is a symmetric n × n matrix. because then there is a basis of eigenvectors of length 1. the
λ1 · ·
transformation determined by A is represented by
Since this is a symmetric matrix.. v2 } be an orthonormal basis for R2 with Av1 = λv1 . k ≤ n. . the proof is immediate. By induction..
(0)
(D)
. Theorem If V is an ndimensional IPS and f : V → V is a linear transformation. ..
This series converges and thus E is a well deﬁned function. Exercise Suppose V is an ndimensional real vector space. it automatically has a metric. assume any A ∈ Cn is similar to a lower triangular matrix. E(C −1 AC) = C −1 E(A)C. We know that V is isomorphic to Rn . Now use the results of this section to prove the statements below. i. an algebraic object. Under what conditions are A and D similar? Show that. then E(A + B) = E(A)E(B).106
Linear Algebra
Chapter 5
Exercise
Let A =
Do the same for A =
2 2 2 1
2 . suppose V and W are ﬁnitedimensional real vector spaces and h : V → W is a linear transformation. and this metric will determine that same topology. Suppose f and g are isomorphisms from V to Rn and A is a subset of V . then eA ∈ O(n). D ∈ Rn are symmetric. Of course.) 1) If A ∈ Cn . if A and D are similar.  eA = 1 iﬀ trace(A) = 0. Exercise Deﬁne E : Cn → Cn by E(A) = eA = I + A + (1/2!)A2 + ··. Find an orthogonal C such that C −1 AC is diagonal.
2) 3) 4)
. ∃ an orthogonal C such that D = C −1 AC. then  eA = etrace(A) . and if C is invertible. If N ∈ Rn is symmetric.. Thus if A ∈ Rn . has a godgiven topology. assume the Jordan form. I = E(0) = E(A − A) = E(A)E(−A). If AB = BA. ¯ If A ∈ Rn and At = −A. then eN = I iﬀ N = 0. 2 1 . if V has an inner product. and ¯ thus E(A) is invertible with E(A)−1 = E(−A). Since A and −A commute. Show that f (A) is an open subset of Rn iﬀ g(A) is an open subset of Rn . ∃ a nonzero matrix N ∈ R2 with eN = I. Furthermore E(At ) = E(A)t . This shows that V . 2
Exercise Suppose A. Finally.e. (For part 1. Show that h is continuous.
The basic theorem of this chapter is that if R is a Euclidean domain and M is a ﬁnitely generated Rmodule. In the chapter on matrices. This classical and very powerful technique allows an easy proof of the canonical forms. Now suppose F is a ﬁeld and V is a ﬁnitely generated F module. Now F [x] is a Euclidean domain and so VF [x] is the sum of cyclic modules.e. If T : V → V is a linear transformation. ﬁnitely generated abelian groups are the sums of cyclic groups – one of the jewels of abstract algebra. i. it is a free Rmodule. enough material is added to form a basic ﬁrst year graduate course. is given in this chapter. This always holds if F = C. Suppose R is a commutative ring. A matrix in Jordan form is a lower triangular matrix with the eigenvalues of T displayed on the diagonal. which depends upon the classiﬁcation of certain types of alternating multilinear forms. so this is a powerful concept. then V becomes an F [x]module by deﬁning vx = T (v). In this appendix. A proof of this. The organization is mostly a linearly ordered sequence except for the last two sections on determinants and dual spaces. 107
. These are independent sections added on at the end. Thus if M is torsion free. An Rmodule M is said to be cyclic if it can be generated by one element. then M is the sum of cyclic modules. M ≈ R/I where I is an ideal of R.e. Since Z is a Euclidean domain. There is a basis for V so that the matrix representing T is in Rational canonical form. The style is the same as before. everything is right down to the nub.Chapter 6
Appendix
The ﬁve previous chapters were designed for a year undergraduate course in algebra. If the characteristic polynomial of T factors into the product of linear polynomials. Two of the main goals are to characterize ﬁnitely generated abelian groups and to prove the Jordan canonical form. i.. The ﬁnal section gives the fundamentals of dual spaces. then there is a basis for V so that the matrix representing T is in Jordan canonical form.. it is stated without proof that the determinant of the product is the product of the determinants.
Chinese Remainder Theorem Suppose A1 .
Theorem If A1 . 1i . Surprisingly. ∃ a ∈ A and b ∈ B with a + b = 1. An are pairwise comaximal ideals of R. If (r1 + A1 ... ¯ ¯ ¯ Theorem If R is commutative and A1 . .. then the following are equivalent. In this section this classical topic is presented in full generality. Deﬁnition Theorem 1) 2) 3) Ideals A and B of R are said to be comaximal if A + B = R.. A2 . Show A1 ∩ A2 ⊂ A1 A2 . Then the sum A1 + A2 + · · · + Am is the set of all a1 + a2 + · · · + am with ai ∈ Ai . Deﬁnition Suppose R is a ring and A1 .. . it is the image of r1 b1 +r2 b2 +···+rn bn = r1 (1 −a1 )+r2 (1 −a2 )+···+rn (1 −an )... If A and B are ideals of a ring R. Then π(A1 A2 · · · Am ) = π(A1 )π(A2 ) · · · π(Am ) = (R/B)(R/B) · · · (R/B) = R/B.
. . A2 . .... Thus A1 ∩ A2 ∩ · · · ∩ Am and B are comaximal.
A and B are comaximal. the theorem holds even for noncommutative rings. Proof Consider π : R → R/B. Am are ideals of R.. Proof for n = 2. A2 .. then c = c(a1 + a2 ) ∈ A1 A2 . .. r2 + A2 . Note ¯ that π(bi ) = (0. then A1 A2 · · · An = A1 ∩ A2 ∩ · · · ∩ An . 0.. ¯ If c ∈ A1 ∩ A2 . A2 . 0. . rn + An ) is an element of the ¯ range.. Am and B are ideals of R with Ai and B comaximal for each i.. An are pairwise comaximal ideals of R.. Proof There exists ai ∈ Ai and bi ∈ A1 A2 · · · Ai−1Ai+1 · · · An with ai + bi = 1. Then the natural map π : R → R/A1 × R/A2 × · · · × R/An is a surjective ring homomorphism with kernel A1 ∩ A2 ∩ · · · ∩ An . ∃ a1 ∈ A1 and a2 ∈ A2 with a1 + a2 = 1.108
Appendix The Chinese Remainder Theorem
Chapter 6
On page 50 in the chapter on rings.. with each Ai = R. The product A1 A2 · · · Am is the set of all ﬁnite sums of elements a1 a2 · · · am with ai ∈ Ai . ¯ π(A) = R/B where π : R → R/B is the projection. Note that the sum and product of ideals are ideals and A1 A2 · · · Am ⊂ (A1 ∩ A2 ∩ · · · ∩ Am ).. the Chinese Remainder Theorem was proved for the ring of integers. 0). . then A1 A2 · · · Am and B are comaximal.
Note that ∼ is an equivalence relation. If a ∈ R is not a unit. Then we say a ∼ b iﬀ there exists a unit u with au = b. b ∈ R.
Theorem
Corollary Proof Theorem
Every ﬁeld is a domain. Therefore V is equal to some Vt and is a maximal ideal containing a. Theorem 0 is a prime ideal of R iﬀ R is ¯ 0 is a maximal ideal of R iﬀ R is ¯ Suppose J ⊂ R is an ideal. Later on it will be shown that every Euclidean domain is a principle ideal domain. J = R. This collection contains a maximal monotonic collection {Vt }t∈T . then a or b ∈ I. I is maximal means I = R and there are no ideals properly between I and R. Thus every Euclidean domain is a unique factorization domain.Chapter 6
Appendix Prime and Maximal Ideals and UFDs
109
In the ﬁrst chapter on background material. J is a prime ideal iﬀ R/J is J is a maximal ideal iﬀ R/J is Maximal ideals are prime. then a
. b ∈ R have ab ∈ I. If a ∼ b. Consider {J : J is an ideal of R containing a with J = R}. The ideal V = Vt does not contain 1 and ¯ t∈T thus is not equal to R. then ∃ a maximal ideal I of R with a ∈ I. Note To properly appreciate this proof. the student should work the exercise in group theory at the end of this section (see page 114).
Proof This is a classical application of the Hausdorﬀ Maximality Principle. Deﬁnition Suppose R is a commutative ring and I ⊂ R is an ideal.
I is prime means I = R and if a.
Deﬁnition Suppose R is a domain and a. it was shown that Z is a unique factorization domain. Here it will be shown that this property holds for any principle ideal domain.
If n ∈ Z is not zero. . i.
Parts 1) and 3) above show there is a bijection from the associate classes of R to the principal ideals of R. Thus if R is a PID.110 and b are said to be associates.
Theorem Factorization into primes is unique up to order and associates. If an element of a domain generates a nonzero prime ideal. while the only ¯ associate of 0 is 0 itself. a is prime if it generates a prime ideal. Deﬁnition 1) 2) Suppose R is a domain and a ∈ R is a nonzero nonunit. there is a bijection from the associate classes of R to the ideals of R. bi and cσ(i) are associates for every i. . if d = b1 b2 · · · bn = c1 c2 · · · cm with each bi and each ci prime.e. then its associates are n and −n. . Then the following are ¯
a ∼ b. Theorem equivalent. then the associates of g are all cg where c is a nonzero constant. Note a is prime ⇒ a is irreducible. then a ∼ ci for some i. b ∈ (R − 0). In other words. a = bc ⇒ b or c is a unit. The following theorem is elementary. then au is irreducible (prime).e. ab and ba. This follows from the deﬁnition and induction on n. pt where no two are associates and du = ps1 ps2 · · · pst . p2 . i. This is immediate from the deﬁnitions.. if a is irreducible (prime) and u is a unit. 1) 2) 3) Suppose R is a domain and a. then a is irreducible (prime) iﬀ b is irreducible (prime). it is called a prime element. abc ⇒ ab or ac. the associates of 1 are the units of R. Note If a ∼ b. . If each cj is irreducible. aR = bR. t 1 2
... i.
Appendix
Chapter 6
Examples If R is a domain. then aci for some i.e. Note also ∃ a unit u and primes p1 . An element a divides b (ab) if ∃! c ∈ R with ac = b. ¯ ¯ If F is a ﬁeld and g ∈ F [x] is a nonzero polynomial.
Note If a is a prime and ac1 c2 · · · cn . but it shows how associates ﬁt into the scheme of things.
a is irreducible if it does not factor. then n = m and for some permutation σ of the indices.
Thus in a UFD.
Proof We already know 1) ⇒ 2). Then the following are equivalent.Chapter 6 Proof
Appendix
111
This follows from the notes above. and abc... Since b divides a. ∃ b ∈ R with I = bR. If either Proof b or c is a unit or is zero. Part 2) ⇒ 1) because factorization into primes is always unique. Now suppose a is irreducible and show aR is a maximal ideal. the element b is a unit or an associate of a. If R is a FD. so 2) ⇒ 3). elements factor as the product of primes. as seen in the next theorem. This is a revealing and useful theorem. a is an irreducible element of R. If I is an ideal containing aR. then R is a UFD iﬀ each irreducible element generates a prime ideal. There exists an element d with ad = bc.
R is a UFD. Theorem Suppose R is a PID and a ∈ R is nonzero nonunit. a irreducible ⇔ a is prime. so suppose each of b and c is a nonzero nonunit element of R.
Deﬁnition R is a factorization domain (FD) means that R is a domain and if a is a nonzero nonunit element of R. Theorem 1) 2) Suppose R is a FD.
. Suppose R is a UFD. that one of these irreducibles is an associate of a. Every prime element is an irreducible element. then a factors into a ﬁnite product of irreducibles. Then the following are equivalent. then a divides one of them. aR is a prime ideal. so 1) ⇒ 2). a is irreducible. Therefore the element a is a prime. 1) 2) 3) aR is a maximal ideal. and thus ab or ac.e. then a is irreducible ⇔ a is prime. Every irreducible element of R is prime. It follows from the uniqueness of the factorization of ad = bc. a is a prime element. Deﬁnition R is a unique factorization domain (UFD) means R is a FD in which factorization is unique (up to order and associates). Theorem If R is a UFD and a is a nonzero nonunit of R. This means I = R or I = aR.e.
Proof Every maximal ideal is a prime ideal. i. principal ideal domains have this property. i. Each of b and c factors as the product of irreducibles and the product of these products is the factorization of bc. Fortunately.
a PID is a FD. then J = I. The element c must be reducible. it only remains to show that a PID is a FD. So there is a collection of ideals of R such that any ideal in the collection is properly contained in another ideal of the collection. an } ⊂ R such that I = a1 R + a2 R + · · · + an R. with each properly contained in the next. This turns out to be equivalent to the property that any collection of ideals has a “maximal” element. Then the following are equivalent. is ﬁnitely generated and ∃ t0 ≥ 1 such that It0 contains those generators. ∃ t0 ≥ 1 such that It = It0 for all t ≥ t0 . i. or it is said to satisfy the ascending chain condition.. This property is satisﬁed by many of the classical rings in mathematics. Having three deﬁnitions makes this property useful and easy to use. but we slide over that. see the next theorem. and consider the collection of all ﬁnitely generated ideals contained in I. Using the two theorems above.. Theorem A Noetherian domain is a FD. ∃ a maximal one cR. and it must be I itself. (The ideal I is maximal only in the sense described.
3)
If I ⊂ R is an ideal. each ideal of R is ﬁnitely generated. . nor need it be a maximal ideal of the ring R. Let I be an ideal of R. Thus 3) is true. R is said to be Noetherian. Theorem 1) 2) Suppose R is a commutative ring.) Deﬁnition If R satisﬁes these properties.e.. Consider all ideals dR where d does not factor.. It need not contain all the ideals of the collection. By 2) there is a maximal one.) If I1 ⊂ I2 ⊂ I3 ⊂ . Since R is Noetherian. and therefore 3) is false.
Proof Suppose 1) is true and show 3). We now have 2)⇒1)⇒3). In particular. is a monotonic sequence of ideals. i. The ideal I = I1 ∪ I2 ∪ . a2 . Any nonvoid collection of ideals of R contains an ideal I which is maximal in the collection. but only that they be ﬁnitely generated..
Proof Suppose there is a nonzero nonunit element that does not factor as the ﬁnite product of irreducibles. and thus 1) is true. and so each
. Each of aR and bR properly contains cR. Now suppose 2) is true and show 1). .. . so suppose 2) is false and show 3) is false. We shall see below that this is a useful concept which ﬁts naturally into the study of unique factorization domains. For example. Thus it is possible to construct a sequence of ideals I1 ⊂ I2 ⊂ I3 . The proof will not require that ideals be principally generated.e.112
Appendix
Chapter 6
Our goal is to prove that a PID is a UFD. (Actually this construction requires the Hausdorﬀ Maximality Principle or some form of the Axiom of Choice. . This means if J is an ideal in the collection with J ⊃ I. .. c = ab where neither a nor b is a unit. ∃ a ﬁnite set {a1 .
and R/(2) is isomorphic to Z2 [x]/(x2 − [5]) = Z2 [x]/(x2 + [1]). (There is a UFD R where R[[x]] is not a UFD.)
Domains With Nonunique Factorizations Next are presented two of the standard examples of Noetherian domains that are not unique factorization domains... Show that R is a subring of √ R which is not a UFD.) Theorem If R is Noetherian and I ⊂ R is a proper ideal. F [[x1 . where (x2 − 5) represents the ideal (x2 − 5)Z[x].. m ∈√ Z}. √ √ Exercise Let R = Z( 5) = {n + m 5 : n. xn ]] is a UFD. then R/I is Noetherian.2 of An Introduction to Complex Analysis in Several Variables by L. (This follows immediately from the deﬁnition. o Theorem Suppose R is a commutative ring. . xn ]] are Noetherian.
Proof See Theorem 6.. which is a contradiction. Theorem If R is a UFD then R[x1 .) If R is a PID. F [x] is a UFD.. which are stated here only for reference. xn ] is a UFD.
.. . Show R is isomorphic to Z[x]/(x2 − 5). xn ] is a UFD. So Z is a UFD and if F is a ﬁeld. .. (This is the famous Hilbert Basis Theorem. . xn ] and R[[x1 ..6.. It takes more work to prove the following theorems. Corollary A PID is a UFD. .. Bourbaki. Then R is Noetherian ⇒ R[x1 . F [x1 . H¨rmander. Thus if F is a ﬁeld. This and the previous theorem show that Noetherian is a ubiquitous property in ring theory.. In particular 2 · 2 = (1 − 5) · (−1 − 5) are two distinct irreducible factorizations of 4.Chapter 6
Appendix
113
of a and b factors as a ﬁnite product of irreducibles.. . Thus if F is a ﬁeld. See page 566 of Commutative Algebra by N. which is not a domain.
You see the basic structure of UFDs is quite easy..) Theorem Germs of analytic functions on Cn form a UFD.. This gives a ﬁnite factorization of c into irreducibles. (This theorem goes all the way back to Gauss. then the formal power series R[[x1 ..... xn ]] is a UFD.
z]/(x2 − yz). If i : L → B is inclusion. ∃ a homomorphism h : C → B with g ◦ h = I : C → C. 1) 2) K is a summand of B. x2 . then every submodule of Rn is free. y] be the ring homomorphism deﬁned by f (x) = xy. suppose ∃ a submodule L of B with K ⊕ L = B. x2 . ¯ Thus K ⊕ h(C) = B. z] → R[x. An easier ¯ x approach is to let f : R[x. Exercise In Group Theory If G is an additive abelian group. B and C are Rmodules.. then h deﬁned by h = i ◦ (gL)−1 is a right inverse of g. In particular x · x = y · z are two distinct irreducible factorizations ¯ ¯ ¯ ¯ of x2 . As deﬁned in the chapter on linear algebra. which is not a domain. Then (gL) : L → C is an isomorphism. if B/K is free. y. Note that xy.)
Proof Suppose 1) is true. Which one of the following is true? 1) 2) If a ∈ Q.e. i. z]. y 2 ] is the image of f and S is isomorphic to R. z]/(yz). z].e. For simplicity. then there is a maximal subgroup H of Q which contains a. In particular. y. K must be a summand of B.114
Appendix
Chapter 6
Exercise Let R = R[x. This is used below to show that if R is a PID. (h is called a splitting map. When is K a summand ¯ of B? It turns out that K is a summand of B iﬀ there is a splitting map from B/K to B.
. Then the following are equivalent. Show x2 − yz is irreducible and thus prime in R[x. Then h is injective. K + h(C) = B and K ∩ h(C) = 0. If u ∈ R[x. Splitting Short Exact Sequences Suppose B is an Rmodule and K is a submodule of B. Now suppose 2) is true and h : C → B is a right inverse of g. Show R/(¯) is isomorphic to R[y. i. Q contains no maximal subgroups. K is a summand of B provided ∃ a submodule L of B with K + L = B and K ∩ L = 0. a subgroup H of G is said to be maximal if H = G and there are no subgroups properly between H and G. g has a right inverse. let u ∈ R be the coset containing u. consider the case G = Q. y. y. Then S = R[xy. Show that H is maximal iﬀ G/H ≈ Zp for some prime p. and y 2 are irreducible in S and (xy)(xy) = (x2 )(y 2 ) are two distinct irreducible factorizations of (xy)2 in S. Show R ¯ is not a UFD.. f (y) = x2 . Theorem 1 Suppose R is a ring. In this case we write K ⊕ L = B. and f (z) = y 2 . and g : B → C is a surjective homomorphism with kernel K.
Proof From the previous theorem we know this is true for n = 1. Theorem 3 If R is a PID and A ⊂ Rn is a submodule. A short exact sequence is said to split if ∃ ≈ an isomorphism B → A ⊕ C such that the following diagram commutes. Suppose A ⊂ Rn is a submodule.
.1 A short exact sequence 0 → A → B → C → 0 splits iﬀ f (A) is a summand of B. there is a splitting map and thus the sequence splits. 1)⇒2) is a precursor to the following classical result.
R is a PID. Theorem 1. Thus subgroups of Zn are free Zmodules of dimension ≤ n. then A is a free Rmodule of dimension ≤ n. then the following are equivalent. Proof We know from the previous theorem f (A) is a summand of B iﬀ B → C has a splitting map. iﬀ B → C has a splitting map. Suppose n > 1 and the theorem is true for submodules of Rn−1 .Chapter 6
Appendix
115
Deﬁnition Suppose f : A → B and g : B → C are Rmodule homomorphisms. The canonical split s. There exists a function h : T → B such that g ◦ h(c) = c for each c ∈ T .e. Every submodule of RR is a free Rmodule of dimension ≤ 1. The function h extends to a homomorphism from C to B which is a right inverse of g. If C is a free Rmodule. Now suppose C has a free basis T ⊂ C. g is surjective and f (A) = ker(g). Showing these properties are equivalent to the splitting of the sequence is a good exercise in the art of diagram chasing.s. is A → A ⊕ C → C where f = i1 and g = π2 .s) means f is injective.e. 0→ A f
E
B
c
g
&2 π
E & b & &
C →0
i 1
~
≈
A⊕C
&
&
We now restate the previous theorem in this terminology. and g : B → C is surjective. Although this theorem is transparent. Theorem 2 1) 2) If R is a domain. f g The statement that 0 → A → B → C → 0 is a short exact sequence (s.
This theorem restates the ring property of PID as a module property.
Show A ×3 is a free Zmodule of dimension 1. If π(A) = 0. ¯ ¯ F [x] where F is a ﬁeld with φ(f = a0 + a1 x + · · · + an xn ) = deg(f ). then M is the sum of cyclic modules. φ(a) ≤ φ(ab). then A ⊂ Rn−1 . is used only in Theorem 2. (16. ¯ If π(A) = 0. and the eﬃciency of this abstraction is displayed in this section. and you don’t have to worry about it becoming obsolete. A ﬁeld F with φ(a) = 1 ∀ a = 0 or with φ(a) = 0 ∀ a = 0. Z[i] = {a + bi : a. Exercise Let A ⊂ Z2 be the subgroup generated by {(6. ¯ In either case. 2) ∃ q. Anyway it is possible to just play around with matrices and get some deep results. Furthermore the ﬁrst axiom. it is free of dimension 1 and thus the sequence splits by Theorem 1.s. r ∈ R such that a = bq + r with r = 0 or φ(r) < φ(b). A is a free submodule of dimension ≤ n.
f π
.1. b ∈ Z} = Gaussian integers with φ(a + bi) = a2 + b2 . Deﬁnition A domain R is a Euclidean domain provided ∃ φ : (R − 0) −→ N such ¯ that if a. Here N will denote the set of all nonnegative integers.e. Also show the s.116
Appendix
Chapter 6
Consider the following short exact sequences. and is sometimes omitted from the deﬁnition. 24). b ∈ (R − 0). A ∩ Rn−1 is free of dimension ≤ n − 1. Euclidean Domains The ring Z possesses the Euclidean algorithm and the polynomial ring F [x] has the division algorithm (pages 14 and 45). ¯ Examples of Euclidean Domains Z with φ(n) = n. then ¯ 1) φ(a) ≤ φ(ab). where f : R n−1 → Rn−1 ⊕ R is inclusion and g = π : Rn−1 ⊕ R → R is the projection. 0 −→ Rn−1 −→ Rn−1 ⊕ R −→ R −→ 0 0 −→ A ∩ Rn−1 −→ A −→ π(A) −→ 0 By induction. Z4 −→ Z12 −→ Z3 splits ×2 ×2 but Z −→ Z −→ Z2 and Z2 −→ Z4 −→ Z2 do not (see top of page 78). If R is a Euclidean domain and M is a ﬁnitely generated Rmodule. not just the set of positive integers. This is one of the great classical theorems of abstract algebra. 64)}. The concept of Euclidean domain is an abstraction of these properties.
. Thus r = 0 and a ∈ bR so I = bR.. there is an element d1 with I = d1 R. In the same manner. and this will turn out to be the d1 displayed in the theorem. ¯ a and b are associates ⇒ φ(a) = φ(b). This is a good exercise. then the ideal J generated by the elements of EA has J ⊂ I.j ). then J = I. However it is unnecessary for Theorem 3 below. and di di+1 for 1 ≤ i < m. if E ∈ Rt is invertible and J is the ideal generated by the elements of AE. then ∃ b ∈ I − 0 satisfying φ(b) ≤ φ(a) ∀ a ∈ I − 0. This means that row and column operations on A do not change the ideal I. If E ∈ Rn .t is a nonzero matrix. ¯ ¯ Then b generates I because if a ∈ I − 0. Now r ∈ I and ¯ r = 0 ⇒ φ(r) < φ(b) which is impossible. Theorem 3 If R is a Euclidean domain and (ai.j ). b ∈ R − 0. Since R is a PID.
Proof
The following remarkable theorem is the foundation for the results of this section. then R is a PID and thus a UFD.j ) has at least one nonzero element d with φ(d) a miminum. then by elementary row and column operations (ai. . If E is invertible. ¯ a is a unit in R iﬀ φ(a) = φ(1). ∃ q.
117
Proof If I is a nonzero ideal. r with a = bq + r. Also d1 generates the ideal of R ¯ generated by the entries of (ai. then ¯ φ(1) is the smallest integer in the image of φ. ¯ ¯ Theorem 2 If R is a Euclidean domain and a. Proof Let I ⊂ R be the ideal generated by the elements of the matrix A = (ai. dm 0 0
0
0
where each di = 0. The matrix (ai.j ) may produce elements with smaller
. However. then J = I. .Chapter 6 Theorem 1
Appendix If R is a Euclidean domain. .j ) ∈ Rn.j ) can be transformed to
d1 0 · · · 0 d2 . row and column operations on (ai.
.118
Appendix
Chapter 6
φ values. and thus I = d1 R. B is a ﬁnitely generated free Rmodule and A ⊂ B is a nonzero submodule. Then d1 will divide the other entries in the ﬁrst row.t where vi = a1. In a similar manner.. ∃ invertible matrixes U ∈ Rn and V ∈ Rt such that
... This is an example of a theorem that is easy to prove playing around at the blackboard.. the matrix may be changed to the following form.i w1 + a2. w2 .. . where n ≥ t. 0 cij
Note that d1 divides each ci. wn } be a free basis for B. Thus B/A ≈ R/d1 ⊕ R/d2 ⊕ · · · ⊕ R/dt ⊕ Rn−t . A has a free basis {v1 ... with t ≤ n.j ) ∈ Rn. To consolidate this approach. let (bi.. and such that each ai = di bi . Proof By Theorem 3 in the section Splitting Short Exact Sequences. bn } for B. a2 . Among these. Then ∃ free bases {a1 .. Thus by column operations of type 3. 1) place in the matrix. By elementary operations of type 2.j . ¯ and di di+1 for 1 ≤ i < t. else we could obtain an entry with a smaller φ value. Yet it must be a deep theorem because the next two theorems are easy consequences. By the previous theorem. . at } for A and {b1 . Theorem 4 Suppose R is a Euclidean domain.j ) be one which has an entry d1 = 0 with φ(d1 ) a minimum. ...i w2 + · · · + an. the entry d1 may be moved to the (1. the other entries of the ﬁrst row may be made zero.j ) by a ﬁnite number of row and column operations. where each di = 0. consider matrices obtained from (ai.
d1 0 · · · 0 0 . . vt }. Let {w1 . v2 . by row operations of type 3. The proof now follows by induction on the size of the matrix.i wn . The composition Rt −→ A −→ B −→ Rn ei −→ vi wi −→ ei
≈ ⊂ ≈
is represented by a matrix (ai. b2 .. .
e.. they are unique. To prove this we need the following Lemma. vt } and {w1 . w2 . R/di = 0. then the elements d1 . (See p 108 for comaximal... . The splitting in Theorem 5 is not the ultimate because the modules R/di may split into the sum of other cyclic modules. . then M ≈ R/d1 ⊕ R/d2 ⊕ · · ·⊕ R/dt ⊕ Rm where each di = 0. and for such di ... and B/A ≈ M. Let A be the kernel.. t 1 s (The elements pi i are called elementary divisors of R/d. i. They are unique up to associates.) Proof
s If i = j. some or all of the elements di may be units. By the Lemma above. so 0 −→ A −→ B −→ M −→ 0 is a s. a is a unit. Suppose b and c are relatively prime. Lemma Suppose R is a PID and b and c are nonzero nonunit elements of R.) Proof There exists an a ∈ R with aR = bR + cR. but we do not bother with that here. Since changing the isomorphisms Rt −→ A and B −→ Rn corresponds to changing the bases {v1 . wn }. . pi i and pj j are relatively prime... Theorem 6 Suppose R is a PID and d is a nonzero nonunit element of R. If we assume that no di is a unit. If R = Z and we select the di to be positive.. dt are ¯ called invariant factors.
≈
U (ai. there is no prime common to their prime factorizations. d2 . the theorem follows. The way Theorem 5 is stated. . The result now follows from the previous theorem. then they are unique. so R = bR + cR..e.. they are s
. s1 s2 Assume d = p1 p2 · · · pst is the prime factorization of d (see bottom of p 110). and di di+1 for 1 ≤ i < t. . v2 . If R = F [x] and we select the di to be monic. ¯ Proof By hypothesis ∃ a ﬁnitely generated free module B and a surjective homo⊂ morphism B −→ M −→ 0. Since ab and ac. Then bR and cR are comaximal ideals. Theorem 5 If R is a Euclidean domain and M is a ﬁnitely generated Rmodule.s.Chapter 6
Appendix
119 d1 0 . Then t ≈ the natural map R/d −→R/ps1 ⊕ · · · ⊕ R/pst is an isomorphism of Rmodules.j )V =
0
···
≈
dt 0
with di di+1 . . 0 d2 0 ··· 0 0 .
An element m ∈ M is said to be a torsion element if ∃ r ∈ R with r = 0 and mr = 0. Of course this theorem is transparent anyway. and thus there is a splitting map. Proof This follows immediately from Theorem 5.120
Appendix
Chapter 6
comaximal and thus by the Chinese Remainder Theorem. By Theorem 8.
. Proof This is a simple exercise.
Deﬁnition Suppose M is a module over a domain R. it is an Rmodule isomorphism. and s ≥ 1. we say that M is torsion ¯ free. Then T (M ) is a submodule of M and M/T (M ) is torsion free.s.e. M/T (M ) is a free Rmodule. Denote by T (M ) the set of all torsion elements of M . This theorem carries the splitting as far as it can go. Then M is a free Rmodule.
Theorem 9 Suppose R is a Euclidean domain and M is a ﬁnitely generated Rmodule. Suppose R is a PID. as seen by the next exercise. 0 −→ T (M ) −→ M −→ M/T (M ) −→ 0 Proof By Theorem 7.
Torsion Submodules
This will give a little more perspective to this section. Theorem 7 Suppose M is a module over a domain R. Then the following s. it is the same as saying m has ﬁnite order. p ∈ R is a prime element. splits. This is the same as ¯ ¯ saying m is dependent. Then the Exercise Rmodule R/ps has no proper submodule which is a summand. M ≈ Rm . Since the natural map is also an Rmodule homomorphism.
Theorem 8 Suppose R is a Euclidean domain and M is a ﬁnitely generated Rmodule which is torsion free. M/T (M ) is torsion free.e. If T (M ) = 0. because Theorem 5 gives a splitting of M into a torsion part and a free part. If R = Z. i. the natural map is a ring isomorphism (page 108)..
This result also holds for any invertible matrix over a Euclidean domain. the equation will have degree ¯ ¯ less than the number of roots.. the multiplicative group F ∗ is cyclic. then G is a cyclic group. Also d1 generates the ideal generated ¯ by the entries of A. which is impossible. The ﬁrst summand T (M ) is unique.
0
dn
where each di = 0 and di di+1 for 1 ≤ i < n. Thus t = 1 and so G is cyclic. A is the product of elementary matrices. The multiplicative group G is isomorphic to an additive group Z/d1 ⊕ Z/d2 ⊕ · · · ⊕ Z/dt where each di > 1 and di di+1 for 1 ≤ i < t. but the complementary summand V is not unique. Then by elementary row and column operations. Proof This is a corollary to Theorem 5 with R = Z. Every u in the additive group has the property that udt = 0. Thus if A is invertible. Thus if p is a prime. ¯ So every g ∈ G is a solution to xdt − 1 = 0. V depends upon the splitting map and is unique only up to isomorphism. .
To complete this section. (Zp )∗ is cyclic. here are two more theorems that follow from the work we have done. Thus if F is a ﬁnite ﬁeld. Exercise For which primes p and q is the group of units (Zp ×Zq )∗ a cyclic group?
We know from Exercise 2) on page 59 that an invertible matrix over a ﬁeld is the product of elementary matrices. If t > 1. Theorem 10 Suppose T is a domain and T ∗ is the multiplicative group of units of T .Chapter 6
Appendix
121
Note It follows from Theorem 9 that ∃ a free submodule V of M such that T (M )⊕ V = M . Theorem 11 Suppose R is a Euclidean domain and A ∈ Rn is a matrix with nonzero determinant.
. A may be transformed to a diagonal matrix
d1 d2 0 . If G is a ﬁnite subgroup of T ∗ . Furthermore A is invertible iﬀ each di is a unit.
Find the characteristic polynomials of A and D. Since the determinant of A is not zero. and is more interesting than the companion matrix C(q). xn−1 }. But as we shall see later. Exercise Let R = Z. All of this is supposed to make the next theorem transparent. . and C(q) will be the matrix of this endomorphism with respect to this basis. . V is a torsion module over the ring R[x]. is zero iﬀ h(x) ∈ qR[x]. A Jordan block displays its eigenvalue on the diagonal. A = Jordan Blocks In this section. (See the last part of the last theorem on page 46. then the diagonal matrix is the product of elementary matrices of type 1. we deﬁne the two special types of square matrices used in the Rational and Jordan canonical forms. If each di is a unit. the matrix A is invertible iﬀ the diagonal matrix is invertible. x2 .. Therefore if A is invertible. Write D as the product of elementary matrices. and h(T ) is the zero homomorphism iﬀ h(x) ∈ qR[x]. Let T : V → V be deﬁned by T (v) = vx. Find an elementary matrix B over Z such that B −1 AB is diagonal. Find an invertible matrix C in R2 such that C −1 DC is diagonal. q = a0 + a1 x + · · · + an−1 xn−1 + xn ∈ R[x] is a monic polynomial of degree n ≥ 1. it follows that each di = 0. h(T ) is the Rmodule homomorphism given by multiplication by h(x).) Multiplication by x deﬁnes an Rmodule endomorphism on V . ¯ Furthermore. The companion matrix
. Note that the Jordan block B(q) is the sum of a scalar matrix and a nilpotent matrix. . xn−1 }. Theorem Let V have the free basis {1. which is true iﬀ each di is a unit. and V is the R[x]module V = R[x]/q. but as an Rmodule. the Rational canonical form will always exist. 3 11 3 11 and D = . That is to say q(T ) = a0 I + a1 T + · · · + T n is the zero homomorphism. The homomorphism from R[x]/q to R[x]/q given by multiplication by h(x). Suppose R is a commutative ring. x. Perform elementary 0 4 1 4 operations on A and D to obtain diagonal matrices where the ﬁrst diagonal element divides the second diagonal element. x2 ..122
Appendix
Chapter 6
Proof It follows from Theorem 3 that A may be transformed to a diagonal matrix with di di+1 .. . . x. If h(x) ∈ R[x]. Show C cannot be selected in Q2 . V has a free basis {1. while the Jordan canonical form will exist iﬀ the characteristic polynomial factors as the product of linear polynomials. it is the product of elementary matrices.
.. . . reverse the order of the basis for V .. . h(B(q)) is zero iﬀ h(x) ∈ qR[x]. (x − λ). . .
λ 1
0
. . 1 λ . Finally. read ﬁrst the section in Chapter 5. if h(x) ∈ R[x]. .. 1 −an−1
The characteristic polynomial of C(q) is q. 0 . . . (For an overview of Jordan form. . Using the previous sections. . 1 λ
0 λ
The characteristic polynomial of B(q) is q. 0 .. . . 0 −a0 1 0 . .. . . . page 96. . and B(q) = λn = (−1)n a0 . if h(x) ∈ R[x].)
. .. . Let V have the free basis {1..Chapter 6
Appendix
123
representing T is
C(q) =
0 . 0 −a1 0 1 0 −a2 . . C(a0 + x) = B(a0 + x) = (−a0 ). . . if you wish to have the 1s above the diagonal. (x − λ)n−1 }. . all that’s left to do is to put the pieces together. .. . Then the matrix representing T is
B(q) = 0 . . . 0 0 .. . . Jordan Canonical Form We are ﬁnally ready to prove the Rational and Jordan forms.. . . Finally. Note For n = 1.. h(C(q)) is zero iﬀ h(x) ∈ qR[x]... (x − λ)2 . . . . and C(q) = (−1)n a0 . . This is the only case where a block matrix may be the zero matrix. . Theorem Suppose λ ∈ R and q(x) = (x − λ)n . Note In B(q)..
we pick another basis for each of the cyclic modules (see the second theorem in the section on Jordan Blocks).124
Appendix
Chapter 6
Suppose R is a commutative ring. Since V is ﬁnitely generated as an F module. From Theorem 5 in the section on Euclidean Domains. x2 .. Theorem 2 With respect to this basis. and T : V → V is an Rmodule homomorphism. and di di+1 . This gives the Rational Canonical Form and that is all there is to it. a basis is selected for these cyclic modules and the matrix representing T is described. xm−1 } as the F basis for F [x]/di where m is the degree of the polynomial di . it follows that where each di is a monic polynomial of degree ≥ 1. We know VF [x] is the sum of cyclic modules from Theorems 5 and 6 in the section on Euclidean Domains. C(dt )
. A submodule of VF [x] is a submodule of VF which is invariant under T . V is an R[x]module. Our goal is to select a basis for V such that the matrix representing T is in some simple form.
. V is an Rmodule. . the matrix representing T is
VF [x] ≈ F [x]/d1 ⊕ F [x]/d2 ⊕ · · · ⊕ F [x]/dt
C(d1 ) C(d2 )
. we suppose R is a ﬁeld F . but it is one of the great tricks in mathematics. Questions about the transformation T are transferred to questions about the module V over the ring R[x]. If all the eigenvalues for T are in F . In the section on Jordan Blocks. And in the case R is a ﬁeld. x. the free part of this decomposition will be zero. R[x] is a Euclidean domain and so we know almost everything about V as an R[x]module. Theorem 1 Under this scalar multiplication. . . V is a ﬁnitely generated F module. . Now in this section. Pick {1. T : V → V is a linear transformation and V is an F [x]module with vx = T (v).
This is just an observation. Then the matrix representing T is called the Jordan Canonical Form. Now we say all this again with a little more detail. Deﬁne a scalar multiplication V × R[x] → V by v(a0 + a1 x + · · · + ar xr ) = va0 + T (v)a1 + · · · + T r (v)ar .
the matrix representing T is
B((x − λ1 )s1 ) B((x − λ2 )s2 ) 0 . This is called 1 r ¯ the Rational canonical form for T . . . .. . (x − λi ).
0
B((x − λr )sr )
. i Theorem 3 With respect to this basis. Now we apply Theorem 6 to each F [x]/di . . pi = x − λi and VF [x] ≈ F [x]/(x − λ1 )s1 ⊕ · · · ⊕ F [x]/(x − λr )sr is an isomorphism of F [x]modules. Theorem 4 With respect to this basis. C(psr ) r
0
The characteristic polynomial of T is p = ps1 · · · psr and p(T ) = 0. the matrix representing T is C(ps1 ) 1 C(ps2 ) 2
0 . . This gives VF [x] ≈ F [x]/ps1 ⊕ · · · ⊕ 1 F [x]/psr where the pi are irreducible monic polynomials of degree at least 1. Pick {1. Thus in the Theorem above. (x − λi )2 .Chapter 6
Appendix
125
The characteristic polynomial of T is p = d1 d2 · · · dt and p(T ) = 0. The pi r need not be distinct. This is a type ¯ of canonical form but it does not seem to have a name. (x − λi )m−1 } as the F basis for F [x]/(x − λi )si where m is si . Pick an F basis for each F [x]/psi as before.. Now suppose the characteristic polynomial of T factors in F [x] as the product of linear polynomials.
However this exercise can still be worked ¯ using Jordan form.
Exercise Suppose F is a ﬁeld of characteristic 0 and T ∈ Fn has trace(T i ) = 0 ¯ for 0 < i ≤ n. So suppose T is in Jordan form and trace (T i ) = 0 for 1 ≤ i ≤ n. Let p ∈ F [x] be the characteristic polynomial of T . this section is loosely written. Then λi = 0 for 1 ≤ i ≤ n.
. The point is that it suﬃcies to consider the case where T is in Jordan form. Note also that if a matrix is in Jordan form. It also has a cute corollary. Thus trace ¯ (p(T )) = a0 n where a0 is the constant term of p(x). so it is important to use the transpose principle to write three other versions of the last two theorems. This is the case where each block is one by one. As before we make V a module over F [x] with T (v) = vx. and to show the diagonal elements are all zero. This fact is not proved in this ¯ book. This part should be studied only if you need it. We know p(T ) = 0 and thus ¯ trace (p(T )) = 0. This is based on the fact that there exists a ﬁeld F containing F ¯ as a subﬁeld.. its trace is the sum of the eigenvalues and its determinant is the product of the eigenvalues.126
Appendix
Chapter 6
The characteristic polynomial of T is p = (x − λ1 )s1 · · · (x − λr )sr and p(T ) = 0. This exercise illustrates the power and facility of Jordan form. Since the ﬁeld has characteristic 0.. n ≥ 1. . and thus T may have no conjugate in Fn which is in Jordan form. n 2 ¯ ¯
Minimal polynomials To conclude this section here are a few comments on the minimal polynomial of a linear transformation. and (λ1 . Suppose V is an ndimensional vector space over a ﬁeld F and T : V → V is a linear transformation. This ¯ is called the Jordan canonical form for T. Of course a diagonal matrix is about as canonical as you can get. and thus a0 n = 0. Note A diagonal matrix is in Rational canonical form and in Jordan canonical form. such that p factors into linears in F [x]. λ2 . Note that the λi need not be distinct. and the result follows by induction on the size of T . and of course. Show T is nilpotent. This means that one block of T is a strictly lower ¯ triangular matrix. So ∃ an invertible matrix U ∈ Fn so that −1 −1 U T U is in Jordan form. The polynomial p may not factor into linears in F [x]. Corollary Suppose F is a ﬁeld of characteristic 0. T is nilpotent iﬀ U T U is nilpotent. but it is assumed for this exercise. Removing this block leaves a smaller matrix which still satisﬁes the hypothesis. a0 = 0 ¯ ¯ ¯ and so 0 is an eigenvalue of T . λn ) ∈ F n i satisﬁes λ1 + λi + · · +λi = 0 for each 1 ≤ i ≤ n. Finally.
then p(A) = 0 and ¯ thus p is a multiple of u. ¯ Now we state this again in terms of matrices. If A is given to start with. . Ann(VF [x]) = uF [x]. Note that u(T ) = 0 and if h(x) ∈ F [x]. The polynomial u is called the minimal polynomial of T. Suppose A is the matrix displayed in Theorem 3 above. Suppose A ∈ Fn is a matrix representing T . If p(x) ∈ F [x] is the characteristic polynomial of T . This together with the rational form and the Jordan form will allow us to understand the relation of the minimal polynomial to the characteristic polynomial.Chapter 6
Appendix
127
Deﬁnition Ann(VF [x]) is the set of all h ∈ F [x] which annihilate V . h(A) = 0 iﬀ h is a multiple of ¯ ¯ u in F [x]. Find the characteristic and minimal polynomials of A.. use the linear transformation T : F n → F n determined by A to deﬁne the polynomial u. h(T ) = 0 iﬀ ¯ ¯ h is a multiple of u in F [x]. Then u(A) = 0 and if h(x) ∈ F [x]. The polynomial u is also called the minimal polynomial of A. Exercise 1) 2) 3) Suppose A ∈ Fn . Now suppose q ∈ F [x] is a monic polynomial and C(q) ∈ Fn is the companion matrix deﬁned in the section Jordan Blocks. let B(q) ∈ Fn be the Jordan block matrix also deﬁned in that section. p(T ) = 0 and thus p is a multiple of u.e. and thus similar matrices have the same minimal polynomial.. which satisfy V h = 0. and A =
Ar
and the minimal polynomial of A. This is a nonzero ideal of F [x] and is thus generated by a unique ¯ monic polynomial u(x) ∈ F (x). Suppose A is the matrix displayed in Theorem 4 above. Exercise Suppose Ai ∈ Fni has qi as its characteristic polynomial and its minimal
A1 A2 0 . If p(x) ∈ F [x] is the characteristic polynomial of A. i. Note that these properties hold for any matrix representing T . Find the characteristic and minimal polynomials of A.
. Find the characteristic and minimal polynomials of A.
Find the characteristic polynomial
Suppose A is the matrix displayed in Theorem 2 above. Whenever q(x) = (x − λ)n .
. Recall that q is the characteristic polynomial and the minimal polynomial of each of these matrices.
0
polynomial.
. f is skewsymmetric if f (b1 . Deﬁnition 1) 2) f is symmetric means f (b1 .
Proof From the ﬁrst exercise in Chapter 5. then (f r) is Rmultilinear. . . 2. so is f1 + f2 . Deﬁnition A map f : B1 ⊕ B2 ⊕ · · · ⊕ Bn → C is Rmultilinear means that if 1 ≤ i ≤ n. . We suppose R is a commutative ring. bτ (n) ) for all τ . (This funny looking exercise is a little delicate. Theorem The set of all Rmultilinear maps is an Rmodule. . The purpose of this section is to give a proof of this. . .) 5 −1 3 2 0 . B2 . . and bj ∈ Bj for j = i.
Determinants In the chapter on matrices. . . . n}. Bi . . it is stated without proof that the determinant of the product is the product of the determinants (see page 63). . Show that if λ is a root. . . . . . Find the characteristic and minimal 6) Let F = R and A = 0 −3 1 −1 polynomials of A. It must be seen that the Rmultilinear maps form a submodule. From here on. . Also if f is Rmultilinear and r ∈ R. . . It is easy to see that if f1 and f2 are Rmultilinear. and B1 . ¯ 5) Suppose F is a ﬁeld containing F as a subﬁeld. suppose B1 = B2 = · · · = Bn = B. . bn ) = f (bτ (1) . n ≥ 2. bτ (n) ) for all permutations τ on {1. b2 . bn ) = sign(τ )f (bτ (1) . Show that the minimal polynomial of A ∈ Fn is the same as the minimal polynomial of A considered as a ¯ matrix in Fn .128
Appendix
Chapter 6
4) Suppose λ ∈ F . . then f (b1 . Show λ is a root of the characteristic polynomial of A iﬀ λ is a root of the minimal polynomial of A. . . . its order in the characteristic polynomial is at least as large as its order in the minimal polynomial. . . . C is an Rmodule. . . the set of all functions from B1 ⊕ B2 ⊕ · · · ⊕ Bn to C is an Rmodule (see page 69). Bn is a sequence of Rmodules. . bn ) deﬁnes an Rlinear map from Bi to C. .
bn ) = 0. b1 .j ) ∈ Rn . Suppose B is Rn with the canonical basis {e1 . b3 . en ). bn ) = −f (bτ (1) . ¯ we get f (b1 . b3 . . b3 . . iii) If no element of C has order 2. . . . . . . Suppose C = R. b1 . . .. as seen by the following theorem. bτ (n) ) where τ is a transposition. assume f is alternating. . .. . ¯
Each of these three types deﬁnes a submodule of the set of all Rmultilinear maps... Then f = df (e1 . . It turns out that this is all of them.e.. . Proof Part i) is immediate.. . bn ) + ¯ f (b2 . ⊕ B → R is an alternating multilinear form.. . If we let τ be the transposition (1. . and the determinant is a generator.2 · · · aτ (n). For simplicity. b1 .. Now we are ready for determinant. . . . Thus the set of alternating forms is a free Rmodule of dimension 1.. 2). dc is an alternating multilinear form. and so 2f (b1 .. then alternating ⇐⇒ skewsymmetric... . .n e2 + · · · + an. en ). . To prove ii).. a1. (We think of a matrix A ∈ Rn as n column vectors. a1.n en ) = all τ sign(τ )(aτ (1). .1 e1 + a2. b1 . . This means f is the multilinear form d times the scalar f (e1 . . . en ). as an element of B ⊕ B ⊕ · · · ⊕ B.1 en . b3 . b2 .. assume τ = (1...n e1 + a2..1 aτ (2). ii) Alternating ⇒ skewsymmetric.) First we recall the deﬁnition of determinant. . e2 . suppose f is skew symmetric and no element of C has order 2.. bn ) = f (b1 . .1 en .. ... en ) = 1. .. ¯ and the result follows.. because the set of alternating forms is an Rmodule.j ) ∈ Rn . bn ) = −f (b1 . Then 0 = f (b1 + b2 . In this case multilinear maps are usually called multilinear forms. bn ). . b1 . Suppose for convenience that b1 = b2 and show f (b1 .n e2 + a2. if A = (ai. bn ) = 0 whenever some bi = bj for i = j. . e2 . . then f (a1. e2 . .. e2 . b3 .n en ) = Af (e1 . b1 + b2 . Deﬁne d : B ⊕B ⊕· · ·⊕B → R by d(a1. b3 . bn ) and the result follows.. Theorem Suppose f : B ⊕ B ⊕ . The next theorem follows from the section on determinants on page 61. Theorem d is an alternating multilinear form with d(e1 . ¯
i)
If c ∈ R. . e2 .n e2 + · · · + an.. . To prove iii). bn ) = 0. 2).1 e2 +· · ·+ an.
...Chapter 6 3) Theorem
Appendix
129
f is alternating if f (b1 . . and show f is alternating..n ) = A. . i. It suﬃcies to show that f (b1 . . ..1 e2 + · · · + an. In other words... Suppose A = (ai. en }. b3 . .1 e1 +a2.
en ).. ...2 f (e2 . . Dual Spaces The concept of dual module is basic. This incredible classiﬁcation of these alternating forms makes the proof of the following theorem easy. a tangent plane to a diﬀerentiable manifold is a real vector space.1 en . Use the fact that CA = (CA1 . ... e2 . en ) = Af (e1 . f (A) = Af (e1 ..2 e2 ) = a1.1 a2. In the notation of the previous theorem. but in general there is no natural isomorphism from V to V ∗ .n en ) = ai1 . In algebraic topology. en ). Deﬁne f : Rn → R by f (A) = CA. if any is = it for s = t. . By the previous theorem. Therefore the sum is just all τ aτ (1). while the union of the dual spaces is the cotangent bundle.2 f (e2 . e2 . . . (See the third theorem on page 63.1 aτ (2). .2 · · · aτ (n). eτ (n) ) = all τ sign(τ )aτ (1). The sections of the tangent bundle are called vector ﬁelds while the sections of the cotangent bundle are called 1forms.. e1 ) + a2. For example.n f (eτ (1) . . ...1 a1.1 ai2 . F ). Thus the tangent (cotangent) bundle may be considered to be the dual of the cotangent (tangent) bundle.2 − a1. The sum of the cohomology groups forms a ring. CA2 .1 aτ (2).1 e2 . V ∗ is isomorphic to V . 1 ≤ in ≤ n. A = (A1 . it follows that CA = f (A) = AC. e2 ) = Af (e1 .1 e2 + · · · + an. A2 .2 f (e1 ..1 e1 + a2. However there is a natural isomorphism from V to V ∗∗ . A2 .1 a2.. This remarkable fact has many expressions in mathematics.1 a1.. If V is a ﬁnitely generated vector space over a ﬁeld F ..n f (e1 .n e1 + a2. e2 ).
...2 · · · aτ (n). you can simply write it out. eτ (2) ..n f (ei1 .. . not only in algebra. . For the general case. but also in other areas such as diﬀerential geometry and topology. that term is 0 because f is alternating.. If A ∈ Rn ..130
Appendix
Chapter 6
Proof For n = 2. en ) = CI = C.2 a2. The union of these spaces is the tangent bundle. A ∈ Rn .1 a2. 1 ≤ i2 ≤ n. An ) = CA.. ein ) where the sum is over all 1 ≤ i1 ≤ n.2 f (e1 .
Proof Suppose C ∈ Rn .) Theorem If C.. e1 ) + a1. . and f : Rn ⊕ · · · ⊕ Rn → R has f (A1 . . .2 · · · ain . e2 ) + a2. its dual V ∗ is deﬁned as V ∗ = HomF (V... and so V ∗ is the dual of V and V may be considered to be the dual of V ∗ . while the sum of the homology groups does not. e2 ) = (a1.n e2 + · · · + an. f (a1. Since f (e1 . e2 .1 e1 + a2. a1. However.. then CA = CA. ..2 e1 + a2. a1. B = Rn and Rn = Rn ⊕ Rn ⊕ · · · ⊕ Rn . f (a1.1 )f (e1 . .. . CAn ) to show that f is an alternating multilinear form. homology groups are derived from chain complexes. while cohomology groups are derived from the dual chain complexes... An ) where Ai ∈ Rn is column i of A. e2 .. ei2 .
If I : M → M is the identity. If M and N are Rmodules and g : M → N is an Rmodule homomorphism. Suppose R is a commutative ring and W is an Rmodule. Deﬁnition If M is an Rmodule. let H(g) : H(N ) → H(M ) be deﬁned by H(g)(f ) = f ◦ g.
g h
iii) If M1 −→ M2 −→ M3 are Rmodule homomorphisms. H(M1 ⊕ M2 ) ≈ H(M1 ) ⊕ H(M2 ). then H(g)◦H(h) = H(h ◦ g). W ).Chapter 6
Appendix
131
Thus the concept of dual module has considerable power.
. H is a contravariant functor from the category of Rmodules to itself. M1 g
E
M2
h
E
M3 f
c
f ◦h◦g
f ◦ h ~ q
W
Note In the language of the category theory. let H(M ) be the Rmodule H(M )=HomR (M. then (H(g) ◦ H(h))(f ) = H(h ◦ g)(f ) = f ◦ h ◦ g. then H(I) : H(M ) → H(M ) is the identity. g M
E
N f
H(g)(f ) = f ◦ g
~
c
W Theorem i) ii) If M1 and M2 are Rmodules. If f : M3 → W is a homomorphism. Note that H(g) is an Rmodule homomorphism. We develop here the basic theory of dual modules.
. 0). . Thus vi (vj ) = δi.132
Appendix
Chapter 6
Theorem If M and N are Rmodules and g : M → N is an isomorphism. 0 · n Proof First consider the case of R = Rn. .
. vm } and N is a free module with a basis {w1 .. . then g is surjective (injective) iﬀ H(g) is injective (surjective). In this case H(M ) = HomR (M. . . .n is free with dual basis {e1 . . . Proof This is a good exercise.. en } where ei = 1i · 0 We know (Rn )∗ ≈ R1. wn } and g : M → N is the homomorphism given by A = (ai.
iii) If R is a ﬁeld and g : M → N is a homomorphism. . Then g ∗ : M ∗ → (Rn )∗ ∗ ∗ ∗ sends vi to e∗ . 0. R) is denoted by H(M ) = M ∗ and H(g) is denoted by H(g) = g ∗ .m . .j ) ∈ Rn. .
For the remainder of this section.n . Then v1 . . n i ≈ For the general case. . . then H(g) : H(N ) → H(M ) is injective. . . Deﬁne vi ∈ M ∗ by ∗ ∗ ∗ ∗ vi (v1 r1 + · · · + vn rn ) = ri . {v1 .j w1 + · · · + an. . then H(g) : H(N ) → H(M ) is an isomorphism with H(g −1 ) = H(g)−1 . Now R1. is given by At . vn is a free basis for M ∗ . . . .e. called the dual basis. . . Since g ∗ is an isomorphism.j wn .1 .j . . Therefore M ∗ is free and is isomorphic to M . . .. then H(g) : H(N ) → H(M ) is surjective. 0. i. . any homomorphism from Rn to R is given by a 1 × n ∗ matrix. with basis {e1 .
Theorem Suppose M is a free module with a basis {v1 . . 1i .. Then the matrix of g ∗ : N ∗ → M ∗ with respect to the dual bases. e∗ } where e∗ = (0. vn }. If g : M → N is an injective homomorphism and g(M ) is a summand of N. . suppose W = RR . i
. Proof IH(N ) = H(IN ) = H(g ◦ g −1 ) = H(g −1 ) ◦ H(g) IH(M ) = H(IM ) = H(g −1 ◦ g) = H(g) ◦ H(g −1 ) Theorem i) ii) If g : M → N is a surjective homomorphism. . This means g(vj ) = a1.
∗ Theorem Suppose M has a ﬁnite free basis {v1 . let g : Rn → M be given by g(ei ) = vi . . . . vn } is a basis for M ∗ .
Of course this is just the matrix product f Av. Now suppose M = N = Rn and g : Rn → Rn is represented by a matrix A ∈ Rn .m vm .. v). On N. vn }. v). g(v)) = φM (g ∗ (f ). f ∈ N ∗ and v ∈ M .Chapter 6
Appendix
133
∗ Proof Note that g ∗ (wi ) is a homomorphism from M to R.j w1 + · · · + an. Thus g ∗ (wi ) ∗ ∗ = ai. . v). Show that φU is Rbilinear. Deﬁnition “Double dual” is a “covariant” functor. Evaluation on vj gives ∗ ∗ ∗ ∗ ∗ g ∗ (wi )(vj ) = (wi ◦ g)(vj ) = wi (g(vj )) = wi (a1. and this exercise should be worked out completely. . M g
c
α E M ∗∗ g ∗∗ α E N ∗∗
c
N
Proof On M. v). i. Proof
∗ ∗ ∗ {α(v1 ).
. Av) = φ(f A. then α : M → M ∗∗ is an isomorphism. g(v)) = φM (g ∗ (f ). the formula is φ(f.e. . if g : M → N is a homomorphism. u). The proof follows from the equation φN (f..
Exercise If U is an Rmodule. Show that φN (f. deﬁne φU : U ∗ ⊕ U → R by φU (f. . . α(vn )} is the dual basis of {v1 . Av) = φ(At f. .j . Theorem If M is a free Rmodule with a ﬁnite basis {v1 . Dual spaces are confusing. If the elements of Rn are written as column vectors and the elements of (Rn )∗ are written as row vectors. Note that α is a homomorphism. α is given by α(v) = φM (−. Theorem If M and N are Rmodules and g : M → N is a homomorphism. . then the following diagram is commutative.e. This is with the elements of Rn and (Rn )∗ written as column vectors. and thus g ∗ is represented by At . Suppose g : M → N is an Rmodule homomorphism. . . . Suppose f ∈ (Rn )∗ and v ∈ Rn . . then g ∗∗ : M ∗∗ → N ∗∗ . u) = f (u). v).j wn ) = ai. Use the theorem above to show that φ : (Rn )∗ ⊕ Rn → R has the property φ(f.. deﬁne α : M → M ∗∗ by α(m) : M ∗ → R is the homomorphism which sends f ∈ M ∗ to f (m) ∈ R. For any module M . α(m) is given by evaluation at m. α(u) = φN (−. i. vn }. α(vi ) = (vi )∗ . i.e.1 v1 + · · · + ai. .
. . . . . {v1 . then φV : V ∗ ⊕ V → R is just the dot product V ⊕ V → R. . . . β(vn )} is the dual ∗ basis of {v1 . . In general there is no natural way to identify V and V ∗ . β is an isometry. . vn } is the dual basis ∗ ∗ for {v1 . . . . vn } is an orthonormal basis for V ∗ . Under this identiﬁcation V ∗ is the dual of V and V is the dual of V ∗ . In the language of category theory. and under this structure. . vn } its dual basis. . . . . . vn }. vn }. and Rt . . .134
Appendix
Chapter 6
Note Suppose R is a ﬁeld and C is the category of ﬁnitely generated vector spaces over R. α is a natural equivalence between the identity functor and the double dual functor. . R = R. and M =
t∈T
Rt . Rt = R. Then β : V → V ∗ given by β(v) = (v. −) is an isomorphism. vn } ∗ ∗ is a basis for V and {vi . However for real inner product spaces there is. . V f
c
β E V∗
T
f∗ U β E U∗
Exercise Suppose R is a commutative ring. Note For ﬁnitely generated vector spaces. Note If {v1 . . if U is another ndimensional IPS and f : V → U is an isometry. Theorem Let R = R and V be an ndimensional real inner product space. T is an inﬁnite index set. Proof β is injective and V and V ∗ have the same dimension. vn } is any orthonormal basis for V. . The isomorphism β : V → V ∗ deﬁnes an inner product on V ∗ . .
Note If β is used to identify V with V ∗ . . If {v1 . α is used to identify V and V ∗∗ .
. . . . . if {v1 . . . Now let for each t ∈ T . . that is β(vi ) = vi . Also. Show ( Rt )∗ is isomorphic to RT =
t∈T t∈T
T = Z+ . then f ∗ : U ∗ → V ∗ is an isometry and the following diagram commutes. {β(v1 ). vn } is ∗ ∗ an orthonormal basis for V. . .
Show M ∗ is not isomorphic to M . then {v1 . Also.
23 module. 46. 39 of a function. 108. 45 Domain euclidean. 64 Conjugation by a unit. 37 Complex numbers. 56 Dimension of a free module. 74 Cycle. 46. 58
. 19 Boolean algebras. 120 Elementary matrices. 32 Cyclic group.7 Binary operation. 76 Coset. 85 of a matrix. 125 Center of group. 50. 83 Division algorithm. 128 Diagonal matrix.Index
Abelian group. 66 Chinese remainder theorem. 39 Cartesian product. 95 Eigenvectors. 97. 32 Ascending chain condition. 1. 85. 29 of modules. 71 Algebraically closed ﬁeld. 79 of a module. 52 Boolean rings. 104 Conjugate. 83 Bijective or onetoone correspondence. 20 in a ring. 116 integral domain. 131 Coproduct or sum of modules. 20. 120 Commutative ring. 31 CayleyHamilton theorem. 50 Characteristic polynomial of a homomorphism. 132 Dual spaces. 40. 47. 98. 111 Dual basis. 107 Determinant of a homomorphism. 130 Eigenvalues. 60. 43 Axiom of choice. 2. 46 unique factorization. 47. 95 Elementary divisors. 22 Change of basis. 109 Automorphism of groups. 63 135 Cofactor of a matrix. 66. 11 Cayley’s theorem. 72. 70 of rings. 97 Alternating group. 119. 5 principal ideal. 108 Classical adjoint of a matrix. 112 Associate elements in a domain. 95 of a matrix. 24. 62 Comaximal ideals. 78. 42. 83 Characteristic of a ring. 44 Contravariant functor. 51 Cancellation law in a group. 10 Basis or free basis canonical or standard for Rn .
40 Integers. 4 Euclidean algorithm. 34 Isometry. 111 Fermat’s little theorem. 36 as a module. 7 injective. 4 Equivalence relation. 15 Group. 7 Function space Y T as a group. 106 Factorization domain (FD). 70 Equivalence class. 60 Homormophism of groups. 6 bijective. 109 of a ring. 42 of modules. 3. 55 Irreducible element. 79. 41 prime. 26. 42. 78 Function or map. 19 symmetric. 78. 87. 23 multiplicative. 29 module. 44 as a set. 109 Hilbert. 50 Field. 116 Evaluation map. 22.136 Elementary operations. 119 Inverse image. 55 Generating sequence in a module. 79 Inner product spaces. 27. 39 Formal power series. 1. 2 Induction. 90 GramSchmidt orthonormalization. 44 Ideal left. 110 Isometries of a square. 19 abelian. 100 Free basis. 7 surjective. 51 Image of a function. 101 Isomorphism
. 46 Gauss. 41 maximal. 72. 14 Invariant factors. 20 cyclic. 32 Exponential of a matrix. 25 Index set. 6 Greatest common divisor. 46 right. 40 Geometry of determinant. 49. 69 Homomorphism of quotient group. 7 Invertible or nonsingular matrix. 78 Generators of Zn . 57. 49 Even permutation. 69 as a ring. 12 Fundamental theorem of algebra. 113 Fourier series. 13 Injective or onetoone. 109 principal. 23 of rings. 7 Independent sequence in a module. 41 Idempotent element in a ring. 122 Endomorphism of a module. 113 Homogeneous equation. 47. 78 Index of a subgroup. 83 Free Rmodule. 20 additive. 113 General linear group GLn (R). 31
Index
Hausdorﬀ maximality principle. 14 Euclidean domain. 98 Integers mod n. 47. 100 Graph of a function. 74 ring. 7.
70 Least common multiple. 85 Matrix elementary. 59. 55 representing a linear transformation. 119 Right and left inverses of functions. 46 Row echelon form. 10 Ring. 56 Maximal ideal. 87 monotonic subcollection. 16 Principal ideal domain (PID). 96. 26 Odd permutation. 123. 34. 5 Permutation. 125 Kernel. 107. 39 Polynomial ring. 12 Prime element. 129 Multiplicative group of a ﬁnite ﬁeld. 109 integer. 125 Relation. 57
. 62 Module over a ring. 18 Linear combination. 42 Product of groups. 70 of rings.Index of groups. 56 homomorphism. 102 Orthogonal vectors. 109 independent sequence. 16 elements in a PID. 93 Noetherian ring. 43 Jacobian matrix. 91 Jordan block. 58 invertible. 68 Monomial. 11 Quotient group. 27 Quotient module. 32 Onto or surjective. 96. 17. 43. 121 Nilpotent element. 110 ideal. 28. 7. 79 Order of an element or group. 6 Rank of a matrix. 45 Power set. 112 Normal subgroup. 42
137
Range of a function. 31 Pigeonhole principle. 11 Projection maps. 123 Jordan canonical form. 59 Scalar matrix. 23 Orthogonal group O(n). 99 Partial ordering. 75 of rings. 3 Partition of a set. 48 Monotonic collection of sets. 49 of sets. 78 Linear ordering. 3 Linear transformation. 89 Rational canonical form. 35 of modules. 4 subgroup. 2. 8. 74 Quotient ring. 3 Relatively prime integers. 86. 114 Minimal polynomial. 46 Principal ideal. 99 Orthonormal sequence. 84 triangular. 127 Minor of a matrix. 29 of modules. 4 Multilinear forms. 38 Root of a polynomial.
81 Splitting map. 105 Short exact sequence. 67. 71 Self adjoint. 59. 9. 115 Surjective or onto. 115 Sign of a permutation. 85 Volume preserving homomorphism. 31 Symmetric matrix. 103. 77. 38 Vector space. 121 Trace of a homormophism. 7. 85 of a matrix. 103 Torsion element of a module. 32 Unique factorization. 21. 54. 65 Transpose of a matrix. 132 Transposition. 113 of integers. 16 Unique factorization domain (UFD).138 Scalar multiplication. 114 Standard basis for Rn . 14. 8 Subgroup. 64 Solutions of equations. 103. 41 Summand of a module. in principal ideal domains. 69 Subring. 56. 79 Symmetric groups. 79 Strips (horizontal and vertical). 39
Index
. 90 Zero divisor in a ring. 38. 21 Submodule. 60 Similar matrices. 72. 111 Unit in a ring.