Elements of

Abstract and Linear Algebra
E. H. Connell
ii
E.H. Connell
Department of Mathematics
University of Miami
P.O. Box 249085
Coral Gables, Florida 33124 USA
ec@math.miami.edu
Mathematical Subject Classifications (1991): 12-01, 13-01, 15-01, 16-01, 20-01
c _1999 E.H. Connell
November 30, 2000 [http://www.math.miami.edu/∼ec/book/]
iii
Introduction
In 1965 I first taught an undergraduate course in abstract algebra. It was fun to
teach because the material was interesting and the class was outstanding. Five of
those students later earned a Ph.D. in mathematics. Since then I have taught the
course about a dozen times from various texts. Over the years I developed a set of
lecture notes and in 1985 I had them typed so they could be used as a text. They
now appear (in modified form) as the first five chapters of this book. Here were some
of my motives at the time.
1) To have something as short and inexpensive as possible. In my experience,
students like short books.
2) To avoid all innovation. To organize the material in the most simple-minded
straightforward manner.
3) To order the material linearly. To the extent possible, each section should use
the previous sections and be used in the following sections.
4) To omit as many topics as possible. This is a foundational course, not a topics
course. If a topic is not used later, it should not be included. There are three
good reasons for this. First, linear algebra has top priority. It is better to go
forward and do more linear algebra than to stop and do more group and ring
theory. Second, it is more important that students learn to organize and write
proofs themselves than to cover more subject matter. Algebra is a perfect place
to get started because there are many “easy” theorems to prove. There are
many routine theorems stated here without proofs, and they may be considered
as exercises for the students. Third, the material should be so fundamental
that it be appropriate for students in the physical sciences and in computer
science. Zillions of students take calculus and cookbook linear algebra, but few
take abstract algebra courses. Something is wrong here, and one thing wrong
is that the courses try to do too much group and ring theory and not enough
matrix theory and linear algebra.
5) To offer an alternative for computer science majors to the standard discrete
mathematics courses. Most of the material in the first four chapters of this text
is covered in various discrete mathematics courses. Computer science majors
might benefit by seeing this material organized from a purely mathematical
viewpoint.
iv
Over the years I used the five chapters that were typed as a base for my algebra
courses, supplementing them as I saw fit. In 1996 I wrote a sixth chapter, giving
enough material for a full first year graduate course. This chapter was written in the
same “style” as the previous chapters, i.e., everything was right down to the nub. It
hung together pretty well except for the last two sections on determinants and dual
spaces. These were independent topics stuck on at the end. In the academic year
1997-98 I revised all six chapters and had them typed in LaTeX. This is the personal
background of how this book came about.
It is difficult to do anything in life without help from friends, and many of my
friends have contributed to this text. My sincere gratitude goes especially to Marilyn
Gonzalez, Lourdes Robles, Marta Alpar, John Zweibel, Dmitry Gokhman, Brian
Coomes, Huseyin Kocak, and Shulim Kaliman. To these and all who contributed,
this book is fondly dedicated.
This book is a survey of abstract algebra with emphasis on linear algebra. It is
intended for students in mathematics, computer science, and the physical sciences.
The first three or four chapters can stand alone as a one semester course in abstract
algebra. However they are structured to provide the background for the chapter on
linear algebra. Chapter 2 is the most difficult part of the book because groups are
written in additive and multiplicative notation, and the concept of coset is confusing
at first. After Chapter 2 the book gets easier as you go along. Indeed, after the
first four chapters, the linear algebra follows easily. Finishing the chapter on linear
algebra gives a basic one year undergraduate course in abstract algebra. Chapter 6
continues the material to complete a first year graduate course. Classes with little
background can do the first three chapters in the first semester, and chapters 4 and 5
in the second semester. More advanced classes can do four chapters the first semester
and chapters 5 and 6 the second semester. As bare as the first four chapters are, you
still have to truck right along to finish them in one semester.
The presentation is compact and tightly organized, but still somewhat informal.
The proofs of many of the elementary theorems are omitted. These proofs are to
be provided by the professor in class or assigned as homework exercises. There is a
non-trivial theorem stated without proof in Chapter 4, namely the determinant of the
product is the product of the determinants. For the proper flow of the course, this
theorem should be assumed there without proof. The proof is contained in Chapter 6.
The Jordan form should not be considered part of Chapter 5. It is stated there only
as a reference for undergraduate courses. Finally, Chapter 6 is not written primarily
for reference, but as an additional chapter for more advanced courses.
v
This text is written with the conviction that it is more effective to teach abstract
and linear algebra as one coherent discipline rather than as two separate ones. Teach-
ing abstract algebra and linear algebra as distinct courses results in a loss of synergy
and a loss of momentum. Also I am convinced it is easier to build a course from a
base than to extract it from a big book. Because after you extract it, you still have to
build it. Basic algebra is a subject of incredible elegance and utility, but it requires
a lot of organization. This book is my attempt at that organization. Every effort
has been extended to make the subject move rapidly and to make the flow from one
topic to the next as seamless as possible. The goal is to stay focused and go forward,
because mathematics is learned in hindsight. I would have made the book shorter,
but I did not have any more time.
Unfortunately mathematics is a difficult and heavy subject. The style and
approach of this book is to make it a little lighter. This book works best when
viewed lightly and read as a story. I hope the students and professors who try it,
enjoy it.
E. H. Connell
Department of Mathematics
University of Miami
Coral Gables, FL 33124
ec@math.miami.edu
vi
Outline
Chapter 1 Background and Fundamentals of Mathematics
Sets, Cartesian products 1
Relations, partial orderings, Hausdorff maximality principle, 3
equivalence relations
Functions, bijections, strips, solutions of equations, 5
right and left inverses, projections
Notation for the logic of mathematics 13
Integers, subgroups, unique factorization 14
Chapter 2 Groups
Groups, scalar multiplication for additive groups 19
Subgroups, order, cosets 21
Normal subgroups, quotient groups, the integers mod n 25
Homomorphisms 27
Permutations, the symmetric groups 31
Product of groups 34
Chapter 3 Rings
Rings 37
Units, domains, fields 38
The integers mod n 40
Ideals and quotient rings 41
Homomorphisms 42
Polynomial rings 45
Product of rings 49
The Chinese remainder theorem 50
Characteristic 50
Boolean rings 51
Chapter 4 Matrices and Matrix Rings
Addition and multiplication of matrices, invertible matrices 53
Transpose 55
Triangular, diagonal, and scalar matrices 56
Elementary operations and elementary matrices 57
Systems of equations 59
vii
Determinants, the classical adjoint 60
Similarity, trace, and characteristic polynomial 64
Chapter 5 Linear Algebra
Modules, submodules 68
Homomorphisms 69
Homomorphisms on R
n
71
Cosets and quotient modules 74
Products and coproducts 75
Summands 77
Independence, generating sets, and free basis 78
Characterization of free modules 79
Uniqueness of dimension 82
Change of basis 83
Vector spaces, square matrices over fields, rank of a matrix 85
Geometric interpretation of determinant 90
Linear functions approximate differentiable functions locally 91
The transpose principle 92
Nilpotent homomorphisms 93
Eigenvalues, characteristic roots 94
Jordan canonical form 96
Inner product spaces, Gram-Schmidt orthonormalization 98
Orthogonal matrices, the orthogonal group 102
Diagonalization of symmetric matrices 103
Chapter 6 Appendix
The Chinese remainder theorem 108
Prime and maximal ideals and UFD
s
109
Splitting short exact sequences 114
Euclidean domains 116
Jordan blocks 122
Jordan canonical form 123
Determinants 128
Dual spaces 130
viii
Chapter 1
Background and Fundamentals of
Mathematics
This chapter is fundamental, not just for algebra, but for all fields related to mathe-
matics. The basic concepts are products of sets, partial orderings, equivalence rela-
tions, functions, and the integers. An equivalence relation on a set A is shown to be
simply a partition of A into disjoint subsets. There is an emphasis on the concept
of function, and the properties of surjective, injective, and bijective. The notion of a
solution of an equation is central in mathematics, and most properties of functions
can be stated in terms of solutions of equations. In elementary courses the section
on the Hausdorff Maximality Principle should be ignored. The final section gives a
proof of the unique factorization theorem for the integers.
Notation Mathematics has its own universally accepted shorthand. The symbol
∃ means “there exists” and ∃! means “there exists a unique”. The symbol ∀ means
“for each” and ⇒ means “implies”. Some sets (or collections) are so basic they have
their own proprietary symbols. Five of these are listed below.
N = Z
+
= the set of positive integers = ¦1, 2, 3, ...¦
Z = the ring of integers = ¦..., −2, −1, 0, 1, 2, ...¦
Q = the field of rational numbers = ¦a/b : a, b ∈ Z, b ,= 0¦
R = the field of real numbers
C = the field of complex numbers = ¦a + bi : a, b ∈ R¦ (i
2
= −1)
Sets Suppose A, B, C,... are sets. We use the standard notation for intersection
and union.
A ∩ B = ¦x : x ∈ A and x ∈ B¦ = the set of all x which are elements
1
2 Background Chapter 1
of A and B.
A ∪ B = ¦x : x ∈ A or x ∈ B¦ = the set of all x which are elements of
A or B.
Any set called an index set is assumed to be non-void. Suppose T is an index set and
for each t ∈ T, A
t
is a set.
_
t∈T
A
t
= ¦x : ∃ t ∈ T with x ∈ A
t
¦

t∈T
A
t
= ¦x : if t ∈ T, x ∈ A
t
¦ = ¦x : ∀t ∈ T, x ∈ A
t
¦
Let ∅ be the null set. If A ∩ B = ∅, then A and B are said to be disjoint.
Definition Suppose each of A and B is a set. The statement that A is a subset
of B (A ⊂ B) means that if a is an element of A, then a is an element of B. That
is, a ∈ A ⇒a ∈ B.
Exercise Suppose each of A and B is a set. The statement that A is not a subset
of B means .
Theorem (De Morgan’s laws) Suppose S is a set. If C ⊂ S (i.e., if C is a subset
of S), let C

, the complement of C in S, be defined by C

= S−C = ¦x ∈ S : x ,∈ C¦.
Then for any A, B ⊂ S,
(A ∩ B)

= A

∪ B

and
(A ∪ B)

= A

∩ B

Cartesian Products If X and Y are sets, X Y = ¦(x, y) : x ∈ X and y ∈ Y ¦.
In other words, the Cartesian product of X and Y is defined to be the set of all
ordered pairs whose first term is in X and whose second term is in Y .
Example RR = R
2
= the plane.
Chapter 1 Background 3
Definition If each of X
1
, ..., X
n
is a set, X
1
X
n
= ¦(x
1
, ..., x
n
) : x
i
∈ X
i
for 1 ≤ i ≤ n¦ = the set of all ordered n-tuples whose i-th term is in X
i
.
Example R R = R
n
= real n-space.
Question Is (RR
2
) = (R
2
R) = R
3
?
Relations
If A is a non-void set, a non-void subset R ⊂ A A is called a relation on A. If
(a, b) ∈ R we say that a is related to b, and we write this fact by the expression a ∼ b.
Here are several properties which a relation may possess.
1) If a ∈ A, then a ∼ a. (reflexive)
2) If a ∼ b, then b ∼ a. (symmetric)
2

) If a ∼ b and b ∼ a, then a = b. (anti-symmetric)
3) If a ∼ b and b ∼ c, then a ∼ c. (transitive)
Definition A relation which satisfies 1), 2

), and 3) is called a partial ordering.
In this case we write a ∼ b as a ≤ b. Then
1) If a ∈ A, then a ≤ a.
2

) If a ≤ b and b ≤ a, then a = b.
3) If a ≤ b and b ≤ c, then a ≤ c.
Definition A linear ordering is a partial ordering with the additional property
that, if a, b ∈ A, then a ≤ b or b ≤ a.
Example A = R with the ordinary ordering, is a linear ordering.
Example A = all subsets of R
2
, with a ≤ b defined by a ⊂ b, is a partial ordering.
Hausdorff Maximality Principle (HMP) Suppose S is a non-void subset of A
and ∼ is a relation on A. This defines a relation on S. If the relation satisfies any
of the properties 1), 2), 2

), or 3) on A, the relation also satisfies these properties
when restricted to S. In particular, a partial ordering on A defines a partial ordering
4 Background Chapter 1
on S. However the ordering may be linear on S but not linear on A. The HMP is
that any linearly ordered subset of a partially ordered set is contained in a maximal
linearly ordered subset.
Exercise Define a relation on A = R
2
by (a, b) ∼ (c, d) provided a ≤ c and
b ≤ d. Show this is a partial ordering which is linear on S = ¦(a, a) : a < 0¦. Find at
least two maximal linearly ordered subsets of R
2
which contain S.
In this book, the only applications of the HMP are to obtain maximal monotonic
collections of subsets.
Definition A collection of sets is said to be monotonic if, given any two sets of
the collection, one is contained in the other.
Corollary to HMP Suppose X is a non-void set and A is some non-void
collection of subsets of X, and S is a subcollection of A which is monotonic. Then ∃
a maximal monotonic subcollection of A which contains S.
Proof Define a partial ordering on A by V ≤ W iff V ⊂ W, and apply HMP.
The HMP is used twice in this book. First, to show that infinitely generated
vector spaces have free bases, and second, in the Appendix, to show that rings have
maximal ideals (see pages 87 and 109). In each of these applications, the maximal
monotonic subcollection will have a maximal element. In elementary courses, these
results may be assumed, and thus the HMP may be ignored.
Equivalence Relations A relation satisfying properties 1), 2), and 3) is called
an equivalence relation.
Exercise Define a relation on A = Z by n ∼ m iff n − m is a multiple of 3.
Show this is an equivalence relation.
Definition If ∼ is an equivalence relation on A and a ∈ A, we define the equiva-
lence class containing a by cl(a) = ¦x ∈ A : a ∼ x¦.
Chapter 1 Background 5
Theorem
1) If b ∈ cl(a) then cl(b) = cl(a). Thus we may speak of a subset of A
being an equivalence class with no mention of any element contained
in it.
2) If each of U, V ⊂ A is an equivalence class and U ∩ V ,= ∅, then
U = V .
3) Each element of A is an element of one and only one equivalence class.
Definition A partition of A is a collection of disjoint non-void subsets whose union
is A. In other words, a collection of non-void subsets of A is a partition of A provided
any a ∈ A is an element of one and only one subset of the collection. Note that if A
has an equivalence relation, the equivalence classes form a partition of A.
Theorem Suppose A is a non-void set with a partition. Define a relation on A by
a ∼ b iff a and b belong to the same subset of the partition. Then ∼ is an equivalence
relation, and the equivalence classes are just the subsets of the partition.
Summary There are two ways of viewing an equivalence relation — one is as a
relation on A satisfying 1), 2), and 3), and the other is as a partition of A into
disjoint subsets.
Exercise Define an equivalence relation on Z by n ∼ m iff n−m is a multiple of 3.
What are the equivalence classes?
Exercise Is there a relation on R satisfying 1), 2), 2

) and 3) ? That is, is there
an equivalence relation on R which is also a partial ordering?
Exercise Let H ⊂ R
2
be the line H = ¦(a, 2a) : a ∈ R¦. Consider the collection
of all translates of H, i.e., all lines in the plane with slope 2. Find the equivalence
relation on R
2
defined by this partition of R
2
.
Functions
Just as there are two ways of viewing an equivalence relation, there are two ways
of defining a function. One is the “intuitive” definition, and the other is the “graph”
or “ordered pairs” definition. In either case, domain and range are inherent parts of
the definition. We use the “intuitive” definition because everyone thinks that way.
6 Background Chapter 1
Definition If X and Y are (non-void) sets, a function or mapping or map with
domain X and range Y , is an ordered triple (X, Y, f) where f assigns to each x ∈ X
a well defined element f(x) ∈ Y . The statement that (X, Y, f) is a function is written
as f : X →Y or X
f
→Y .
Definition The graph of a function (X, Y, f) is the subset Γ ⊂ X Y defined
by Γ = ¦(x, f(x)) : x ∈ X¦. The connection between the “intuitive” and “graph”
viewpoints is given in the next theorem.
Theorem If f : X → Y , then the graph Γ ⊂ X Y has the property that each
x ∈ X is the first term of one and only one ordered pair in Γ. Conversely, if Γ is a
subset of X Y with the property that each x ∈ X is the first term of one and only
ordered pair in Γ, then ∃! f : X → Y whose graph is Γ. The function is defined by
“f(x) is the second term of the ordered pair in Γ whose first term is x.”
Example Identity functions Here X = Y and f : X → X is defined by
f(x) = x for all x ∈ X. The identity on X is denoted by I
X
or just I : X →X.
Example Constant functions Suppose y
0
∈ Y . Define f : X → Y by f(x) =
y
0
for all x ∈ X.
Restriction Given f : X →Y and a non-void subset S of X, define f [ S : S →Y
by (f [ S)(s) = f(s) for all s ∈ S.
Inclusion If S is a non-void subset of X, define the inclusion i : S → X by
i(s) = s for all s ∈ S. Note that inclusion is a restriction of the identity.
Composition Given W
f
→X
g
→Y define g ◦ f : W →Y by
(g ◦ f)(x) = g(f(x)).
Theorem (The associative law of composition) If V
f
→ W
g
→ X
h
→ Y , then
h ◦ (g ◦ f) = (h ◦ g) ◦ f. This may be written as h ◦ g ◦ f.
Chapter 1 Background 7
Definitions Suppose f : X →Y .
1) If T ⊂ Y , the inverse image of T is a subset of X, f
−1
(T) = ¦x ∈ X :
f(x) ∈ T¦.
2) If S ⊂ X, the image of S is a subset of Y , f(S) = ¦f(s) : s ∈ S¦ =
¦y ∈ Y : ∃s ∈ S with f(s) = y¦.
3) The image of f is the image of X , i.e., image (f) = f(X) =
¦f(x) : x ∈ X¦ = ¦y ∈ Y : ∃x ∈ X with f(x) = y¦.
4) f : X →Y is surjective or onto provided image (f) = Y i.e., the image
is the range, i.e., if y ∈ Y , f
−1
(y) is a non-void subset of X.
5) f : X →Y is injective or 1-1 provided (x
1
,= x
2
) ⇒f(x
1
) ,= f(x
2
), i.e.,
if x
1
and x
2
are distinct elements of X, then f(x
1
) and f(x
2
) are
distinct elements of Y .
6) f : X →Y is bijective or is a 1-1 correspondence provided f is surjective
and injective. In this case, there is function f
−1
: Y →X with f
−1
◦ f =
I
X
: X →X and f ◦ f
−1
= I
Y
: Y →Y . Note that f
−1
: Y →X is
also bijective and (f
−1
)
−1
= f.
Examples
1) f : R →R defined by f(x) = sin(x) is neither surjective nor injective.
2) f : R →[−1, 1] defined by f(x) = sin(x) is surjective but not injective.
3) f : [0, π/2] →R defined by f(x) = sin(x) is injective but not surjective.
4) f : [0, π/2] →[0, 1] defined by f(x) = sin(x) is bijective. (f
−1
(x) is
written as arcsin(x) or sin
−1
(x).)
5) f : R →(0, ∞) defined by f(x) = e
x
is bijective. (f
−1
(x) is written as
ln(x).)
Note There is no such thing as “the function sin(x).” A function is not defined
unless the domain and range are specified.
8 Background Chapter 1
Exercise Show there are natural bijections from (R R
2
) to (R
2
R) and
from (R
2
R) to R R R. These three sets are disjoint, but the bijections
between them are so natural that we sometimes identify them.
Exercise Suppose X is a set with 6 elements and Y is a finite set with n elements.
1) There exists an injective f : X →Y iff n .
2) There exists a surjective f : X →Y iff n .
3) There exists a bijective f : X →Y iff n .
Pigeonhole Principle Suppose X is a set with n elements, Y is a set with m
elements, and f : X →Y is a function.
1) If n = m, then f is injective iff f is surjective iff f is bijective.
2) If n > m, then f is not injective.
3) If n < m, then f is not surjective.
If you are placing 6 pigeons in 6 holes, and you run out of pigeons before you fill
the holes, then you have placed 2 pigeons in one hole. In other words, in part 1) for
n = m = 6, if f is not surjective then f is not injective. Of course, the pigeonhole
principle does not hold for infinite sets, as can be seen by the following exercise.
Exercise Show there is a function f : Z
+
→ Z
+
which is injective but not
surjective. Also show there is one which is surjective but not injective.
Exercise Suppose f : [−2, 2] → R is defined by f(x) = x
2
. Find f
−1
(f([1, 2])).
Also find f(f
−1
([3, 5])).
Exercise Suppose f : X → Y is a function, S ⊂ X and T ⊂ Y . Find the
relationship between S and f
−1
(f(S)). Show that if f is injective, S = f
−1
(f(S)).
Also find the relationship between T and f(f
−1
(T)). Show that if f is surjective,
T = f(f
−1
(T)).
Strips If x
0
∈ X, ¦(x
0
, y) : y ∈ Y ¦ = (x
0
, Y ) is called a vertical strip.
If y
0
∈ Y, ¦(x, y
0
) : x ∈ X¦ = (X, y
0
) is called a horizontal strip.
Chapter 1 Background 9
Theorem Suppose S ⊂ X Y . The subset S is the graph of a function with
domain X and range Y iff each vertical strip intersects S in exactly one point.
This is just a restatement of the property of a graph of a function. The purpose
of the next theorem is to restate properties of functions in terms of horizontal strips.
Theorem Suppose f : X →Y has graph Γ. Then
1) Each horizontal strip intersects Γ in at least one point iff f is .
2) Each horizontal strip intersects Γ in at most one point iff f is .
3) Each horizontal strip intersects Γ in exactly one point iff f is .
Solutions of Equations Now we restate these properties in terms of solutions of
equations. Suppose f : X → Y and y
0
∈ Y . Consider the equation f(x) = y
0
. Here
y
0
is given and x is considered to be a “variable”. A solution to this equation is any
x
0
∈ X with f(x
0
) = y
0
. Note that the set of all solutions to f(x) = y
0
is f
−1
(y
0
).
Also f(x) = y
0
has a solution iff y
0
∈ image(f) iff f
−1
(y
0
) is non-void.
Theorem Suppose f : X →Y .
1) The equation f(x) = y
0
has at least one solution for each y
0
∈ Y iff
f is .
2) The equation f(x) = y
0
has at most one solution for each y
0
∈ Y iff
f is .
3) The equation f(x) = y
0
has a unique solution for each y
0
∈ Y iff
f is .
Right and Left Inverses One way to understand functions is to study right and
left inverses, which are defined after the next theorem.
Theorem Suppose X
f
→Y
g
→W are functions.
1) If g ◦ f is injective, then f is injective.
10 Background Chapter 1
2) If g ◦ f is surjective, then g is surjective.
3) If g ◦ f is bijective, then f is injective and g is surjective.
Example X = W = ¦p¦, Y = ¦p, q¦, f(p) = p, and g(p) = g(q) = p. Here
g ◦ f is the identity, but f is not surjective and g is not injective.
Definition Suppose f : X → Y is a function. A left inverse of f is a function
g : Y → X such that g ◦ f = I
X
: X → X. A right inverse of f is a function
h : Y →X such that f ◦ h = I
Y
: Y →Y .
Theorem Suppose f : X →Y is a function.
1) f has a right inverse iff f is surjective. Any such right inverse must be
injective.
2) f has a left inverse iff f is injective. Any such left inverse must be
surjective.
Corollary Suppose each of X and Y is a non-void set. Then ∃ an injective
f : X → Y iff ∃ a surjective g : Y → X. Also a function from X to Y is bijective
iff it has a left inverse and a right inverse.
Note The Axiom of Choice is not discussed in this book. However, if you worked
1) of the theorem above, you unknowingly used one version of it. For completeness,
we state this part of 1) again.
The Axiom of Choice If f : X → Y is surjective, then f has a right inverse
h. That is, for each y ∈ Y , it is possible to choose an x ∈ f
−1
(y) and thus to define
h(y) = x.
Note It is a classical theorem in set theory that the Axiom of Choice and the
Hausdorff Maximality Principle are equivalent. However in this text we do not go
that deeply into set theory. For our purposes it is assumed that the Axiom of Choice
and the HMP are true.
Exercise Suppose f : X → Y is a function. Define a relation on X by a ∼ b if
f(a) = f(b). Show this is an equivalence relation. If y belongs to the image of f, then
f
−1
(y) is an equivalence class and every equivalence class is of this form. In the next
chapter where f is a group homomorphism, these equivalence classes will be called
cosets.
Chapter 1 Background 11
Projections If X
1
and X
2
are non-void sets, we define the projection maps
π
1
: X
1
X
2
→X
1
and π
2
: X
1
X
2
→X
2
by π
i
(x
1
, x
2
) = x
i
.
Theorem If Y, X
1
, and X
2
are non-void sets, there is a 1-1 correspondence
between ¦functions f: Y →X
1
X
2
¦ and ¦ordered pairs of functions (f
1
, f
2
) where
f
1
: Y →X
1
and f
2
: Y →X
2
¦.
Proof Given f, define f
1
= π
1
◦ f and f
2
= π
2
◦ f. Given f
1
and f
2
define
f : Y → X
1
X
2
by f(y) = (f
1
(y), f
2
(y)). Thus a function from Y to X
1
X
2
is
merely a pair of functions from Y to X
1
and Y to X
2
. This concept is displayed in
the diagram below. It is summarized by the equation f = (f
1
, f
2
).
X
1
X
2
X
1
X
2
Y
·
¸ ¡

»

f
1
f
2
f
π
1
π
2
One nice thing about this concept is that it works fine for infinite Cartesian
products.
Definition Suppose T is an index set and for each t ∈ T, X
t
is a non-void set.
Then the product

t∈T
X
t
=

X
t
is the collection of all “sequences” ¦x
t
¦
t∈T
= ¦x
t
¦
where x
t
∈ X
t
. (Thus if T = Z
+
, ¦x
t
¦ = ¦x
1
, x
2
, ...¦.) For each s ∈ T, the projection
map π
s
:

X
t
→X
s
is defined by π
s
(¦x
t
¦) = x
s
.
Theorem If Y is any non-void set, there is a 1-1 correspondence between
¦functions f : Y →

X
t
¦ and ¦sequences of functions ¦f
t
¦
t∈T
where f
t
: Y → X
t
¦.
Given f, the sequence ¦f
t
¦ is defined by f
t
= π
t
◦ f. Given ¦f
t
¦, f is defined by
f(y) = ¦f
t
(y)¦.
12 Background Chapter 1
A Calculus Exercise Let A be the collection of all functions f : [0, 1] → R
which have an infinite number of derivatives. Let A
0
⊂ A be the subcollection of
those functions f with f(0) = 0. Define D : A
0
→A by D(f) = df/dx. Use the mean
value theorem to show that D is injective. Use the fundamental theorem of calculus
to show that D is surjective.
Exercise This exercise is not used elsewhere in this text and may be omitted. It
is included here for students who wish to do a little more set theory. Suppose T is a
non-void set.
1) If Y is a non-void set, define Y
T
to be the collection of all functions with domain
T and range Y . Show that if T and Y are finite sets with n and m elements, then
Y
T
has m
n
elements. In particular, when T = ¦1, 2, 3¦, Y
T
= Y Y Y has
m
3
elements. Show that if m ≥ 3, the subset of Y
{1,2,3}
of all injective functions has
m(m− 1)(m− 2) elements. These injective functions are called permutations on Y
taken 3 at a time. If T = N, then Y
T
is the infinite product Y Y . That is,
Y
N
is the set of all infinite sequences (y
1
, y
2
, . . .) where each y
i
∈ Y . For any Y and
T, let Y
t
be a copy of Y for each t ∈ T. Then Y
T
=

t∈T
Y
t
.
2) Suppose each of Y
1
and Y
2
is a non-void set. Show there is a natural bijection
from (Y
1
Y
2
)
T
to Y
T
1
Y
T
2
. (This is the fundamental property of Cartesian products
presented in the two previous theorems.)
3) Define T(T), the power set of T, to be the collection of all subsets of T (including
the null set). Show that if T is a finite set with n elements, T(T) has 2
n
elements.
4) If S is any subset of T, define its characteristic function χ
S
: T → ¦0, 1¦ by
letting χ
S
(t) be 1 when t ∈ S, and be 0 when t ∈[ S. Define α : T(T) → ¦0, 1¦
T
by
α(S) = χ
S
. Define β : ¦0, 1¦
T
→ T(T) by β(f) = f
−1
(1). Show that if S ⊂ T then
β ◦ α(S) = S, and if f : T → ¦0, 1¦ then α ◦ β(f) = f. Thus α is a bijection and
β = α
−1
.
T(T) ←→¦0, 1¦
T
5) Suppose γ : T →¦0, 1¦
T
is a function and show that it cannot be surjective. If
t ∈ T, denote γ(t) by γ(t) = f
t
: T → ¦0, 1¦. Define f : T → ¦0, 1¦ by f(t) = 0 if
f
t
(t) = 1, and f(t) = 1 if f
t
(t) = 0. Show that f is not in the image of γ and thus
γ cannot be surjective. This shows that if T is an infinite set, then the set ¦0, 1¦
T
represents a “higher order of infinity”.
6) A set Y is said to be countable if it is finite or if there is a bijection from N to
Chapter 1 Background 13
Y. Consider the following three collections.
i) T(N), the collection of all subsets of N.
ii) ¦0, 1¦
N
, the collection of all functions f : N →¦0, 1¦.
iii) The collection of all sequences (y
1
, y
2
, . . .) where each y
i
is 0 or 1.
We know that ii) and iii) are equal and there is a natural bijection between i)
and ii). We also know there is no surjective map from N to ¦0, 1¦
N
, i.e., ¦0, 1¦
N
is
uncountable. Show there is a bijection from ¦0, 1¦
N
to the real numbers R. (This is
not so easy.)
Notation for the Logic of Mathematics
Each of the words “Lemma”, “Theorem”, and “Corollary” means “true state-
ment”. Suppose A and B are statements. A theorem may be stated in any of the
following ways:
Theorem Hypothesis Statement A.
Conclusion Statement B.
Theorem Suppose A is true. Then B is true.
Theorem If A is true, then B is true.
Theorem A ⇒B (A implies B ).
There are two ways to prove the theorem — to suppose A is true and show B is
true, or to suppose B is false and show A is false. The expressions “A ⇔ B”, “A is
equivalent to B”, and “A is true iff B is true ” have the same meaning (namely, that
A ⇒B and B ⇒A).
The important thing to remember is that thoughts and expressions flow through
the language. Mathematical symbols are shorthand for phrases and sentences in the
English language. For example, “x ∈ B ” means “x is an element of the set B.” If A
is the statement “x ∈ Z
+
” and B is the statement “x
2
∈ Z
+
”, then “A ⇒ B”means
“If x is a positive integer, then x
2
is a positive integer”.
Mathematical Induction is based upon the fact that if S ⊂ Z
+
is a non-void
subset, then S contains a smallest element.
14 Background Chapter 1
Theorem Suppose P(n) is a statement for each n = 1, 2, ... . Suppose P(1) is true
and for each n ≥ 1, P(n) ⇒P(n + 1). Then for each n ≥ 1, P(n) is true.
Proof If the theorem is false, then ∃ a smallest positive integer m such that
P(m) is false. Since P(m−1) is true, this is impossible.
Exercise Use induction to show that, for each n ≥ 1, 1+2+ +n = n(n+1)/2.
The Integers
In this section, lower case letters a, b, c, ... will represent integers, i.e., elements
of Z. Here we will establish the following three basic properties of the integers.
1) If G is a subgroup of Z, then ∃ n ≥ 0 such that G = nZ.
2) If a and b are integers, not both zero, and G is the collection of all linear
combinations of a and b, then G is a subgroup of Z, and its
positive generator is the greatest common divisor of a and b.
3) If n ≥ 2, then n factors uniquely as the product of primes.
All of this will follow from long division, which we now state formally.
Euclidean Algorithm Given a, b with b ,= 0, ∃! m and r with 0 ≤ r <[b[ and
a = bm + r. In other words, b divides a “m times with a remainder of r”. For
example, if a = −17 and b = 5, then m = −4 and r = 3, −17 = 5(−4) + 3.
Definition If r = 0, we say that b divides a or a is a multiple of b. This fact is
written as b [ a. Note that b [ a ⇔ the rational number a/b is an integer ⇔ ∃! m
such that a = bm ⇔ a ∈ bZ.
Note Anything (except 0) divides 0. 0 does not divide anything.
± 1 divides anything . If n ,= 0, the set of integers which n divides
is nZ = ¦nm : m ∈ Z¦ = ¦..., −2n, −n, 0, n, 2n, ...¦. Also n divides
a and b with the same remainder iff n divides (a −b).
Definition A non-void subset G ⊂ Z is a subgroup provided (g ∈ G ⇒ −g ∈ G)
and (g
1
, g
2
∈ G ⇒(g
1
+g
2
) ∈ G). We say that G is closed under negation and closed
under addition.
Chapter 1 Background 15
Theorem If n ∈ Z then nZ is a subgroup. Thus if n ,= 0, the set of integers
which n divides is a subgroup of Z.
The next theorem states that every subgroup of Z is of this form.
Theorem Suppose G ⊂ Z is a subgroup. Then
1) 0 ∈ G.
2) If g
1
and g
2
∈ G, then (m
1
g
1
+ m
2
g
2
) ∈ G for all integers m
1
, m
2
.
3) ∃! non-negative integer n such that G = nZ. In fact, if G ,= ¦0¦
and n is the smallest positive integer in G, then G = nZ.
Proof Since G is non-void, ∃ g ∈ G. Now (−g) ∈ G and thus 0 = g + (−g)
belongs to G, and so 1) is true. Part 2) is straightforward, so consider 3). If G ,= 0,
it must contain a positive element. Let n be the smallest positive integer in G. If
g ∈ G, g = nm + r where 0 ≤ r < n. Since r ∈ G, it must be 0, and g ∈ nZ.
Now suppose a, b ∈ Z and at least one of a and b is non-zero.
Theorem Let G be the set of all linear combinations of a and b, i.e., G =
¦ma + nb : m, n ∈ Z¦. Then
1) G contains a and b.
2) G is a subgroup. In fact, it is the smallest subgroup containing a and b.
It is called the subgroup generated by a and b.
3) Denote by (a, b) the smallest positive integer in G. By the previous
theorem, G = (a, b)Z, and thus (a, b) [ a and (a, b) [ b. Also note that
∃ m, n such that ma + nb = (a, b). The integer (a, b) is called
the greatest common divisor of a and b.
4) If n is an integer which divides a and b, then n also divides (a, b).
Proof of 4) Suppose n [ a and n [ b i.e., suppose a, b ∈ nZ. Since G is the
smallest subgroup containing a and b, nZ ⊃ (a, b)Z, and thus n [ (a, b).
Corollary The following are equivalent:
1) a and b have no common divisors, i.e., (n [ a and n [ b) ⇒n = ±1.
16 Background Chapter 1
2) (a, b) = 1, i.e., the subgroup generated by a and b is all of Z.
3) ∃ m, n ∈Z with ma + nb = 1.
Definition If any one of these three conditions is satisfied, we say that a and b
are relatively prime.
We are now ready for our first theorem with any guts.
Theorem If a and b are relatively prime and a [ bc, then a [ c.
Proof Suppose a and b are relatively prime, c ∈ Z and a [ bc. Then there exist
m, n with ma + nb = 1, and thus mac + nbc = c. Now a [ mac and a [ nbc. Thus
a [ (mac + nbc) and so a [ c.
Definition A prime is an integer p > 1 which does not factor, i.e., if p = ab then
a = ±1 or a = ±p. The first few primes are 2, 3, 5, 7, 11, 13, 17,... .
Theorem Suppose p is a prime.
1) If a is an integer which is not a multiple of p, then (p, a) = 1. In other
words, if a is any integer, (p, a) = p or (p, a) = 1.
2) If p [ ab then p [ a or p [ b.
3) If p [ a
1
a
2
a
n
then p divides some a
i
. Thus if each a
i
is a prime,
then p is equal to some a
i
.
Proof Part 1) follows immediately from the definition of prime. Now suppose
p [ ab. If p does not divide a, then by 1), (p, a) = 1 and by the previous theorem, p
must divide b. Thus 2) is true. Part 3) follows from 2) and induction on n.
The Unique Factorization Theorem Suppose n is an integer which is not 0,1,
or -1. Then n may be factored into the product of primes and, except for order, this
factorization is unique. That is, ∃ a unique collection of distinct primes p
1
, ..., p
k
and
positive integers s
1
, s
2
, ..., s
k
such that n = ±p
s
1
1
p
s
2
2
p
s
k
k
.
Proof Factorization into primes is obvious, and uniqueness follows from 3) in the
theorem above. The power of this theorem is uniqueness, not existence.
Chapter 1 Background 17
Now that we have unique factorization and part 3) above, the picture becomes
transparent. Here are some of the basic properties of the integers in this light.
Theorem (Summary)
1) Suppose [ a[> 1 has prime factorization a = ±p
s
1
1
p
s
k
k
. Then the only
divisors or a are of the form ±p
t
1
1
p
t
k
k
where 0 ≤ t
i
≤ s
i
for i = 1, ..., k.
2) If [ a [> 1 and [ b [> 1, then (a, b) = 1 iff there is no common prime in
their factorizations. Thus if there is no common prime in their
factorizations, ∃ m, n with ma + nb = 1.
3) Suppose [ a[> 1 and [ b[> 1. Let ¦p
1
, . . . , p
k
¦ be the union of the distinct
primes of their factorizations. Thus a = ±p
s
1
1
p
s
k
k
where 0 ≤ s
i
and
b = ±p
t
1
1
p
t
k
k
where 0 ≤ t
i
. Let u
i
be the minimum of s
i
and t
i
. Then
(a, b) = p
u
1
1
p
u
k
k
. For example (2
3
5 11, 2
2
5
4
7) = 2
2
5.
3

) Let v
i
be the maximum of s
i
and t
i
. Then c = p
v
1
1
p
v
k
k
is the least
common multiple of a and b. Note that c is a multiple of a and b,
and if n is a multiple of a and b, then n is a multiple of c.
Finally, the least common multiple of a and b is c = ab/(a, b). In
particular, if a and b are relatively prime, then their least common
multiple is just their product.
4) There is an infinite number of primes. (Proof: Suppose there were only
a finite number of primes p
1
, p
2
, ..., p
k
. Then no prime would divide
(p
1
p
2
p
k
+ 1).)
5)

2 is irrational. (Proof: Suppose

2 = m/n where (m, n) = 1. Then
2n
2
= m
2
and if n > 1, n and m have a common prime factor.
Since this is impossible, n = 1, and so

2 is an integer. This is a
contradiction and therefore

2 is irrational.)
6) Suppose c is an integer greater than 1. Then

c is rational iff

c is an
integer.
Exercise Find (180,28), i.e., find the greatest common divisor of 180 and 28,
i.e., find the positive generator of the subgroup generated by ¦180,28¦. Find integers
m and n such that 180m + 28n = (180, 28). Find the least common multiple of 180
and 28, and show that it is equal to (180 28)/(180, 28).
18 Background Chapter 1
Exercise We have defined the greatest common divisor (gcd) and the least com-
mon multiple (lcm) of a pair of integers. Now suppose n ≥ 2 and S = ¦a
1
, a
2
, .., a
n
¦
is a finite collection of integers with [a
i
[ > 1 for 1 ≤ i ≤ n. Define the gcd and
the lcm of the elements of S and develop their properties. Express the gcd and the
lcm in terms of the prime factorizations of the a
i
. Show that the set of all linear
combinations of the elements of S is a subgroup of Z, and its positive generator is
the gcd of the elements of S.
Exercise Show that the gcd of S = ¦90, 70, 42¦ is 2, and find integers n
1
, n
2
, n
3
such that 90n
1
+ 70n
2
+ 42n
3
= 2. Also find the lcm of the elements of S.
Exercise Show that if each of G
1
, G
2
, ..., G
m
is a subgroup of Z, then
G = G
1
∩ G
2
∩ ∩ G
m
is also a subgroup of Z. Now let G = (90Z) ∩ (70Z) ∩ (42Z)
and find the positive integer n with G = nZ.
Exercise Show that if the nth root of an integer is a rational number, then it
itself is an integer. That is, suppose c and n are integers greater than 1. There is a
unique positive real number x with x
n
= c. Show that if x is rational, then it is an
integer. Thus if p is a prime, its nth root is an irrational number.
Exercise Show that a positive integer is divisible by 3 iff the sum of its digits is
divisible by 3. More generally, let a = a
n
a
n−1
. . . a
0
= a
n
10
n
+ a
n−1
10
n−1
+ + a
0
where 0 ≤ a
i
≤ 9. Now let b = a
n
+a
n−1
+ +a
0
, and show that 3 divides a and b
with the same remainder. Although this is a straightforward exercise in long division,
it will be more transparent later on. In the language of the next chapter, it says that
[a] = [b] in Z
3
.
Card Trick Ask friends to pick out seven cards from a deck and then to select one
to look at without showing it to you. Take the six cards face down in your left hand
and the selected card in your right hand, and announce you will place the selected
card in with the other six, but they are not to know where. Put your hands behind
your back and place the selected card on top, and bring the seven cards in front in
your left hand. Ask your friends to give you a number between one and seven (not
allowing one). Suppose they say three. You move the top card to the bottom, then
the second card to the bottom, and then you turn over the third card, leaving it face
up on top. Then repeat the process, moving the top two cards to the bottom and
turning the third card face up on top. Continue until there is only one card face
down, and this will be the selected card. Magic? Stay tuned for Chapter 2, where it
is shown that any non-zero element of Z
7
has order 7.
Chapter 2
Groups
Groups are the central objects of algebra. In later chapters we will define rings and
modules and see that they are special cases of groups. Also ring homomorphisms and
module homomorphisms are special cases of group homomorphisms. Even though
the definition of group is simple, it leads to a rich and amazing theory. Everything
presented here is standard, except that the product of groups is given in the additive
notation. This is the notation used in later chapters for the products of rings and
modules. This chapter and the next two chapters are restricted to the most basic
topics. The approach is to do quickly the fundamentals of groups, rings, and matrices,
and to push forward to the chapter on linear algebra. This chapter is, by far and
above, the most difficult chapter in the book, because all the concepts are new.
Definition Suppose G is a non-void set and φ : G G → G is a function. φ is
called a binary operation, and we will write φ(a, b) = a b or φ(a, b) = a+b. Consider
the following properties.
1) If a, b, c ∈ G then a (b c) = (a b) c. If a, b, c ∈ G then a + (b + c) = (a +b) + c.
2) ∃ e = e
G
∈ G such that if a ∈ G ∃ 0
¯
=0
¯
G
∈ G such that if a ∈ G
e a = a e = a. 0
¯
+a = a+0
¯
= a.
3) If a ∈ G, ∃b ∈ G with a b = b a = e If a ∈ G, ∃b ∈ G with a + b = b +a = 0
¯
(b is written as b = a
−1
). (b is written as b = −a).
4) If a, b ∈ G, then a b = b a. If a, b ∈ G, then a + b = b + a.
Definition If properties 1), 2), and 3) hold, (G, φ) is said to be a group. If we
write φ(a, b) = a b, we say it is a multiplicative group. If we write φ(a, b) = a + b,
19
20 Groups Chapter 2
we say it is an additive group. If in addition, property 4) holds, we say the group is
abelian or commutative.
Theorem Let (G, φ) be a multiplicative group.
(i) Suppose a, c, ¯ c ∈ G. Then a c = a ¯ c ⇒ c = ¯ c.
Also c a = ¯ c a ⇒ c = ¯ c.
In other words, if f : G →G is defined by f(c) = a c, then f is injective.
Also f is bijective with f
−1
given by f
−1
(c) = a
−1
c.
(ii) e is unique, i.e., if ¯ e ∈ G satisfies 2), then e = ¯ e. In fact,
if a, b ∈ G then (a b = a) ⇒(b = e) and (a b = b) ⇒ (a = e).
Recall that b is an identity in G provided it is a right and left
identity for any a in G. However group structure is so rigid that if
∃ a ∈ G such that b is a right identity for a, then b = e.
Of course, this is just a special case of the cancellation law in (i).
(iii) Every right inverse is an inverse, i.e., if a b = e then b = a
−1
. Also
if b a = e then b = a
−1
. Thus inverses are unique.
(iv) If a ∈ G, then (a
−1
)
−1
= a.
(v) If a, b ∈ G, (a b)
−1
= b
−1
a
−1
Also (a
1
a
2
a
n
)
−1
=
a
−1
n
a
−1
n−1
a
−1
1
.
(vi) The multiplication a
1
a
2
a
3
= a
1
(a
2
a
3
) = (a
1
a
2
) a
3
is well defined.
In general, a
1
a
2
a
n
is well defined.
(vii) Suppose a ∈ G. Let a
0
= e and if n > 0, a
n
= a a (n times)
and a
−n
= a
−1
a
−1
(n times). If n
1
, n
2
, ..., n
t
∈ Z then
a
n
1
a
n
2
a
nt
= a
n
1
+···+nt
. Also (a
n
)
m
= a
nm
.
Finally, if G is abelian and a, b ∈ G, then (a b)
n
= a
n
b
n
.
Exercise. Write out the above theorem where G is an additive group. Note that
part (vii) states that G has a scalar multiplication over Z. This means that if a is in
G and n is an integer, there is defined an element an in G. This is so basic, that we
state it explicitly.
Theorem. Suppose G is an additive group. If a ∈ G, let a0 =0
¯
and if n > 0,
let an = (a + +a) where the sum is n times, and a(−n) = (−a) + (−a) + (−a),
Chapter 2 Groups 21
which we write as (−a − a − a). Then the following properties hold in general,
except the first requires that G be abelian.
(a + b)n = an + bn
a(n + m) = an + am
a(nm) = (an)m
a1 = a
Note that the plus sign is used ambiguously — sometimes for addition in G
and sometimes for addition in Z. In the language used in Chapter 5, this theorem
states that any additive abelian group is a Z-module. (See page 71.)
Exercise Suppose G is a non-void set with a binary operation φ(a, b) = a b which
satisfies 1), 2) and [ 3

) If a ∈ G, ∃b ∈ G with a b = e]. Show (G, φ) is a group,
i.e., show b a = e. In other words, the group axioms are stronger than necessary. If
every element has a right inverse, then every element has a two sided inverse.
Exercise Suppose G is the set of all functions from Z to Z with multiplication
defined by composition, i.e., f g = f ◦ g. Note that G satisfies 1) and 2) but not 3),
and thus G is not a group. Show that f has a right inverse in G iff f is surjective,
and f has a left inverse in G iff f is injective. Also show that the set of all bijections
from Z to Z is a group under composition.
Examples G = R, G = Q, or G = Z with φ(a, b) = a +b is an additive
abelian group.
Examples G = R−0 or G = Q−0 with φ(a, b) = ab is a multiplicative abelian
group.
G = Z −0 with φ(a, b) = ab is not a group.
G = R
+
= ¦r ∈ R : r > 0¦ with φ(a, b) = ab is a multiplicative
abelian group.
Subgroups
Theorem Suppose G is a multiplicative group and H ⊂ G is a non-void subset
satisfying
1) if a, b ∈ H then a b ∈ H
and 2) if a ∈ H then a
−1
∈ H.
22 Groups Chapter 2
Then e ∈ H and H is a group under multiplication. H is called a subgroup of G.
Proof Since H is non-void, ∃a ∈ H. By 2), a
−1
∈ H and so by 1), e ∈ H. The
associative law is immediate and so H is a group.
Example G is a subgroup of G and e is a subgroup of G. These are called the
improper subgroups of G.
Example If G = Z under addition, and n ∈ Z, then H = nZ is a subgroup of
Z. By a theorem in the section on the integers in Chapter 1, every subgroup of Z is
of this form. This is a key property of the integers.
Exercises Suppose G is a multiplicative group.
1) Let H be the center of G, i.e., H = ¦h ∈ G : g h = h g for all g ∈ G¦. Show
H is a subgroup of G.
2) Suppose H
1
and H
2
are subgroups of G. Show H
1
∩ H
2
is a subgroup of G.
3) Suppose H
1
and H
2
are subgroups of G, with neither H
1
nor H
2
contained in
the other. Show H
1
∪ H
2
is not a subgroup of G.
4) Suppose T is an index set and for each t ∈ T, H
t
is a subgroup of G.
Show

t∈T
H
t
is a subgroup of G.
5) Furthermore, if ¦H
t
¦ is a monotonic collection, then
_
t∈T
H
t
is a subgroup of G.
6) Suppose G= ¦all functions f : [0, 1] →R¦. Define an addition on G by
(f +g)(t) = f(t) + g(t) for all t ∈ [0, 1]. This makes G into an abelian group.
Let K be the subset of G composed of all differentiable functions. Let H
be the subset of G composed of all continuous functions. What theorems
in calculus show that H and K are subgroups of G? What theorem shows that
K is a subset (and thus subgroup) of H?
Order Suppose G is a multiplicative group. If G has an infinite number of
Chapter 2 Groups 23
elements, we say that o(G), the order of G, is infinite. If G has n elements, then
o(G) = n. Suppose a ∈ G and H = ¦a
i
: i ∈ Z¦. H is an abelian subgroup of G
called the subgroup generated by a. We define the order of the element a to be the
order of H, i.e., the order of the subgroup generated by a. Let f : Z → H be the
surjective function defined by f(m) = a
m
. Note that f(k + l) = f(k) f(l) where
the addition is in Z and the multiplication is in the group H. We come now to the
first real theorem in group theory. It says that the element a has finite order iff f
is not injective, and in this case, the order of a is the smallest positive integer n
with a
n
= e.
Theorem Suppose a is an element of a multiplicative group G, and
H = ¦a
i
: i ∈ Z¦. If ∃ distinct integers i and j with a
i
= a
j
, then a has some finite
order n. In this case H has n distinct elements, H = ¦a
0
, a
1
, . . . , a
n−1
¦, and a
m
= e
iff n[m. In particular, the order of a is the smallest positive integer n with a
n
= e,
and f
−1
(e) = nZ.
Proof Suppose j < i and a
i
= a
j
. Then a
i−j
= e and thus ∃ a smallest positive
integer n with a
n
= e. This implies that the elements of ¦a
0
, a
1
, ..., a
n−1
¦ are distinct,
and we must show they are all of H. If m ∈ Z, the Euclidean algorithm states that
∃ integers q and r with 0 ≤ r < n and m = nq + r. Thus a
m
= a
nq
a
r
= a
r
, and
so H = ¦a
0
, a
1
, ..., a
n−1
¦, and a
m
= e iff n[m. Later in this chapter we will see that
f is a homomorphism from an additive group to a multiplicative group and that,
in additive notation, H is isomorphic to Z or Z
n
.
Exercise Write out this theorem for G an additive group. To begin, suppose a is
an element of an additive group G, and H = ¦ai : i ∈ Z¦.
Exercise Show that if G is a finite group of even order, then G has an odd number
of elements of order 2. Note that e is the only element of order 1.
Definition A group G is cyclic if ∃ an element of G which generates G.
Theorem If G is cyclic and H is a subgroup of G, then H is cyclic.
Proof Suppose G is a cyclic group of order n. Then ∃ a ∈ G with G =
¦a
o
, a
1
, . . . , a
n−1
¦. Suppose H is a subgroup of G with more than one element. Let
m be the smallest integer with o < m < n and a
m
∈ H. Then m[n and a
m
generates
H. The case where G is an infinite cyclic group is left as an exercise. Note that Z
is an additive cyclic group and it was shown in the previous chapter that subgroups
of Z are cyclic.
24 Groups Chapter 2
Cosets Suppose H is a subgroup of a group G. It will be shown below that H
partitions G into right cosets. It also partitions G into left cosets, and in general
these partitions are distinct.
Theorem If H is a subgroup of a multiplicative group G, then a ∼ b defined by
a ∼ b iff a b
−1
∈ H is an equivalence relation. If a ∈ G, cl(a) = ¦b ∈ G : a ∼ b¦ =
¦h a : h ∈ H¦ = Ha. Note that a b
−1
∈ H iff b a
−1
∈ H.
If H is a subgroup of an additive group G, then a ∼ b defined by a ∼ b iff
(a − b) ∈ H is an equivalence relation. If a ∈ G, cl(a) = ¦b ∈ G : a ∼ b¦ = ¦h + a :
h ∈ H¦ = H + a. Note that (a −b) ∈ H iff (b −a) ∈ H.
Definition These equivalence classes are called right cosets. If the relation is
defined by a ∼ b iff b
−1
a ∈ H, then the equivalence classes are cl(a) = aH and
they are called left cosets. H is a left and right coset. If G is abelian, there is no
distinction between right and left cosets. Note that b
−1
a ∈ H iff a
−1
b ∈ H.
In the theorem above, we used H to define an equivalence relation on G, and thus
a partition of G. We now do the same thing a different way. We define the right
cosets directly and show they form a partition of G. This is really much easier.
Theorem Suppose H is a subgroup of a multiplicative group G. If a ∈ G, define
the right coset containing a to be Ha = ¦h a : h ∈ H¦. Then the following hold.
1) Ha = H iff a ∈ H.
2) If b ∈ Ha, then Hb = Ha, i.e., if h ∈ H, then H(h a) = (Hh)a = Ha.
3) If Hc ∩ Ha ,= ∅, then Hc = Ha.
4) The right cosets form a partition of G, i.e., each a in G belongs to one and
only one right coset.
5) Elements a and b belong to the same right coset iff a b
−1
∈ H iff b a
−1
∈ H.
Proof There is no better way to develop facility with cosets than to prove this
theorem. Also write this theorem for G an additive group.
Theorem Suppose H is a subgroup of a multiplicative group G.
Chapter 2 Groups 25
1) Any two right cosets have the same number of elements. That is, if a, b ∈ G,
f : Ha →Hb defined by f(h a) = h b is a bijection. Also any two left cosets
have the same number of elements. Since H is a right and left coset, any
two cosets have the same number of elements.
2) G has the same number of right cosets as left cosets. The bijection is given by
F(Ha) = a
−1
H. The number of right (or left) cosets is called the index of
H in G.
3) If G is finite, o(H) (index of H) = o(G) and so o(H) [ o(G). In other words,
o(G)/o(H) = the number of right cosets = the number of left cosets.
4) If G is finite, and a ∈ G, then o(a) [ o(G). (Proof: The order of a is the order
of the subgroup generated by a, and by 3) this divides the order of G.)
5) If G has prime order, then G is cyclic, and any element (except e) is a generator.
(Proof: Suppose o(G) = p and a ∈ G, a ,= e. Then o(a) [ p and thus o(a) = p.)
6) If o(G) = n and a ∈ G, then a
n
= e. (Proof: a
o(a)
= e and n = o(a) (o(G)/o(a)) .)
Exercises
i) Suppose G is a cyclic group of order 4, G = ¦e, a, a
2
, a
3
¦ with a
4
= e. Find the
order of each element of G. Find all the subgroups of G.
ii) Suppose G is the additive group Z and H = 3Z. Find the cosets of H.
iii) Think of a circle as the interval [0, 1] with end points identified. Suppose G = R
under addition and H = Z. Show that the collection of all the cosets of H
can be thought of as a circle.
iv) Let G = R
2
under addition, and H be the subgroup defined by
H = ¦(a, 2a) : a ∈ R¦. Find the cosets of H. (See the last exercise on p. 5.)
Normal Subgroups
We would like to make a group out of the collection of cosets of a subgroup H. In
26 Groups Chapter 2
general, there is no natural way to do that. However, it is easy to do in case H is a
normal subgroup, which is described below.
Theorem If H is a subgroup of G, then the following are equivalent.
1) If a ∈ G, then aHa
−1
= H
2) If a ∈ G, then aHa
−1
⊂ H
3) If a ∈ G, then aH = Ha
4) Every right coset is a left coset, i.e., if a ∈ G, ∃ b ∈ G with Ha = bH.
Proof 1) ⇒ 2) is obvious. Suppose 2) is true and show 3). We have (aHa
−1
)a ⊂
Ha so aH ⊂ Ha. Also a(a
−1
Ha) ⊂ aH so Ha ⊂ aH. Thus aH = Ha.
3) ⇒ 4) is obvious. Suppose 4) is true and show 3). Ha = bH contains a, so
bH = aH because a coset is an equivalence class.
Finally, suppose 3) is true and show 1). Multiply aH = Ha on the right by a
−1
.
Definition If H satisfies any of the four conditions above, then H is said to be a
normal subgroup of G.
Note For any group G, G and e are normal subgroups. If G is an abelian group,
then every subgroup of G is normal.
Exercise Show that if H is a subgroup of G with index 2, then H is normal.
Exercise Show the intersection of a collection of normal subgroups of G is a
normal subgroup of G. Show the union of a monotonic collection of normal subgroups
of G is a normal subgroup of G.
Exercise Let A ⊂ R
2
be the square with vertices (−1, 1), (1, 1), (1, −1), and
(−1, −1), and G be the collection of all “isometries” of A onto itself. These are
bijections of A onto itself which preserve distance and angles, i.e., which preserve dot
product. Show that with multiplication defined as composition, G is a multiplicative
group. Show that G has four rotations, two reflections about the axes, and two
reflections about the diagonals, for a total of eight elements. Show the collection of
rotations is a cyclic subgroup of order four which is a normal subgroup of G. Show
that the reflection about the x-axis together with the identity form a cyclic subgroup
of order two which is not a normal subgroup of G. Find the four right cosets of this
subgroup. Finally, find the four left cosets of this subgroup.
Chapter 2 Groups 27
Quotient Groups Suppose N is a normal subgroup of G, and C and D are
cosets. We wish to define a coset E which is the product of C and D. If c ∈ C and
d ∈ D, define E to be the coset containing c d, i.e., E = N(c d). The coset E does
not depend upon the choice of c and d. This is made precise in the next theorem,
which is quite easy.
Theorem Suppose G is a multiplicative group, N is a normal subgroup, and
G/N is the collection of all cosets. Then (Na) (Nb) = N(a b) is a well defined
multiplication (binary operation) on G/N, and with this multiplication, G/N is a
group. Its identity is N and (Na)
−1
= (Na
−1
). Furthermore, if G is finite, o(G/N) =
o(G)/o(N).
Proof Multiplication of elements in G/N is multiplication of subsets in G.
(Na) (Nb) = N(aN)b = N(Na)b = N(a b). Once multiplication is well defined,
the group axioms are immediate.
Exercise Write out the above theorem for G an additive abelian group.
Example Suppose G = Z under +, n > 1, and N = nZ. Z
n
, the group of
integers mod n is defined by Z
n
= Z/nZ. If a is an integer, the coset a + nZ is
denoted by [a]. Note that [a] + [b] = [a + b], −[a] = [−a], and [a] = [a + nl] for any
integer l. Any additive abelian group has a scalar multiplication over Z, and in this
case it is just [a]m = [am]. Note that [a] = [r] where r is the remainder of a divided
by n, and thus the distinct elements of Z
n
are [0], [1], ..., [n − 1]. Also Z
n
is cyclic
because each of [1] and [−1] = [n −1] is a generator. We already know that if p is a
prime, any non-zero element of Z
p
is a generator, because Z
p
has p elements.
Theorem If n > 1 and a is any integer, then [a] is a generator of Z
n
iff (a, n) = 1.
Proof The element [a] is a generator iff the subgroup generated by [a] contains
[1] iff ∃ an integer k such that [a]k = [1] iff ∃ integers k and l such that ak +nl = 1.
Exercise Show that a positive integer is divisible by 3 iff the sum of its digits is
divisible by 3. Note that [10] = [1] in Z
3
. (See the fifth exercise on page 18.)
Homomorphisms
Homomorphisms are functions between groups that commute with the group op-
erations. It follows that they honor identities and inverses. In this section we list
28 Groups Chapter 2
the basic properties. Properties 11), 12), and 13) show the connections between coset
groups and homomorphisms, and should be considered as the cornerstones of abstract
algebra.
Definition If G and
¯
G are multiplicative groups, a function f : G →
¯
G is a
homomorphism if, for all a, b ∈ G, f(a b) = f(a) f(b). On the left side, the group
operation is in G, while on the right side it is in
¯
G. The kernel of f is defined by
ker(f) = f
−1
(¯ e) = ¦a ∈ G : f(a) = ¯ e¦. In other words, the kernel is the set of
solutions to the equation f(x) = ¯ e. (If
¯
G is an additive group, ker(f) = f
−1
(0
¯
).)
Examples The constant map f : G →
¯
G defined by f(a) = ¯ e is a homomorphism.
If H is a subgroup of G, the inclusion i : H → G is a homomorphism. The function
f : Z → Z defined by f(t) = 2t is a homomorphism of additive groups, while the
function defined by f(t) = t +2 is not a homomorphism. The function h : Z →R−0
defined by h(t) = 2
t
is a homomorphism from an additive group to a multiplicative
group.
We now catalog the basic properties of homomorphisms. These will be helpful
later on when we study ring homomorphisms and module homomorphisms.
Theorem Suppose G and
¯
G are groups and f : G →
¯
G is a homomorphism.
1) f(e) = ¯ e.
2) f(a
−1
) = f(a)
−1
.
3) f is injective ⇔ ker(f) = e.
4) If H is a subgroup of G, f(H) is a subgroup of
¯
G. In particular, image(f) is
a subgroup of
¯
G.
5) If
¯
H is a subgroup of
¯
G, f
−1
(
¯
H) is a subgroup of G. Furthermore, if
¯
H is
normal in
¯
G, then f
−1
(
¯
H) is normal in G.
6) The kernel of f is a normal subgroup of G.
7) If ¯ g ∈
¯
G, f
−1
(¯ g) is void or is a coset of ker(f), i.e., if f(g) = ¯ g then
f
−1
(¯ g) = Ng where N= ker(f). In other words, if the equation f(x) = ¯ g has a
Chapter 2 Groups 29
solution, then the set of all solutions is a coset of N= ker(f). This is a key fact
which is used routinely in topics such as systems of equations and linear
differential equations.
8) The composition of homomorphisms is a homomorphism, i.e., if h :
¯
G →
=
G
is a
homomorphism, then h ◦ f : G →
=
G
is a homomorphism.
9) If f : G →
¯
G is a bijection, then the function f
−1
:
¯
G →G is a homomorphism.
In this case, f is called an isomorphism, and we write G ≈
¯
G. In the case
G =
¯
G, f is also called an automorphism.
10) Isomorphisms preserve all algebraic properties. For example, if f is an
isomorphism and H ⊂ G is a subset, then H is a subgroup of G
iff f(H) is a subgroup of
¯
G, H is normal in G iff f(H) is normal in
¯
G, G is
cyclic iff
¯
G is cyclic, etc. Of course, this is somewhat of a cop-out, because an
algebraic property is one that, by definition, is preserved under isomorphisms.
11) Suppose H is a normal subgroup of G. Then π : G →G/H defined by
π(a) = Ha is a surjective homomorphism with kernel H. Furthermore, if
f : G →
¯
G is a surjective homomorphism with kernel H, then G/H ≈
¯
G
(see below).
12) Suppose H is a normal subgroup of G. If H ⊂ ker(f), then
¯
f : G/H →
¯
G
defined by
¯
f(Ha) = f(a) is a well-defined homomorphism making
the following diagram commute.
G
¯
G
G/H
f
·
¸

π
¯
f
Thus defining a homomorphism on a quotient group is the same as defining a
homomorphism on the numerator which sends the denominator to ¯ e. The
image of
¯
f is the image of f and the kernel of
¯
f is ker(f)/H. Thus if H = ker(f),
¯
f is injective, and thus G/H ≈ image(f).
13) Given any group homomorphism f, domain(f)/ker(f) ≈ image(f). This is
the fundamental connection between quotient groups and homomorphisms.
30 Groups Chapter 2
14) Suppose K is a group. Then K is an infinite cycle group iff K is isomorphic to
the integers under addition, i.e., K ≈ Z. K is a cyclic group of order n iff
K ≈ Z
n
.
Proof of 14) Suppose
¯
G = K is generated by some element a. Then f : Z →K
defined by f(m) = a
m
is a homomorphism from an additive group to a multiplicative
group. If o(a) is infinite, f is an isomorphism. If o(a) = n, ker(f) = nZ and
¯
f : Z
n
→K is an isomorphism.
Exercise If a is an element of a group G, there is always a homomorphism from Z
to G which sends 1 to a. When is there a homomorphism from Z
n
to G which sends [1]
to a? What are the homomorphisms from Z
2
to Z
6
? What are the homomorphisms
from Z
4
to Z
8
?
Exercise Suppose G is a group and g is an element of G, g ,= e.
1) Under what conditions on g is there a homomorphism f : Z
7
→G with
f([1]) = g ?
2) Under what conditions on g is there a homomorphism f : Z
15
→G with
f([1]) = g ?
3) Under what conditions on G is there an injective homomorphism f : Z
15
→G ?
4) Under what conditions on G is there a surjective homomorphism f : Z
15
→G ?
Exercise We know every finite group of prime order is cyclic and thus abelian.
Show that every group of order four is abelian.
Exercise Let G = ¦h : [0, 1] → R : h has an infinite number of derivatives¦.
Then G is a group under addition. Define f : G → G by f(h) =
dh
dt
= h

. Show f
is a homomorphism and find its kernel and image. Let g : [0, 1] → R be defined by
g(t) = t
3
−3t + 4. Find f
−1
(g) and show it is a coset of ker(f).
Exercise Let G be as above and g ∈ G. Define f : G →G by f(h) = h

+ 5h

+
6t
2
h. Then f is a group homomorphism and the differential equation h

+5h

+6t
2
h =
g has a solution iff g lies in the image of f. Now suppose this equation has a solution
and S ⊂ G is the set of all solutions. For which subgroup H of G is S an H-coset?
Chapter 2 Groups 31
Exercise Suppose G is a multiplicative group and a ∈ G. Define f : G → G to
be conjugation by a, i.e., f(g) = a
−1
g a. Show that f is a homomorphism. Also
show f is an automorphism and find its inverse.
Permutations
Suppose X is a (non-void) set. A bijection f : X → X is called a permutation
on X, and the collection of all these permutations is denoted by S = S(X). In this
setting, variables are written on the left, i.e., f = (x)f. Therefore the composition
f ◦g means “f followed by g”. S(X) forms a multiplicative group under composition.
Exercise Show that if there is a bijection between X and Y , there is an iso-
morphism between S(X) and S(Y ). Thus if each of X and Y has n elements,
S(X) ≈ S(Y ), and these groups are called the symmetric groups on n elements.
They are all denoted by the one symbol S
n
.
Exercise Show that o(S
n
) = n!. Let X = ¦1, 2, ..., n¦, S
n
= S(X), and H =
¦f ∈ S
n
: (n)f = n¦. Show H is a subgroup of S
n
which is isomorphic to S
n−1
. Let
g be any permutation on X with (n)g = 1. Find g
−1
Hg.
The next theorem shows that the symmetric groups are incredibly rich and com-
plex.
Theorem (Cayley’s Theorem) Suppose G is a multiplicative group with n
elements and S
n
is the group of all permutations on the set G. Then G is isomorphic
to a subgroup of S
n
.
Proof Let h : G →S
n
be the function which sends a to the bijection h
a
: G →G
defined by (g)h
a
= g a. The proof follows from the following observations.
1) For each given a, h
a
is a bijection from G to G.
2) h is a homomorphism, i.e., h
a·b
= h
a
◦ h
b
.
3) h is injective and thus G is isomorphic to image (h) ⊂ S
n
.
The Symmetric Groups Now let n ≥ 2 and let S
n
be the group of all permu-
tations on ¦1, 2, ..., n¦. The following definition shows that each element of S
n
may
32 Groups Chapter 2
be represented by a matrix.
Definition Suppose 1 < k ≤ n, ¦a
1
, a
2
, ..., a
k
¦ is a collection of distinct integers
with 1 ≤ a
i
≤ n, and ¦b
1
, b
2
, ..., b
k
¦ is the same collection in some different order. Then
the matrix
_
a
1
a
2
... a
k
b
1
b
2
... b
k
_
represents f ∈ S
n
defined by (a
i
)f = b
i
for 1 ≤ i ≤ k,
and (a)f = a for all other a. The composition of two permutations is computed by
applying the matrix on the left first and the matrix on the right second.
There is a special type of permutation called a cycle. For these we have a special
notation.
Definition
_
a
1
a
2
...a
k−1
a
k
a
2
a
3
...a
k
a
1
_
is called a k-cycle, and is denoted by (a
1
, a
2
, ..., a
k
).
A 2-cycle is called a transposition. The cycles (a
1
, ..., a
k
) and (c
1
, ..., c

) are disjoint
provided a
i
,= c
j
for all 1 ≤ i ≤ k and 1 ≤ j ≤ .
Listed here are seven basic properties of permutations. They are all easy except
4), which is rather delicate. Properties 8), 9), and 10) are listed solely for reference.
Theorem
1) Disjoint cycles commute. (This is obvious.)
2) Every permutation can be written uniquely (except for order) as the product of
disjoint cycles. (This is easy.)
3) Every permutation can be written (non-uniquely) as the product of transpostions.
(Proof: (a
1
, ..., a
n
) = (a
1
, a
2
)(a
1
, a
3
) (a
1
, a
n
). )
4) The parity of the number of these transpositions is unique. This means that if
f is the product of p transpositions and also of q transpositions, then p is
even iff q is even. In this case, f is said to be an even permutation. In the other
case, f is an odd permutation.
5) A k-cycle is even (odd) iff k is odd (even). For example (1, 2, 3) = (1, 2)(1, 3) is
an even permutation.
6) Suppose f, g ∈ S
n
. If one of f and g is even and the other is odd, then g ◦ f is
Chapter 2 Groups 33
odd. If f and g are both even or both odd, then g ◦ f is even. (Obvious.)
7) The map h : S
n
→Z
2
defined by h(even)= [0] and h(odd)= [1] is a
homomorphism from a multiplicative group to an additive group. Its kernel (the
subgroup of even permutations) is denoted by A
n
and is called the alternating
group. Thus A
n
is a normal subgroup of index 2, and S
n
/A
n
≈ Z
2
.
The following parts are not included in this course. They are presented here merely
for reference.
8) For any n ,= 4, A
n
is simple, i.e., has no proper normal subgroups.
9) For any n ≥ 3, A
n
is generated by its 3-cycles.
10) S
n
can be generated by two elements. In fact, ¦(1, 2), (1, 2, ..., n)¦ generates S
n
.
(Of course there are subgroups of S
n
which cannot be generated by two
elements).
Proof of 4) The proof presented here uses polynomials in n variables with real
coefficients. Since polynomials will not be introduced until Chapter 3, the student
may skip the proof until after that chapter. Suppose S = ¦1, ..., n¦. If σ is a
permutation on S and p = p(x
1
, ..., x
n
) is a polynomial in n variables, define σ(p)
to be the polynomial p(x
(1)σ
, ..., x
(n)σ
). Thus if p = x
1
x
2
2
+ x
1
x
3
, and σ is the trans-
position (1, 2), then σ(p) = x
2
x
2
1
+ x
2
x
3
. Note that if σ
1
and σ
2
are permutations,
σ
2

1
(p)) = (σ
1
σ
2
)(p). Now let p be the product of all (x
i
−x
j
) where 1 ≤ i < j ≤ n.
(For example, if n = 3, p = (x
1
−x
2
)(x
1
−x
3
)(x
2
−x
3
).) If σ is a permutation on S,
then for each 1 ≤ i, j ≤ n with i ,= j, σ(p) has (x
i
−x
j
) or (x
j
−x
i
) as a factor. Thus
σ(p) = ±p. A careful examination shows that if σ
i
is a transposition, σ
i
(p) = −p.
Any permutation σ is the product of transpositions, σ = σ
1
σ
2
σ
t
. Thus if σ(p) = p,
t must be even, and if σ(p) = −p, t must be odd.
Exercise
1) Write
_
1 2 3 4 5 6 7
6 5 4 3 1 7 2
_
as the product of disjoint cycles.
Write (1,5,6,7)(2,3,4)(3,7,1) as the product of disjoint cycles.
Write (3,7,1)(1,5,6,7)(2,3,4) as the product of disjoint cycles.
Which of these permutations are odd and which are even?
34 Groups Chapter 2
2) Suppose (a
1
, . . . , a
k
) and (c
1
, . . . , c

) are disjoint cycles. What is the order of
their product?
3) Suppose σ ∈ S
n
. Show that σ
−1
(1, 2, 3)σ = ((1)σ, (2)σ, (3)σ). This shows
that conjugation by σ is just a type of relabeling. Also let τ = (4, 5, 6) and
find τ
−1
(1, 2, 3, 4, 5)τ.
4) Show that H = ¦σ ∈ S
6
: (6)σ = 6¦ is a subgroup of S
6
and find its right
cosets and its left cosets.
5) Let A ⊂ R
2
be the square with vertices (−1, 1), (1, 1), (1, −1), and (−1, −1),
and G be the collection of all isometries of A onto itself. We know from a
previous exercise that G is a group with eight elements. It follows from Cayley’s
theorem that G is isomorphic to a subgroup of S
8
. Show that G is isomorphic
to a subgroup of S
4
.
6) If G is a multiplicative group, define a new multiplication on the set G by
a ◦ b = b a. In other words, the new multiplication is the old multiplication
in the opposite order. This defines a new group denoted by G
op
, the opposite
group. Show that it has the same identity and the same inverses as G, and
that f : G →G
op
defined by f(a) = a
−1
is a group isomorphism. Now consider
the special case G = S
n
. The convention used in this section is that an element
of S
n
is a permutation on ¦1, 2, . . . , n¦ with the variable written on the left.
Show that an element of S
op
n
is a permutation on ¦1, 2, . . . , n¦ with the variable
written on the right. (Of course, either S
n
or S
op
n
may be called the symmetric
group, depending on personal preference or context.)
Product of Groups
The product of groups is usually presented for multiplicative groups. It is pre-
sented here for additive groups because this is the form that occurs in later chapters.
As an exercise, this section should be rewritten using multiplicative notation. The
two theorems below are transparent and easy, but quite useful. For simplicity we
first consider the product of two groups, although the case of infinite products is only
slightly more difficult. For background, read the two theorems on page 11.
Theorem Suppose G
1
and G
2
are additive groups. Define an addition on G
1
G
2
by (a
1
, a
2
) +(b
1
, b
2
) = (a
1
+b
1
, a
2
+b
2
). This operation makes G
1
G
2
into a group.
Its “zero” is (0
¯
1
, 0
¯
2
) and −(a
1
, a
2
) = (−a
1
, −a
2
). The projections π
1
: G
1
G
2
→G
1
Chapter 2 Groups 35
and π
2
: G
1
G
2
→G
2
are group homomorphisms. Suppose G is an additive group.
We know there is a bijection from ¦functions f : G →G
1
G
2
¦ to ¦ordered pairs of
functions (f
1
, f
2
) where f
1
: G → G
1
and f
2
: G → G
2
¦. Under this bijection, f is a
group homomorphism iff each of f
1
and f
2
is a group homomorphism.
Proof It is transparent that the product of groups is a group, so let’s prove
the last part. Suppose G, G
1
, and G
2
are groups and f = (f
1
, f
2
) is a function
from G to G
1
G
2
. Now f(a + b) = (f
1
(a + b), f
2
(a + b)) and f(a) + f(b) =
(f
1
(a), f
2
(a)) +(f
1
(b), f
2
(b)) = (f
1
(a) +f
1
(b), f
2
(a) +f
2
(b)). An examination of these
two equations shows that f is a group homomorphism iff each of f
1
and f
2
is a group
homomorphism.
Exercise Suppose G
1
and G
2
are groups. Show G
1
G
2
and G
2
G
1
are isomor-
phic.
Exercise If o(a
1
) = n and o(a
2
) = m, find the order of (a
1
, a
2
) in G
1
G
2
.
Exercise Show that if G is any group of order 4, G is isomorphic to Z
4
or Z
2
Z
2
.
Show Z
4
is not isomorphic to Z
2
Z
2
. Show that Z
mn
is isomorphic to Z
n
Z
m
iff
(n, m) = 1.
Exercise Suppose G
1
and G
2
are groups and i
1
: G
1
→ G
1
G
2
is defined by
i
1
(g
1
) = (g
1
, 0
¯
2
). Show i
1
is an injective group homomorphism and its image is a
normal subgroup of G
1
G
2
. Usually G
1
is identified with its image under i
1
, so G
1
may be considered to be a normal subgroup of G
1
G
2
. Let π
2
: G
1
G
2
→ G
2
be the projection map defined in the Background chapter. Show π
2
is a surjective
homomorphism with kernel G
1
. Therefore (G
1
G
2
)/G
1
≈ G
2
.
Exercise Let R be the reals under addition. Show that the addition in the
product RR is just the usual addition in analytic geometry.
Exercise Suppose n > 2. Is S
n
isomorphic to A
n
G where G is a multiplicative
group of order 2?
One nice thing about the product of groups is that it works fine for any finite
number, or even any infinite number. The next theorem is stated in full generality.
36 Groups Chapter 2
Theorem Suppose T is an index set, and for any t ∈ T, G
t
is an additive
group. Define an addition on

t∈T
G
t
=

G
t
by ¦a
t
¦ + ¦b
t
¦ = ¦a
t
+ b
t
¦. This op-
eration makes the product into a group. Its “zero” is ¦0
¯
t
¦ and −¦a
t
¦ = ¦−a
t
¦.
Each projection π
s
:

G
t
→ G
s
is a group homomorphism. Suppose G is an ad-
ditive group. Under the natural bijection from ¦functions f : G →

G
t
¦ to
¦sequences of functions ¦f
t
¦
t∈T
where f
t
: G → G
t
¦, f is a group homomorphism
iff each f
t
is a group homomorphism. Finally, the scalar multiplication on

G
t
by integers is given coordinatewise, i.e., ¦a
t
¦n = ¦a
t
n¦.
Proof The addition on

G
t
is coordinatewise.
Exercise Suppose s is an element of T and π
s
:

G
t
→G
s
is the projection map
defined in the Background chapter. Show π
s
is a surjective homomorphism and find
its kernel.
Exercise Suppose s is an element of T and i
s
: G
s

G
t
is defined by i
s
(a) =
¦a
t
¦ where a
t
= 0
¯
if t ,= s and a
s
= a. Show i
s
is an injective homomorphism
and its image is a normal subgroup of

G
t
. Thus each G
s
may be considered to be
a normal subgroup of

G
t
.
Exercise Let f : Z → Z
30
Z
100
be the homomorphism defined by f(m) =
([4m], [3m]). Find the kernel of f. Find the order of ([4], [3]) in Z
30
Z
100
.
Exercise Let f : Z → Z
90
Z
70
Z
42
be the group homomorphism defined by
f(m) = ([m], [m], [m]). Find the kernel of f and show that f is not surjective. Let
g : Z → Z
45
Z
35
Z
21
be defined by g(m) = ([m], [m], [m]). Find the kernel of
g and determine if g is surjective. Note that the gcd of ¦45, 35, 21¦ is 1. Now let
h : Z → Z
8
Z
9
Z
35
be defined by h(m) = ([m], [m], [m]). Find the kernel of h
and show that h is surjective. Finally suppose each of b, c, and d is greater than 1
and f : Z → Z
b
Z
c
Z
d
is defined by f(m) = ([m], [m], [m]). Find necessary and
sufficient conditions for f to be surjective.
Exercise Suppose T is a non-void set, G is an additive group, and G
T
is the
collection of all functions f : T →G with addition defined by (f +g)(t) = f(t) +g(t).
Show G
T
is a group. For each t ∈ T, let G
t
= G. Note that G
T
is just another way
of writing

t∈T
G
t
. Also note that if T = [0, 1] and G = R, the addition defined on
G
T
is just the usual addition of functions used in calculus. (See exercises on pages 44
and 69.)
Chapter 3
Rings
Rings are additive abelian groups with a second operation called multiplication. The
connection between the two operations is provided by the distributive law. Assuming
the results of Chapter 2, this chapter flows smoothly. This is because ideals are also
normal subgroups and ring homomorphisms are also group homomorphisms. We do
not show that the polynomial ring F[x] is a unique factorization domain, although
with the material at hand, it would be easy to do. Also there is no mention of prime
or maximal ideals, because these concepts are unnecessary for our development of
linear algebra. These concepts are developed in the Appendix. A section on Boolean
rings is included because of their importance in logic and computer science.
Suppose R is an additive abelian group, R ,= 0
¯
, and R has a second binary
operation (i.e., map from R R to R) which is denoted by multiplication. Consider
the following properties.
1) If a, b, c ∈ R, (a b) c = a (b c). (The associative property
of multiplication.)
2) If a, b, c ∈ R, a (b + c) = (a b) + (a c) and (b + c) a = (b a) + (c a).
(The distributive law, which connects addition and
multiplication.)
3) R has a multiplicative identity, i.e., an element
1
¯
= 1
¯
R
∈ R such that if a ∈ R, a 1
¯
= 1
¯
a = a.
4) If a, b ∈ R, a b = b a. (The commutative property for
multiplication.)
Definition If 1), 2), and 3) are satisfied, R is said to be a ring. If in addition 4)
is satisfied, R is said to be a commutative ring.
Examples The basic commutative rings in mathematics are the integers Z, the
37
38 Rings Chapter 3
rational numbers Q, the real numbers R, and the complex numbers C. It will be shown
later that Z
n
, the integers mod n, has a natural multiplication under which it is a
commutative ring. Also if R is any commutative ring, we will define R[x
1
, x
2
, . . . , x
n
],
a polynomical ring in n variables. Now suppose R is any ring, n ≥ 1, and R
n
is the
collection of all nn matrices over R. In the next chapter, operations of addition and
multiplication of matrices will be defined. Under these operations, R
n
is a ring. This
is a basic example of a non-commutative ring. If n > 1, R
n
is never commutative,
even if R is commutative.
The next two theorems show that ring multiplication behaves as you would wish
it to. They should be worked as exercises.
Theorem Suppose R is a ring and a, b ∈ R.
1) a 0
¯
= 0
¯
a = 0
¯
. Therefore 1
¯
,= 0
¯
.
2) (−a) b = a (−b) = −(a b).
Recall that, since R is an additive abelian group, it has a scalar multiplication
over Z. This scalar multiplication can be written on the right or left, i.e., na = an,
and the next theorem shows it relates nicely to the ring multiplication.
Theorem Suppose a, b ∈ R and n, m ∈ Z.
1) (na) (mb) = (nm)(a b). (This follows from the distributive
law and the previous theorem.)
2) Let n
¯
= n1
¯
. For example, 2
¯
= 1
¯
+ 1
¯
. Then na = n
¯
a, that is, scalar
multiplication by n is the same as ring multiplication by n
¯
.
Of course, n
¯
may be 0
¯
even though n ,= 0.
Units
Definition An element a of a ring R is a unit provided ∃ an element a
−1
∈ R
with a a
−1
= a
−1
a = 1
¯
.
Theorem 0
¯
can never be a unit. 1
¯
is always a unit. If a is a unit, a
−1
is also a
unit with (a
−1
)
−1
= a. The product of units is a unit with (a b)
−1
= b
−1
a
−1
. More
Chapter 3 Rings 39
generally, if a
1
, a
2
, ..., a
n
are units, then their product is a unit with (a
1
a
2
a
n
)
−1
=
a
−1
n
a
−1
n−1
a
−1
1
. The set of all units of R forms a multiplicative group denoted by
R

. Finally if a is a unit, (−a) is a unit and (−a)
−1
= −(a
−1
).
In order for a to be a unit, it must have a two-sided inverse. It suffices to require
a left inverse and a right inverse, as shown in the next theorem.
Theorem Suppose a ∈ R and ∃ elements b and c with b a = a c = 1
¯
. Then
b = c and so a is a unit with a
−1
= b = c.
Proof b = b 1
¯
= b (a c) = (b a) c = 1
¯
c = c.
Corollary Inverses are unique.
Domains and Fields In order to define these two types of rings, we first consider
the concept of zero divisor.
Definition Suppose R is a commutative ring. A non-zero element a ∈ R is called
a zero divisor provided ∃ a non-zero element b with a b = 0
¯
. Note that if a is a unit,
it cannot be a zero divisor.
Theorem Suppose R is a commutative ring and a ∈ (R−0
¯
) is not a zero divisor.
Then (a b = a c) ⇒ b = c. In other words, multiplication by a is an injective map
from R to R. It is surjective iff a is a unit.
Definition A domain (or integral domain) is a commutative ring such that, if
a ,= 0, a is not a zero divisor. A field is a commutative ring such that, if a ,= 0, a is a
unit. In other words, R is a field if it is commutative and its non-zero elements form
a group under multiplication.
Theorem A field is a domain. A finite domain is a field.
Proof A field is a domain because a unit cannot be a zero divisor. Suppose R is
a finite domain and a ,= 0. Then f : R → R defined by f(b) = a b is injective and,
by the pigeonhole principle, f is surjective. Thus a is a unit and so R is a field.
40 Rings Chapter 3
Exercise Let C be the additive abelian group R
2
. Define multiplication by
(a, b) (c, d) = (ac − bd, ad + bc). Show C is a commutative ring which is a field.
Note that 1
¯
= (1, 0) and if i = (0, 1), then i
2
= −1
¯
.
Examples Z is a domain. Q, R, and C are fields.
The Integers Mod n
The concept of integers mod n is fundamental in mathematics. It leads to a neat
little theory, as seen by the theorems below. However, the basic theory cannot be
completed until the product of rings is defined. (See the Chinese Remainder Theorem
on page 50.) We know from Chapter 2 that Z
n
is an additive abelian group.
Theorem Suppose n > 1. Define a multiplication on Z
n
by [a] [b] = [ab]. This
is a well defined binary operation which makes Z
n
into a commutative ring.
Proof Since [a +kn] [b +l n] = [ab +n(al +bk +kl n)] = [ab], the multiplication
is well defined. The ring axioms are easily verified.
Theorem Suppose n > 1 and a ∈ Z. Then the following are equivalent.
1) [a] is a generator of the additive group Z
n
.
2) (a, n) = 1.
3) [a] is a unit of the ring Z
n
.
Proof We already know 1) and 2) are equivalent. Recall that if b is an integer,
[a]b = [a] [b] = [ab]. Thus 1) and 3) are equivalent, because each says ∃ an integer
b with [a]b = [1].
Corollary If n > 1, the following are equivalent.
1) Z
n
is a domain.
2) Z
n
is a field.
3) n is a prime.
Proof We already know 1) and 2) are equivalent, because Z
n
is finite. Suppose
3) is true. Then by the previous theorem, each of [1], [2],...,[n − 1] is a unit, and
thus 2) is true. Now suppose 3) is false. Then n = ab where 1 < a < n, 1 < b < n,
Chapter 3 Rings 41
[a][b] = [0], and thus [a] is a zero divisor and 1) is false.
Exercise List the units and their inverses for Z
7
and Z
12
. Show that (Z
7
)

is
a cyclic group but (Z
12
)

is not. Show that in Z
12
the equation x
2
= 1
¯
has four
solutions. Finally show that if R is a domain, x
2
= 1
¯
can have at most two solutions
in R.
Subrings Suppose S is a subset of a ring R. The statement that S is a subring
of R means that S is a subgroup of the group R, 1
¯
∈ S , and (a, b ∈ S ⇒a b ∈ S).
Then clearly S is a ring and has the same multiplicative identity as R. Note that Z
is a subring of Q, Q is a subring of R, and R is a subring of C. Subrings do not play
a role analogous to subgroups. That role is played by ideals, and an ideal is never a
subring (unless it is the entire ring). Note that if S is a subring of R and s ∈ S, then
s may be a unit in R but not in S. Note also that Z and Z
n
have no proper subrings,
and thus occupy a special place in ring theory, as well as in group theory.
Ideals and Quotient Rings
Ideals in ring theory play a role analagous to normal subgroups in group theory.
Definition A subset I of a ring R is a
_
¸
_
¸
_
left
right
2−sided
_
¸
_
¸
_
ideal provided it is a subgroup
of the additive group R and if a ∈ R and b ∈ I, then
_
¸
_
¸
_
a b ∈ I
b a ∈ I
a b and b a ∈ I
_
¸
_
¸
_
. The
word “ideal ” means “2-sided ideal”. Of course, if R is commutative, every right or
left ideal is an ideal.
Theorem Suppose R is a ring.
1) R and 0
¯
are ideals of R. These are called the improper ideals.
2) If ¦I
t
¦
t∈T
is a collection or right (left, 2-sided) ideals of R, then

t∈T
I
t
is a
right (left, 2-sided) ideal of R.
42 Rings Chapter 3
3) Furthermore, if the collection is monotonic, then
_
t∈T
I
t
is a right (left, 2-sided)
ideal of R.
4) If a ∈ R, I = aR is a right ideal. Thus if R is commutative, aR is an ideal,
called a principal ideal. Thus every subgroup of Z is a principal ideal,
because it is of the form nZ.
5) If R is a commutative ring and I ⊂ R is an ideal, then the following are
equivalent.
i) I = R.
ii) I contains some unit u.
iii) I contains 1
¯
.
Exercise Suppose R is a commutative ring. Show that R is a field iff R contains
no proper ideals.
The following theorem is just an observation, but it is in some sense the beginning
of ring theory.
Theorem Suppose R is a ring and I ⊂ R is an ideal, I ,= R. Since I is a normal
subgroup of the additive group R, R/I is an additive abelian group. Multiplication
of cosets defined by (a +I) (b +I) = (ab +I) is well defined and makes R/I a ring.
Proof (a + I) (b + I) = a b + aI + Ib + II ⊂ a b + I. Thus multiplication
is well defined, and the ring axioms are easily verified. The multiplicative identity is
(1
¯
+ I).
Observation If R = Z and I = nZ, the ring structure on Z
n
= Z/nZ is the
same as the one previously defined.
Homomorphisms
Definition Suppose R and
¯
R are rings. A function f : R →
¯
R is a ring homo-
morphism provided
1) f is a group homomorphism
2) f(1
¯
R
) = 1
¯
¯
R
and
3) if a, b ∈ R then f(a b) = f(a) f(b). (On the left, multiplication
Chapter 3 Rings 43
is in R, while on the right multiplication is in
¯
R.)
The kernel of f is the kernel of f considered as a group homomorphism, namely
f
−1
(0
¯
).
Here is a list of the basic properties of ring homomorphisms. Much of this
work has already been done in the theorem in group theory on page 28.
Theorem Suppose each of R and
¯
R is a ring.
1) The identity map I
R
: R →R is a ring homomorphism.
2) The zero map from R to
¯
R is not a ring homomorphism
(because it does not send 1
¯
to 1
¯
).
3) The composition of ring homomorphisms is a ring homomorphism.
4) If f : R →
¯
R is a bijection which is a ring homomorphism,
then f
−1
:
¯
R →R is a ring homomorphism. Such an f is called
a ring isomorphism. In the case R =
¯
R, f is also called a
ring automorphism.
5) The image of a ring homomorphism is a subring of the range.
6) The kernel of a ring homomorphism is an ideal of the domain.
In fact, if f : R →
¯
R is a homomorphism and I ⊂
¯
R is an ideal,
then f
−1
(I) is an ideal of R.
7) Suppose I is an ideal of R, I ,= R, and π : R →R/I is the
natural projection, π(a) = (a +I). Then π is a surjective ring
homomorphism with kernel I. Furthermore, if f : R →
¯
R is a surjective
ring homomorphism with kernel I, then R/I ≈
¯
R (see below).
8) From now on the word “homomorphism” means “ring homomorphism”.
Suppose f : R →
¯
R is a homomorphism and I is an ideal of R, I ,= R.
If I ⊂ ker(f), then
¯
f : R/I →
¯
R defined by
¯
f(a + I) = f(a)
44 Rings Chapter 3
is a well defined homomorphism making the following diagram commute.
R
¯
R
R/I
f
·
¸

π
¯
f
Thus defining a homomorphism on a quotient ring is the same as
defining a homomorphism on the numerator which sends the
denominator to zero. The image of
¯
f is the image of f, and
the kernel of
¯
f is ker(f)/I. Thus if I = ker(f),
¯
f is
injective, and so R/I ≈ image (f).
Proof We know all this on the group level, and it is only necessary
to check that
¯
f is a ring homomorphism, which is obvious.
9) Given any ring homomorphism f, domain(f)/ker(f) ≈ image(f).
Exercise Find a ring R with an ideal I and an element b such that b is not a unit
in R but (b +I) is a unit in R/I.
Exercise Show that if u is a unit in a ring R, then conjugation by u is an
automorphism on R. That is, show that f : R →R defined by f(a) = u
−1
a u is
a ring homomorphism which is an isomorphism.
Exercise Suppose R is a ring, T is a non-void set, and R
T
is the collection of
all functions f : T → R. Define addition and multiplication on R
T
point-wise. This
means if f and g are functions from T to R, then (f + g)(t) = f(t) + g(t) and
(f g)(t) = f(t)g(t). Show that under these operations R
T
is a ring. Suppose S is a
non-void set and α : S →T is a function. If f : T →R is a function, define a function
α

(f) : S →R by α

(f) = f ◦ α. Show α

: R
T
→R
S
is a ring homomorphism.
Exercise Now consider the case T = [0, 1] and R = R. Let A ⊂ R
[0,1]
be the
collection of all C

functions, i.e., A =¦f : [0, 1] →R : f has an infinite number of
derivatives¦. Show A is a ring. Notice that much of the work has been done in the
previous exercise. It is only necessary to show that A is a subring of the ring R
[0,1]
.
Chapter 3 Rings 45
Polynomial Rings
In calculus, we consider real functions f which are polynomials, f(x) = a
0
+a
1
x +
+a
n
x
n
. The sum and product of polynomials are again polynomials, and it is easy
to see that the collection of polynomial functions forms a commutative ring. We can
do the same thing formally in a purely algebraic setting.
Definition Suppose R is a commutative ring and x is a “variable” or “symbol”.
The polynomial ring R[x] is the collection of all polynomials f = a
0
+a
1
x + +a
n
x
n
where a
i
∈ R. Under the obvious addition and multiplication, R[x] is a commutative
ring. The degree of a non-zero polynomial f is the largest integer n such that a
n
,= 0,
and is denoted by n = deg(f). If a
n
= 1
¯
, then f is said to be monic.
To be more formal, think of a polynomial a
0
+ a
1
x + as an infinite sequence
(a
0
, a
1
, ...) such that each a
i
∈ R and only a finite number are non-zero. Then
(a
0
, a
1
, ...) + (b
0
, b
1
, ...) = (a
0
+ b
0
, a
1
+ b
1
, ...) and
(a
0
, a
1
, ...) (b
0
, b
1
, ...) = (a
0
b
0
, a
0
b
1
+ a
1
b
0
, a
0
b
2
+a
1
b
1
+ a
2
b
0
, ...).
Note that on the right, the ring multiplication a b is written simply as ab, as is
often done for convenience.
Theorem If R is a domain, R[x] is also a domain.
Proof Suppose f and g are non-zero polynomials. Then deg(f)+deg(g) = deg(fg)
and thus fg is not 0
¯
. Another way to prove this theorem is to look at the bottom
terms instead of the top terms. Let a
i
x
i
and b
j
x
j
be the first non-zero terms of f and
g. Then a
i
b
j
x
i+j
is the first non-zero term of fg.
Theorem (The Division Algorithm) Suppose R is a commutative ring, f ∈
R[x] has degree ≥ 1 and its top coefficient is a unit in R. (If R is a field, the top
coefficient of f will always be a unit.) Then for any g ∈ R[x], ∃! h, r ∈ R[x] such that
g = fh + r with r = 0
¯
or deg(r) < deg(f).
Proof This theorem states the existence and uniqueness of polynomials h and
r. We outline the proof of existence and leave uniqueness as an exercise. Suppose
f = a
0
+ a
1
x + +a
m
x
m
where m ≥ 1 and a
m
is a unit in R. For any g with
deg(g) < m, set h = 0
¯
and r = g. For the general case, the idea is to divide f into g
until the remainder has degree less than m. The proof is by induction on the degree
of g. Suppose n ≥ m and the result holds for any polynomial of degree less than
46 Rings Chapter 3
n. Suppose g is a polynomial of degree n. Now ∃ a monomial bx
t
with t = n − m
and deg(g − fbx
t
) < n. By induction, ∃ h
1
and r with fh
1
+ r = (g − fbx
t
) and
deg(r) < m. The result follows from the equation f(h
1
+ bx
t
) + r = g.
Note If r = 0 we say that f divides g. Note that f = x − c divides g iff c is a
root of g, i.e., g(c) = 0. More generally, x −c divides g with remainder g(c).
Theorem Suppose R is a domain, n > 0, and g(x) = a
0
+ a
1
x + + a
n
x
n
is a
polynomial of degree n with at least one root in R. Then g has at most n roots. Let
c
1
, c
2
, .., c
k
be the distinct roots of g in the ring R. Then ∃ a unique sequence of
positive integers n
1
, n
2
, .., n
k
and a unique polynomial h with no root in R so that
g(x) = (x − c
1
)
n
1
(x − c
k
)
n
k
h(x). (If h has degree 0, i.e., if h = a
n
, then we say
“all the roots of g belong to R”. If g = a
n
x
n
, we say “all the roots of g are 0
¯
”.)
Proof Uniqueness is easy so let’s prove existence. The theorem is clearly true
for n = 1. Suppose n > 1 and the theorem is true for any polynomial of degree less
than n. Now suppose g is a polynomial of degree n and c
1
is a root of g. Then ∃
a polynomial h
1
with g(x) = (x − c
1
)h
1
. Since h
1
has degree less than n, the result
follows by induction.
Note If g is any non-constant polynomial in C[x], all the roots of g belong to C,
i.e., C is an algebraically closed field. This is called The Fundamental Theorem of
Algebra, and it is assumed without proof for this textbook.
Exercise Suppose g is a non-constant polynomial in R[x]. Show that if g has
odd degree then it has a real root. Also show that if g(x) = x
2
+ bx + c, then it has
a real root iff b
2
≥ 4c, and in that case both roots belong to R.
Definition A domain T is a principal ideal domain (PID) if, given any ideal I,
∃ t ∈ T such that I = tT. Note that Z is a PID and any field is PID.
Theorem Suppose F is a field, I is a proper ideal of F[x], and n is the smallest
positive integer such that I contains a polynomial of degree n. Then I contains a
unique polynomial of the form f = a
0
+ a
1
x + +a
n−1
x
n−1
+ x
n
and it has the
property that I = fF[x]. Thus F[x] is a PID. Furthermore, each coset of I can be
written uniquely in the form (c
0
+ c
1
x + +c
n−1
x
n−1
+ I).
Proof. This is a good exercise in the use of the division algorithm. Note this is
similar to showing that a subgroup of Z is generated by one element (see page 15).
Chapter 3 Rings 47
Theorem. Suppose R is a subring of a commutative ring C and c ∈ C. Then ∃!
homomorphism h : R[x] →C with h(x) = c and h(r) = r for all r ∈ R. It is defined
by h(a
0
+a
1
x + +a
n
x
n
) = a
0
+a
1
c + +a
n
c
n
, i.e., h sends f(x) to f(c). The image
of h is the smallest subring of C containing R and c.
This map h is called an evaluation map. The theorem says that adding two
polynomials in R[x] and evaluating is the same as evaluating and then adding in C.
Also multiplying two polynomials in R[x] and evaluating is the same as evaluating
and then multiplying in C. In street language the theorem says you are free to send
x wherever you wish and extend to a ring homomorphism on R[x].
Exercise Let C = ¦a + bi : a, b ∈ R¦. Since R is a subring of C, there exists a
homomorphism h : R[x] → C which sends x to i, and this h is surjective. Show
ker(h) = (x
2
+ 1)R[x] and thus R[x]/(x
2
+ 1) ≈ C. This is a good way to look
at the complex numbers, i.e., to obtain C, adjoin x to R and set x
2
= −1.
Exercise Z
2
[x]/(x
2
+ x + 1) has 4 elements. Write out the multiplication table
for this ring and show that it is a field.
Exercise Show that, if R is a domain, the units of R[x] are just the units of R.
Thus if F is a field, the units of F[x] are the non-zero constants. Show that [1] +[2]x
is a unit in Z
4
[x].
In this chapter we do not prove F[x] is a unique factorization domain, nor do
we even define unique factorization domain. The next definition and theorem are
included merely for reference, and should not be studied at this stage.
Definition Suppose F is a field and f ∈ F[x] has degree ≥ 1. The statement
that g is an associate of f means ∃ a unit u ∈ F[x] such that g = uf. The statement
that f is irreducible means that if h is a non-constant polynomial which divides f,
then h is an associate of f.
We do not develop the theory of F[x] here. However, the development is easy
because it corresponds to the development of Z in Chapter 1. The Division Algo-
rithm corresponds to the Euclidean Algorithm. Irreducible polynomials correspond
to prime integers. The degree function corresponds to the absolute value function.
One difference is that the units of F[x] are non-zero constants, while the units of Z
48 Rings Chapter 3
are just ±1. Thus the associates of f are all cf with c ,= 0
¯
while the associates of an
integer n are just ±n. Here is the basic theorem. (This theory is developed in full in
the Appendix under the topic of Euclidean domains.)
Theorem Suppose F is a field and f ∈ F[x] has degree ≥ 1. Then f factors as the
product of irreducibles, and this factorization is unique up to order and associates.
Also the following are equivalent.
1) F[x]/(f) is a domain.
2) F[x]/(f) is a field.
3) f is irreducible.
Definition Now suppose x and y are “variables”. If a ∈ R and n, m ≥ 0, then
ax
n
y
m
= ay
m
x
n
is called a monomial. Define an element of R[x, y] to be any finite
sum of monomials.
Theorem R[x, y] is a commutative ring and (R[x])[y] ≈ R[x, y] ≈ (R[y])[x]. In
other words, any polynomial in x and y with coefficients in R may be written as a
polynomial in y with coefficients in R[x], or as a polynomial in x with coefficients in
R[y].
Side Comment It is true that if F is a field, each f ∈ F[x, y] factors as the
product of irreducibles. However F[x, y] is not a PID. For example, the ideal
I = xF[x, y] +yF[x, y] = ¦f ∈ F[x, y] : f(0
¯
, 0
¯
) = 0
¯
¦ is not principal.
If R is a commutative ring and n ≥ 2, the concept of a polynomial ring in
n variables works fine without a hitch. If a ∈ R and v
1
, v
2
, ..., v
n
are non-negative
integers, then ax
v
1
1
x
v
2
2
...x
vn
n
is called a monomial. Order does not matter here.
Define an element of R[x
1
, x
2
, ..., x
n
] to be any finite sum of monomials. This
gives a commutative ring and there is canonical isomorphism R[x
1
, x
2
, ..., x
n
] ≈
(R[x
1
, x
2
, ..., x
n−1
])[x
n
]. Using this and induction on n, it is easy to prove the fol-
lowing theorem.
Theorem If R is a domain, R[x
1
, x
2
, ..., x
n
] is a domain and its units are just the
units of R.
Chapter 3 Rings 49
Exercise Suppose R is a commutative ring and f : R[x, y] → R[x] is the eval-
uation map which sends y to 0
¯
. This means f(p(x, y)) = p(x, 0
¯
). Show f is a ring
homomorphism whose kernel is the ideal (y) = yR[x, y]. Use the fact that “the do-
main mod the kernel is isomorphic to the image” to show R[x, y]/(y) is isomorphic
to R[x].
Product of Rings
The product of rings works fine, just as does the product of groups.
Theorem Suppose T is an index set and for each t ∈ T, R
t
is a ring. On the
additive abelian group

t∈T
R
t
=

R
t
, define multiplication by ¦r
t
¦ ¦s
t
¦ = ¦r
t
s
t
¦.
Then

R
t
is a ring and each projection π
s
:

R
t
→ R
s
is a ring homomorphism.
Suppose R is a ring. Under the natural bijection from ¦functions f : R →

R
t
¦
to ¦sequences of functions ¦f
t
¦
t∈T
where f
t
: R → R
t
¦, f is a ring homomorphism
iff each f
t
is a ring homomorphism.
Proof We already know f is a group homomorphism iff each f
t
is a group homo-
morphism (see page 36). Note that ¦1
¯
t
¦ is the multiplicative identity of

R
t
, and
f(1
¯
R
) = ¦1
¯
t
¦ iff f
t
(1
¯
R
) = 1
¯
t
for each t ∈ T. Finally, since multiplication is defined
coordinatewise, f is a ring homomorphism iff each f
t
is a ring homomorphism.
Exercise Suppose R and S are rings. Note that R 0 is not a subring of R S
because it does not contain (1
¯
R
, 1
¯
S
). Show R0
¯
is an ideal and (RS/R0
¯
) ≈ S.
Suppose I ⊂ R and J ⊂ S are ideals. Show I J is an ideal of RS and every ideal
of R S is of this form.
Exercise Suppose R and S are commutative rings. Show T = R S is not a
domain. Let e = (1, 0) ∈ RS and show e
2
= e, (1 −e)
2
= (1 −e), R0 = eT, and
0 S = (1 −e)T.
Exercise If T is any ring, an element e of T is called an idempotent provided
e
2
= e. The elements 0 and 1 are idempotents called the trivial idempotents. Suppose
T is a commutative ring and e ∈ T is an idempotent with 0 ,= e ,= 1. Let R = eT
and S = (1 − e)T. Show each of the ideals R and S is a ring with identity, and
f : T →RS defined by f(t) = (et, (1 −e)t) is a ring isomorphism. This shows that
a commutative ring T splits as the product of two rings iff it contains a non-trivial
idempotent.
50 Rings Chapter 3
The Chinese Remainder Theorem
Suppose n and m are relatively prime integers with n, m > 1. There is an exercise
in Chapter 2 to show that Z
nm
and Z
n
Z
m
are isomorphic as groups. It will now be
shown that they are also isomorphic as rings. (For a useful and elegant generalization
of this theorem, see the Appendix.)
Theorem Suppose n
1
, ..., n
t
are integers, each n
i
> 1, and (n
i
, n
j
) = 1 for all
i ,= j. Let f
i
: Z → Z
n
i
be defined by f
i
(a) = [a]. (Note that the bracket symbol is
used ambiguously.) Then the ring homomorphism f = (f
1
, .., f
t
) : Z →Z
n
1
Z
nt
is surjective. Furthermore, the kernel of f is nZ, where n = n
1
n
2
n
t
. Thus Z
n
and
Z
n
1
Z
nt
are isomorphic rings.
Proof We wish to show that the order of f(1) is n, and thus f(1) is a group
generator, and thus f is surjective. The element f(1)m = ([1], .., [1])m = ([m], .., [m])
is zero iff m is a multiple of each of n
1
, .., n
t
. Since their least common multiple is n,
the order of f(1) is n. (See the fourth exercise on page 36.)
Exercise Show that if a is an integer and p is a prime, then [a] = [a
p
] in Z
p
(Fermat’s Little Theorem). Use this and the Chinese Remainder Theorem to show
that if b is a positive integer, it has the same last digit as b
5
.
Characteristic
The following theorem is just an observation, but it shows that in ring theory, the
ring of integers is a “cornerstone”.
Theorem If R is a ring, there is one and only one ring homomorphism f : Z →R.
It is given by f(m) = m1
¯
= m
¯
. Thus the subgroup of R generated by 1
¯
is a subring
of R isomorphic to Z or isomorphic to Z
n
for some positive integer n.
Definition Suppose R is a ring and f : Z →R is the natural ring homomorphism
f(m) = m1
¯
= m
¯
. The non-negative integer n with ker(f) = nZ is called the charac-
teristic of R. Thus f is injective iff R has characteristic 0 iff 1
¯
has infinite order.
If f is not injective, the characteristic of R is the order of 1
¯
.
It is an interesting fact that, if R is a domain, all the non-zero elements of R
have the same order.
Chapter 3 Rings 51
Theorem Suppose R is a domain. If R has characteristic 0, then each non-zero
a ∈ R has infinite order. If R has finite characteristic n, then n is a prime and each
non-zero a ∈ R has order n.
Proof Suppose R has characteristic 0, a is a non-zero element of R, and m is a
positive integer. Then ma = m
¯
a cannot be 0
¯
because m
¯
, a ,= 0
¯
and R is a domain.
Thus o(a) = ∞. Now suppose R has characteristic n. Then R contains Z
n
as a
subring, and thus Z
n
is a domain and n is a prime. If a is a non-zero element of R,
na = n
¯
a = 0
¯
a = 0
¯
and thus o(a) = n.
Exercise Show that if F is a field of characteristic 0, F contains Q as a subring.
That is, show that the injective homomorphism f : Z → F extends to an injective
homomorphism
¯
f : Q →F.
Boolean Rings
This section is not used elsewhere in this book. However it fits easily here, and is
included for reference.
Definition A ring R is a Boolean ring if for each a ∈ R, a
2
= a, i.e., each element
of R is an idempotent.
Theorem Suppose R is a Boolean ring.
1) R has characteristic 2. If a ∈ R, 2a = a + a = 0
¯
, and so a = −a.
Proof (a + a) = (a + a)
2
= a
2
+ 2a
2
+ a
2
= 4a. Thus 2a = 0
¯
2) R is commutative.
Proof (a + b) = (a + b)
2
= a
2
+ (a b) + (b a) + b
2
= a + (a b) −(b a) +b. Thus a b = b a.
3) If R is a domain, R ≈ Z
2
.
Proof Suppose a ,= 0
¯
. Then a (1
¯
−a) = 0
¯
and so a = 1
¯
.
4) The image of a Boolean ring is a Boolean ring. That is, if I is an ideal
of R with I ,= R, then every element of R/I is idempotent and thus R/I
is a Boolean ring. It follows from 3) that R/I is a domain iff R/I is a
field iff R/I ≈ Z
2
. (In the language of Chapter 6, I is a prime ideal
iff I is a maximal ideal iff R/I ≈ Z
2
).
52 Rings Chapter 3
Suppose X is a non-void set. If a is a subset of X, let a

= (X−a) be a complement
of a in X. Now suppose R is a non-void collection of subsets of X. Consider the
following properties for R.
1) a ∈ R ⇒ a

∈ R.
2) a, b ∈ R ⇒ (a ∩ b) ∈ R.
3) a, b ∈ R ⇒ (a ∪ b) ∈ R.
4) ∅ ∈ R and X ∈ R.
Theorem If 1) and 2) are satisfied, then 3) and 4) are satisfied. In this case, R
is called a Boolean algebra of sets.
Proof Suppose 1) and 2) are true, and a, b ∈ R. Then a ∪b = (a

∩b

)

belongs to
R and so 3) is true. Since R is non-void, it contains some element a. Then ∅ = a ∩a

and X = a ∪ a

belong to R, and so 4) is true.
Theorem Suppose R is a Boolean algebra of sets. Define an addition on R by
a + b = (a ∪ b) −(a ∩ b). Under this addition, R is an abelian group with 0
¯
= ∅ and
a = −a. Define a multiplication on R by a b = a ∩ b. Under this multiplication R
becomes a Boolean ring with 1
¯
= X.
Note Suppose R is a Boolean ring. It is a classical theorem that ∃ a Boolean
algebra of sets whose Boolean ring is isomorphic to R. So let’s just suppose R is
a Boolean algebra of sets which is a Boolean ring with addition and multiplication
defined as above. Now define a ∨ b = a ∪ b and a ∧ b = a ∩ b. These operations cup
and cap are associative, commutative, have identity elements, and each distributes
over the other. With these two operations (along with complement), R is called a
Boolean algebra. R is not a group under cup or cap. Anyway, it is a classical fact
that, if you have a Boolean ring (algebra), you have a Boolean algebra (ring). The
advantage of the algebra is that it is symmetric in cup and cap. The advantage of
the ring viewpoint is that you can draw from the rich theory of commutative rings.
Exercise Let X = ¦1, 2, ..., n¦ and let R be the Boolean ring of all subsets of
X. Note that o(R) = 2
n
. Define f
i
: R → Z
2
by f
i
(a) = [1] iff i ∈ a. Show each
f
i
is a homomorphism and thus f = (f
1
, ..., f
n
) : R → Z
2
Z
2
Z
2
is a ring
homomorphism. Show f is an isomorphism.
Exercise Suppose R is a finite Boolean ring. Show that R ≈ Z
2
Z
2
Z
2
.
Chapter 4
Matrices and Matrix Rings
We first consider matrices in full generality, i.e., over an arbitrary ring R. However,
after the first few pages, it will be assumed that R is commutative. The topics,
such as invertible matrices, transpose, elementary matrices, systems of equations,
and determinant, are all classical. The highlight of the chapter is the theorem that a
square matrix is a unit in the matrix ring iff its determinant is a unit in the ring.
This chapter concludes with the theorem that similar matrices have the same deter-
minant, trace, and characteristic polynomial. This will be used in the next chapter
to show that an endomorphism on a finitely generated vector space has a well defined
determinant, trace, and characteristic polynomial.
Definition Suppose m and n are positive integers. Let R
m,n
be the collection of
all mn matrices
A = (a
i,j
) =
_
_
_
_
a
1,1
. . . a
1,n
.
.
.
.
.
.
a
m,1
. . . a
m,n
_
_
_
_
where each entry a
i,j
∈ R.
A matrix may be viewed as m n-dimensional row vectors or as n m-dimensional
column vectors. A matrix is said to be square if it has the same number of rows
as columns. Square matrices are so important that they have a special notation,
R
n
= R
n,n
. R
n
is defined to be the additive abelian group R R R.
To emphasize that R
n
does not have a ring structure, we use the “sum” notation,
R
n
= R⊕R⊕ ⊕R. Our convention is to write elements of R
n
as column vectors,
i.e., to identify R
n
with R
n,1
. If the elements of R
n
are written as row vectors, R
n
is
identified with R
1,n
.
53
54 Matrices Chapter 4
Addition of matrices To “add” two matrices, they must have the same number
of rows and the same number of columns, i.e., addition is a binary operation R
m,n

R
m,n
→R
m,n
. The addition is defined by (a
i,j
) +(b
i,j
) = (a
i,j
+b
i,j
), i.e., the i, j term
of the sum is the sum of the i, j terms. The following theorem is just an observation.
Theorem R
m,n
is an additive abelian group. Its “zero” is the matrix 0 = 0
m,n
all of whose terms are zero. Also −(a
i,j
) = (−a
i,j
). Furthermore, as additive groups,
R
m,n
≈ R
mn
.
Scalar multiplication An element of R is called a scalar. A matrix may be
“multiplied” on the right or left by a scalar. Right scalar multiplication is defined
by (a
i,j
)c = (a
i,j
c). It is a function R
m,n
R → R
m,n
. Note in particular that
scalar multiplication is defined on R
n
. Of course, if R is commutative, there is no
distinction between right and left scalar multiplication.
Theorem Suppose A, B ∈ R
m,n
and c, d ∈ R. Then
(A + B)c = Ac +Bc
A(c +d) = Ac +Ad
A(cd) = (Ac)d
and A1 = A
This theorem is entirely transparent. In the language of the next chapter, it merely
states that R
m,n
is a right module over the ring R.
Multiplication of Matrices The matrix product AB is defined iff the number
of columns of A is equal to the number of rows of B. The matrix AB will have the
same number of rows as A and the same number of columns as B, i.e., multiplication
is a function R
m,n
R
n,p
→R
m,p
. The product (a
i,j
)(b
i,j
) is defined to be the matrix
whose (s, t) term is a
s,1
b
1,t
+ + a
s,n
b
n,t
, i.e., the dot product of row s of A
with column t of B.
Exercise Consider real matrices A =
_
a b
c d
_
, U =
_
2 0
0 1
_
, V =
_
0 1
1 0
_
,
and W =
_
1 2
0 1
_
. Find the matrices AU, UA, AV, V A, AW, and WA.
Chapter 4 Matrices 55
Definition The identity matrix I
n
∈ R
n
is the square matrix whose diagonal terms
are 1 and whose off-diagonal terms are 0.
Theorem Suppose A ∈ R
m,n
.
1) 0
p,m
A = 0
p,n
A0
n,p
= 0
m,p
2) I
m
A = A = AI
n
Theorem (The distributive laws) (A +B)C = AC + BC and
C(A +B) = CA + CB whenever the
operations are defined.
Theorem (The associative law for matrix multiplication) Suppose A ∈ R
m,n
,
B ∈ R
n,p
, and C ∈ R
p,q
. Then (AB)C = A(BC). Note that ABC ∈ R
m,q
.
Proof We must show that the (s, t) terms are equal. The proof involves writing
it out and changing the order of summation. Let (x
i,j
) = AB and (y
i,j
) = BC.
Then the (s, t) term of (AB)C is

i
x
s,i
c
i,t
=

i
_

j
a
s,j
b
j,i
_
c
i,t
=

i,j
a
s,j
b
j,i
c
i,t
=

j
a
s,j
_

i
b
j,i
c
i,t
_
=

j
a
s,j
y
j,t
which is the (s, t) term of A(BC).
Theorem For each ring R and integer n ≥ 1, R
n
is a ring.
Proof This elegant little theorem is immediate from the theorems above. The
units of R
n
are called invertible or non-singular matrices. They form a group under
multiplication called the general linear group and denoted by Gl
n
(R) = (R
n
)

.
Exercise Recall that if A is a ring and a ∈ A, then aA is right ideal of A. Let
A = R
2
and a = (a
i,j
) where a
1,1
= 1 and the other entries are 0. Find aR
2
and R
2
a.
Show that the only ideal of R
2
containing a is R
2
itself.
Multiplication by blocks Suppose A, E ∈ R
n
, B, F ∈ R
n,m
, C, G ∈ R
m,n
, and
D, H ∈ R
m
. Then multiplication in R
n+m
is given by
_
A B
C D
__
E F
G H
_
=
_
AE + BG AF + BH
CE + DG CF + DH
_
.
56 Matrices Chapter 4
Transpose
Notation For the remainder of this chapter on matrices, suppose R is a commu-
tative ring. Of course, for n > 1, R
n
is non-commutative.
Transpose is a function from R
m,n
to R
n,m
. If A ∈ R
m,n
, A
t
∈ R
n,m
is the matrix
whose (i, j) term is the (j, i) term of A. So row i (column i) of A becomes column
i (row i) of A
t
. If A is an n-dimensional row vector, then A
t
is an n-dimensional
column vector. If A is a square matrix, A
t
is also square.
Theorem 1) (A
t
)
t
= A
2) (A + B)
t
= A
t
+ B
t
3) If c ∈ R, (Ac)
t
= A
t
c
4) (AB)
t
= B
t
A
t
5) If A ∈ R
n
, then A is invertible iff A
t
is invertible.
In this case (A
−1
)
t
= (A
t
)
−1
.
Proof of 5) Suppose A is invertible. I = I
t
= (AA
−1
)
t
= (A
−1
)
t
A
t
.
Exercise Characterize those invertible matrices A ∈ R
2
which have A
−1
= A
t
.
Show that they form a subgroup of Gl
2
(R).
Triangular Matrices
If A ∈ R
n
, then A is upper (lower) triangular provided a
i,j
= 0 for all i > j (all
j > i). A is strictly upper (lower) triangular provided a
i,j
= 0 for all i ≥ j (all j ≥ i).
A is diagonal if it is upper and lower triangular, i.e., a
i,j
= 0 for all i ,= j. Note
that if A is upper (lower) triangular, then A
t
is lower (upper) triangular.
Theorem If A ∈ R
n
is strictly upper (or lower) triangular, then A
n
= 0.
Proof The way to understand this is just multiply it out for n = 2 and n = 3.
The geometry of this theorem will become transparent later in Chapter 5 when the
matrix A defines an R-module endomorphism on R
n
.
Definition If T is any ring, an element t ∈ T is said to be nilpotent provided ∃n
such that t
n
= 0. In this case, (1 − t) is a unit with inverse 1 + t + t
2
+ + t
n−1
.
Thus if T = R
n
and B is a nilpotent matrix, I −B is invertible.
Chapter 4 Matrices 57
Exercise Let R = Z. Find the inverse of
_
_
_
1 2 −3
0 1 4
0 0 1
_
_
_.
Exercise Suppose A =
_
_
_
_
_
_
_
_
a
1
a
2
0

0
a
n
_
_
_
_
_
_
_
_
is a diagonal matrix, B ∈ R
m,n
,
and C ∈ R
n,p
. Show that BA is obtained from B by multiplying column i of B by
a
i
. Show AC is obtained from C by multiplying row i of C by a
i
. Show A is a unit
in R
n
iff each a
i
is a unit in R.
Scalar matrices A scalar matrix is a diagonal matrix for which all the diagonal
terms are equal, i.e., a matrix of the form cI
n
. The map R → R
n
which sends c to
cI
n
is an injective ring homomorphism, and thus we may consider R to be a subring
of R
n
. Multiplying by a scalar is the same as multiplying by a scalar matrix, and
thus scalar matrices commute with everything, i.e., if B ∈ R
n
, (cI
n
)B = cB = Bc =
B(cI
n
). Recall we are assuming R is a commutative ring.
Exercise Suppose A ∈ R
n
and for each B ∈ R
n
, AB = BA. Show A is a scalar
matrix. For n > 1, this shows how non-commutative R
n
is.
Elementary Operations and Elementary Matrices
There are 3 types of elementary row and column operations on a matrix A. A
need not be square.
Type 1 Multiply row i by some Multiply column i by some
unit a ∈ R. unit a ∈ R.
Type 2 Interchange row i and row j. Interchange column i and column j.
Type 3 Add a times row j Add a times column i
to row i where i ,= j and a to column j where i ,= j and a
is any element of R. is any element of R.
58 Matrices Chapter 4
Elementary Matrices Elementary matrices are square and invertible. There
are three types. They are obtained by performing row or column operations on the
identity matrix.
Type 1 B =
_
_
_
_
_
_
_
_
_
_
1
1 0
a
1
0 1
1
_
_
_
_
_
_
_
_
_
_
where a is a unit in R.
Type 2 B =
_
_
_
_
_
_
_
_
_
_
1
0 1
1
1
1 0
1
_
_
_
_
_
_
_
_
_
_
Type 3 B =
_
_
_
_
_
_
_
_
_
_
1
1 a
i,j
1
1
0 1
1
_
_
_
_
_
_
_
_
_
_
where i ,= j and a
i,j
is
any element of R.
In type 1, all the off-diagonal elements are zero. In type 2, there are two non-zero
off-diagonal elements. In type 3, there is at most one non-zero off-diagonal element,
and it may be above or below the diagonal.
Exercise Show that if B is an elementary matrix of type 1,2, or 3, then B is
invertible and B
−1
is an elementary matrix of the same type.
The following theorem is handy when working with matrices.
Theorem Suppose A is a matrix. It need not be square. To perform an elemen-
tary row (column) operation on A, perform the operation on an identity matrix to
obtain an elementary matrix B, and multiply on the left (right). That is, BA = row
operation on A and AB = column operation on A. (See the exercise on page 54.)
Chapter 4 Matrices 59
Exercise Suppose F is a field and A ∈ F
m,n
.
1) Show ∃ invertible matrices B ∈ F
m
and C ∈ F
n
such that BAC = (d
i,j
)
where d
1,1
= = d
t,t
= 1 and all other entries are 0. The integer t is
called the rank of A. (See page 89 of Chapter 5.)
2) Suppose A ∈ F
n
is invertible. Show A is the product of elementary
matrices.
3) A matrix T is said to be in row echelon form if, for each 1 ≤ i < m, the
first non-zero term of row (i + 1) is to the right of the first non-zero
term of row i. Show ∃ an invertible matrix B ∈ F
m
such that BA is in
row echelon form.
4) Let A =
_
3 11
0 4
_
and D =
_
3 11
1 4
_
. Write A and D as products
of elementary matrices over Q. Is it possible to write them as products
of elementary matrices over Z?
For 1), perform row and column operations on A to reach the desired form. This
shows the matrices B and C may be selected as products of elementary matrices.
Part 2) also follows from this procedure. For part 3), use only row operations. Notice
that if T is in row-echelon form, the number of non-zero rows is the rank of T.
Systems of Equations
Suppose A = (a
i,j
) ∈ R
m,n
and C =
_
_
_
_
_
c
1

c
m
_
_
_
_
_
∈ R
m
= R
m,1
. The system
a
1,1
x
1
+ + a
1,n
x
n
= c
1
.
.
.
.
.
.
.
.
.
a
m,1
x
1
+ +a
m,n
x
n
= c
m
of m equations in n unknowns, can be written as one
matrix equation in one unknown, namely as (a
i,j
)
_
_
_
_
_
x
1

x
n
_
_
_
_
_
=
_
_
_
_
_
c
1

c
m
_
_
_
_
_
or AX = C.
60 Matrices Chapter 4
Define f : R
n
→R
m
by f(D) = AD. Then f is a group homomorphism and also
f(Dc) = f(D)c for any c ∈ R. In the language of the next chapter, this says that f
is an R-module homomorphism. The following theorem summarizes what we already
know about solutions of linear equations in this setting.
Theorem
1) AX = 0 is called the homogeneous equation. Its solution set is ker(f).
2) AX = C has a solution iff C ∈ image(f). If D ∈ R
n
is one
solution, the solution set is the coset D + ker(f) in R
n
.
(See part 7 of the section on Homomorphisms in Chapter 2.)
3) Suppose B ∈ R
m
is invertible. Then AX = C and (BA)X = BC have
the same set of solutions. Thus we may perform any row operation
on both sides of the equation and not change the solution set.
4) If A ∈ R
m
is invertible, then AX = C has the unique solution
X = A
−1
C.
The geometry of systems of equations over a field will not become really trans-
parent until the development of linear algebra in Chapter 5.
Determinants
The concept of determinant is one of the most amazing in all of mathematics.
The proper development of this concept requires a study of multilinear forms, which
is given in Chapter 6. In this section we simply present the basic properties.
For each n ≥ 1 and each commutative ring R, determinant is a function from R
n
to R. For n = 1, [ (a) [ = a. For n = 2,
_
a b
c d
_
= ad −bc.
Definition Let A = (a
i,j
) ∈ R
n
. If σ is a permutation on (1, 2, ..., n), let sign(σ) =
1 if σ is an even permutation, and sign(σ) = −1 if σ is an odd permutation. The
determinant is defined by [ A [=

all σ
sign(σ) a
1,σ(1)
a
2,σ(2)
a
n,σ(n)
. Check that for
n = 2, this agrees with the definition above. (Note that here we are writing the
permutation functions as σ(i) and not as (i)σ.)
Chapter 4 Matrices 61
For each σ, a
1,σ(1)
a
2,σ(2)
a
n,σ(n)
contains exactly one factor from each row and
one factor from each column. Since R is commutative, we may rearrange the factors
so that the first comes from the first column, the second from the second column, etc.
This means that there is a permutation τ on (1, 2, . . . , n) such that a
1,σ(1)
a
n,σ(n)
=
a
τ(1),1
a
τ(n),n
. We wish to show that τ = σ
−1
and thus sign(σ) = sign(τ). To
reduce the abstraction, suppose σ(2) = 5. Then the first expression will contain
the factor a
2,5
. In the second expression, it will appear as a
τ(5),5
, and so τ(5) = 2.
Anyway, τ is the inverse of σ and thus there are two ways to define determinant. It
follows that the determinant of a matrix is equal to the determinant of its transpose.
Theorem [A[ =

all σ
sign(σ)a
1,σ(1)
a
2,σ(2)
a
n,σ(n)
=

all τ
sign(τ)a
τ(1),1
a
τ(2),2
a
τ(n),n
.
Corollary [A[ = [A
t
[.
You may view an n n matrix A as a sequence of n column vectors or as a
sequence of n row vectors. Here we will use column vectors. This means we write the
matrix A as A = (A
1
, A
2
, . . . , A
n
) where each A
i
∈ R
n,1
= R
n
.
Theorem If two columns of A are equal, then [A[ = 0
¯
.
Proof For simplicity, assume the first two columns are equal, i.e., A
1
= A
2
.
Now [A[ =

all τ
sign(τ)a
τ(1),1
a
τ(2),2
a
τ(n),n
and this summation has n! terms and
n! is an even number. Let γ be the transposition which interchanges one and two.
Then for any τ, a
τ(1),1
a
τ(2),2
a
τ(n),n
= a
τγ(1),1
a
τγ(2),2
a
τγ(n),n
. This pairs up
the n! terms of the summation, and since sign(τ)=−sign(τγ), these pairs cancel in
the summation. Therefore [A[ = 0
¯
.
Theorem Suppose 1 ≤ r ≤ n, C
r
∈ R
n,1
, and a, c ∈ R. Then [(A
1
, . . . , A
r−1
,
aA
r
+ cC
r
, A
r+1
, . . . , A
n
)[ = a[(A
1
, . . . , A
n
)[ +c[(A
1
, . . . , A
r−1
, C
r
, A
r+1
, . . . , A
n
)[
Proof This is immediate from the definition of determinant and the distributive
law of multiplication in the ring R.
Summary Determinant is a function d : R
n
→ R. In the language used in the
Appendix, the two previous theorems say that d is an alternating multilinear form.
The next two theorems say that d is skew-symmetric.
62 Matrices Chapter 4
Theorem Interchanging two columns of A multiplies the determinant by minus
one.
Proof For simplicity, show that [(A
2
, A
1
, A
3
, . . . , A
n
)[ = −[A[. We know 0
¯
=
[(A
1
+ A
2
, A
1
+ A
2
, A
3
, . . . , A
n
)[ = [(A
1
, A
1
, A
3
, . . . , A
n
)[ + [(A
1
, A
2
, A
3
, . . . , A
n
)[ +
[(A
2
, A
1
, A
3
, . . . , A
n
)[ + [(A
2
, A
2
, A
3
, . . . , A
n
)[. Since the first and last of these four
terms are zero, the result follows.
Theorem If τ is a permutation of (1, 2, . . . , n), then
[A[ = sign(τ)[(A
τ(1)
, A
τ(2)
, . . . , A
τ(n)
)[.
Proof The permutation τ is the finite product of transpositions.
Exercise Rewrite the four preceding theorems using rows instead of columns.
The following theorem is just a summary of some of the work done so far.
Theorem Multiplying any row or column of matrix by a scalar c ∈ R, multiplies
the determinant by c. Interchanging two rows or two columns multiplies the determi-
nant by −1. Adding c times one row to another row, or adding c times one column
to another column, does not change the determinant. If a matrix has two rows equal
or two columns equal, its determinant is zero. More generally, if one row is c times
another row, or one column is c times another column, then the determinant is zero.
There are 2n ways to compute [ A[; expansion by any row or expansion by any
column. Let M
i,j
be the determinant of the (n − 1) (n − 1) matrix obtained by
removing row i and column j from A. Let C
i,j
= (−1)
i+j
M
i,j
. M
i,j
and C
i,j
are called
the (i, j) minor and cofactor of A. The following theorem is useful but the proof is a
little tedious and should not be done as an exercise.
Theorem For any 1 ≤ i ≤ n, [ A[= a
i,1
C
i,1
+ a
i,2
C
i,2
+ + a
i,n
C
i,n
. For any
1 ≤ j ≤ n, [ A[= a
1,j
C
1,j
+a
2,j
C
2,j
+ +a
n,j
C
n,j
. Thus if any row or any column is
zero, the determinant is zero.
Exercise Let A =
_
_
_
a
1
a
2
a
3
b
1
b
2
b
3
c
1
c
2
c
3
_
_
_. The determinant of A is the sum of six terms.
Chapter 4 Matrices 63
Write out the determinant of A expanding by the first column and also expanding by
the second row.
Theorem If A is an upper or lower triangular matrix, [ A[ is the product of the
diagonal elements. If A is an elementary matrix of type 2, [ A [= −1. If A is an
elementary matrix of type 3, [ A[= 1.
Proof We will prove the first statement for upper triangular matrices. If A ∈ R
2
is an upper triangular matrix, then its determinant is the product of the diagonal
elements. Suppose n > 2 and the theorem is true for matrices in R
n−1
. Suppose
A ∈ R
n
is upper triangular. The result follows by expanding by the first column.
An elementary matrix of type 3 is a special type of upper or lower triangular
matrix, so its determinant is 1. An elementary matrix of type 2 is obtained from the
identity matrix by interchanging two rows or columns, and thus has determinant −1.
Theorem (Determinant by blocks) Suppose A ∈ R
n
, B ∈ R
n,m
, and D ∈ R
m
.
Then the determinant of
_
A B
O D
_
is [ A[[ D[.
Proof Expand by the first column and use induction on n.
The following remarkable theorem takes some work to prove. We assume it here
without proof. (For the proof, see page 130 of the Appendix.)
Theorem The determinant of the product is the product of the determinants,
i.e., if A, B ∈ R
n
, [ AB[ = [ A[[ B[. Thus [ AB[ = [ BA[ and if C is invertible,
[ C
−1
AC [ = [ACC
−1
[ = [ A[.
Corollary If A is a unit in R
n
then [ A[ is a unit in R and [ A
−1
[ = [ A[
−1
.
Proof 1 = [ I [ = [ AA
−1
[ = [ A[[ A
−1
[ .
One of the major goals of this chapter is to prove the converse of the preceding
corollary.
Classical adjoint Suppose R is a commutative ring and A ∈ R
n
. The classical
adjoint of A is (C
i,j
)
t
, i.e., the matrix whose (j, i) term is the (i, j) cofactor. Before
64 Matrices Chapter 4
we consider the general case, let’s examine 2 2 matrices.
If A =
_
a b
c d
_
then (C
i,j
) =
_
d −c
−b a
_
and so (C
i,j
)
t
=
_
d −b
−c a
_
. Then
A(C
i,j
)
t
= (C
i,j
)
t
A =
_
[ A[ 0
0 [ A[
_
= [ A[ I. Thus if [ A[ is a unit in R, A is
invertible and A
−1
= [ A[
−1
(C
i,j
)
t
. In particular, if [ A[ = 1, A
−1
=
_
d −b
−c a
_
.
Here is the general case.
Theorem If R is commutative and A ∈ R
n
, then A(C
i,j
)
t
= (C
i,j
)
t
A = [ A[ I.
Proof We must show that the diagonal elements of the product A(C
i,j
)
t
are all
[ A[ and the other elements are 0. The (s, s) term is the dot product of row s of A
with row s of (C
i,j
) and is thus [ A[ (computed by expansion by row s). For s ,= t,
the (s, t) term is the dot product of row s of A with row t of (C
i,j
). Since this is the
determinant of a matrix with row s = row t, the (s, t) term is 0. The proof that
(C
i,j
)
t
A = [A[I is left as an exercise.
We are now ready for one of the most beautiful and useful theorems in all of
mathematics.
Theorem Suppose R is a commutative ring and A ∈ R
n
. Then A is a unit in
R
n
iff [ A[ is a unit in R. (Thus if R is a field, A is invertible iff [ A[ , = 0.) If A is
invertible, then A
−1
= [ A[
−1
(C
i,j
)
t
. Thus if [ A[ = 1, A
−1
= (C
i,j
)
t
, the classical
adjoint of A.
Proof This follows immediately from the preceding theorem.
Exercise Show that any right inverse of A is also a left inverse. That is, suppose
A, B ∈ R
n
and AB = I. Show A is invertible with A
−1
= B, and thus BA = I.
Similarity
Suppose A, B ∈ R
n
. B is said to be similar to A if ∃ an invertible C ∈ R
n
such
that B = C
−1
AC, i.e., B is similar to A iff B is a conjugate of A.
Theorem B is similar to B.
Chapter 4 Matrices 65
B is similar to A iff A is similar to B.
If D is similar to B and B is similar to A, then D is similar to A.
“Similarity” is an equivalence relation on R
n
.
Proof This is a good exercise using the definition.
Theorem Suppose A and B are similar. Then [ A[ = [ B[ and thus A is invertible
iff B is invertible.
Proof Suppose B = C
−1
AC. Then [ B[ = [ C
−1
AC [ = [ACC
−1
[ = [ A[.
Trace Suppose A = (a
i,j
) ∈ R
n
. Then the trace is defined by trace(A) = a
1,1
+
a
2,2
+ + a
n,n
. That is, the trace of A is the sum of its diagonal terms.
One of the most useful properties of trace is trace(AB) = trace(BA) whenever AB
and BA are defined. For example, suppose A = (a
1
, a
2
, ..., a
n
) and B = (b
1
, b
2
, ..., b
n
)
t
.
Then AB is the scalar a
1
b
1
+ + a
n
b
n
while BA is the n n matrix (b
i
a
j
). Note
that trace(AB) = trace(BA). Here is the theorem in full generality.
Theorem Suppose A ∈ R
m,n
and B ∈ R
n,m
. Then AB and BA are square
matrices with trace(AB) = trace(BA).
Proof This proof involves a change in the order of summation. By definition,
trace(AB) =

1≤i≤m
a
i,1
b
1,i
+ +a
i,n
b
n,i
=

1≤i≤m
1≤j≤n
a
i,j
b
j,i
=

1≤j≤n
b
j,1
a
1,j
+ +b
j,m
a
m,j
=
trace(BA).
Theorem If A, B ∈ R
n
, trace(A + B) = trace(A) + trace(B) and
trace(AB) = trace(BA).
Proof The first part of the theorem is immediate, and the second part is a special
case of the previous theorem.
Theorem If A and B are similar, then trace(A) = trace(B).
Proof trace(B) = trace(C
−1
AC) = trace(ACC
−1
) = trace(A).
66 Matrices Chapter 4
Summary Determinant and trace are functions from R
n
to R. Determinant is a
multiplicative homomorphism and trace is an additive homomorphism. Furthermore
[ AB[ = [ BA[ and trace(AB) = trace(BA). If A and B are similar, [ A[ = [ B[ and
trace(A) = trace(B).
Exercise Suppose A ∈ R
n
and a ∈ R. Find [aA[ and trace(aA).
Characteristic polynomials If A ∈ R
n
, the characteristic polynomial CP
A
(x) ∈
R[x] is defined by CP
A
(x) = [ (xI − A) [. Any λ ∈ R which is a root of CP
A
(x) is
called a characteristic root of A.
Theorem CP
A
(x) = a
0
+ a
1
x + + a
n−1
x
n−1
+ x
n
where trace(A) = −a
n−1
and [ A[ = (−1)
n
a
0
.
Proof This follows from a direct computation of the determinant.
Theorem If A and B are similar, then they have the same characteristic polyno-
mials.
Proof Suppose B = C
−1
AC. CP
B
(x) = [ (xI − C
−1
AC) [ = [ C
−1
(xI − A)C [ =
[ (xI −A)[ = CP
A
(x).
Exercise Suppose R is a commutative ring, A =
_
a b
c d
_
is a matrix in R
2
, and
CP
A
(x) = a
0
+ a
1
x + x
2
. Find a
0
and a
1
and show that a
0
I + a
1
A + A
2
= 0, i.e.,
show A satisfies its characteristic polynomial. In other words, CP
A
(A) = 0.
Exercise Suppose F is a field and A ∈ F
2
. Show the following are equivalent.
1) A
2
= 0.
2) [ A [= trace(A) = 0.
3) CP
A
(x) = x
2
.
4) ∃ an elementary matrix C such that C
−1
AC is strictly upper triangular.
Note This exercise is a special case of a more general theorem. A square matrix
over a field is nilpotent iff all its characteristic roots are 0
¯
iff it is similar to a strictly
upper triangular matrix. This remarkable result cannot be proved by matrix theory
alone, but depends on linear algebra (see pages 93 and 98).
Chapter 5
Linear Algebra
The exalted position held by linear algebra is based upon the subject’s ubiquitous
utility and ease of application. The basic theory is developed here in full generality,
i.e., modules are defined over an arbitrary ring R and not just over a field. The
elementary facts about cosets, quotients, and homomorphisms follow the same pat-
tern as in the chapters on groups and rings. We give a simple proof that if R is a
commutative ring and f : R
n
→ R
n
is a surjective R-module homomorphism, then
f is an isomorphism. This shows that finitely generated free R-modules have a well
defined dimension, and simplifies much of the development of linear algebra. It is in
this chapter that the concepts about functions, solutions of equations, matrices, and
generating sets come together in one unified theory.
After the general theory, we restrict our attention to vector spaces, i.e., modules
over a field. The key theorem is that any vector space V has a free basis, and thus
if V is finitely generated, it has a well defined dimension, and incredible as it may
seem, this single integer determines V up to isomorphism. Also any endomorphism
f : V →V may be represented by a matrix, and any change of basis corresponds to
conjugation of that matrix. One of the goals in linear algebra is to select a basis so
that the matrix representing f has a simple form. For example, if f is not injective,
then f may be represented by a matrix whose first column is zero. As another
example, if f is nilpotent, then f may be represented by a strictly upper triangular
matrix. The theorem on Jordan canonical form is not proved in this chapter, and
should not be considered part of this chapter. It is stated here in full generality only
for reference and completeness. The proof is given in the Appendix. This chapter
concludes with the study of real inner product spaces, and with the beautiful theory
relating orthogonal matrices and symmetric matrices.
67
68 Linear Algebra Chapter 5
Definition Suppose R is a ring and M is an additive abelian group. The state-
ment that M is a right R-module means there is a scalar multiplication
M R → M satisfying (a
1
+ a
2
)r = a
1
r + a
2
r
(m, r) → mr a(r
1
+ r
2
) = ar
1
+ ar
2
a(r
1
r
2
) = (ar
1
)r
2
a1
¯
= a
for all a, a
1
, a
2
∈ M and r, r
1
, r
2
∈ R.
The statement that M is a left R-module means there is a scalar multiplication
R M → M satisfying r(a
1
+ a
2
) = ra
1
+ ra
2
(r, m) → rm (r
1
+ r
2
)a = r
1
a + r
2
a
(r
1
r
2
)a = r
1
(r
2
a)
1
¯
a = a
Note that the plus sign is used ambiguously, as addition in M and as addition in R.
Notation The fact that M is a right (left) R-module will be denoted by M = M
R
(M =
R
M). If R is commutative and M = M
R
then left scalar multiplication defined
by ra = ar makes M into a left R-module. Thus for commutative rings, we may write
the scalars on either side.
Convention Unless otherwise stated, the word “R-module” (or sometimes just
“module”) will mean “right R-module”.
Theorem Suppose M is an R-module.
1) If r ∈ R, then f : M →M defined by f(a) = ar is a homomorphism of
additive groups. In particular (0
¯
M
)r = 0
¯
M
.
2) If a ∈ M, a0
¯
R
= 0
¯
M
.
3) If a ∈ M and r ∈ R, then (−a)r = −(ar) = a(−r).
Proof This is a good exercise in using the axioms for an R-module.
Chapter 5 Linear Algebra 69
Submodules If M is an R-module, the statement that a subset N ⊂ M is a
submodule means it is a subgroup which is closed under scalar multiplication, i.e., if
a ∈ N and r ∈ R, then ar ∈ N. In this case N will be a module because the axioms
will be satisfied. Note that 0
¯
and M are submodules, called the improper submodules
of M.
Theorem Suppose M is an R-module, T is an index set, and for each t ∈ T,
N
t
is a submodule of M.
1)

t∈T
N
t
is a submodule.
2) If ¦N
t
¦ is a monotonic collection,
_
t∈T
N
t
is a submodule.
3) +
t∈T
N
t
= ¦all finite sums a
1
+ +a
m
: each a
i
belongs
to some N
t
¦ is a submodule. If T = ¦1, 2, .., n¦,
then this submodule may be written as
N
1
+ N
2
+ +N
n
= ¦a
1
+a
2
+ +a
n
: each a
i
∈ N
i
¦.
Proof We know from page 22 that versions of 1) and 2) hold for subgroups, and
in particular for subgroups of additive abelian groups. To finish the proofs it is only
necessary to check scalar multiplication, which is immediate. Also the proof of 3) is
immediate. Note that if N
1
and N
2
are submodules of M, N
1
+ N
2
is the smallest
submodule of M containing N
1
∪ N
2
.
Exercise Suppose T is a non-void set, N is an R-module, and N
T
is the collection
of all functions f : T →N with addition defined by (f +g)(t) = f(t)+g(t), and scalar
multiplication defined by (fr)(t) = f(t)r. Show N
T
is an R-module. (We know from
the last exercise in Chapter 2 that N
T
is a group, and so it is only necessary to check
scalar multiplication.) This simple fact is quite useful in linear algebra. For example,
in 5) of the theorem below, it is stated that Hom
R
(M, N) forms an abelian group. So
it is only necessary to show that Hom
R
(M, N) is a subgroup of N
M
. Also in 8) it is
only necessary to show that Hom
R
(M, N) is a submodule of N
M
.
Homomorphisms
Suppose M and N are R-modules. A function f : M → N is a homomorphism
(i.e., an R-module homomorphism) provided it is a group homomorphism and if
a ∈ M and r ∈ R, f(ar) = f(a)r. On the left, scalar multiplication is in M and on
the right it is in N. The basic facts about homomorphisms are listed below. Much
70 Linear Algebra Chapter 5
of this work has already been done in the chapter on groups (see page 28).
Theorem
1) The zero map M →N is a homomorphism.
2) The identity map I : M →M is a homomorphism.
3) The composition of homomorphisms is a homomorphism.
4) The sum of homomorphisms is a homomorphism. If f, g : M →N are
homomorphisms, define (f + g) : M →N by (f + g)(a) = f(a) + g(a).
Then f +g is a homomorphism. Also (−f) defined by (−f)(a) = −f(a)
is a homomorphism. If h : N →P is a homomorphism,
h ◦ (f + g) = (h ◦ f) + (h ◦ g). If k : P →M is a homomorphism,
(f +g ) ◦ k = (f ◦ k) + (g ◦ k).
5) Hom
R
(M, N) = Hom(M
R
, N
R
), the set of all homomorphisms from M
to N, forms an abelian group under addition. Hom
R
(M, M), with
multiplication defined to be composition, is a ring.
6) If a bijection f : M →N is a homomorphism, then f
−1
: N →M is also
a homomorphism. In this case f and f
−1
are called isomorphisms. A
homomorphism f : M →M is called an endomorphism. An isomorphism
f : M →M is called an automorphism. The units of the endomorphism
ring Hom
R
(M, M) are the automorphisms. Thus the automorphisms on
M form a group under composition. We will see later that if M = R
n
,
Hom
R
(R
n
, R
n
) is just the matrix ring R
n
and the automorphisms
are merely the invertible matrices.
7) If R is commutative and r ∈ R, then g : M →M defined by g(a) = ar
is a homomorphism. Furthermore, if f : M →N is a homomorphism,
fr defined by (fr)(a) = f(ar) = f(a)r is a homomorphism.
8) If R is commutative, Hom
R
(M, N) is an R-module.
9) Suppose f : M →N is a homomorphism, G ⊂ M is a submodule,
and H ⊂ N is a submodule. Then f(G) is a submodule of N
and f
−1
(H) is a submodule of M. In particular, image(f) is a
submodule of N and ker(f) = f
−1
(0
¯
) is a submodule of M.
Proof This is just a series of observations.
Chapter 5 Linear Algebra 71
Abelian groups are Z-modules On page 21, it is shown that any additive
group M admits a scalar multiplication by integers, and if M is abelian, the properties
are satisfied to make M a Z-module. Note that this is the only way M can be a Z-
module, because a1 = a, a2 = a + a, etc. Furthermore, if f : M → N is a group
homomorphism of abelian groups, then f is also a Z-module homomorphism.
Summary Additive abelian groups are “the same things” as Z-modules. While
group theory in general is quite separate from linear algebra, the study of additive
abelian groups is a special case of the study of R-modules.
Exercise If R is a subring of a ring T, then T, with scalar multiplication defined
by ring multiplication, is an R-module. In particular, R is a Q-module. If f : Q →R
is a Z-module homomorphism, must f be a Q-module homomorphism?
Homomorphisms on R
n
R
n
as an R-module In Chapter 4 it was shown that the additive abelian
group R
m,n
admits a scalar multiplication by elements in R. The properties listed
there were exactly those needed to make R
m,n
an R-module. Of particular importance
is the case R
n
= R ⊕ ⊕R = R
n,1
. We begin with the case n = 1.
R as a right R-module Let M = R and define scalar multiplication on the right
by ar = a r. That is, scalar multiplication is just ring multiplication. This makes
R a right R-module denoted by R
R
(or just R). This is the same as the definition
before for R
n
when n = 1.
Theorem Suppose N is a subset of R. Then N is a submodule of R
R
(
R
R) iff
N is a right (left) ideal of R.
Proof The definitions are the same except expressed in different language.
Theorem Suppose M = M
R
and f, g : R →M are homomorphisms with f(1
¯
) =
g(1
¯
). Then f = g. If m ∈ M, ∃! homomorphism h : R → M with h(1
¯
) = m. In
other words, Hom
R
(R, M) ≈ M.
Proof Suppose f(1
¯
) = g(1
¯
). Then f(r) = f(1
¯
r) = f(1
¯
)r = g(1
¯
)r = g(1
¯
r) =
g(r). Given m ∈ M, h : R → M defined by h(r) = mr is a homomorphism. Thus
72 Linear Algebra Chapter 5
evaluation at 1
¯
gives a bijection from Hom
R
(R, M) to M, and this bijection is clearly
a group isomorphism. If R is commutative, it is an isomorphism of R-modules.
In the case M = R, the above theorem states that multiplication on left by some
m ∈ R defines a right R-module homomorphism from R to R, and every module
homomorphism is of this form. The element m should be thought of as a 1 1
matrix. We now consider the case where the domain is R
n
.
Homomorphisms on R
n
Define e
i
∈ R
n
by e
i
=
_
_
_
_
_
_
_
_
0
¯

1
¯
i

0
¯
_
_
_
_
_
_
_
_
. Note that any
_
_
_
_
_
_
_
_
r
1

r
n
_
_
_
_
_
_
_
_
can be written uniquely as e
1
r
1
+ +e
n
r
n
. The sequence ¦e
1
, .., e
n
¦ is called the
canonical free basis or standard basis for R
n
.
Theorem Suppose M = M
R
and f, g : R
n
→ M are homomorphisms with
f(e
i
) = g(e
i
) for 1 ≤ i ≤ n. Then f = g. If m
1
, m
2
, ..., m
n
∈ M, ∃! homomorphism
h : R
n
→ M with h(e
i
) = m
i
for 1 ≤ i ≤ m. The homomorphism h is defined
by h(e
1
r
1
+ +e
n
r
n
) = m
1
r
1
+ +m
n
r
n
.
Proof The proof is straightforward. Note this theorem gives a bijection from
Hom
R
(R
n
, M) to M
n
= MMM and this bijection is a group isomorphism. We
will see later that the product M
n
is an R-module with scalar multiplication defined
by (m
1
, m
2
, .., m
n
)r = (m
1
r, m
2
r, .., m
n
r). If R is commutative so that Hom
R
(R
n
, M)
is an R-module, this theorem gives an R-module isomorphism from Hom
R
(R
n
, M) to
M
n
.
This theorem reveals some of the great simplicity of linear algebra. It does not
matter how complicated the ring R is, or which R-module M is selected. Any R-
module homomorphism from R
n
to M is determined by its values on the basis, and
any function from that basis to M extends uniquely to a homomorphism from R
n
to
M.
Exercise Suppose R is a field and f : R
R
→ M is a non-zero homomorphism.
Show f is injective.
Chapter 5 Linear Algebra 73
Now let’s examine the special case M = R
m
and show Hom
R
(R
n
, R
m
) ≈ R
m,n
.
Theorem Suppose A = (a
i,j
) ∈ R
m,n
. Then f : R
n
→R
m
defined by f(B) = AB
is a homomorphism with f(e
i
) = column i of A. Conversely, if m
1
, . . . , m
n
∈ R
m
,
define A ∈ R
m,n
to be the matrix with column i = m
i
. Then f defined by f(B) = AB
is the unique homomorphism from R
n
to R
m
with f(e
i
) = m
i
.
Even though this follows easily from the previous theorem and properties of ma-
trices, it is one of the great classical facts of linear algebra. Matrices over R give
R-module homomorphisms! Furthermore, addition of matrices corresponds to addi-
tion of homomorphisms, and multiplication of matrices corresponds to composition
of homomorphisms. These properties are made explicit in the next two theorems.
Theorem If f, g : R
n
→ R
m
are given by matrices A, C ∈ R
m,n
, then f + g is
given by the matrix A+C. Thus Hom
R
(R
n
, R
m
) and R
m,n
are isomorphic as additive
groups. If R is commutative, they are isomorphic as R-modules.
Theorem If f : R
n
→ R
m
is the homomorphism given by A ∈ R
m,n
and g :
R
m
→R
p
is the homomorphism given by C ∈ R
p,m
, then g ◦ f : R
n
→R
p
is given by
CA ∈ R
p,n
. That is, composition of homomorphisms corresponds to multiplication
of matrices.
Proof This is just the associative law of matrix multiplication, C(AB) = (CA)B.
The previous theorem reveals where matrix multiplication comes from. It is the
matrix which represents the composition of the functions. In the case where the
domain and range are the same, we have the following elegant corollary.
Corollary Hom
R
(R
n
, R
n
) and R
n
are isomorphic as rings. The automorphisms
correspond to the invertible matrices.
This corollary shows one way non-commutative rings arise, namely as endomor-
phism rings. Even if R is commutative, R
n
is never commutative unless n = 1.
We now return to the general theory of modules (over some given ring R).
74 Linear Algebra Chapter 5
Cosets and Quotient Modules
After seeing quotient groups and quotient rings, quotient modules go through
without a hitch. As before, R is a ring and module means R-module.
Theorem Suppose M is a module and N ⊂ M is a submodule. Since N is a
normal subgroup of M, the additive abelian quotient group M/N is defined. Scalar
multiplication defined by (a + N)r = (ar + N) is well defined and gives M/N the
structure of an R-module. The natural projection π : M → M/N is a surjective
homomorphism with kernel N. Furthermore, if f : M →
¯
M is a surjective homomor-
phism with ker(f) = N, then M/N ≈
¯
M (see below).
Proof On the group level, this is all known from Chapter 2. It is only necessary
to check the scalar multiplication, which is obvious.
The relationship between quotients and homomorphisms for modules is the same
as for groups and rings, as shown by the next theorem.
Theorem Suppose f : M →
¯
M is a homomorphism and N is a submodule of M.
If N ⊂ ker(f), then
¯
f : (M/N) →
¯
M defined by
¯
f(a + N) = f(a) is a well defined
homomorphism making the following diagram commute.
M
¯
M
M/N
f
·
¸

π
¯
f
Thus defining a homomorphism on a quotient module is the same as defining a homo-
morphism on the numerator that sends the denominator to 0
¯
. The image of
¯
f is the
image of f, and the kernel of
¯
f is ker(f)/N. Thus if N = ker(f),
¯
f is injective, and
thus (M/N) ≈image(f). Therefore for any homomorphism f, (domain(f)/ker(f)) ≈
image(f).
Proof On the group level this is all known from Chapter 2 (see page 29). It is
only necessary to check that
¯
f is a module homomorphism, and this is immediate.
Chapter 5 Linear Algebra 75
Theorem Suppose M is an R-module and K and L are submodules of M.
i) The natural homomorphism K →(K +L)/L is surjective with kernel
K ∩ L. Thus (K/K ∩ L)

→(K +L)/L is an isomorphism.
ii) Suppose K ⊂ L. The natural homomorphism M/K →M/L is surjective
with kernel L/K. Thus (M/K)/(L/K)

→M/L is an isomorphism.
Examples These two examples are for the case R = Z.
1) M = Z K = 3Z L = 5Z K ∩ L = 15Z K + L = Z
K/K ∩ L = 3Z/15Z ≈ Z/5Z = (K + L)/L
2) M = Z K = 6Z L = 3Z (K ⊂ L)
(M/K)/(L/K) = (Z/6Z)/(3Z/6Z) ≈ Z/3Z = M/L
Products and Coproducts
Infinite products work fine for modules, just as they do for groups and rings.
This is stated below in full generality, although the student should think of the finite
case. In the finite case something important holds for modules that does not hold
for non-abelian groups or rings – namely, the finite product is also a coproduct. This
makes the structure of module homomorphisms much more simple. For the finite
case we may use either the product or sum notation, i.e., M
1
M
2
M
n
=
M
1
⊕M
2
⊕ ⊕M
n
.
Theorem Suppose T is an index set and for each t ∈ T, M
t
is an R-module. On
the additive abelian group

t∈T
M
t
=

M
t
define scalar multiplication by ¦m
t
¦r =
¦m
t
r¦. Then

M
t
is an R-module and, for each s ∈ T, the natural projection
π
s
:

M
t
→M
s
is a homomorphism. Suppose M is a module. Under the natural 1-1
correspondence from ¦functions f : M →

M
t
¦ to ¦sequence of functions ¦f
t
¦
t∈T
where f
t
: M →M
t
¦, f is a homomorphism iff each f
t
is a homomorphism.
Proof We already know from Chapter 2 that f is a group homomorphism iff each
f
t
is a group homomorphism. Since scalar multiplication is defined coordinatewise,
f is a module homomorphism iff each f
t
is a module homomorphism.
76 Linear Algebra Chapter 5
Definition If T is finite, the coproduct and product are the same module. If T
is infinite, the coproduct or sum

t∈T
M
t
=

t∈T
M
t
= ⊕M
t
is the submodule of

M
t
consisting of all sequences ¦m
t
¦ with only a finite number of non-zero terms. For
each s ∈ T, the inclusion homomorphisms i
s
: M
s
→⊕M
t
is defined by i
s
(a) = ¦a
t
¦
where a
t
= 0
¯
if t ,= s and a
s
= a. Thus each M
s
may be considered to be a submodule
of ⊕M
t
.
Theorem Suppose M is an R-module. There is a 1-1 correspondence from
¦homomorphisms g : ⊕M
t
→ M¦ and ¦sequences of homomorphisms ¦g
t
¦
t∈T
where
g
t
: M
t
→ M¦ . Given g, g
t
is defined by g
t
= g ◦ i
t
. Given ¦g
t
¦, g is defined by
g(¦m
t
¦) =

t
g
t
(m
t
). Since there are only a finite number of non-zero terms, this
sum is well defined.
For T = ¦1, 2¦ the product and sum properties are displayed in the following
commutative diagrams.
M
1
⊕M
2
M
1
M
2
M
1
M
2
M
1
⊕M
2
M M
π
1 i
1
π
2 i
2
f
g
f
1
f
2
g
1
g
2
¡ ¸ ¸ ¡
·
`

»

´
Theorem For finite T, the 1-1 correspondences in the above theorems actually
produce group isomorphisms. If R is commutative, they give isomorphisms of R-
modules.
Hom
R
(M, M
1
⊕ ⊕M
n
) ≈ Hom
R
(M, M
1
) ⊕ ⊕Hom
R
(M, M
n
) and
Hom
R
(M
1
⊕ ⊕M
n
, M) ≈ Hom
R
(M
1
, M) ⊕ ⊕Hom
R
(M
n
, M)
Proof Let’s look at this theorem for products with n = 2. All it says is that if f =
(f
1
, f
2
) and h = (h
1
, h
2
), then f +h = (f
1
+h
1
, f
2
+h
2
). If R is commutative, so that
the objects are R-modules and not merely additive groups, then the isomorphisms
are module isomorphisms. This says merely that fr = (f
1
, f
2
)r = (f
1
r, f
2
r).
Chapter 5 Linear Algebra 77
Exercise Suppose M and N are R-modules. Show that M ⊕N is isomorphic to
N ⊕M.
Exercise Suppose M and N are R-modules, and A ⊂ M, B ⊂ N are submodules.
Show (M ⊕N)/(A ⊕B) is isomorphic to (M/A) ⊕(N/B).
Exercise Suppose R is a commutative ring, M is an R-module, and n ≥ 1. Define
a function ∝: Hom
R
(R
n
, M) →M
n
which is a R-module isomorphism.
Summands
One basic question in algebra is “When does a module split as the sum of two
modules?”. Before defining summand, here are two theorems for background.
Theorem Consider M
1
= M
1
⊕0
¯
as a submodule of M
1
⊕M
2
. Then the projection
map π
2
: M
1
⊕ M
2
→ M
2
is a surjective homomorphism with kernel M
1
. Thus
(M
1
⊕M
2
)/M
1
is isomorphic to M
2
.
This is exactly what you would expect, and the next theorem is almost as intuitive.
Theorem Suppose K and L are submodules of M and f : K ⊕ L → M is the
natural homomorphism, f(k, l) = k + l. Then the image of f is K + L and the
kernel of f is ¦(a, −a) : a ∈ K ∩ L¦. Thus f is an isomorphism iff K + L = M and
K ∩ L = 0
¯
. In this case we write K ⊕ L = M. This abuse of notation allows us to
avoid talking about “internal” and “external” direct sums.
Definition Suppose K is a submodule of M. The statement that K is a summand
of M means ∃ a submodule L of M with K ⊕ L = M. According to the previous
theorem, this is the same as there exists a submodule L with K + L = M and
K ∩ L = 0
¯
. If such an L exists, it need not be unique, but it will be unique up to
isomorphism, because L ≈ M/K. Of course, M and 0
¯
are always summands of M.
Exercise Suppose M is a module and K = ¦(m, m) : m ∈ M¦ ⊂ M ⊕M. Show
K is a submodule of M ⊕M which is a summand.
Exercise R is a module over Q, and Q ⊂ R is a submodule. Is Q a summand of
R? With the material at hand, this is not an easy question. Later on, it will be easy.
78 Linear Algebra Chapter 5
Exercise Answer the following questions about abelian groups.
1) Is 2Z a summand of Z?
2) Is 4Z
8
a summand of Z
8
?
3) Is 3Z
12
a summand of Z
12
?
4) Suppose n, m > 1. When is nZ
mn
a summand of Z
mn
?
Exercise If T is a ring, define the center of T to be the subring ¦t : ts =
st for all s ∈ T¦. Let R be a commutative ring and T = R
n
. There is a exercise
on page 57 to show that the center of T is the subring of scalar matrices. Show R
n
is a left T-module and find Hom
T
(R
n
, R
n
).
Independence, Generating Sets, and Free Basis
This section is a generalization and abstraction of the brief section Homomor-
phisms on R
n
. These concepts work fine for an infinite index set T because linear
combination means finite linear combination. However, to avoid dizziness, the student
should consider the case where T is finite.
Definition Suppose M is an R-module, T is an index set, and for each t ∈ T,
s
t
∈ M. Let S be the sequence ¦s
t
¦
t∈T
= ¦s
t
¦. The statement that S is dependent
means ∃ a finite number of distinct elements t
1
, ..., t
n
in T, and elements r
1
, .., r
n
in
R, not all zero, such that the linear combination s
t
1
r
1
+ +s
tn
r
n
= 0
¯
. Otherwise,
S is independent. Note that if some s
t
= 0
¯
, then S is dependent. Also if ∃ distinct
elements t
1
and t
2
in T with s
t
1
= s
t
2
, then S is dependent.
Let SR be the set of all linear combinations s
t
1
r
1
+ +s
tn
r
n
. SR is a submodule
of M called the submodule generated by S. If S is independent and generates M,
then S is said to be a basis or free basis for M. In this case any v ∈ M can be written
uniquely as a linear combination of elements in S. If ∃ a basis for M, M is said to
be a free R-module. The next two theorems are obvious, except for the confusing
notation. You might try first the case T = ¦1, 2, ..., n¦ and ⊕R
t
= R
n
.
Theorem For each t ∈ T, let R
t
= R
R
and for each c ∈ T, let e
c
∈ ⊕R
t
=

t∈T
R
t
be e
c
= ¦r
t
¦ where r
c
= l
¯
and r
t
= 0
¯
if t ,= c. Then ¦e
c
¦
c∈T
is a basis for ⊕R
t
called
the canonical basis or standard basis.
Chapter 5 Linear Algebra 79
Theorem Suppose N is an R-module and M is a free R-module with a basis
¦s
t
¦. Then ∃ a 1-1 correspondence from the set of all functions g : ¦s
t
¦ →N and the
set of all homomorphisms f : M → N. Given g, define f by f(s
t
1
r
1
+ +s
tn
r
n
) =
g(s
t
1
)r
1
+ +g(s
tn
)r
n
. Given f, define g by g(s
t
) = f(s
t
). In other words, f is
completely determined by what it does on the basis S, and you are “free” to send the
basis any place you wish and extend to a homomorphism.
Recall that we have already had the preceding theorem in the case S is the canon-
ical basis for M = R
n
. The next theorem is so basic in linear algebra that it is used
without comment. Although the proof is easy, it should be worked carefully.
Theorem Suppose M and N are modules, f : M →N is a homomorphism, and
S = ¦s
t
¦ is a basis for M. Let f(S) be the sequence ¦f(s
t
)¦ in N.
1) f(S) generates N iff f is surjective.
2) f(S) is independent in N iff f is injective.
3) f(S) is a basis for N iff f is an isomorphism.
4) If h : M →N is a homomorphism then f = h iff f [ S = h [ S.
Exercise Let (A
1
, .., A
n
) be a sequence of n vectors with each A
i
∈ Z
n
.
Show this sequence is linearly independent over Z iff it is linearly independent over Q.
Is it true the sequence is linearly independent over Z iff it is linearly independent
over R? This question is difficult until we learn more linear algebra.
Characterization of Free Modules
It will now be shown that any free R-module is isomorphic to one of the
canonical free R-modules.
Theorem An R-module N is free iff ∃ an index set T such that N ≈

t∈T
R
t
. In
particular, N has a finite free basis of n elements iff N ≈ R
n
.
Proof If N is isomorphic to a free module, N is certainly free. Now suppose N
has a free basis ¦s
t
¦. Then the homomorphism f : ⊕R
t
→ N with f(e
t
) = s
t
sends
the canonical basis for ⊕R
t
to the basis for N. By 3) in the preceding theorem, f is
an isomorphism.
80 Linear Algebra Chapter 5
Exercise Suppose R is a commutative ring, A ∈ R
n
, and the homomorphism
f : R
n
→ R
n
defined by f(B) = AB is surjective. Show f is an isomorphism, i.e.,
show A is invertible. This is a key theorem in linear algebra, although it is usually
stated only for the case where R is a field. Use the fact that ¦e
1
, .., e
n
¦ is a free basis
for R
n
.
The next exercise is routine, but still informative.
Exercise Let R = Z, A =
_
2 1 0
3 2 −5
_
and f: Z
3
→ Z
2
be the group homo-
morphism defined by A. Find a non-trivial linear combination of the columns of A
which is 0. Also find a non-zero element of kernel(f).
The next exercise is to relate properties of R as an R-module to properties of R
as a ring.
Exercise Suppose R is a commutative ring and v ∈ R, v ,= 0
¯
.
1) v is independent iff v is .
2) v is a basis for R iff v generates R iff v is .
Note that 2) here is essentially the first exercise for the case n = 1. That is, if
f : R →R is a surjective R-module homomorphism, then f is an isomorphism.
Relating these concepts to matrices
The theorem stated below gives a summary of results we have already had. It
shows that certain concepts about matrices, linear independence, injective homo-
morphisms, and solutions of equations, are all the same — they are merely stated in
different language. Suppose A ∈ R
m,n
and f : R
n
→R
m
is the homomorphism associ-
ated with A, i.e., f(B) = AB. Let v
1
, .., v
n
∈ R
m
be the columns of A, i.e., f(e
i
) = v
i
= column i of A. Let λ =
_
_
_
λ
1
.
λ
n
_
_
_ represent an element of R
n
and C =
_
_
_
c
1
.
c
m
_
_
_
Chapter 5 Linear Algebra 81
represent an element of R
m
.
Theorem
1) f(λ) is the linear combination of the columns of A, f(λ) = f(e
1
λ
1
+ +
e
n
λ
n
) = v
1
λ
1
+ +v
n
λ
n
.
2) ¦v
1
, .., v
n
¦ generates R
m
iff f is surjective iff (for any C ∈ R
m
, AX = C
has a solution).
3) ¦v
1
, .., v
n
¦ is independent iff f is injective iff AX = 0
¯
has a unique
solution iff (∃ C ∈ R
m
such that AX = C has a unique solution).
4) ¦v
1
, .., v
n
¦ is a basis for R
m
iff f is an isomorphism iff (for any C ∈ R
m
,
AX = C has a unique solution).
Relating these concepts to square matrices
We now look at the preceding theorem in the special case where n = m and R
is a commutative ring. So far in this chapter we have just been cataloging. Now we
prove something more substantial, namely that if f : R
n
→ R
n
is surjective, then f
is injective. Later on we will prove that if R is a field, injective implies surjective.
Theorem Suppose R is a commutative ring, A ∈ R
n
, and f : R
n
→R
n
is defined
by f(B) = AB. Let v
1
, .., v
n
∈ R
n
be the columns of A, and w
1
, .., w
n
∈ R
n
= R
1,n
be the rows of A. Then the following are equivalent.
1) f is an automorphism.
2) A is invertible, i.e., [ A [ is a unit in R.
3) ¦v
1
, .., v
n
¦ is a basis for R
n
.
4) ¦v
1
, .., v
n
¦ generates R
n
.
5) f is surjective.
2
t
) A
t
is invertible, i.e., [ A
t
[ is a unit in R.
3
t
) ¦w
1
, .., w
n
¦ is a basis for R
n
.
82 Linear Algebra Chapter 5
4
t
) ¦w
1
, .., w
n
¦ generates R
n
.
Proof Suppose 5) is true and show 2). Since f is onto, ∃ u
1
, ..., u
n
∈ R
n
with
f(u
i
) = e
i
. Let g : R
n
→R
n
be the homomorphism satisfying g(e
i
) = u
i
. Then f ◦ g
is the identity. Now g comes from some matrix D and thus AD = I. This shows that
A has a right inverse and is thus invertible. Recall that the proof of this fact uses
determinant, which requires that R be commutative.
We already know the first three properties are equivalent, 4) and 5) are equivalent,
and 3) implies 4). Thus the first five are equivalent. Furthermore, applying this
result to A
t
shows that the last three properties are equivalent to each other. Since
[ A [=[ A
t
[, 2) and 2
t
) are equivalent.
Uniqueness of Dimension
There exists a ring R with R
2
≈ R
3
as R-modules, but this is of little interest. If
R is commutative, this is impossible, as shown below. First we make a convention.
Convention For the remainder of this chapter, R will be a commutative ring.
Theorem If f : R
m
→R
n
is a surjective R-module homomorphism, then m ≥ n.
Proof Suppose k = n − m is positive. Define h : (R
m
⊕ R
k
= R
n
) → R
n
by
h(u, v) = f(u). Then h is a surjective homomorphism, and by the previous section,
also injective. This is a contradiction.
Corollary If f : R
m
→R
n
is an isomorphism, then m = n.
Proof Each of f and f
−1
is surjective, so m = n by the previous theorem.
Corollary If ¦v
1
, .., v
m
¦ generates R
n
, then m ≥ n.
Proof The hypothesis implies there is a surjective homomorphism R
m
→R
n
. So
this follows from the first theorem.
Lemma Suppose M is a f.g. module (i.e., a finite generated R-module). Then
if M has a basis, that basis is finite.
Chapter 5 Linear Algebra 83
Proof Suppose U ⊂ M is a finite generating set and S is a basis. Then any
element of U is a finite linear combination of elements of S, and thus S is finite.
Theorem Suppose M is a f.g. module. If M has a basis, that basis is finite
and any other basis has the same number of elements. This number is denoted by
dim(M), the dimension of M.
Proof By the previous lemma, any basis for M must be finite. M has a basis of
n elements iff M ≈ R
n
. The result follows because R
n
≈ R
m
iff n = m.
Change of Basis
Before changing basis, we recall what a basis is. Previously we defined generat-
ing, independence, and basis for sequences, not for collections. For the concept of
generating it matters not whether you use sequences or collections, but for indepen-
dence and basis, you must use sequences. Consider the columns of the real matrix
A =
_
2 3 2
1 4 1
_
. If we consider the column vectors of A as a collection, there are
only two of them, yet we certainly don’t wish to say the columns of A form a basis
for R
2
. In a set or collection, there is no concept of repetition. In order to make
sense, we must consider the columns of A as an ordered triple of vectors. When we
originally defined basis, we could have called it “indexed free basis” or even “ordered
free basis”.
Two sequences cannot begin to be equal unless they have the same index set.
We will follow the classical convention that an index set with n elements must be
¦1, 2, .., n¦, and thus a basis for M with n elements is a sequence S = ¦u
1
, .., u
n
¦
or if you wish, S = (u
1
, .., u
n
) ∈ M
n
. Suppose M is an R-module with a basis of
n elements. Recall there is a bijection α : Hom
R
(R
n
, M) → M
n
defined by α(h) =
(h(e
1
), .., h(e
n
)). Now h : R
n
→M is an isomorphism iff α(h) is a basis for M.
The point of all this is that selecting a basis of n elements for M is the same as
selecting an isomorphism from R
n
to M, and from this viewpoint, change of basis
can be displayed by the diagram below.
Endomorphisms on R
n
are represented by square matrices, and thus have a de-
terminant and trace. Now suppose M is a f.g. free module and f : M → M is a
homomorphism. In order to represent f by a matrix, we must select a basis for M
(i.e., an isomorphism with R
n
). We will show that this matrix is well defined up to
similarity, and thus the determinant, trace, and characteristic polynomial of f are
well defined.
84 Linear Algebra Chapter 5
Definition Suppose M is a free module, S = ¦u
1
, .., u
n
¦ is a basis for M, and
f : M → M is a homomorphism. The matrix A = (a
i,j
) ∈ R
n
of f w.r.t. the basis
S is defined by f(u
i
) = u
1
a
1,i
+ +u
n
a
n,i
. (Note that if M = R
n
and u
i
= e
i
, A is
the usual matrix associated with f).
Theorem Suppose T = ¦v
1
, .., v
n
¦ is another basis for M and B ∈ R
n
is the
matrix of f w.r.t. T. Define C = (c
i,j
) ∈ R
n
by v
i
= u
1
c
1,i
+ +u
n
c
n,i
. Then C is
invertible and B = C
−1
AC, i.e., A and B are similar. Therefore [A[ = [B[,
trace(A)=trace(B), and A and B have the same characteristic polynomial (see chap-
ter 4).
Conversely, suppose C = (c
i,j
) ∈ R
n
is invertible. Define T = ¦v
1
, .., v
n
¦ by
v
i
= u
1
c
1,i
+ +u
n
c
n,i
. Then T is a basis for M and that matrix of f w.r.t. T is
B = C
−1
AC. In other words, conjugation of matrices corresponds to change of basis.
Proof The proof follows by seeing that the following diagram is commutative.
R
n
R
n
R
n
R
n
M M C C
A
B
≈ ≈
≈ ≈
≈ ≈
e
i
v
i
e
i
u
i
v
i
e
i
u
i
e
i
f
¸
¸
· ·

´

»
¸
»
´

»

´

´

»

The diagram also explains what it means for A to be the matrix of f w.r.t. the
basis S. Let h : R
n
→ M be the isomorphism with h(e
i
) = u
i
for 1 ≤ i ≤ n. Then
the matrix A ∈ R
n
is the one determined by the endomorphism h
−1
◦f ◦h : R
n
→R
n
.
In other words, column i of A is h
−1
(f(h(e
i
))).
An important special case is where M = R
n
and f : R
n
→ R
n
is given by some
matrix W. Then h is given by the matrix U whose i
th
column is u
i
and A =
U
−1
WU. In other words, W represents f w.r.t. the standard basis, and U
−1
WU
represents f w.r.t. the basis ¦u
1
, .., u
n
¦.
Definition Suppose M is a f.g. free module and f : M →M is a homomorphism.
Define [f[ to be [A[, trace(f) to be trace(A), and CP
f
(x) to be CP
A
(x), where A
Chapter 5 Linear Algebra 85
is the matrix of f w.r.t. some basis. By the previous theorem, all three are well
defined, i.e., do not depend upon the choice of basis.
Exercise Let R = Z and f : Z
2
→Z
2
be defined by f(D) =
_
3 3
0 −1
_
D. Find
the matrix of f w.r.t. the basis
__
2
1
_
,
_
3
1
__
.
Exercise Let L ⊂ R
2
be the line L = ¦(r, 2r) : r ∈ R¦. Show there is one
and only one homomorphism f : R
2
→ R
2
which is the identity on L and has
f(−1, 1) = (1, −1). Find the matrix A ∈ R
2
which represents f with respect to the
basis ¦(1, 2), (−1, 1)¦. Find the determinant, trace, and characteristic polynomial of
f. Also find the matrix B ∈ R
2
which represents f with respect to the standard
basis. Finally, find an invertible matrix C ∈ R
2
with B = C
−1
AC.
Vector Spaces
So far in this chapter we have been developing the theory of linear algebra in
general. The previous theorem, for example, holds for any commutative ring R, but
it must be assumed that the module M is free. Endomorphisms in general will not
have a determinant or trace. We now focus on the case where R is a field, and
show that in this case, every R-module is free. Thus any finitely generated R-module
will have a well defined dimension, and endomorphisms on it will have well defined
determinant, trace, and characteristic polynomial.
In this section, F is a field. F-modules may also be called vector spaces and
F-module homomorphisms may also be called linear transformations.
Theorem Suppose M is an F-module and v ∈ M. Then v ,= 0
¯
iff v is independent.
That is, if v ∈ V and r ∈ F, vr = 0
¯
implies v = 0
¯
or r = 0
¯
.
Proof Suppose vr = 0
¯
and r ,= 0
¯
. Then 0
¯
= (vr)r
−1
= v1
¯
= v.
Theorem Suppose M ,= 0
¯
is an F-module and v ∈ M. Then v generates M iff v
is a basis for M. Furthermore, if these conditions hold, then M ≈ F
F
, any non-zero
element of M is a basis, and any two elements of M are dependent.
86 Linear Algebra Chapter 5
Proof Suppose v generates M. Then v ,= 0
¯
and is thus independent by the
previous theorem. In this case M ≈ F, and any non-zero element of F is a basis, and
any two elements of F are dependent.
Theorem Suppose M ,= 0
¯
is a finitely generated F-module. If S = ¦v
1
, .., v
m
¦
generates M, then any maximal independent subsequence of S is a basis for M. Thus
any finite independent sequence can be extended to a basis. In particular, M has a
finite free basis, and thus is a free F-module.
Proof Suppose, for notational convenience, that ¦v
1
, .., v
n
¦ is a maximal inde-
pendent subsequence of S, and n < i ≤ m. It must be shown that v
i
is a linear
combination of ¦v
1
, .., v
n
¦. Since ¦v
1
, .., v
n
, v
i
¦ is dependent, ∃ r
1
, ..., r
n
, r
i
not all
zero, such that v
1
r
1
+ +v
n
r
n
+v
i
r
i
= 0
¯
. Then r
i
,= 0
¯
and v
i
= −(v
1
r
1
+ +v
n
r
n
)r
−1
i
.
Thus ¦v
1
, .., v
n
¦ generates S and thus all of M. Now suppose T is a finite indepen-
dent sequence. T may be extended to a finite generating sequence, and inside that
sequence it may be extended to a maximal independent sequence. Thus T extends
to a basis.
After so many routine theorems, it is nice to have one with real power. It not
only says any finite independent sequence can be extended to a basis, but it can be
extended to a basis inside any finite generating set containing it. This is one of the
theorems that makes linear algebra tick. The key hypothesis here is that the ring
is a field. If R = Z, then Z is a free module over itself, and the element 2 of Z is
independent. However it certainly cannot be extended to a basis. Also the finiteness
hypothesis in this theorem is only for convenience, as will be seen momentarily.
Since F is a commutative ring, any two bases of M must have the same number
of elements, and thus the dimension of M is well defined.
Theorem Suppose M is an F-module of dimension n, and ¦v
1
, ..., v
m
¦ is an
independent sequence in M. Then m ≤ n and if m = n, ¦v
1
, .., v
m
¦ is a basis.
Proof ¦v
1
, .., v
m
¦ extends to a basis with n elements.
The next theorem is just a collection of observations.
Theorem Suppose M and N are finitely generated F-modules.
Chapter 5 Linear Algebra 87
1) M ≈ F
n
iff dim(M) = n.
2) M ≈ N iff dim(M) = dim(N).
3) F
m
≈ F
n
iff n = m.
4) dim(M ⊕N) = dim(M) + dim(N).
Here is the basic theorem in full generality.
Theorem Suppose M ,= 0
¯
is an F-module and S = ¦v
t
¦
t∈T
generates M.
1) Any maximal independent subsequence of S is a basis for M.
2) Any independent subsequence of S may be extended to a maximal
independent subsequence of S, and thus to a basis for M.
3) Any independent subsequence of M can be extended to a basis for M.
In particular, M has a free basis, and thus is a free F-module.
Proof The proof of 1) is the same as in the case where S is finite. Part 2) will
follow from the Hausdorff Maximality Principle. An independent subsequence of S is
contained in a maximal monotonic tower of independent subsequences. The union of
these independent subsequences is still independent, and so the result follows. Part
3) follows from 2) because an independent sequence can always be extended to a
generating sequence.
Theorem Suppose M is an F-module and K ⊂ M is a submodule.
1) K is a summand of M, i.e., ∃ a submodule L of M with K ⊕L = M.
2) If M is f.g., then dim(K) ≤ dim(M) and K = M iff dim(K) = dim(M).
Proof Let T be a basis for K. Extend T to a basis S for M. Then S−T generates
a submodule L with K ⊕L = M. Part 2) follows from 1).
Corollary Q is a summand of R. In other words, ∃ a Q-submodule V ⊂ R
with Q⊕V = R as Q-modules. (See exercise on page 77.)
Proof Q is a field, R is a Q-module, and Q is a submodule of R.
Corollary Suppose M is a f.g. F-module, W is an F-module, and f : M → W
is a homomorphism. Then dim(M) = dim(ker(f)) + dim(image(f)).
88 Linear Algebra Chapter 5
Proof Let K = ker(f) and L ⊂ M be a submodule with K ⊕ L = M. Then
f [ L : L →image(f) is an isomorphism.
Exercise Suppose R is a domain with the property that, for R-modules, every
submodule is a summand. Show R is a field.
Exercise Find a free Z-module which has a generating set containing no basis.
Exercise The real vector space R
2
is generated by the sequence S =
¦(π, 0), (2, 1), (3, 2)¦. Show there are three maximal independent subsequences of
S, and each is a basis for R
2
.
The real vector space R
3
is generated by S = ¦(1, 1, 2), (1, 2, 1), (3, 4, 5), (1, 2, 0)¦.
Show there are three maximal independent subsequences of S and each is a basis
for R
3
. You may use determinant.
Square matrices over fields
This theorem is just a summary of what we have for square matrices over fields.
Theorem Suppose A ∈ F
n
and f : F
n
→ F
n
is defined by f(B) = AB. Let
v
1
, .., v
n
∈ F
n
be the columns of A, and w
1
, .., w
n
∈ F
n
= F
1,n
be the rows of A. Then
the following are equivalent
1) ¦v
1
, .., v
n
¦ is independent, i.e., f is injective.
2) ¦v
1
, .., v
n
¦ is a basis for F
n
, i.e., f is an automorphism, i.e., A is
invertible, i.e., [ A [, = 0
¯
.
3) ¦v
1
, .., v
n
¦ generates F
n
, i.e., f is surjective.
1
t
) ¦w
1
, .., w
n
¦ is independent.
2
t
) ¦w
1
, .., w
n
¦ is a basis for F
n
, i.e., A
t
is invertible, i.e., [ A
t
[, = 0
¯
.
3
t
) ¦w
1
, .., w
n
¦ generates F
n
.
Chapter 5 Linear Algebra 89
Proof Except for 1) and 1
t
), this theorem holds for any commutative ring R.
(See the section Relating these concepts to square matrices.) Parts 1) and 1
t
)
follow from the preceding section.
Exercise Add to this theorem more equivalent statements in terms of solutions
of n equations in n unknowns.
Overview Suppose each of X and Y is a set with n elements and f : X →Y is a
function. By the pigeonhole principle, f is injective iff f is bijective iff f is surjective.
Now suppose each of U and V is a vector space of dimension n and f : U → V is a
linear transformation. It follows from the work done so far that f is injective iff f
is bijective iff f is surjective. This shows some of the simple and definitive nature of
linear algebra.
Exercise Let A = (A
1
, .., A
n
) be an nn matrix over Z with column i = A
i

Z
n
. Let f : Z
n
→ Z
n
be defined by f(B) = AB and
¯
f : R
n
→ R
n
be defined by
f(C) = AC. Show the following are equivalent. (See the exercise on page 79.)
1) f : Z
n
→Z
n
is injective.
2) The sequence (A
1
, .., A
n
) is linearly independent over Z.
3) [A[ , = 0.
4)
¯
f : R
n
→R
n
is injective.
5) The sequence (A
1
, .., A
n
) is linearly independent over R.
Rank of a matrix Suppose A ∈ F
m,n
. The row (column) rank of A is defined
to be the dimension of the submodule of F
n
(F
m
) generated by the rows (columns)
of A.
Theorem If C ∈ F
m
and D ∈ F
n
are invertible, then the row (column) rank of
A is the same as the row (column) rank of CAD.
Proof Suppose f : F
n
→ F
m
is defined by f(B) = AB. Each column of A is a
vector in the range F
m
, and if B ∈ F
n
, f(B) is a linear combination of those vectors.
90 Linear Algebra Chapter 5
Thus the image of f is the submodule of F
m
generated by the columns of A, and
its dimension is the rank of f. This dimension is the same as the dimension of the
image of g ◦ f ◦ h : F
n
→ F
m
, where h is any automorphism on F
n
and g is any
automorphism on F
m
. This proves the theorem for column rank. The theorem for
row rank follows using transpose.
Theorem If A ∈ F
m,n
, the row rank and the column rank of A are equal. This
number is called the rank of A and is ≤ min¦m, n¦.
Proof By the theorem above, elementary row and column operations change
neither the row rank nor the column rank. By row and column operations, A may be
changed to a matrix H where h
1,1
= = h
t,t
= 1
¯
and all other entries are 0
¯
(see the
first exercise on page 59). Thus row rank = t = column rank.
Exercise Suppose A has rank t. Show that it is possible to select t rows and t
columns of A such that the determined t t matrix is invertible. Show that the rank
of A is the largest integer t such that this is possible.
Exercise Suppose A ∈ F
m,n
has rank t. What is the dimension of the solution
set of AX = 0
¯
?
Definition Suppose M is a finite dimensional vector space over a field F, and
f : M → M is an endomorphism. The rank of f is defined to be the dimension of
the image of f. It follows from the work above that this is the same as the rank of
any matrix representing f.
Geometric Interpretation of Determinant
Suppose V ⊂ R
n
is some nice subset. For example, if n = 2, V might be the
interior of a square or circle. There is a concept of the n-dimensional volume of V .
For n = 1, it is length. For n = 2, it is area, and for n = 3 it is “ordinary volume”.
Suppose A ∈ R
n
and f : R
n
→R
n
is the homomorphism given by A. The volume of
V does not change under translation, i.e., V and V +p have the same volume. Thus
f(V ) and f(V +p) = f(V ) +f(p) have the same volume. In street language, the next
theorem says that “f multiplies volume by the absolute value of its determinant”.
Theorem The n-dimensional volume of f(V ) is ±[A[(the n-dimensional volume
of V ). Thus if [A[ = ±1, f preserves volume.
Chapter 5 Linear Algebra 91
Proof If [A[ = 0, image(f) has dimension < n and thus f(V ) has n-dimensional
volume 0. If [A[ , = 0 then A is the product of elementary matrices (see page 59)
and for elementary matrices, the theorem is obvious. The result follows because the
determinant of the composition is the product of the determinants.
Corollary If P is the n-dimensional parallelepiped determined by the columns
v
1
, .. , v
n
of A, then the n-dimensional volume of P is ±[A[.
Proof Let V = [0, 1] [0, 1] = ¦e
1
t
1
+ +e
n
t
n
: 0 ≤ t
i
≤ 1¦. Then
P = f(V ) = ¦v
1
t
1
+ +v
n
t
n
: 0 ≤ t
i
≤ 1¦.
Linear functions approximate differentiable functions locally
We continue with the special case F = R. Linear functions arise naturally in
business, science, and mathematics. However this is not the only reason that linear
algebra is so useful. It is a central fact that smooth phenomena may be approx-
imated locally by linear phenomena. Without this great simplification, the world
of technology as we know it today would not exist. Of course, linear transforma-
tions send the origin to the origin, so they must be adjusted by a translation. As
a simple example, suppose h : R → R is differentiable and p is a real number. Let
f : R →R be the linear transformation f(x) = h

(p)x. Then h is approximated near
p by g(x) = h(p) + f(x −p) = h(p) + h

(p)(x −p).
Now suppose V ⊂ R
2
is some nice subset and h = (h
1
, h
2
) : V → R
2
is injective
and differentiable. Define the Jacobian by J(h)(x, y) =
_
∂h
1
∂x
∂h
1
∂y
∂h
2
∂x
∂h
2
∂y
_
and for each
(x, y) ∈ V , let f(x, y) : R
2
→ R
2
be the homomorphism defined by J(h)(x, y).
Then for any (p
1
, p
2
) ∈ V , h is approximated near (p
1
, p
2
) (after translation) by
f(p
1
, p
2
). The area of V is
_ _
V
1dxdy. From the previous section we know that
any homomorphism f multiplies area by [ f [. The student may now understand
the following theorem from calculus. (Note that if h is the restriction of a linear
transformation from R
2
to R
2
, this theorem is immediate from the previous section.)
Theorem Suppose the determinant of J(h)(x, y) is non-negative for each
(x, y) ∈ V . Then the area of h(V ) is
_ _
V
[ J(h) [ dxdy.
92 Linear Algebra Chapter 5
The Transpose Principle
We now return to the case where F is a field (of arbitrary characteristic). F-
modules may also be called vector spaces and submodules may be called subspaces.
The study of R-modules in general is important and complex. However the study of
F-modules is short and simple – every vector space is free and every subspace is a
summand. The core of classical linear algebra is not the study of vector spaces, but
the study of homomorphisms, and in particular, of endomorphisms. One goal is to
show that if f : V → V is a homomorphism with some given property, there exists
a basis of V so that the matrix representing f displays that property in a prominent
manner. The next theorem is an illustration of this.
Theorem Let F be a field and n be a positive integer.
1) Suppose V is an n-dimensional vector space and f : V →V is a
homomorphism with [f[ = 0
¯
. Then ∃ a basis of V such that the matrix
representing f has its first row zero.
2) Suppose A ∈ F
n
has [A[ = 0
¯
. Then ∃ an invertible matrix C such that
C
−1
AC has its first row zero.
3) Suppose V is an n-dimensional vector space and f : V →V is a
homomorphism with [f[ = 0. Then ∃ a basis of V such that the matrix
representing f has its first column zero.
4) Suppose A ∈ F
n
has [A[ = 0
¯
. Then ∃ an invertible matrix D such that
D
−1
AD has its first column zero.
We first wish to show that these 4 statements are equivalent. We know that
1) and 2) are equivalent and also that 3) and 4) are equivalent because change of
basis corresponds to conjugation of the matrix. Now suppose 2) is true and show
4) is true. Suppose [A[ = 0
¯
. Then [A
t
[ = 0
¯
and by 2) ∃ C such that C
−1
A
t
C has
first row zero. Thus (C
−1
A
t
C)
t
= C
t
A(C
t
)
−1
has first row column zero. The result
follows by defining D = (C
t
)
−1
. Also 4) implies 2).
This is an example of the transpose principle. Loosely stated, it is that theorems
about change of basis correspond to theorems about conjugation of matrices and
theorems about the rows of a matrix correspond to theorems about the columns of a
matrix, using transpose. In the remainder of this chapter, this will be used without
further comment.
Chapter 5 Linear Algebra 93
Proof of the theorem We are free to select any of the 4 parts, and we select
part 3). Since [ f [= 0, f is not injective and ∃ a non-zero v
1
∈ V with f(v
1
) = 0
¯
.
Extend v
1
to a basis ¦v
1
, .., v
n
¦. Then the matrix of f w.r.t this basis has first column
zero.
Exercise Let A =
_
3π 6
2π 4
_
. Find an invertible matrix C ∈ R
2
so that C
−1
AC
has first row zero. Also let A =
_
_
_
0 0 0
1 3 4
2 1 4
_
_
_ and find an invertible matrix D ∈ R
3
so that D
−1
AD has first column zero.
Exercise Suppose M is an n-dimensional vector space over a field F, k is an
integer with 0 < k < n, and f : M → M is an endomorphism of rank k. Show
there is a basis for M so that the matrix representing f has its first n −k rows zero.
Also show there is a basis for M so that the matrix representing f has its first n −k
columns zero. Do not use the transpose principle.
Nilpotent Homomorphisms
In this section it is shown that an endomorphism f is nilpotent iff all of its char-
acteristic roots are 0
¯
iff it may be represented by a strictly upper triangular matrix.
Definition An endomorphism f : V →V is nilpotent if ∃ m with f
m
= 0
¯
. Any
f represented by a strictly upper triangular matrix is nilpotent (see page 56).
Theorem Suppose V is an n-dimensional vector space and f : V → V is a
nilpotent homomorphism. Then f
n
= 0
¯
and ∃ a basis of V such that the matrix
representing f w.r.t. this basis is strictly upper triangular. Thus the characteristic
polynomial of f is CP
f
(x) = x
n
.
Proof Suppose f ,= 0
¯
is nilpotent. Let t be the largest positive integer with
f
t
,= 0
¯
. Then f
t
(V ) ⊂ f
t−1
(V ) ⊂ ⊂ f(V ) ⊂ V . Since f is nilpotent, all of these
inclusions are proper. Therefore t < n and f
n
= 0
¯
. Construct a basis for V by
starting with a basis for f
t
(V ), extending it to a basis for f
t−1
(V ), etc. Then the
matrix of f w.r.t. this basis is strictly upper triangular.
Note To obtain a matrix which is strictly lower triangular, reverse the order of
the basis.
94 Linear Algebra Chapter 5
Exercise Use the transpose principle to write 3 other versions of this theorem.
Theorem Suppose V is an n-dimensional vector space and f : V → V is a
homomorphism. Then f is nilpotent iff CP
f
(x) = x
n
. (See the exercise at the end of
Chapter 4.)
Proof Suppose CP
f
(x) = x
n
. For n = 1 this implies f = 0
¯
, so suppose n > 1.
Since the constant term of CP
f
(x) is 0
¯
, the determinant of f is 0
¯
. Thus ∃ a basis
of V such that the matrix A representing f has its first column zero. Let B ∈ F
n−1
be the matrix obtained from A by removing its first row and first column. Now
CP
A
(x) = x
n
= xCP
B
(x). Thus CP
B
(x) = x
n−1
and by induction on n, B is
nilpotent and so ∃ C such that C
−1
BC is strictly upper triangular. Then
_
_
_
_
_
_
_
_
1 0 0
0
C
−1

0
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
0 ∗ ∗

B

0
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
1 0 0
0
C

0
_
_
_
_
_
_
_
_
=
_
_
_
_
_
_
_
_
0 ∗ ∗
0
C
−1
BC

0
_
_
_
_
_
_
_
_
is strictly upper triangular.
Exercise Suppose F is a field, A ∈ F
3
is a lower triangular matrix of rank 2,
and B =
_
_
_
0 0 0
1 0 0
0 1 0
_
_
_. Using conjugation by elementary matrices, show there is an
invertible matrix C so that C
−1
AC = B. Now suppose V is a 3-dimensional vector
space and f : V → V is a nilpotent endomorphism of rank 2. We know f can be
represented by a lower triangular matrix. Show there is a basis ¦v
1
, v
2
, v
3
¦ for V so
that B is the matrix representing f. Also show that f(v
1
) = v
2
, f(v
2
) = v
3
, and
f(v
3
) = 0
¯
. In other words, there is a basis for V of the form ¦v, f(v), f
2
(v)¦ with
f
3
(v) = 0
¯
.
Exercise Suppose V is a 3-dimensional vector space and f : V →V is a nilpotent
endomorphism of rank 1. Show there is a basis for V so that the matrix representing
f is
_
_
_
0 0 0
1 0 0
0 0 0
_
_
_.
Chapter 5 Linear Algebra 95
Eigenvalues
Our standing hypothesis is that V is an n-dimensional vector space over a field F
and f : V →V is a homomorphism.
Definition An element λ ∈ F is an eigenvalue of f if ∃ a non-zero v ∈ V with
f(v) = λv. Any such v is called an eigenvector. E
λ
⊂ V is defined to be the set of
all eigenvectors for λ (plus 0
¯
). Note that E
λ
= ker(λI − f) is a subspace of V . The
next theorem shows the eigenvalues of f are just the characteristic roots of f.
Theorem If λ ∈ F then the following are equivalent.
1) λ is an eigenvalue of f, i.e., (λI −f) : V →V is not injective.
2) [ (λI −f) [= 0
¯
.
3) λ is a characteristic root of f, i.e., a root of the characteristic
polynomial CP
f
(x) =[ (xI −A) [, where A is any matrix representing f.
Proof It is immediate that 1) and 2) are equivalent, so let’s show 2) and 3)
are equivalent. The evaluation map F[x] → F which sends h(x) to h(λ) is a ring
homomorphism (see theorem on page 47). So evaluating (xI − A) at x = λ and
taking determinant gives the same result as taking the determinant of (xI −A) and
evaluating at x = λ. Thus 2) and 3) are equivalent.
The nicest thing you can say about a matrix is that it is similar to a diagonal
matrix. Here is one case where that happens.
Theorem Suppose λ
1
, .., λ
k
are distinct eigenvalues of f, and v
i
is an eigenvector
of λ
i
for 1 ≤ i ≤ k. Then the following hold.
1) ¦v
1
, .., v
k
¦ is independent.
2) If k = n, i.e., if CP
f
(x) = (x −λ
1
) (x −λ
n
), then ¦v
1
, .., v
n
¦ is a
basis for V . The matrix of f w.r.t. this basis is the diagonal matrix whose
(i, i) term is λ
i
.
Proof Suppose ¦v
1
, .., v
k
¦ is dependent. Suppose t is the smallest positive integer
such that ¦v
1
, .., v
t
¦ is dependent, and v
1
r
1
+ +v
t
r
t
= 0
¯
is a non-trivial linear
combination. Note that at least two of the coefficients must be non-zero. Now
(f −λ
t
)(v
1
r
1
+ +v
t
r
t
) = v
1

1
−λ
t
)r
1
+ +v
t−1

t−1
−λ
t
)r
t−1
+0
¯
= 0
¯
is a shorter
96 Linear Algebra Chapter 5
non-trivial linear combination. This is a contradiction and proves 1). Part 2) follows
from 1) because dim(V ) = n.
Exercise Let A =
_
0 1
−1 0
_
∈ R
2
. Find an invertible C ∈ C
2
such that C
−1
AC
is diagonal. Show that C cannot be selected in R
2
. Find the characteristic polyno-
mial of A.
Exercise Suppose V is a 3-dimensional vector space and f : V → V is an endo-
morphism with CP
f
(x) = (x−λ)
3
. Show that (f −λI) has characteristic polynomial
x
3
and is thus a nilpotent endomorphism. Show there is a basis for V so that the
matrix representing f is
_
_
_
λ 0 0
1 λ 0
0 1 λ
_
_
_,
_
_
_
λ 0 0
1 λ 0
0 0 λ
_
_
_ or
_
_
_
λ 0 0
0 λ 0
0 0 λ
_
_
_.
We could continue and finally give an ad hoc proof of the Jordan canonical form,
but in this chapter we prefer to press on to inner product spaces. The Jordan form
will be developed in Chapter 6 as part of the general theory of finitely generated
modules over Euclidean domains. The next section is included only as a convenient
reference.
Jordan Canonical Form
This section should be just skimmed or omitted entirely. It is unnecessary for the
rest of this chapter, and is not properly part of the flow of the chapter. The basic
facts of Jordan form are summarized here simply for reference.
The statement that a square matrix B over a field F is a Jordan block means that
∃ λ ∈ F such that B is a lower triangular matrix of the form
B =
_
_
_
_
_
_
_
_
λ 0
1 λ

0 1 λ
_
_
_
_
_
_
_
_
. B gives a homomorphism g : F
m
→F
m
with g(e
m
) = λe
m
and g(e
i
) = e
i+1
+λe
i
for 1 ≤ i < m. Note that CP
B
(x) = (x −λ)
m
and so λ is the
only eigenvalue of B, and B satisfies its characteristic polynomial, i.e., CP
B
(B) = 0
¯
.
Chapter 5 Linear Algebra 97
Definition A matrix D ∈ F
n
is in Jordan form if ∃ Jordan blocks B
1
, .. , B
t
such
that D =
_
_
_
_
_
_
_
_
_
_
B
1
B
2
0

0
B
t
_
_
_
_
_
_
_
_
_
_
. Suppose D is of this form and B
i
∈ F
n
i
has
eigenvalue λ
i
. Then n
1
+ +n
t
= n and CP
D
(x) = (x −λ
1
)
n
1
(x −λ
t
)
nt
. Note that
a diagonal matrix is a special case of Jordan form. D is a diagonal matrix iff each
n
i
= 1, i.e., iff each Jordan block is a 1 1 matrix.
Theorem If A ∈ F
n
, the following are equivalent.
1) ∃ an invertible C ∈ F
n
such that C
−1
AC is in Jordan form.
2) ∃ λ
1
, .., λ
n
∈ F (not necessarily distinct) such that CP
A
(x) = (x −λ
1
)
(x −λ
n
). (In this case we say that all the eigenvalues of A belong to F.)
Theorem Jordan form (when it exists) is unique. This means that if A and D are
similar matrices in Jordan form, they have the same Jordan blocks, except possibly
in different order.
The reader should use the transpose principle to write three other versions of the
first theorem. Also note that we know one special case of this theorem, namely that
if A has n distinct eigenvalues in F, then A is similar to a diagonal matrix. Later on
it will be shown that if A is a symmetric real matrix, then A is similar to a diagonal
matrix.
Let’s look at the classical case A ∈ R
n
. The complex numbers are algebraically
closed. This means that CP
A
(x) will factor completely in C[x], and thus ∃ C ∈ C
n
with C
−1
AC in Jordan form. C may be selected to be in R
n
iff all the eigenvalues of
A are real.
Exercise Find all real matrices in Jordan form that have the following charac-
teristic polynomials: x(x − 2), (x − 2)
2
, (x − 2)(x − 3)(x − 4), (x − 2)(x − 3)
2
,
(x −2)
2
(x −3)
2
, (x −2)(x −3)
3
.
Exercise Suppose D ∈ F
n
is in Jordan form and has characteristic polynomial
a
0
+ a
1
x + +x
n
. Show a
0
I + a
1
D + +D
n
= 0
¯
, i.e., show CP
D
(D) = 0
¯
.
98 Linear Algebra Chapter 5
Exercise (Cayley-Hamilton Theorem) Suppose E is a field and A ∈ E
n
.
Assume the theorem that there is a field F containing E such that CP
A
(x) factors
completely in F[x]. Thus ∃ an invertible C ∈ F
n
such that D = C
−1
AC is in Jordan
form. Use this to show CP
A
(A) = 0
¯
.
Exercise Suppose A ∈ F
n
is in Jordan form. Show A is nilpotent iff A
n
= 0
¯
iff CP
A
(x) = x
n
. (Note how easy this is in Jordan form.)
Inner Product Spaces
The two most important fields for mathematics and science in general are the
real numbers and the complex numbers. Finitely generated vector spaces over R or
C support inner products and are thus geometric as well as algebraic objects. The
theories for the real and complex cases are quite similar, and both could have been
treated here. However, for simplicity, attention is restricted to the case F = R.
In the remainder of this chapter, the power and elegance of linear algebra become
transparent for all to see.
Definition Suppose V is a real vector space. An inner product (or dot product)
on V is a function V V →R which sends (u, v) to u v and satisfies
1) (u
1
r
1
+ u
2
r
2
) v = (u
1
v)r
1
+ (u
2
v)r
2
for all u
1
, u
2
, v ∈ V
v (u
1
r
1
+ u
2
r
2
) = (v u
1
)r
1
+ (v u
2
)r
2
and r
1
, r
2
∈ R.
2) u v = v u for all u, v ∈ V .
3) u u ≥ 0 and u u = 0 iff u = 0
¯
for all u ∈ V .
Theorem Suppose V has an inner product.
1) If v ∈ V , f : V →R defined by f(u) = u v is a homomorphism.
Thus 0
¯
v = 0.
2) Schwarz’ inequality. If u, v ∈ V , (u v)
2
≤ (u u)(v v).
Proof of 2) Let a =

v v and b =

u u. If a or b is 0, the result is obvious.
Suppose neither a nor b is 0. Now 0 ≤ (ua ±vb) (ua ±vb) = (u u)a
2
±2ab(u v)+
(v v)b
2
= b
2
a
2
±2ab(u v)+a
2
b
2
. Dividing by 2ab yields 0 ≤ ab±(u v) or [ u v [≤ ab.
Chapter 5 Linear Algebra 99
Theorem Suppose V has an inner product. Define the norm or length of a vector
v by |v| =

v v. The following properties hold.
1) |v| = 0 iff v = 0
¯
.
2) |vr| = |v| [ r [.
3) [ u v [ ≤ |u||v|. (Schwarz’ inequality)
4) |u + v| ≤ |u| +|v|. (The triangle inequality)
Proof of 4) |u + v|
2
= (u + v) (u + v) = |u|
2
+ 2(u v) + |v|
2
≤ |u|
2
+
2|u||v| +|v|
2
= (|u| +|v|)
2
.
Definition An Inner Product Space (IPS) is a real vector space with an
inner product. Suppose V is an IPS. A sequence ¦v
1
, .., v
n
¦ is orthogonal provided
v
i
v
j
= 0 when i ,= j. The sequence is orthonormal if it is orthogonal and each
vector has length 1, i.e., v
i
v
j
= δ
i,j
for 1 ≤ i, j ≤ n.
Theorem If S = ¦v
1
, .., v
n
¦ is an orthogonal sequence of non-zero vectors, then
S is independent. Furthermore
_
v
1
|v
1
|
, ,
v
n
|v
n
|
_
is orthonormal.
Proof Suppose v
1
r
1
+ +v
n
r
n
= 0
¯
. Then 0 = (v
1
r
1
+ +v
n
r
n
) v
i
= r
i
(v
i
v
i
)
and thus r
i
= 0. Thus S is independent. The second statement is transparent.
It is easy to define an inner product, as is shown by the following theorem.
Theorem Suppose V is a real vector space with a basis S = ¦v
1
, .., v
n
¦. Then
there is a unique inner product on V which makes S an orthornormal basis. It is
given by the formula (v
1
r
1
+ +v
n
r
n
) (v
1
s
1
+ +v
n
s
n
) = r
1
s
1
+ +r
n
s
n
.
Convention R
n
will be assumed to have the standard inner product defined by
(r
1
, .., r
n
) (s
1
, .., s
n
) = r
1
s
1
+ +r
n
s
n
. S = ¦e
1
, .., e
n
¦ will be called the canonical or
standard orthonormal basis. The next theorem shows that this inner product has an
amazing geometry.
Theorem If u, u ∈ R
n
, u v = |u||v| cos Θ where Θ is the angle between u
100 Linear Algebra Chapter 5
and v.
Proof Let u = (r
1
, .., r
n
) and v = (s
1
, .., s
n
). By the law of cosines |u − v|
2
=
|u|
2
+|v|
2
−2|u||v| cos Θ. So (r
1
−s
1
)
2
+ +(r
n
−s
n
)
2
= r
2
1
+ +r
2
n
+ s
2
1
+
+s
2
n
−2|u||v| cos Θ. Thus r
1
s
1
+ +r
n
s
n
= |u||v| cos Θ.
Exercise This is a simple exercise to observe that hyperplanes in R
n
are cosets.
Suppose f : R
n
→R is a non-zero homomorphism given by a matrix A = (a
1
, .., a
n
) ∈
R
1,n
. Then L = ker(f) is the set of all solutions to a
1
x
1
+ +a
n
x
n
= 0, i.e., the
set of all vectors perpendicular to A. Now suppose b ∈ R and C =
_
_
_
_
_
c
1

c
n
_
_
_
_
_
∈ R
n
has f(C) = b. Then f
−1
(b) is the set of all solutions to a
1
x
1
+ +a
n
x
n
= b which
is the coset L+C, and this the set of all solutions to a
1
(x
1
−c
1
) + +a
n
(x
n
−c
n
) = 0.
Gram-Schmidt orthonormalization
Theorem (Fourier series) Suppose W is an IPS with an orthonormal basis
¦w
1
, .., w
n
¦. Then if v ∈ W, v = w
1
(v w
1
) + +w
n
(v w
n
).
Proof v = w
1
r
1
+ +w
n
r
n
and v w
i
= (w
1
r
1
+ +w
n
r
n
) w
i
= r
i
Theorem Suppose W is an IPS, Y ⊂ W is a subspace with an orthonormal basis
¦w
1
, .., w
k
¦, and v ∈ W−Y . Define the projection of v onto Y by p(v) = w
1
(v w
1
)+
+w
k
(vw
k
), and let w = v−p(v). Then (ww
i
) = (v−w
1
(vw
1
)−w
k
(vw
k
))w
i
= 0.
Thus if w
k+1
=
w
|w|
, then ¦w
1
, .., w
k+1
¦ is an orthonormal basis for the subspace
generated by ¦w
1
, .., w
k
, v¦. If ¦w
1
, .., w
k
, v¦ is already orthonormal, w
k+1
= v.
Theorem (Gram-Schmidt) Suppose W is an IPS with a basis ¦v
1
, .., v
n
¦.
Then W has an orthonormal basis ¦w
1
, .., w
n
¦. Moreover, any orthonormal sequence
in W extends to an orthonormal basis of W.
Proof Let w
1
=
v
1
|v
1
|
. Suppose inductively that ¦w
1
, .., w
k
¦ is an orthonormal
basis for Y , the subspace generated by ¦v
1
, .., v
k
¦. Let w = v
k+1
− p(v
k+1
) and
Chapter 5 Linear Algebra 101
w
k+1
=
w
|w|
. Then by the previous theorem, ¦w
1
, .., w
k+1
¦ is an orthonormal basis
for the subspace generated by ¦w
1
, .., w
k
, v
k+1
¦. In this manner an orthonormal basis
for W is constructed.
Now suppose W has dimension n and ¦w
1
, .., w
k
¦ is an orthonormal sequence in
W. Since this sequence is independent, it extends to a basis ¦w
1
, .., w
k
, v
k+1
, .., v
n
¦.
The process above may be used to modify this to an orthonormal basis ¦w
1
, .., w
n
¦.
Exercise Let f : R
3
→ R be the homomorphism defined by the matrix (2,1,3).
Find an orthonormal basis for the kernel of f. Find the projection of (e
1
+ e
2
) onto
ker(f). Find the angle between e
1
+ e
2
and the plane ker(f).
Exercise Let W = R
3
have the standard inner product and Y ⊂ W be the
subspace generated by ¦w
1
, w
2
¦ where w
1
= (1, 0, 0) and w
2
= (0, 1, 0). W is
generated by the sequence ¦w
1
, w
2
, v¦ where v = (1, 2, 3). As in the first theorem
of this section, let w = v − p(v), where p(v) is the projection of v onto Y , and set
w
3
=
w
|w|
. Find w
3
and show that for any t with 0 ≤ t ≤ 1, ¦w
1
, w
2
, (1 −t)v +tw
3
¦
is a basis for W. This is a key observation for a future exercise showing O(n) is a
deformation retract of Gl
n
(R).
Isometries Suppose each of U and V is an IPS. A homomorphism f : U → V
is said to be an isometry provided it is an isomorphism and for any u
1
, u
2
in U,
(u
1
u
2
)
U
= (f(u
1
) f(u
2
))
V
.
Theorem Suppose each of U and V is an n-dimensional IPS, ¦u
1
, .., u
n
¦ is an
orthonormal basis for U, and f : U →V is a homomorphism. Then f is an isometry
iff ¦f(u
1
), .., f(u
n
)¦ is an orthonormal sequence in V .
Proof Isometries certainly preserve orthonormal sequences. So suppose S =
¦f(u
1
), .., f(u
n
)¦ is an orthonormal sequence in V . Then S is independent and thus
S is a basis and thus f is an isomorphism. It is easy to check that f preserves inner
products.
We now come to one of the definitive theorems in linear algebra. It is that, up to
isometry, there is only one inner product space for each dimension.
102 Linear Algebra Chapter 5
Theorem Suppose each of U and V is an n-dimensional IPS. Then ∃ an isometry
f : U →V . In particular, U is isometric to R
n
with its standard inner product.
Proof There exist orthonormal bases ¦u
1
, .., u
n
¦ for U and ¦v
1
, .., v
n
¦ for V .
Now there exists a homomorphism f : U → V with f(u
i
) = v
i
, and by the
previous theorem, f is an isometry.
Exercise Let f : R
3
→ R be the homomorphism defined by the matrix (2,1,3).
Find a linear transformation h : R
2
→R
3
which gives an isometry from R
2
to ker(f).
Orthogonal Matrices
As noted earlier, linear algebra is not so much the study of vector spaces as it is
the study of endomorphisms. We now wish to study isometries from R
n
to R
n
.
We know from a theorem on page 90 that an endomorphism preserves volume iff
its determinant is ±1. Isometries preserve inner product, and thus preserve angle and
distance, and so certainly preserve volume.
Theorem Suppose A ∈ R
n
and f : R
n
→ R
n
is the homomorphism defined by
f(B) = AB. Then the following are equivalent.
1) The columns of A form an orthonormal basis for R
n
, i.e., A
t
A = I.
2) The rows of A form an orthonormal basis for R
n
, i.e., AA
t
= I.
3) f is an isometry.
Proof A left inverse of a matrix is also a right inverse (see the exercise on
page 64). Thus 1) and 2) are equivalent because each of them says A is invert-
ible with A
−1
= A
t
. Now ¦e
1
, .., e
n
¦ is the canonical orthonormal basis for R
n
, and
f(e
i
) is column i of A. Thus by the previous section, 1) and 3) are equivalent.
Definition If A ∈ R
n
satisfies these three conditions, A is said to be orthogonal.
The set of all such A is denoted by O(n), and is called the orthogonal group.
Theorem
1) If A is orthogonal, [ A [= ±1.
2) If A is orthogonal, A
−1
is orthogonal. If A and C are orthogonal, AC is
orthogonal. Thus O(n) is a multiplicative subgroup of Gl
n
(R).
Chapter 5 Linear Algebra 103
3) Suppose A is orthogonal and f is defined by f(B) = AB. Then f preserves
distances and angles. This means that if u, v ∈ R
n
, |u −v| =
|f(u)−f(v)| and the angle between u and v is equal to the angle between
f(u) and f(v).
Proof Part 1) follows from [A[
2
= [A[ [A
t
[ = [I[ = 1. Part 2) is imme-
diate, because isometries clearly form a subgroup of the multiplicative group of
all automorphisms. For part 3) assume f : R
n
→ R
n
is an isometry. Then
|u − v|
2
= (u − v) (u − v) = f(u − v) f(u − v) = |f(u − v)|
2
= |f(u) − f(v)|
2
.
The proof that f preserves angles follows from u v = |u||v|cosΘ.
Exercise Show that if A ∈ O(2) has [A[ = 1, then A =
_
cosΘ −sinΘ
sinΘ cosΘ
_
for
some number Θ. (See the exercise on page 56.)
Exercise (topology) Let R
n
≈ R
n
2
have its usual metric topology. This means
a sequence of matrices ¦A
i
¦ converges to A iff it converges coordinatewise. Show
Gl
n
(R) is an open subset and O(n) is closed and compact. Let h : Gl
n
(R) →
O(n) be defined by Gram-Schmidt. Show H : Gl
n
(R) [0, 1] → Gl
n
(R) defined by
H(A, t) = (1 −t)A + th(A) is a deformation retract of Gl
n
(R) to O(n).
Diagonalization of Symmetric Matrices
We continue with the case F = R. Our goals are to prove that, if A is a symmetric
matrix, all of its eigenvalues are real and that ∃ an orthogonal matrix C such that
C
−1
AC is diagonal. As background, we first note that symmetric is the same as
self-adjoint.
Theorem Suppose A ∈ R
n
and u, v ∈ R
n
. Then (A
t
u) v = u (Av).
Proof Suppose y, z ∈ R
n
. Then the dot product y z is the matrix product y
t
z.
Thus (A
t
u) v = (u
t
A)v = u
t
(Av) = u (Av).
Definition Suppose A ∈ R
n
. A is said to be symmetric provided A
t
= A. Note
that any diagonal matrix is symmetric. A is said to be self-adjoint if (Au)v = u(Av)
for all u, v ∈ R
n
. The next theorem is just an exercise using the previous theorem.
Theorem A is symmetric iff A is self-adjoint.
104 Linear Algebra Chapter 5
Theorem Suppose A ∈ R
n
is symmetric. Then ∃ real numbers λ
1
, .., λ
n
(not
necessarily distinct) such that CP
A
(x) = (x − λ
1
)(x − λ
2
) (x − λ
n
). That is, all
the eigenvalues of A are real.
Proof We know CP
A
(x) factors into linears over C. If µ = a + bi is a complex
number, its conjugate is defined by ¯ µ = a −bi. If h : C →C is defined by h(µ) = ¯ µ,
then h is a ring isomorphism which is the identity on R. If w = (a
i,j
) is a complex
matrix or vector, its conjugate is defined by ¯ w = (¯ a
i,j
). Since A ∈ R
n
is a real
symmetric matrix, A = A
t
=
¯
A
t
. Now suppose λ is a complex eigenvalue of A and
v ∈ C
n
is an eigenvector with Av = λv. Then
¯
λ(¯ v
t
v) = (λv)
t
v = (Av)
t
v = (¯ v
t
A)v =
¯ v
t
(Av) = ¯ v
t
(λv) = λ(¯ v
t
v). Thus λ =
¯
λ and λ ∈ R. Or you can define a complex
inner product on C
n
by (w v) = ¯ w
t
v. The proof then reads as
¯
λ(v v) = (λv v) =
(Av v) = (v Av) = (v λv) = λ(v v). Either way, λ is a real number.
We know that eigenvectors belonging to distinct eigenvalues are linearly indepen-
dent. For symmetric matrices, we show more, namely that they are perpendicular.
Theorem Suppose A is symmetric, λ
1
, λ
2
∈ R are distinct eigenvalues of A, and
Au = λ
1
u and Av = λ
2
v. Then u v = 0.
Proof λ
1
(u v) = (Au) v = u (Av) = λ
2
(u v).
Review Suppose A ∈ R
n
and f : R
n
→ R
n
is defined by f(B) = AB. Then A
represents f w.r.t. the canonical orthonormal basis. Let S = ¦v
1
, .., v
n
¦ be another
basis and C ∈ R
n
be the matrix with v
i
as column i. Then C
−1
AC is the matrix
representing f w.r.t. S. Now S is an orthonormal basis iff C is an orthogonal matrix.
Summary Representing f w.r.t. an orthonormal basis is the same as conjugating
A by an orthogonal matrix.
Theorem Suppose A ∈ R
n
and C ∈ O(n). Then A is symmetric iff C
−1
AC is
symmetric.
Proof Suppose A is symmetric. Then (C
−1
AC)
t
= C
t
A(C
−1
)
t
= C
−1
AC.
The next theorem has geometric and physical implications, but for us, just the
incredibility of it all will suffice.
Chapter 5 Linear Algebra 105
Theorem If A ∈ R
n
, the following are equivalent.
1) A is symmetric.
2) ∃ C ∈ O(n) such that C
−1
AC is diagonal.
Proof By the previous theorem, 2) ⇒ 1). Show 1) ⇒ 2). Suppose A is a
symmetric 2 2 matrix. Let λ be an eigenvalue for A and ¦v
1
, v
2
¦ be an orthonor-
mal basis for R
2
with Av
1
= λv
1
. Then w.r.t this basis, the transformation A is
represented by
_
λ b
0 d
_
. Since this matrix is symmetric, b = 0.
Now suppose by induction that the theorem is true for symmetric matrices in
R
t
for t < n, and suppose A is a symmetric n n matrix. Denote by λ
1
, .., λ
k
the
distinct eigenvalues of A, k ≤ n. If k = n, the proof is immediate, because then there
is a basis of eigenvectors of length 1, and they must form an orthonormal basis. So
suppose k < n. Let v
1
, .., v
k
be eigenvectors for λ
1
, .., λ
k
with each | v
i
|= 1. They
may be extended to an orthonormal basis v
1
, .., v
n
. With respect to this basis,
the transformation A is represented by
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
λ
1

λ
k
_
_
_
_
_
(B)
(0) (D)
_
_
_
_
_
_
_
_
_
_
.
Since this is a symmetric matrix, B = 0 and D is a symmetric matrix of smaller
size. By induction, ∃ an orthogonal C such that C
−1
DC is diagonal. Thus conjugating
by
_
I 0
0 C
_
makes the entire matrix diagonal.
This theorem is so basic we state it again in different terminology. If V is an IPS, a
linear transformation f : V →V is said to be self-adjoint provided (uf(v)) = (f(u)v)
for all u, v ∈ V .
Theorem If V is an n-dimensional IPS and f : V →V is a linear transformation,
then the following are equivalent.
1) f is self-adjoint.
2) ∃ an orthonormal basis ¦v
1
, ..., v
n
¦ for V with each
v
i
an eigenvector of f.
106 Linear Algebra Chapter 5
Exercise Let A =
_
2 2
2 2
_
. Find an orthogonal C such that C
−1
AC is diagonal.
Do the same for A =
_
2 1
1 2
_
.
Exercise Suppose A, D ∈ R
n
are symmetric. Under what conditions are A and D
similar? Show that, if A and D are similar, ∃ an orthogonal C such that D = C
−1
AC.
Exercise Suppose V is an n-dimensional real vector space. We know that V is
isomorphic to R
n
. Suppose f and g are isomorphisms from V to R
n
and A is a subset
of V . Show that f(A) is an open subset of R
n
iff g(A) is an open subset of R
n
. This
shows that V , an algebraic object, has a god-given topology. Of course, if V has
an inner product, it automatically has a metric, and this metric will determine that
same topology. Finally, suppose V and W are finite-dimensional real vector spaces
and h : V →W is a linear transformation. Show that h is continuous.
Exercise Define E : C
n
→C
n
by E(A) = e
A
= I +A+(1/2!)A
2
+. This series
converges and thus E is a well defined function. If AB = BA, then E(A + B) =
E(A)E(B). Since A and −A commute, I = E(0
¯
) = E(A − A) = E(A)E(−A), and
thus E(A) is invertible with E(A)
−1
= E(−A). Furthermore E(A
t
) = E(A)
t
, and
if C is invertible, E(C
−1
AC) = C
−1
E(A)C. Now use the results of this section to
prove the statements below. (For part 1, assume the Jordan form, i.e., assume any
A ∈ C
n
is similar to a lower triangular matrix.)
1) If A ∈ C
n
, then [ e
A
[= e
trace(A)
. Thus if A ∈ R
n
, [ e
A
[= 1
iff trace(A) = 0.
2) ∃ a non-zero matrix N ∈ R
2
with e
N
= I.
3) If N ∈ R
n
is symmetric, then e
N
= I iff N = 0
¯
.
4) If A ∈ R
n
and A
t
= −A, then e
A
∈ O(n).
Chapter 6
Appendix
The five previous chapters were designed for a year undergraduate course in algebra.
In this appendix, enough material is added to form a basic first year graduate course.
Two of the main goals are to characterize finitely generated abelian groups and to
prove the Jordan canonical form. The style is the same as before, i.e., everything is
right down to the nub. The organization is mostly a linearly ordered sequence except
for the last two sections on determinants and dual spaces. These are independent
sections added on at the end.
Suppose R is a commutative ring. An R-module M is said to be cyclic if it can
be generated by one element, i.e., M ≈ R/I where I is an ideal of R. The basic
theorem of this chapter is that if R is a Euclidean domain and M is a finitely generated
R-module, then M is the sum of cyclic modules. Thus if M is torsion free, it is a
free R-module. Since Z is a Euclidean domain, finitely generated abelian groups
are the sums of cyclic groups.
Now suppose F is a field and V is a finitely generated F-module. If T : V →V is
a linear transformation, then V becomes an F[x]-module by defining vx = T(v). Now
F[x] is a Euclidean domain and so V
F[x]
is the sum of cyclic modules. This classical
and very powerful technique allows an easy proof of the canonical forms. There is a
basis for V so that the matrix representing T is in Rational canonical form. If the
characteristic polynomial of T factors into the product of linear polynomials, then
there is a basis for V so that the matrix representing T is in Jordan canonical form.
This always holds if F = C. A matrix in Jordan form is a lower triangular matrix
with the eigenvalues of T displayed on the diagonal, so this is a powerful concept.
In the chapter on matrices, it is stated without proof that the determinant of the
product is the product of the determinants. A proof of this, which depends upon the
classification of certain types of alternating multilinear forms, is given in this chapter.
The final section gives the fundamentals of dual spaces.
107
108 Appendix Chapter 6
The Chinese Remainder Theorem
On page 50 in the chapter on rings, the Chinese Remainder Theorem was proved
for the integers. Here it is presented in full generality. Surprisingly, the theorem holds
even for non-commutative rings.
Definition Suppose R is a ring and A
1
, A
2
, ..., A
m
are ideals of R. Then the sum
A
1
+ A
2
+ + A
m
is the set of all a
1
+ a
2
+ + a
m
with a
i
∈ A
i
. The product
A
1
A
2
A
m
is the set of all finite sums of elements a
1
a
2
a
m
with a
i
∈ A
i
. Note
that the sum and product of ideals are ideals and A
1
A
2
A
m
⊂ (A
1
∩A
2
∩ ∩A
m
).
Definition Ideals A and B of R are said to be comaximal if A + B = R.
Theorem If A and B are ideals of a ring R, then the following are equivalent.
1) A and B are comaximal.
2) ∃ a ∈ A and b ∈ B with a + b = 1
¯
.
3) π(A) = R/B where π : R →R/B is the projection.
Theorem If A
1
, A
2
, ..., A
m
and B are ideals of R with A
i
and B comaximal for
each i, then A
1
A
2
A
m
and B are comaximal. Thus A
1
∩ A
2
∩ ∩ A
m
and B are
comaximal.
Proof Consider π : R →R/B. Then π(A
1
A
2
A
m
) = π(A
1
)π(A
2
) π(A
m
) =
(R/B)(R/B) (R/B) = R/B.
Chinese Remainder Theorem Suppose A
1
, A
2
, ..., A
n
are pairwise comaximal
ideals of R, with each A
i
,= R. Then π : R →R/A
1
R/A
2
R/A
n
is a surjective
ring homomorphism with kernel A
1
∩ A
2
∩ ∩ A
n
.
Proof There exists a
i
∈ A
i
and b
i
∈ A
1
A
2
A
i−1
A
i+1
A
n
with a
i
+b
i
= 1
¯
. Note
that π(b
i
) = (0, 0, .., 1
¯
i
, 0, .., 0). If (r
1
+ A
1
, r
2
+ A
2
, ..., r
n
+ A
n
) is an element of the
range, it is the image of r
1
b
1
+r
2
b
2
++r
n
b
n
= r
1
(1
¯
−a
1
)+r
2
(1
¯
−a
2
)++r
n
(1
¯
−a
n
).
Theorem If R is commutative and A
1
, A
2
, ..., A
n
are pairwise comaximal ideals
of R, then A
1
A
2
A
n
= A
1
∩ A
2
∩ ∩ A
n
.
Proof for n = 2. Show A
1
∩A
2
⊂ A
1
A
2
. ∃ a
1
∈ A
1
and a
2
∈ A
2
with a
1
+a
2
= 1
¯
.
If c ∈ A
1
∩ A
2
, then c = c(a
1
+ a
2
) ∈ A
1
A
2
.
Chapter 6 Appendix 109
Prime and Maximal Ideals and UFD
s
In the first chapter on background material, it was shown that Z is a unique
factorization domain. Here it will be shown that this property holds for any principle
ideal domain. Later on it will be shown that every Euclidean domain is a principle
ideal domain. Thus every Euclidean domain is a unique factorization domain.
Definition Suppose R is a commutative ring and I ⊂ R is an ideal.
I is prime means I ,= R and if a, b ∈ R have ab ∈ I, then a or b ∈ I.
I is maximal means I ,= R and there are no ideals properly between I and R.
Theorem 0
¯
is a prime ideal of R iff R is
0
¯
is a maximal ideal of R iff R is
Theorem Suppose J ⊂ R is an ideal, J ,= R.
J is a prime ideal iff R/J is
J is a maximal ideal iff R/J is
Corollary Maximal ideals are prime.
Proof Every field is a domain.
Theorem If a ∈ R is not a unit, then ∃ a maximal ideal I of R with a ∈ I.
Proof This is a classical application of the Hausdorff Maximality Principle. Con-
sider ¦J : J is an ideal of R containing a with J ,= R¦. This collection contains a
maximal monotonic collection ¦V
t
¦
t∈T
. The ideal V =
_
t∈T
V
t
does not contain 1
¯
and
thus is not equal to R. Therefore V is equal to some V
t
and is a maximal ideal
containing a.
Note To properly appreciate this proof, the student should work the exercise on
group theory at the end of this section.
Definition Suppose R is a domain and a, b ∈ R. Then we say a ∼ b iff there
exists a unit u with au = b. Note that ∼ is an equivalence relation. If a ∼ b, then a
110 Appendix Chapter 6
and b are said to be associates.
Examples If R is a domain, the associates of 1
¯
are the units of R, while the only
associate of 0
¯
is 0
¯
itself. If n ∈ Z is not zero, then its associates are n and −n.
If F is a field and g ∈ F[x] is a non-zero polynomial, then the associates of g are
all cg where c is a non-zero constant.
The following theorem is elementary, but it shows how associates fit into the
scheme of things. An element a divides b (a[b) if ∃! c ∈ R with ac = b.
Theorem Suppose R is a domain and a, b ∈ (R − 0
¯
). Then the following are
equivalent.
1) a ∼ b.
2) a[b and b[a.
3) aR = bR.
Parts 1) and 3) above show there is a bijection from the associate classes of R to
the principal ideals of R. Thus if R is a PID, there is a bijection from the associate
classes of R to the ideals of R. If an element generates a non-zero prime ideal, it is
called a prime element.
Definition Suppose R is a domain and a ∈ R is a non-zero non-unit.
1) a is irreducible if it does not factor, i.e., a = bc ⇒ b or c is a unit.
2) a is prime if it generates a prime ideal, i.e., a[bc ⇒ a[b or a[c.
Note If a is a prime and a[c
1
c
2
c
n
, then a[c
i
for some i. This follows from the
definition and induction on n. If each c
j
is irreducible, then a ∼ c
i
for some i.
Note If a ∼ b, then a is irreducible (prime) iff b is irreducible (prime). In other
words, if a is irreducible (prime) and u is a unit, then au is irreducible (prime).
Note a is prime ⇒a is irreducible. This is immediate from the definitions.
Theorem Factorization into primes is unique up to order and associates, i.e., if
a = b
1
b
2
b
n
= c
1
c
2
c
m
with each b
i
, c
i
prime, then n = m and for some permutation
σ of the indices, b
i
and c
σ(i)
are associates for every i.
Chapter 6 Appendix 111
Proof This follows from the notes above.
Definition R is a factorization domain (FD) means that R is a domain and if a is
a non-zero non-unit element of R, then a factors into a finite product of irreducibles.
R is a unique factorization domain (UFD) means R is a FD and factorization is unique
(up to order and associates).
Theorem If R is a UFD and a is a non-zero non-unit of R, then a is irreducible
⇔ a is prime. Thus in a UFD, elements factor as the product of primes.
Proof Suppose R is a UFD, a is an irreducible element of R, and a[bc. If either
b or c is a unit or is zero, then a divides one of them, so suppose each of b and c is
a non-zero non-unit element of R. There exists an element d with ad = bc. Each of
b and c factors as the product of irreducibles and the product of these products is
the factorization of bc. It follows from the uniqueness of the factorization of ad = bc,
that one of these irreducibles is an associate of a, and thus a[b or a[c. Therefore
the element a is a prime.
Theorem Suppose R is a FD. Then the following are equivalent.
1) R is a UFD.
2) Every irreducible element is prime, i.e., a irreducible ⇔ a is prime.
Proof We already know 1) ⇒ 2). Part 2) ⇒ 1) because factorization into primes
is always unique.
This is a revealing and useful theorem. If R is a FD, then R is a UFD iff each
irreducible element generates a prime ideal. Fortunately, principal ideal domains
have this property, as seen in the next theorem.
Theorem Suppose R is a PID and a ∈ R is non-zero non-unit. Then the following
are equivalent.
1) aR is a maximal ideal.
2) aR is a prime ideal, i.e., a is a prime element.
3) a is irreducible.
Proof Every maximal ideal is a prime ideal, so 1) ⇒ 2). Every prime element is
an irreducible element, so 2) ⇒ 3). Now suppose a is irreducible and show aR is a
maximal ideal. If I is an ideal containing aR, ∃ b ∈ R with I = bR. Since b divides
a, b is a unit or an associate of a. This means I = R or I = aR.
112 Appendix Chapter 6
Our goal is to prove that a PID is a UFD. Using the two theorems above, it
only remains to show that a PID is a FD. The proof will not require that ideals be
principally generated, but only that they be finitely generated. This turns out to
be equivalent to the property that any collection of ideals has a “maximal” element.
We shall see below that this is a useful concept which fits naturally into the study of
unique factorization domains.
Theorem Suppose R is a commutative ring. Then the following are equivalent.
1) If I ⊂ R is an ideal, ∃ a finite set ¦a
1
, a
2
, ..., a
n
¦ ⊂ R such that I =
a
1
R + a
2
R + + a
n
R, i.e., each ideal of R is finitely generated.
2) If ¦I
t
¦
t∈T
is a collection of ideals, ∃ t
0
∈ T such that if t is any element in T
with I
t
⊃ I
t
0
, then I
t
= I
t
0
. (The ideal I
t
0
is maximal only in the sense
described. It need not contain all the ideals of the collection, nor need it be
a maximal ideal of the ring R.)
3) If I
1
⊂ I
2
⊂ I
3
⊂ ... is a monotonic sequence of ideals, ∃ t
0
≥ 1 such that I
t
= I
to
for all t ≥ t
0
.
Proof Suppose 1) is true and show 3). The ideal I = I
1
∪ I
2
∪ . . . is finitely
generated and ∃ t
0
≥ 1 such that I
t
0
contains those generators. Thus 3) is true.
Now suppose 1) is false and I ⊂ R is an ideal not finitely generated. Then ∃ a
sequence a
1
, a
2
, . . . of elements in I with a
1
R+a
2
R+ +a
n
R properly contained
in a
1
R + a
2
R + + a
n+1
R for each n ≤ 1. Thus 3) is false and 1) ⇔ 3). The
proof that 2) ⇒3) is immediate. If 2) is false, ∃ a sequence of ideals I
1
⊂ I
2
⊂ . . .
with each inclusion proper. Thus 3) is false and so 2) ⇔ 3).
Definition If R satisfies these properties, R is said to be Noetherian, or it is said
to satisfy the ascending chain condition. This is a useful property satisfied by many
of the classical rings in mathematics. Having three definitions makes this property
easy to use. For example, see the next theorem.
Theorem A Noetherian domain is a FD. In particular, a PID is a FD.
Proof Suppose there is a non-zero non-unit element that does not factor as the
finite product of irreducibles. Consider all ideals dR where d does not factor. Then ∃
a maximal one cR. The element c must be reducible, i.e., c = ab where neither a nor
b is a unit. Each of aR and bR properly contains cR, and so each of a and b factors as
Chapter 6 Appendix 113
a finite product of irreducibles. This gives a finite factorization of c into irreducibles
which is a contradiction.
Corollary A PID is a UFD. So Z is a UFD and if F is a field, F[x] is a UFD.
You see the basic structure of UFD
s
is quite easy. It takes more work to prove
the following theorems, which are stated here only for reference.
Theorem If R is a UFD then R[x
1
, ..., x
n
] is a UFD. Thus if F is a field,
F[x
1
, ..., x
n
] is a UFD. (This theorem goes all the way back to Gauss.)
If R is a PID, then the formal power series R[[x
1
, ..., x
n
]] is a UFD. Thus if F
is a field, F[[x
1
, ..., x
n
]] is a UFD. (There is a UFD R where R[[x]] is not a UFD.
See page 566 of Commutative Algebra by N. Bourbaki.)
Theorem Germs of analytic functions over C form a UFD.
Proof See Theorem 6.6.2 of An Introduction to Complex Analysis in Several Vari-
ables by L. H¨ormander.
Theorem Suppose R is a commutative ring. Then R is Noetherian ⇒R[x
1
, ..., x
n
]
and R[[x
1
, ..., x
n
]] are Noetherian. (This is the famous Hilbert Basis Theorem.)
Theorem If R is Noetherian and I ⊂ R is a proper ideal, then R/I is Noetherian.
(This follows immediately from the definition.)
Note The combination of the last two theorems shows that Noetherian is a ubiq-
uitous property which is satisfied by many of the basic rings in commutative algebra.
Next are presented two of the standard examples of Noetherian domains that are
not unique factorization domains.
Exercise Let R = Z(

5) = ¦n + m

5 : n, m ∈ Z¦. Show that R is a subring of
R which is not a UFD. In particular 2 2 = (1 −

5) (−1 −

5) are two distinct
114 Appendix Chapter 6
irreducible factorizations of 4. Show R is isomorphic to Z[x]/(x
2
−5), where (x
2
−5)
represents the ideal (x
2
− 5)Z[x], and R/(2) is isomorphic to Z
2
[x]/(x
2
− [5]) =
Z
2
[x]/(x
2
+ [1]), which is not a domain.
Exercise Let R = R[x, y, z]/(x
2
− yz). Show x
2
− yz is irreducible and thus
prime in R[x, y, z]. If u ∈ R[x, y, z], let ¯ u ∈ R be the coset containing u. Show R
is not a UFD. In particular ¯ x ¯ x = ¯ y ¯ z are two distinct irreducible factorizations
of ¯ x
2
. Show R/(¯ x) is isomorphic to R[y, z]/(yz), which is not a domain.
Exercise In Group Theory If G is an additive abelian group, a subgroup H
of G is said to be maximal if H ,= G and there are no subgroups properly between
H and G. Show that H is maximal iff G/H ≈ Z
p
for some prime p. For simplicity,
consider the case G = Q. Which one of the following is true?
1) If a ∈ Q, then there is a maximal subgroup H of Q which contains a.
2) Q contains no maximal subgroups.
Splitting Short Exact Sequences
Suppose B is an R-module and K is a submodule of B. As defined in the chapter
on linear algebra, K is a summand of B provided ∃ a submodule L of B with
K+L = B and K∩L = 0
¯
. In this case we write K⊕L = B. When is K a summand
of B? It turns out that K is a summand of B iff there is a splitting map from
B/K to B. In particular, if B/K is free, K must be a summand of B. This is used
below to show that if R is a PID, then every submodule of R
n
is free.
Theorem 1 Suppose R is a ring, B and C are R-modules, and g : B → C is a
surjective homomorphism with kernel K. Then the following are equivalent.
1) K is a summand of B.
2) g has a right inverse, i.e., ∃ a homomorphism h : C →B with g ◦h = I : C →C.
(h is called a splitting map.)
Proof Suppose 1) is true, i.e., suppose ∃ a submodule L of B with K ⊕ L = B.
Then (g[L) : L → C is an isomorphism. If i : L → B is inclusion, then h defined
by h = i ◦ (g[L)
−1
is a right inverse of g. Now suppose 2) is true and h : C → B
is a right inverse of g. Then h is injective, K + h(C) = B and K ∩ h(C) = 0
¯
.
Thus K ⊕h(C) = B.
Chapter 6 Appendix 115
Definition Suppose f : A → B and g : B → C are R-module homomorphisms.
The statement that 0 →A
f
→B
g
→C →0 is a short exact sequence (s.e.s) means
f is injective, g is surjective and f(A) = ker(g). The canonical split s.e.s. is A →
A⊕C →C where f = i
1
and g = π
2
. A short exact sequence is said to split if ∃ an
isomorphism B

→A ⊕C such that the following diagram commutes.
A B C
A ⊕C

f
g
i
1
π
2
¸ ¸
`
`
`
`
`
`

·
We now restate the previous theorem in this terminology.
Theorem 1.1 A short exact sequence 0 → A → B → C → 0 splits iff f(A) is
a summand of B, iff B →C has a splitting map. If C is a free R-module, there is a
splitting map and thus the sequence splits.
Proof We know from the previous theorem f(A) is a summand of B iff B → C
has a splitting map. Showing these properties are equivalent to the splitting of the
sequence is a good exercise in the art of diagram chasing. Now suppose C has a free
basis T ⊂ C, and g : B → C is surjective. There exists a function h : T → B such
that g ◦ h(c) = c for each c ∈ T. The function h extends to a homomorphism from
C to B which is a right inverse of g.
Theorem 2 If R is a commutative ring, then the following are equivalent.
1) R is a PID.
2) Every submodule of R
R
is a free R-module of dimension ≤ 1.
This theorem restates the ring property of PID as a module property. Although
this theorem is transparent, it is a precursor to the following classical result.
Theorem 3 If R is a PID and A ⊂ R
n
is a submodule, then A is a free R-module
of dimension ≤ n. Thus subgroups of Z
n
are free Z-modules of dimension ≤ n.
Proof From the previous theorem we know this is true for n = 1. Suppose n > 1
and the theorem is true for submodules of R
n−1
. Suppose A ⊂ R
n
is a submodule.
116 Appendix Chapter 6
Consider the following short exact sequences, where f : R
n−1
→R
n−1
⊕R is inclusion
and g = π : R
n−1
⊕R →R is the projection.
0 −→R
n−1
f
−→R
n−1
⊕R
π
−→R −→0
0 −→A ∩ R
n−1
−→A −→π(A) −→0
By induction, A∩R
n−1
is free of dimension ≤ n −1. If π(A) = 0
¯
, then A ⊂ R
n−1
. If
π(A) ,= 0
¯
, it is free of dimension 1 and the sequence splits by Theorem 1.1. In either
case, A is a free submodule of dimension ≤ n.
Exercise Let A ⊂ Z
2
be the subgroup generated by ¦(6, 24), (16, 64)¦. Show A
is a free Z-module of dimension 1.
Euclidean Domains
The ring Z possesses the Euclidean algorithm and the polynomial ring F[x] has
the division algorithm. The concept of Euclidean domain is an abstraction of these
properties. The axioms are so miniscule that it is surprising you get this much juice
out of them. However they are exactly what you need, and it is possible to just play
around with matrices and get some deep results. If R is a Euclidean domain and M
is a finitely generated R-module, then M is the sum of cyclic modules. This is one of
the great classical theorems of abstract algebra, and you don’t have to worry about
it becoming obsolete. Here N will denote the set of all non-negative integers, not
just the set of positive integers.
Definition A domain R is a Euclidean domain provided ∃ φ : (R−0
¯
) −→N such
that if a, b ∈ (R −0
¯
), then
1) φ(a) ≤ φ(ab).
2) ∃ q, r ∈ R such that a = bq + r with r = 0
¯
or φ(r) < φ(b).
Examples of Euclidean Domains
Z with φ(n) = [n[.
A field F with φ(a) = 1 ∀ a ,= 0
¯
or with φ(a) = 0 ∀ a ,= 0
¯
.
F[x] where F is a field with φ(f = a
0
+ a
1
x + + a
n
x
n
) = deg(f).
Z[i] = ¦a + bi : a, b ∈ Z¦ = Gaussian integers with φ(a + bi) = a
2
+ b
2
.
Chapter 6 Appendix 117
Theorem 1 If R is a Euclidean domain, then R is a PID and thus a UFD.
Proof If I is a non-zero ideal, then ∃ b ∈ I −0
¯
satisfying φ(b) ≤ φ(i) ∀ i ∈ I −0
¯
.
Then b generates I because if a ∈ I − 0
¯
, ∃ q, r with a = bq + r. Now r ∈ I and
r ,= 0
¯
⇒φ(r) < φ(b) which is impossible. Thus r = 0
¯
and a ∈ bR so I = bR.
Theorem 2 If R is a Euclidean domain and a, b ∈ R −0
¯
, then
φ(1
¯
) is the smallest integer in the image of φ.
a is a unit in R iff φ(a) = φ(1
¯
).
a and b are associates ⇒ φ(a) = φ(b).
Proof This is a good exercise.
The following remarkable theorem is the foundation for the results of this section.
Theorem 3 If R is a Euclidean domain and (a
i,j
) ∈ R
n,t
is a non-zero matrix,
then by elementary row and column operations (a
i,j
) can be transformed to
_
_
_
_
_
_
_
_
_
_
_
_
_
d
1
0 0
0 d
2
.
.
.
.
.
.
d
m
0
0 0
_
_
_
_
_
_
_
_
_
_
_
_
_
where each d
i
,= 0
¯
, and d
i
[d
i+1
for 1 ≤ i < m. Also d
1
generates the ideal of R
generated by the entries of (a
i,j
).
Proof Let I ⊂ R be the ideal generated by the elements of the matrix A = (a
i,j
).
If E ∈ R
n
, then the ideal J generated by the elements of EA has J ⊂ I. If E is
invertible, then J = I. In the same manner, if E ∈ R
t
is invertible and J is the ideal
generated by the elements of AE, then J = I. This means that row and column
operations on A do not change the ideal I. Since R is a PID, there is an element d
1
with I = d
1
R, and this will turn out to be the d
1
displayed in the theorem.
The matrix (a
i,j
) has at least one non-zero element d with φ(d) a miminum.
However, row and column operations on (a
i,j
) may produce elements with smaller
118 Appendix Chapter 6
φ values. To consolidate this approach, consider matrices obtained from (a
i,j
) by a
finite number of row and column operations. Among these, let (b
i,j
) be one which
has an entry d
1
,= 0 with φ(d
1
) a minimum. By elementary operations of type 2, the
entry d
1
may be moved to the (1, 1) place in the matrix. Then d
1
will divide the other
entries in the first row, else we could obtain an entry with a smaller φ value. Thus
by column operations of type 3, the other entries of the first row may be made zero.
In a similar manner, by row operations of type 3, the matrix may be changed to the
following form.
_
_
_
_
_
_
_
_
_
_
_
d
1
0 0
0
.
.
. c
ij
0
_
_
_
_
_
_
_
_
_
_
_
Note that d
1
divides each c
i,j
, and thus I = d
1
R. The proof now follows by induction
on the size of the matrix.
This is an example of a theorem that is easy to prove playing around at the
blackboard. Yet it must be a deep theorem because the next two theorems are easy
consequences.
Theorem 4 Suppose R is a Euclidean domain, B is a finitely generated free R-
module and A ⊂ B is a non-zero submodule. Then ∃ free bases ¦a
1
, a
2
, ..., a
t
¦ for A
and ¦b
1
, b
2
, ..., b
n
¦ for B, with t ≤ n, and such that each a
i
= d
i
b
i
, where each d
i
,= 0
¯
,
and d
i
[d
i+1
for 1 ≤ i < t. Thus B/A ≈ R/d
1
⊕R/d
2
⊕ ⊕R/d
t
⊕R
n−t
.
Proof By Theorem 3 in the section Splitting Short Exact Sequences, A has a
free basis ¦h
1
, h
2
, ..., h
t
¦. Let ¦g
1
, g
2
, ..., g
n
¦ be a free basis for B, where n ≥ t. The
composition
R
t

−→A

−→B

−→R
n
e
i
−→h
i
g
i
−→e
i
is represented by a matrix (a
i,j
) ∈ R
n,t
where h
i
= a
1,i
g
1
+ a
2,i
g
2
+ + a
n,i
g
n
. By
the previous theorem, ∃ invertible matrixes U ∈ R
n
and V ∈ R
t
such that
Chapter 6 Appendix 119
U(a
i,j
)V =
_
_
_
_
_
_
_
_
_
_
_
d
1
0 0
0 d
2
0
.
.
. 0
.
.
.
d
t
0 0
_
_
_
_
_
_
_
_
_
_
_
with d
i
[d
i+1
. Since changing the isomorphisms R
t

−→A and B

−→R
n
corresponds
to changing the bases ¦h
1
, h
2
, ..., h
t
¦ and ¦g
1
, g
2
, ..., g
n
¦, the theorem follows.
Theorem 5 If R is a Euclidean domain and M is a finitely generated R-module,
then M ≈ R/d
1
⊕R/d
2
⊕ ⊕R/d
t
⊕R
m
where each d
i
,= 0
¯
, and d
i
[d
i+1
for 1 ≤ i < t.
Proof By hypothesis ∃ a finitely generated free module B and a surjective homo-
morphism B −→M −→0. Let A be the kernel, so 0 −→A

−→B −→M −→0 is
a s.e.s. and B/A ≈ M. The result now follows from the previous theorem.
The way Theorem 5 is stated, some or all of the elements d
i
may be units, and for
such d
i
, R/d
i
= 0
¯
. If we assume that no d
i
is a unit, then the elements d
1
, d
2
, ..., d
t
are called invariant factors. They are unique up to associates, but we do not bother
with that here. If R = Z and we select the d
i
to be positive, they are unique. If
R = F[x] and we select the d
i
to be monic, then they are unique. The splitting in
Theorem 5 is not the ultimate because the modules R/d
i
may be split into the sum
of other cyclic modules. To prove this we need the following Lemma.
Lemma Suppose R is a PID and b and c are non-zero non-unit elements of R.
Suppose b and c are relatively prime, i.e., there is no prime common to their prime
factorizations. Then bR and cR are comaximal ideals.
Proof There exists and a ∈ R with aR = bR+cR. Since a[b and a[c, a is a unit,
so R = bR +cR.
Theorem 6 Suppose R is a PID and d is a non-zero non-unit element of R.
Let d = p
s
1
1
p
s
2
2
p
st
t
be the prime factorization of d. Then the natural map
R/d

−→R/p
s
1
1
⊕ ⊕ R/p
st
t
is an isomorphism of R-modules. (The elements p
s
i
i
are called elementary divisors of R/d.)
Proof If i ,= j, p
s
i
i
and p
s
j
j
are relatively prime. By the Lemma above, they are
120 Appendix Chapter 6
comaximal and thus by the Chinese Remainder Theorem, the natural map is a ring
isomorphism. Since the natural map is also an R-module homomorphism, it is an
R-module isomorphism.
This theorem carries the splitting as far as it can go, as seen by the next exercise.
Exercise Suppose R is a PID, p ∈ R is a prime element, and s ≥ 1. Then the
R-module R/p
s
has no proper submodule which is a summand.
To give perspective to this section, here is a brief discussion of torsion submodules.
Definition Suppose M is a module over a domain R. An element m ∈ M is said
to be a torsion element if ∃ r ∈ R with r ,= 0
¯
and mr = 0
¯
. This is the same as
saying m is dependent. If R = Z, it is the same as saying m has finite order. Denote
by T(M) the set of all torsion elements of M. If T(M) = 0
¯
, we say that M is torsion
free.
Theorem 7 Suppose M is a module over a domain R. Then T(M) is a submodule
of M and M/T(M) is torsion free.
Proof This is a simple exercise.
Theorem 8 Suppose R is a Euclidean domain and M is a finitely generated
R-module which is torsion free. Then M is a free R-module, i.e., M ≈ R
m
.
Proof This follows immediately from Theorem 5.
Theorem 9 Suppose R is a Euclidean domain and M is a finitely generated
R-module. Then the following s.e.s. splits.
0 −→T(M) −→M −→M/T(M) −→0
Proof By Theorem 7, M/T(M) is torsion free. By Theorem 8, M/T(M) is a free
R-module, and thus there is a splitting map. Of course this theorem is transparent
anyway, because Theorem 5 gives a splitting of M into a torsion part and a free part.
Chapter 6 Appendix 121
Note It follows from Theorem 9 that ∃ a free submodule V of M such that T(M)⊕
V = M. The first summand T(M) is unique, but the complementary summand V is
not unique. V depends upon the splitting map and is unique only up to isomorphism.
To complete this section, here are two more theorems that follow from the work
we have done.
Theorem 10 Suppose T is a domain and T

is the multiplicative group of units
of T. If G is a finite subgroup of T

, then G is a cyclic group. Thus if F is a finite
field, the multiplicative group F

is cyclic. Thus if p is a prime, (Z
p
)

is cyclic.
Proof This is a corollary to Theorem 5 with R = Z. The multiplicative group G
is isomorphic to an additive group Z/d
1
⊕ Z/d
2
⊕ ⊕ Z/d
t
where each d
i
> 1 and
d
i
[d
i+1
for 1 ≤ i < t. Every g in the additive group has the property that gd
t
= 0
¯
. So
every u ∈ G is a solution to x
dt
− 1
¯
= 0
¯
. If t > 1, the equation will have degree less
than the number of roots, which is impossible. Thus t = 1 and G is cyclic.
Exercise For which primes p and q is the group of units (Z
p
Z
q
)

a cyclic group?
We know from Exercise 2) on page 59 that an invertible matrix over a field is the
product of elementary matrices. This result also holds for any invertible matrix over
a Euclidean domain.
Theorem 11 Suppose R is a Euclidean domain and A ∈ R
n
is a matrix with
non-zero determinant. Then by elementary row and column operations, A may be
transformed to a diagonal matrix
_
_
_
_
_
_
d
1
0
d
2
.
.
.
0 d
n
_
_
_
_
_
_
where each d
i
,= 0
¯
and d
i
[d
i+1
for 1 ≤ i < n. Also d
1
generates the ideal generated
by the entries of A. Furthermore A is invertible iff each d
i
is a unit. Thus if A is
invertible, A is the product of elementary matrices.
122 Appendix Chapter 6
Proof It follows from Theorem 3 that A may be transformed to a diagonal matrix
with d
i
[d
i+1
. Since the determinant of A is not zero, it follows that each d
i
,= 0
¯
.
Furthermore, the matrix A is invertible iff the diagonal matrix is invertible, which is
true iff each d
i
is a unit. If each d
i
is a unit, then the diagonal matrix is the product
of elementary matrices of type 1. Therefore if A is invertible, it is the product of
elementary matrices.
Exercise Let R = Z, A =
_
3 11
0 4
_
and D =
_
3 11
1 4
_
. Perform elementary
operations on A and D to obtain diagonal matrices where the first diagonal element
divides the second diagonal element. Write D as the product of elementary matrices.
Find the characteristic polynomials of A and D. Find an elementary matrix B over
Z such that B
−1
AB is diagonal. Find an invertible matrix C in R
2
such that C
−1
DC
is diagonal. Show C cannot be selected in Q
2
.
Jordan Blocks
In this section, we define the two special types of square matrices used in the
Rational and Jordan canonical forms. Note that the Jordan block B(q) is the sum
of a scalar matrix and a nilpotent matrix. A Jordan block displays its eigenvalue
on the diagonal, and is more interesting than the companion matrix C(q). But as
we shall see later, the Rational canonical form will always exist, while the Jordan
canonical form will exist iff the characteristic polynomial factors as the product of
linear polynomials.
Suppose R is a commutative ring, q = a
0
+ a
1
x + + a
n−1
x
n−1
+ x
n
∈ R[x]
is a monic polynomial of degree n ≥ 1, and V is the R[x]-module V = R[x]/q.
V is a torsion module over the ring R[x], but as an R-module, V has a free basis
¦1, x, x
2
, . . . , x
n−1
¦. (See the division algorithm in the chapter on rings.) Multipli-
cation by x defines an R-module endomorphism on V , and C(q) will be the ma-
trix of this endomorphism with respect to this basis. Let T : V → V be defined
by T(v) = vx. If h(x) ∈ R[x], h(T) is the R-module homomorphism given by
multiplication by h(x). The homomorphism from R[x]/q to R[x]/q given by
multiplication by h(x), is zero iff h(x) ∈ qR[x]. That is to say q(T) = a
0
I +a
1
T + +
T
n
is the zero homomorphism, and h(T) is the zero homomorphism iff h(x) ∈ qR[x].
Theorem Let V have the free basis ¦1, x, x
2
, ..., x
n−1
¦. The companion matrix
Chapter 6 Appendix 123
representing T is
C(q) =
_
_
_
_
_
_
_
_
_
0 . . . . . . 0 −a
0
1 0 . . . 0 −a
1
0 1 0 −a
2
.
.
.
.
.
.
.
.
.
.
.
.
0 . . . . . . 1 −a
n−1
_
_
_
_
_
_
_
_
_
The characteristic polynomial of C(q) is q, and [C(q)[ = (−1)
n
a
0
. Finally, if h(x) ∈
R[x], h(C(q)) is zero iff h(x) ∈ qR[x].
Theorem Suppose λ ∈ R and q(x) = (x − λ)
n
. Let V have the free basis
¦1, (x −λ), (x −λ)
2
, . . . , (x −λ)
n−1
¦. Then the matrix representing T is
B(q) =
_
_
_
_
_
_
_
_
_
λ 0 . . . . . . 0
1 λ 0 . . . 0
0 1 λ
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 . . . . . . 1 λ
_
_
_
_
_
_
_
_
_
The characteristic polynomial of B(q) is q, and [B(q)[ = λ
n
= (−1)
n
a
0
. Finally, if
h(x) ∈ R[x], h(B(q)) is zero iff h(x) ∈ qR[x].
Note For n = 1, C(a
0
+ x) = B(a
0
+ x) = (−a
0
). This is the only case where a
block matrix may be the zero matrix.
Note In B(q), if you wish to have the 1
s
above the diagonal, reverse the order of
the basis for V .
Jordan Canonical Form
We are finally ready to prove the Rational and Jordan forms. Using the previous
sections, all that’s left to do is to put the pieces together. (For an overview of Jordan
form, see the section in Chapter 5.)
124 Appendix Chapter 6
Suppose R is a commutative ring, V is an R-module, and T : V → V is an
R-module homomorphism. Define a scalar multiplication V R[x] → V by
v(a
0
+ a
1
x + + a
r
x
r
) = va
0
+ T(v)a
1
+ + T
r
(v)a
r
.
Theorem 1 Under this scalar multiplication, V is an R[x]-module.
This is just an observation, but it is one of the great tricks in mathematics.
Questions about the transformation T are transferred to questions about the module
V over the ring R[x]. And in the case R is a field, R[x] is a Euclidean domain and so
we know almost everything about V as an R[x]-module.
Now in this section, we suppose R is a field F, V is a finitely generated F-module,
T : V →V is a linear transformation and V is an F[x]-module with vx = T(v). Our
goal is to select a basis for V such that the matrix representing T is in some simple
form. A submodule of V
F[x]
is a submodule of V
F
which is invariant under T. We
know V
F[x]
is the sum of cyclic modules from Theorems 5 and 6 in the section on
Euclidean Domains. Since V is finitely generated as an F-module, the free part of
this decomposition will be zero. In the section on Jordan Blocks, a basis is selected
for these cyclic modules and the matrix representing T is described. This gives the
Rational Canonical Form and that is all there is to it. If all the eigenvalues for T are
in F, we pick another basis for each of the cyclic modules (see the second theorem in
the section on Jordan Blocks). Then the matrix representing T is called the Jordan
Canonical Form. Now we say all this again with a little more detail.
From Theorem 5 in the section on Euclidean Domains, it follows that
V
F[x]
≈ F[x]/d
1
⊕F[x]/d
2
⊕ ⊕F[x]/d
t
where each d
i
is a monic polynomial of degree ≥ 1, and d
i
[d
i+1
. Pick ¦1, x, x
2
, . . . , x
m−1
¦
as the F-basis for F[x]/d
i
where m is the degree of the polynomial d
i
.
Theorem 2 With respect to this basis, the matrix representing T is
_
_
_
_
_
_
_
_
_
_
_
C(d
1
)
C(d
2
)
.
.
.
C(d
t
)
_
_
_
_
_
_
_
_
_
_
_
Chapter 6 Appendix 125
The characteristic polynomial of T is p = d
1
d
2
d
t
and p(T) = 0
¯
. This is a type of
canonical form but it does not seem to have a name.
Now we apply Theorem 6 to each F[x]/d
i
. This gives V
F[x]
≈ F[x]/p
s
1
1
⊕ ⊕
F[x]/p
sr
r
where the p
i
are irreducible monic polynomials of degree at least 1. The p
i
need not be distinct. Pick an F-basis for each F[x]/p
s
i
i
as before.
Theorem 3 With respect to this basis, the matrix representing T is
_
_
_
_
_
_
_
_
_
C(p
s
1
1
)
C(p
s
2
2
) 0
0
.
.
.
C(p
sr
r
)
_
_
_
_
_
_
_
_
_
The characteristic polynomial of T is p = p
s
1
1
p
sr
r
and p(T) = 0
¯
. This is called the
Rational canonical form for T.
Now suppose the characteristic polynomial of T factors in F[x] as the product of
linear polynomials. Thus in the Theorem above, p
i
= x −λ
i
and
V
F[x]
≈ F[x]/(x −λ
1
)
s
1
⊕ ⊕F[x]/(x −λ
r
)
sr
is an isomorphism of F[x]-modules. Pick ¦1, (x − λ
i
), (x − λ
i
)
2
, . . . , (x − λ
i
)
m−1
¦ as
the F-basis for F[x]/(x −λ
i
)
s
i
where m is s
i
.
Theorem 4 With respect to this basis, the matrix representing T is
_
_
_
_
_
_
_
_
_
_
_
_
_
B((x −λ
1
)
s
1
)
0
B((x −λ
2
)
s
2
)
0
.
.
.
B((x −λ
r
)
sr
)
_
_
_
_
_
_
_
_
_
_
_
_
_
126 Appendix Chapter 6
The characteristic polynomial of T is p = (x−λ
1
)
s
1
(x−λ
r
)
sr
and p(T) = 0
¯
. This
is called the Jordan canonical form for T. Note that the λ
i
need not be distinct.
Note A diagonal matrix is in Rational canonical form and in Jordan canonical
form. This is the case where each block is one by one. Of course a diagonal matrix
is about as canonical as you can get.
Exercise This section is loosely written, so it is important to use the transpose
principle to write three other versions of the last two theorems.
Exercise Suppose F is a field of characteristic 0 and T ∈ F
n
has trace(T
i
) = 0
¯
for 0 < i ≤ n. Show T is nilpotent. Let p ∈ F[x] be the characteristic polynomial of
T. The polynomial p may not factor into linears in F[x], and thus T may have no
conjugate in F
n
which is in Jordan form. However this exercise can still be worked
using Jordan form. This is based on the fact that there exists a field
¯
F containing F
as a subfield, such that p factors into linears in
¯
F[x]. This fact is not proved in this
book, but it is assumed for this exercise. So ∃ an invertible matrix U ∈
¯
F
n
so that
U
−1
TU is in Jordan form, and of course, T is nilpotent iff U
−1
TU is nilpotent. The
point is that it sufficies to consider the case where T is in Jordan form, and to show
the diagonal elements are all zero.
So suppose T is in Jordan form and trace (T
i
) = 0
¯
for 1 ≤ i ≤ n. Thus trace
(p(T)) = a
0
n where a
0
is the constant term of p(x). We know p(T) = 0
¯
and thus
trace (p(T)) = 0
¯
, and thus a
0
n = 0
¯
. Since the field has characteristic 0, a
0
= 0
¯
and so 0
¯
is an eigenvalue of T. This means that one block of T is a strictly lower
triangular matrix. Removing this block leaves a smaller matrix which still satisfies
the hypothesis, and the result follows by induction on the size of T. This exercise
illustrates the power and facility of Jordan form. It also has a cute corollary.
Corollary Suppose F is a field of characteristic 0, n ≥ 1, and (λ
1
, λ
2
, .., λ
n
) ∈ F
n
satisfies λ
i
1
+ λ
i
2
+ +λ
i
n
= 0
¯
for each 1 ≤ i ≤ n. Then λ
i
= 0
¯
for 1 ≤ i ≤ n.
To conclude this section here are a few comments on the minimal polynomial of a
linear transformation. This part should be studied only if you need it. Suppose V is
an n-dimensional vector space over a field F and T : V →V is a linear transformation.
As before we make V a module over F[x] with T(v) = vx.
Chapter 6 Appendix 127
Definition Ann(V
F[x]
) is the set of all h ∈ F[x] which annihilate V , i.e., which
satisfy V h = 0
¯
. This is a non-zero ideal of F[x] and is thus generated by a unique
monic polynomial u(x) ∈ F(x), Ann(V
F[x]
) = uF[x]. The polynomial u is called the
minimal polynomial of T. Note that u(T) = 0
¯
and if h(x) ∈ F[x], h(T) = 0
¯
iff h is a
multiple of u in F[x]. If p(x) ∈ F[x] is the characteristic polynomial of T, p(T) = 0
¯
and thus p is a multiple of u.
Now we state this again in terms of matrices. Suppose A ∈ F
n
is a matrix
representing T. Then u(A) = 0
¯
and if h(x) ∈ F[x], h(A) = 0
¯
iff h is a multiple of
u in F[x]. If p(x) ∈ F[x] is the characteristic polynomial of A, then p(A) = 0
¯
and
thus p is a multiple of u. The polynomial u is also called the minimal polynomial of
A. Note that these properties hold for any matrix representing T, and thus similar
matrices have the same minimal polynomial. If A is given to start with, use the linear
transformation T : F
n
→F
n
determined by A to define the polynomial u.
Now suppose q ∈ F[x] is a monic polynomial and C(q) ∈ F
n
is the compan-
ion matrix defined in the section Jordan Blocks. Whenever q(x) = (x − λ)
n
, let
B(q) ∈ F
n
be the Jordan block matrix also defined in that section. Recall that q is
the characteristic polynomial and the minimal polynomial of each of these matrices.
This together with the rational form and the Jordan form will allow us to understand
the relation of the minimal polynomial to the characteristic polynomial.
Exercise Suppose A
i
∈ F
n
i
has q
i
as its characteristic polynomial and its minimal
polynomial, and A =
_
_
_
_
_
_
A
1
0
A
2
.
.
.
0 A
n
_
_
_
_
_
_
. Find the characteristic polynomial
and the minimal polynomial of A.
Exercise Suppose A ∈ F
n
.
1) Suppose A is the matrix displayed in Theorem 2 above. Find the characteristic
and minimal polynomials of A.
2) Suppose A is the matrix displayed in Theorem 3 above. Find the characteristic
and minimal polynomials of A.
3) Suppose A is the matrix displayed in Theorem 4 above. Find the characteristic
and minimal polynomials of A.
128 Appendix Chapter 6
4) Suppose λ ∈ F. Show λ is a root of the characteristic polynomial of A iff λ
is a root of the minimal polynomial of A. Show that if λ is a root, its order
in the characteristic polynomial is at least as large as its order in the minimal
polynomial.
5) Suppose
¯
F is a field containing F as a subfield. Show that the minimal poly-
nomial of A ∈ F
n
is the same as the minimal polynomial of A considered as a
matrix in
¯
F
n
. (This funny looking exercise is a little delicate.)
6) Let F = R and A =
_
_
_
5 −1 3
0 2 0
−3 1 −1
_
_
_. Find the characteristic and minimal
polynomials of A.
Determinants
In the chapter on matrices, it is stated without proof that the determinant of the
product is the product of the determinants (see page 63). The purpose of this section
is to give a proof of this. We suppose R is a commutative ring, C is an R-module,
n ≥ 2, and B
1
, B
2
, . . . , B
n
is a sequence of R-modules.
Definition A map f : B
1
⊕ B
2
⊕ ⊕ B
n
→ C is R-multilinear means that if
1 ≤ i ≤ n, and b
j
∈ B
j
for j ,= i, then f[(b
1
, b
2
, . . . , B
i
, . . . , b
n
) defines an R-linear
map from B
i
to C.
Theorem The set of all R-multilinear maps is an R-module.
Proof From the first exercise in Chapter 5, the set of all functions from B
1
⊕B
2

⊕B
n
to C is an R-module (see page 69). It must be seen that the R-multilinear
maps form a submodule. It is easy to see that if f
1
and f
2
are R-multilinear, so is
f
1
+ f
2
. Also if f is R-multilinear and r ∈ R, then (fr) is R-multilinear.
From here on, suppose B
1
= B
2
= . . . = B
n
= B.
Definition
1) f is symmetric means f(b
1
, . . . , b
n
) = f(b
τ(1)
, . . . , b
τ(n)
) for all
permutations τ on ¦1, 2, . . . , n¦.
2) f is skew-symmetric if f(b
1
, . . . , b
n
) = sign(τ)f(b
τ(1)
, . . . , b
τ(n)
) for all τ.
Chapter 6 Appendix 129
3) f is alternating if f(b
1
, . . . , b
n
) = 0
¯
whenever some b
i
= b
j
for i ,= j.
Theorem
i) Each of these three types defines a submodule of the set of all
R-multilinear maps.
ii) Alternating ⇒ skew-symmetric.
iii) If no element of C has order 2, then alternating ⇐⇒ skew-symmetric.
Proof Part i) is immediate. To prove ii), assume f is alternating. It sufficies to
show that f(b
1
, ..., b
n
) = −f(b
τ(1)
, ..., b
τ(n)
) where τ is a transposition. For simplicity,
assume τ = (1, 2). Then 0
¯
= f(b
1
+ b
2
, b
1
+ b
2
, b
3
, ..., b
n
) = f(b
1
, b
2
, b
3
, ..., b
n
) +
f(b
2
, b
1
, b
3
, ..., b
n
) and the result follows. To prove iii), suppose f is skew symmetric
and no element of C has order 2, and show f is alternating. Suppose for convenience
that b
1
= b
2
and show f(b
1
, b
1
, b
3
, . . . , b
n
) = 0
¯
. If we let τ be the transposition (1, 2),
we get f(b
1
, b
1
, b
3
, . . . , b
n
) = −f(b
1
, b
1
, b
3
, . . . , b
n
), and so 2f(b
1
, b
1
, b
3
, . . . , b
n
) = 0
¯
, and
the result follows.
Now we are ready for determinant. Suppose C = R. In this case multilinear
maps are usually called multilinear forms. Suppose B is R
n
with the canonical basis
¦e
1
, e
2
, . . . , e
n
¦. (We think of a matrix A ∈ R
n
as n column vectors, i.e., as an element
of B ⊕B ⊕ ⊕B.) First we recall the definition of determinant.
Suppose A = (a
i,j
) ∈ R
n
. Define d : B⊕B⊕ ⊕B →R by d(a
1,1
e
1
+a
2,1
e
2
+ +
a
n,1
e
n
, ....., a
1,n
e
1
+a
2,n
e
2
+ +a
n,n
e
n
) =

all τ
sign(τ)(a
τ(1),1
a
τ(2),2
a
τ(n),n
) = [A[.
The next theorem is from the section on determinants in Chapter 4.
Theorem d is an alternating multilinear form with d(e
1
, e
2
, . . . , e
n
) = 1
¯
.
If c ∈ R, dc is an alternating multilinear form, because the set of alternating forms
is an R-module. It turns out that this is all of them, as seen by the following theorem.
Theorem Suppose f : B ⊕B ⊕. . . ⊕B → R is an alternating multilinear form.
Then f = df(e
1
, e
2
, . . . , e
n
). This means f is the multilinear form d times the scalar
f(e
1
, e
2
, ..., e
n
). In other words, if A = (a
i,j
) ∈ R
n
, then f(a
1,1
e
1
+ a
2,1
e
2
+ +
a
n,1
e
n
, ....., a
1,n
e
2
+ a
2,n
e
2
+ + a
n,n
e
n
) = [A[f(e
1
, e
2
, ..., e
n
). Thus the set of alter-
nating forms is a free R-module of dimension 1, and the determinant is a generator.
130 Appendix Chapter 6
Proof For n = 2, you can simply write it out. f(a
1,1
e
1
+a
2,1
e
2
, a
1,2
e
1
+a
2,2
e
2
) =
a
1,1
a
1,2
f(e
1
, e
1
) +a
1,1
a
2,2
f(e
1
, e
2
) + a
2,1
a
1,2
f(e
2
, e
1
) + a
2,1
a
2,2
f(e
2
, e
2
) = (a
1,1
a
2,2

a
1,2
a
2,1
)f(e
1
, e
2
) = [A[f(e
1
, e
2
). For the general case, f(a
1,1
e
1
+a
2,1
e
2
+ +
a
n,1
e
n
, ....., a
1,n
e
1
+a
2,n
e
2
+ +a
n,n
e
n
) =

a
i
1
,1
a
i
2
,2
a
in,n
f(e
i
1
, e
i
2
, ..., e
in
) where
the sum is over all 1 ≤ i
1
≤ n, 1 ≤ i
2
≤ n, ..., 1 ≤ i
n
≤ n. However, if any i
s
= i
t
for s ,= t, that term is 0 because f is alternating. Therefore the sum is
just

all τ
a
τ(1),1
a
τ(2),2
a
τ(n),n
f(e
τ(1)
, e
τ(2)
, . . . , e
τ(n)
) =

all τ
sign(τ)a
τ(1),1
a
τ(2),2
a
τ(n),n
f(e
1
, e
2
, . . . , e
n
) = [A[f(e
1
, e
2
, ..., e
n
).
This incredible classification of these alternating forms makes the proof of the
following theorem easy. (See the third theorem on page 63.)
Theorem If C, A ∈ R
n
, then [CA[ = [C[[A[.
Proof Suppose C ∈ R
n
. Define f : R
n
→ R by f(A) = [CA[. In the notation of
the previous theorem, B = R
n
and R
n
= R
n
⊕ R
n
⊕ ⊕ R
n
. If A ∈ R
n
, A =
(A
1
, A
2
, ..., A
n
) where A
i
∈ R
n
is column i of A, and f : R
n
⊕ ⊕ R
n
→ R
has f(A
1
, A
2
, ..., A
n
) = [CA[. Use the fact that CA = (CA
1
, CA
2
, ..., CA
n
) to
show that f is an alternating multilinear form. By the previous theorem, f(A) =
[A[f(e
1
, e
2
, ..., e
n
). Since f(e
1
, e
2
, ..., e
n
) = [CI[ = [C[, it follows that [CA[ = f(A) =
[A[[C[.
Dual Spaces
The concept of dual module is basic, not only in algebra, but also in other areas
such as differential geometry and topology. If V is a finitely generated vector space
over a field F, its dual V

is defined as V

= Hom
F
(V, F). V

is isomorphic to V , but
in general there is no natural isomorphism from V to V

. However there is a natural
isomorphism from V to V
∗∗
, and so V

is the dual of V and V may be considered
to be the dual of V

. This remarkable fact has many expressions in mathematics.
For example, a tangent plane to a differentiable manifold is a real vector space. The
union of these spaces is the tangent bundle, while the union of the dual spaces is the
cotangent bundle. Thus the tangent (cotangent) bundle may be considered to be the
dual of the cotangent (tangent) bundle. The sections of the tangent bundle are called
vector fields while the sections of the cotangent bundle are called 1-forms.
In algebraic topology, homology groups are derived from chain complexes, while
cohomology groups are derived from the dual chain complexes. The sum of the
cohomology groups forms a ring, while the sum of the homology groups does not.
Chapter 6 Appendix 131
Thus the concept of dual module has considerable power. We develop here the basic
theory of dual modules.
Suppose R is a commutative ring and W is an R-module.
Definition If M is an R-module, let H(M) be the R-module H(M)=Hom
R
(M, W).
If M and N are R-modules and g : M → N is an R-module homomorphism, let
H(g) : H(N) → H(M) be defined by H(g)(f) = f ◦ g. Note that H(g) is an
R-module homomorphism.
M N
W
f
g
H(g)(f) = f ◦ g
¸
·
`
`
`
`
`
`
``
Theorem
i) If M
1
and M
2
are modules, H(M
1
⊕M
2
) ≈ H(M
1
) ⊕H(M
2
).
ii) If I : M →M is the identity, then H(I) : H(M) →H(M) is the
identity.
iii) If M
1
g
−→M
2
h
−→M
3
are R-module homomorphisms, then H(g)◦H(h) =
H(h ◦ g). If f : M
3
→W is a homomorphism, then
(H(g) ◦ H(h))(f) = H(h ◦ g)(f) = f ◦ h ◦ g.
M
1
M
2
M
3
g
h
f
f ◦ h
W
f ◦ h ◦ g
¸ ¸
·

`
`
`
`
`
`
Note In the language of the category theory, H is a contravariant functor from
the category of R-modules to itself.
132 Appendix Chapter 6
Theorem If g : M → N is an isomorphism, then H(g) : H(N) → H(M) is an
isomorphism, and H(g
−1
) = H(g)
−1
.
Proof
I
H(N)
= H(I
N
) = H(g ◦ g
−1
) = H(g
−1
) ◦ H(g)
I
H(M)
= H(I
M
) = H(g
−1
◦ g) = H(g) ◦ H(g
−1
)
Theorem
i) If g : M →N is a surjective homomorphism, then H(g) : H(N) →H(M)
is injective.
ii) If g : M →N is an injective homomorphism and g(M) is a summand
of N, then H(g) : H(N) →H(M) is surjective.
iii) If R is a field, then g is surjective (injective) iff H(g) is injective
(surjective).
Proof This is a good exercise.
For the remainder of this section, suppose W = R
R
. In this case H(M) =
Hom
R
(M, R) is denoted by H(M) = M

and H(g) is denoted by H(g) = g

.
Theorem Suppose M has a finite free basis ¦v
1
, ..., v
n
¦. Define v

i
∈ M

by
v

i
(v
1
r
1
+ + v
n
r
n
) = r
i
. Thus v

i
(v
j
) = δ
i,j
. Then v

1
, . . . , v

n
is a free basis for
M

, called the dual basis.
Proof First consider the case of R
n
= R
n,1
, with basis ¦e
1
, . . . , e
n
¦ where e
i
=
_
_
_
_
_
_
_
_
0

1
i

0
_
_
_
_
_
_
_
_
.
We know (R
n
)

≈ R
1,n
, i.e., any homomorphism from R
n
to R is given by a 1 n
matrix. Now R
1,n
is free with dual basis ¦e

1
, . . . , e

n
¦ where e

i
= (0, . . . , 0, 1
i
, 0, . . . , 0).
For the general case, let g : R
n

→M be given by g(e
i
) = v
i
. Then g

: M

→(R
n
)

sends v

i
to e

i
. Since g

is an isomorphism, ¦v

1
, . . . , v

n
¦ is a basis for M

.
Theorem Suppose M has a basis ¦v
1
, . . . , v
m
¦ and N has a basis ¦w
1
, . . . , w
n
¦
and g : M → N is the homomorphism given by A = (a
i,j
) ∈ R
n,m
. This means
g(v
j
) = a
1,j
w
1
+ + a
n,j
w
n
. Then the matrix of g

: N

→M

with respect to the
dual bases, is given by A
t
.
Chapter 6 Appendix 133
Proof g

(w

i
) is a homomorphism fromM to R. Evaluation on v
j
gives g

(w

i
)(v
j
) =
(w

i
◦ g)(v
j
) = w

i
(g(v
j
)) = w

i
(a
1,j
w
1
+ + a
n,j
w
n
) = a
i,j
. Thus g

(w

i
) = a
i,1
v

1
+
+ a
i,m
v

m
, and thus g

is represented by A
t
.
Exercise If U is an R-module, define φ
U
: U

⊕ U → R by φ
U
(f, u) = f(u).
Show that φ
U
is R-bilinear. Suppose g : M → N is an R-module homomorphism,
f ∈ N

and v ∈ M. Show that φ
N
(f, g(v)) = φ
M
(g

(f), v). Now suppose M =
N = R
n
and g : R
n
→ R
n
is represented by a matrix A ∈ R
n
. Suppose f ∈ (R
n
)

and v ∈ R
n
. Use the theorem above to show that φ : (R
n
)

⊕ R
n
→ R has the
property φ(f, Av) = φ(A
t
f, v). This is with the elements of R
n
and (R
n
)

written as
column vectors. If the elements of R
n
are written as column vectors and the elements
of (R
n
)

are written as row vectors, the formula is φ(f, Av) = φ(fA, v). Of course
this is just the matrix product fAv. Dual spaces are confusing, and this exercise
should be worked out completely.
Definition “Double dual” is a “covariant” functor, i.e., if g : M → N is a
homomorphism, then g
∗∗
: M
∗∗
→N
∗∗
. For any module M, define α : M →M
∗∗
by
α(m) : M

→R is the homomorphism which sends f ∈ M

to f(m) ∈ R, i.e., α(m)
is given by evaluation at m. Note that α is a homomorphism.
Theorem If g : M → N is a homomorphism, then the following diagram is
commutative.
M M
∗∗
N N
∗∗
α
α
g
g
∗∗
¸
¸
· ·
Proof On M, α is given by α(v) = φ
M
(−, v). On N, α(u) = φ
N
(−, u).
The proof follows from the equation φ
N
(f, g(v)) = φ
M
(g

(f), v).
Theorem If M has a finite free basis ¦v
1
, . . . , v
n
¦, then α : M → M
∗∗
is an
isomorphism.
Proof ¦α(v
1
), . . . , α(v
n
)¦ is the dual basis of ¦v

1
, . . . , v

n
¦, i.e., α(v
i
) = (v

i
)

.
134 Appendix Chapter 6
Note Suppose R is a field and C is the category of finitely generated vector spaces
over R. In the language of category theory, α is a natural equivalence between the
identity functor and the double dual.
Note For finitely generated vector spaces, α is used to identify V and V
∗∗
. Under
this identification V

is the dual of V and V is the dual of V

. Also, if ¦v
1
, . . . , v
n
¦
is a basis for V and ¦v

i
, . . . , v

n
¦ its dual basis, then ¦v
1
, . . . , v
n
¦ is the dual basis for
¦v

1
, . . . , v

n
¦.
In general there is no natural way to identify V and V

. However for real inner
product spaces there is.
Theorem Let R = R and V be an n-dimensional real inner product space.
Then β : V →V

given by β(v) = (v, −) is an isomorphism.
Proof β is injective and V and V

have the same dimension.
Note If β is used to identify V with V

, then φ
V
: V

⊕ V → R is just the dot
product V ⊕V →R.
Note If ¦v
1
, . . . , v
n
¦ is any orthonormal basis for V, ¦β(v
1
), . . . , β(v
n
)¦ is the dual
basis of ¦v
1
, . . . , v
n
¦, that is β(v
i
) = v

i
. The isomorphism β : V → V

defines an
inner product on V

, and under this structure, β is an isometry. If ¦v
1
, . . . , v
n
¦ is
an orthonormal basis for V, ¦v

1
, . . . , v

n
¦ is an orthonormal basis for V

. Also, if U
is another n-dimensional IPS and f : V → U is an isometry, then f

: U

→ V

is an isometry and the following diagram commutes.
V V

U U

β
β
f f

¸
¸
·
`
Exercise Suppose R is a commutative ring, T is an infinite index set, and
for each t ∈ T, R
t
= R. Show (

t∈T
R
t
)

is isomorphic to R
T
=

t∈T
R
t
. Now let
T = Z
+
, R = R, and M =

t∈T
R
t
. Show M

is not isomorphic to M.
Index
Abelian group, 20, 71
Algebraically closed field, 46, 97
Alternating group, 32
Ascending chain condition, 112
Associate elements in a domain, 47, 109
Automorphism
of groups, 29
of modules, 70
of rings, 43
Axiom of choice, 10
Basis or free basis
canonical or standard for R
n
, 72, 79
of a module, 78, 83
Bijective or one-to-one correspondence,7
Binary operation, 19
Boolean algebras, 52
Boolean rings, 51
Cancellation law
in a group, 20
in a ring, 39
Cartesian product, 2, 11
Cayley’s theorem, 31
Cayley-Hamilton theorem, 66, 98, 125
Center of group, 22
Change of basis, 83
Characteristic of a ring, 50
Characteristic polynomial
of a homomorphism, 85, 95
of a matrix, 66
Chinese remainder theorem, 50, 108
Classical adjoint of a matrix, 63
Cofactor of a matrix, 62
Comaximal ideals, 108, 120
Commutative ring, 37
Complex numbers, 1, 40, 46, 47, 97, 104
Conjugate, 64
Conjugation by a unit, 44
Contravariant functor, 131
Coproduct or sum of modules, 76
Coset, 24, 42, 74
Cycle, 32
Cyclic
group, 23
module, 107
Determinant
of a homomorphism, 85
of a matrix, 60, 128
Diagonal matrix, 56
Dimension of a free module, 83
Division algorithm, 45
Domain
euclidean, 116
integral domain, 39
of a function, 5
principal ideal, 46
unique factorization, 111
Dual basis, 132
Dual spaces, 130
Eigenvalues, 95
Eigenvectors, 95
Elementary divisors, 119, 120
Elementary matrices, 58
135
136 Index
Elementary operations, 57, 122
Endomorphism of a module, 70
Equivalence class, 4
Equivalence relation, 4
Euclidean algorithm, 14
Euclidean domain, 116
Evaluation map, 47, 49
Even permutation, 32
Exponential of a matrix, 106
Factorization domain (FD), 111
Fermat’s little theorem, 50
Field, 39
Formal power series, 113
Fourier series, 100
Free basis, 72, 78, 79, 83
Free R-module, 78
Function or map, 6
bijective, 7
injective, 7
surjective, 7
Function space Y
T
as a group, 22, 36
as a module, 69
as a ring, 44
as a set, 12
Fundamental theorem of algebra, 46
Gauss, 113
General linear group Gl
n
(R), 55
Generating sequence in a module, 78
Generators of Z
n
, 40
Geometry of determinant, 90
Gram-Schmidt orthonormalization, 100
Graph of a function, 6
Greatest common divisor, 15
Group, 19
abelian, 20
additive, 20
cyclic, 23
multiplicative, 19
symmetric, 31
Hausdorff maximality principle, 3, 87,
109
Hilbert, 113
Homogeneous equation, 60
Homormophism
of groups, 23
of rings, 42
of modules, 69
Homomorphism of quotient
group, 29
module, 74
ring, 44
Ideal
left, 41
maximal, 109
of a ring, 41
prime, 109
principal, 42, 46
right, 41
Idempotent element in a ring, 49, 51
Image of a function, 7
Independent sequence in a module, 78
Index of a subgroup, 25
Index set, 2
Induction, 13
Injective or one-to-one, 7, 79
Inner product spaces, 98
Integers mod n, 27, 40
Integers, 1, 14
Invariant factors, 119
Inverse image, 7
Invertible or non-singular matrix, 55
Irreducible element, 47, 110
Isometries of a square, 26, 34
Isometry, 101
Isomorphism
Index 137
of groups, 29
of modules, 70
of rings, 43
Jacobian matrix, 91
Jordan block, 96, 123
Jordan canonical form, 96, 123, 125
Kernel, 28, 43, 70
Least common multiple, 17, 18
Linear combination, 78
Linear ordering, 3
Linear transformation, 85
Matrix
elementary, 58
invertible, 55
representing a linear transformation,
84
triangular, 56
Maximal
ideal, 109
independent sequence, 86, 87
monotonic subcollection, 4
subgroup, 114
Minimal polynomial, 127
Minor of a matrix, 62
Module over a ring, 68
Monomial, 48
Monotonic collection of sets, 4
Multilinear forms, 129
Multiplicative group of a finite field, 121
Nilpotent
element, 56
homomorphism, 93
Noetherian ring, 112
Normal subgroup, 26
Odd permutation, 32
Onto or surjective, 7, 79
Order of an element or group, 23
Orthogonal group O(n), 102
Orthogonal vectors, 99
Orthonormal sequence, 99
Partial ordering, 3
Partition of a set, 5
Permutation, 31
Pigeonhole principle, 8, 39
Polynomial ring, 45
Power set, 12
Prime
element, 110
ideal, 109
integer, 16
Principal ideal domain (PID), 46
Principal ideal, 42
Product
of groups, 34, 35
of modules, 75
of rings, 49
of sets, 2, 11
Projection maps, 11
Quotient group, 27
Quotient module, 74
Quotient ring, 42
Range of a function, 6
Rank of a matrix, 59, 89
Rational canonical form, 107, 125
Relation, 3
Relatively prime
integers, 16
elements in a PID, 119
Right and left inverses of functions, 10
Ring, 38
Root of a polynomial, 46
Row echelon form, 59
Scalar matrix, 57
138 Index
Scalar multiplication, 21, 38, 54, 71
Self adjoint, 103, 105
Short exact sequence, 115
Sign of a permutation, 60
Similar matrices, 64
Solutions of equations, 9, 59, 81
Splitting map, 114
Standard basis for R
n
, 72, 79
Strips (horizontal and vertical), 8
Subgroup, 14, 21
Submodule, 69
Subring, 41
Summand of a module, 77, 115
Surjective or onto, 7, 79
Symmetric groups, 31
Symmetric matrix, 103
Torsion element of a module, 121
Trace
of a homormophism, 85
of a matrix, 65
Transpose of a matrix, 56, 103, 132
Transposition, 32
Unique factorization,
in principal ideal domains, 113
of integers, 16
Unique factorization domain (UFD), 111
Unit in a ring, 38
Vector space, 67, 85
Volume preserving homomorphism, 90
Zero divisor in a ring, 39

ii

E.H. Connell Department of Mathematics University of Miami P.O. Box 249085 Coral Gables, Florida 33124 USA ec@math.miami.edu

Mathematical Subject Classifications (1991): 12-01, 13-01, 15-01, 16-01, 20-01

c 1999

E.H. Connell [http://www.math.miami.edu/∼ec/book/]

November 30, 2000

iii

Introduction
In 1965 I first taught an undergraduate course in abstract algebra. It was fun to teach because the material was interesting and the class was outstanding. Five of those students later earned a Ph.D. in mathematics. Since then I have taught the course about a dozen times from various texts. Over the years I developed a set of lecture notes and in 1985 I had them typed so they could be used as a text. They now appear (in modified form) as the first five chapters of this book. Here were some of my motives at the time. 1) To have something as short and inexpensive as possible. In my experience, students like short books. 2) To avoid all innovation. To organize the material in the most simple-minded straightforward manner. 3) To order the material linearly. To the extent possible, each section should use the previous sections and be used in the following sections. 4) To omit as many topics as possible. This is a foundational course, not a topics course. If a topic is not used later, it should not be included. There are three good reasons for this. First, linear algebra has top priority. It is better to go forward and do more linear algebra than to stop and do more group and ring theory. Second, it is more important that students learn to organize and write proofs themselves than to cover more subject matter. Algebra is a perfect place to get started because there are many “easy” theorems to prove. There are many routine theorems stated here without proofs, and they may be considered as exercises for the students. Third, the material should be so fundamental that it be appropriate for students in the physical sciences and in computer science. Zillions of students take calculus and cookbook linear algebra, but few take abstract algebra courses. Something is wrong here, and one thing wrong is that the courses try to do too much group and ring theory and not enough matrix theory and linear algebra. 5) To offer an alternative for computer science majors to the standard discrete mathematics courses. Most of the material in the first four chapters of this text is covered in various discrete mathematics courses. Computer science majors might benefit by seeing this material organized from a purely mathematical viewpoint.

Marta Alpar. There is a non-trivial theorem stated without proof in Chapter 4. Chapter 6 is not written primarily for reference. . Finishing the chapter on linear algebra gives a basic one year undergraduate course in abstract algebra. and Shulim Kaliman. after the first four chapters. These proofs are to be provided by the professor in class or assigned as homework exercises. everything was right down to the nub.e. This chapter was written in the same “style” as the previous chapters. As bare as the first four chapters are. Huseyin Kocak. In 1996 I wrote a sixth chapter. It is intended for students in mathematics. you still have to truck right along to finish them in one semester. Indeed. My sincere gratitude goes especially to Marilyn Gonzalez. the linear algebra follows easily. To these and all who contributed. but as an additional chapter for more advanced courses. and the physical sciences. The first three or four chapters can stand alone as a one semester course in abstract algebra. but still somewhat informal. The proofs of many of the elementary theorems are omitted. Finally. The Jordan form should not be considered part of Chapter 5. The proof is contained in Chapter 6.iv Over the years I used the five chapters that were typed as a base for my algebra courses. The presentation is compact and tightly organized. giving enough material for a full first year graduate course. this theorem should be assumed there without proof. More advanced classes can do four chapters the first semester and chapters 5 and 6 the second semester. Chapter 6 continues the material to complete a first year graduate course. Lourdes Robles. These were independent topics stuck on at the end. It is difficult to do anything in life without help from friends. John Zweibel. It hung together pretty well except for the last two sections on determinants and dual spaces. Classes with little background can do the first three chapters in the first semester. However they are structured to provide the background for the chapter on linear algebra. supplementing them as I saw fit. namely the determinant of the product is the product of the determinants. and chapters 4 and 5 in the second semester. Chapter 2 is the most difficult part of the book because groups are written in additive and multiplicative notation. For the proper flow of the course. this book is fondly dedicated. It is stated there only as a reference for undergraduate courses. In the academic year 1997-98 I revised all six chapters and had them typed in LaTeX.. After Chapter 2 the book gets easier as you go along. computer science. and the concept of coset is confusing at first. Brian Coomes. This is the personal background of how this book came about. Dmitry Gokhman. This book is a survey of abstract algebra with emphasis on linear algebra. and many of my friends have contributed to this text. i.

you still have to build it. I hope the students and professors who try it.edu .miami. because mathematics is learned in hindsight. I would have made the book shorter. This book is my attempt at that organization. FL 33124 ec@math. Also I am convinced it is easier to build a course from a base than to extract it from a big book. Every effort has been extended to make the subject move rapidly and to make the flow from one topic to the next as seamless as possible. Basic algebra is a subject of incredible elegance and utility. but I did not have any more time. Unfortunately mathematics is a difficult and heavy subject. The goal is to stay focused and go forward. Connell Department of Mathematics University of Miami Coral Gables. The style and approach of this book is to make it a little lighter. H. enjoy it.v This text is written with the conviction that it is more effective to teach abstract and linear algebra as one coherent discipline rather than as two separate ones. Because after you extract it. but it requires a lot of organization. This book works best when viewed lightly and read as a story. E. Teaching abstract algebra and linear algebra as distinct courses results in a loss of synergy and a loss of momentum.

unique factorization Chapter 2 Groups 19 21 25 27 31 34 Groups.vi Outline Chapter 1 Background and Fundamentals of Mathematics 1 3 5 13 14 Sets. subgroups. Cartesian products Relations. the symmetric groups Product of groups Chapter 3 Rings Rings Units. solutions of equations. right and left inverses. partial orderings. scalar multiplication for additive groups Subgroups. and scalar matrices Elementary operations and elementary matrices Systems of equations 53 55 56 57 59 . cosets Normal subgroups. order. projections Notation for the logic of mathematics Integers. fields The integers mod n Ideals and quotient rings Homomorphisms Polynomial rings Product of rings The Chinese remainder theorem Characteristic Boolean rings Chapter 4 Matrices and Matrix Rings 37 38 40 41 42 45 49 50 50 51 Addition and multiplication of matrices. quotient groups. domains. the integers mod n Homomorphisms Permutations. invertible matrices Transpose Triangular. strips. Hausdorff maximality principle. equivalence relations Functions. diagonal. bijections.

and characteristic polynomial Chapter 5 Linear Algebra 68 69 71 74 75 77 78 79 82 83 85 90 91 92 93 94 96 98 102 103 60 64 Modules. trace.vii Determinants. rank of a matrix Geometric interpretation of determinant Linear functions approximate differentiable functions locally The transpose principle Nilpotent homomorphisms Eigenvalues. Gram-Schmidt orthonormalization Orthogonal matrices. characteristic roots Jordan canonical form Inner product spaces. the orthogonal group Diagonalization of symmetric matrices Chapter 6 Appendix The Chinese remainder theorem Prime and maximal ideals and UFDs Splitting short exact sequences Euclidean domains Jordan blocks Jordan canonical form Determinants Dual spaces 108 109 114 116 122 123 128 130 . square matrices over fields. and free basis Characterization of free modules Uniqueness of dimension Change of basis Vector spaces. generating sets. the classical adjoint Similarity. submodules Homomorphisms Homomorphisms on Rn Cosets and quotient modules Products and coproducts Summands Independence.

viii .

equivalence relations. The final section gives a proof of the unique factorization theorem for the integers. 0. injective.. and most properties of functions can be stated in terms of solutions of equations. N = Z+ = the set of positive integers = {1... Five of these are listed below. Notation Mathematics has its own universally accepted shorthand. b = 0} R = the field of real numbers C = the field of complex numbers = {a + bi : a. The basic concepts are products of sets.. B. −1. 2. C. and the properties of surjective..} Q = the field of rational numbers = {a/b : a. The symbol ∃ means “there exists” and ∃! means “there exists a unique”. b ∈ Z. and the integers. We use the standard notation for intersection and union. not just for algebra.. −2. 3...Chapter 1 Background and Fundamentals of Mathematics This chapter is fundamental. are sets.. and bijective. An equivalence relation on a set A is shown to be simply a partition of A into disjoint subsets. There is an emphasis on the concept of function. . . 2. 1. but for all fields related to mathematics. functions. The notion of a solution of an equation is central in mathematics. b ∈ R} (i2 = −1) Sets Suppose A.} Z = the ring of integers = {. A ∩ B = {x : x ∈ A and x ∈ B} = the set of all x which are elements 1 .. The symbol ∀ means “for each” and ⇒ means “implies”. partial orderings. In elementary courses the section on the Hausdorff Maximality Principle should be ignored. Some sets (or collections) are so basic they have their own proprietary symbols.

y) : x ∈ X and y ∈ Y }. B ⊂ S.. At is a set. If C ⊂ S (i. x ∈ At } t∈T Let ∅ be the null set. a ∈ A ⇒ a ∈ B.2 of A and B. then A and B are said to be disjoint.e. x ∈ At } = {x : ∀t ∈ T. Any set called an index set is assumed to be non-void. That is. The statement that A is not a subset of B means . be defined by C = S − C = {x ∈ S : x ∈ C}. Theorem (De Morgan’s laws) Suppose S is a set. Example R × R = R2 = the plane. Definition Suppose each of A and B is a set. Background Chapter 1 A ∪ B = {x : x ∈ A or x ∈ B} = the set of all x which are elements of A or B. The statement that A is a subset of B (A ⊂ B) means that if a is an element of A. then a is an element of B. (A ∩ B) = A ∪ B (A ∪ B) = A ∩ B and Cartesian Products If X and Y are sets. the Cartesian product of X and Y is defined to be the set of all ordered pairs whose first term is in X and whose second term is in Y . if C is a subset of S). Suppose T is an index set and for each t ∈ T . X × Y = {(x. At = {x : ∃ t ∈ T with x ∈ At } t∈T At = {x : if t ∈ T. let C . In other words. If A ∩ B = ∅. Then for any A. Exercise Suppose each of A and B is a set. the complement of C in S. .

Example Question R × · · · × R = Rn = real n-space. then If a ∼ b and b ∼ c.. Xn is a set. then a ≤ c. 3) If a ≤ b and b ≤ c. or 3) on A. Here are several properties which a relation may possess. xn ) : xi ∈ Xi for 1 ≤ i ≤ n} = the set of all ordered n-tuples whose i-th term is in Xi . the relation also satisfies these properties when restricted to S.Chapter 1 Background 3 Definition If each of X1 . is a partial ordering. 2 ) If a ≤ b and b ≤ a. a non-void subset R ⊂ A × A is called a relation on A. if a. then a = b. If (a. If the relation satisfies any of the properties 1). then b ∼ a. 2 ).. . then a ≤ b or b ≤ a. Example Example A = R with the ordinary ordering.. (anti-symmetric) a ∼ c. b ∈ A.. Definition A linear ordering is a partial ordering with the additional property that. . Hausdorff Maximality Principle (HMP) Suppose S is a non-void subset of A and ∼ is a relation on A. 1) 2) 2) 3) If a ∈ A.. A = all subsets of R2 . is a linear ordering. and 3) is called a partial ordering. 2 ). with a ≤ b defined by a ⊂ b. This defines a relation on S. and we write this fact by the expression a ∼ b. If a ∼ b and b ∼ a. X1 × · · · × Xn = {(x1 . Then 1) If a ∈ A. 2). then a ∼ a.. a partial ordering on A defines a partial ordering . (transitive) Definition A relation which satisfies 1). In this case we write a ∼ b as a ≤ b. then (reflexive) (symmetric) a = b. then a ≤ a. Is (R × R2 ) = (R2 × R) = R3 ? Relations If A is a non-void set. In particular. If a ∼ b. b) ∈ R we say that a is related to b.

2). these results may be assumed. First. and apply HMP. Proof Define a partial ordering on A by V ≤ W iff V ⊂ W. Definition A collection of sets is said to be monotonic if. and thus the HMP may be ignored. and second. Equivalence Relations an equivalence relation. The HMP is that any linearly ordered subset of a partially ordered set is contained in a maximal linearly ordered subset. a) : a < 0}. Then ∃ a maximal monotonic subcollection of A which contains S. In this book. In elementary courses. the only applications of the HMP are to obtain maximal monotonic collections of subsets. the maximal monotonic subcollection will have a maximal element. to show that infinitely generated vector spaces have free bases. d) provided a ≤ c and b ≤ d. If ∼ is an equivalence relation on A and a ∈ A. A relation satisfying properties 1). In each of these applications. Show this is an equivalence relation.4 Background Chapter 1 on S. Find at least two maximal linearly ordered subsets of R2 which contain S. in the Appendix. Corollary to HMP Suppose X is a non-void set and A is some non-void collection of subsets of X. . to show that rings have maximal ideals (see pages 87 and 109). we define the equivaDefinition lence class containing a by cl(a) = {x ∈ A : a ∼ x}. Show this is a partial ordering which is linear on S = {(a. one is contained in the other. and 3) is called Exercise Define a relation on A = Z by n ∼ m iff n − m is a multiple of 3. b) ∼ (c. given any two sets of the collection. The HMP is used twice in this book. However the ordering may be linear on S but not linear on A. and S is a subcollection of A which is monotonic. Exercise Define a relation on A = R2 by (a.

. the equivalence classes form a partition of A. Then ∼ is an equivalence relation.Chapter 1 Theorem 1) Background 5 2) 3) If b ∈ cl(a) then cl(b) = cl(a). What are the equivalence classes? Exercise Is there a relation on R satisfying 1). and the equivalence classes are just the subsets of the partition. In other words. Consider the collection of all translates of H. a collection of non-void subsets of A is a partition of A provided any a ∈ A is an element of one and only one subset of the collection. Definition A partition of A is a collection of disjoint non-void subsets whose union is A. Functions Just as there are two ways of viewing an equivalence relation.. domain and range are inherent parts of the definition. all lines in the plane with slope 2. 2 ) and 3) ? an equivalence relation on R which is also a partial ordering? That is. then U =V. Theorem Suppose A is a non-void set with a partition. V ⊂ A is an equivalence class and U ∩ V = ∅. Exercise Define an equivalence relation on Z by n ∼ m iff n − m is a multiple of 3. and 3). Find the equivalence relation on R2 defined by this partition of R2 . We use the “intuitive” definition because everyone thinks that way. Define a relation on A by a ∼ b iff a and b belong to the same subset of the partition. and the other is the “graph” or “ordered pairs” definition. 2). Each element of A is an element of one and only one equivalence class. Thus we may speak of a subset of A being an equivalence class with no mention of any element contained in it. 2a) : a ∈ R}.e. i. is there Exercise Let H ⊂ R2 be the line H = {(a. there are two ways of defining a function. In either case. If each of U. One is the “intuitive” definition. 2). Summary There are two ways of viewing an equivalence relation — one is as a relation on A satisfying 1). Note that if A has an equivalence relation. and the other is as a partition of A into disjoint subsets.

then the graph Γ ⊂ X × Y has the property that each x ∈ X is the first term of one and only one ordered pair in Γ. then ∃! f : X → Y whose graph is Γ. f g h .6 Background Chapter 1 Definition If X and Y are (non-void) sets. f ) is a function is written f as f : X → Y or X → Y . This may be written as h ◦ g ◦ f . Y. Y. define the inclusion i : S → X by i(s) = s for all s ∈ S.” Example Identity functions Here X = Y and f : X → X is defined by f (x) = x for all x ∈ X. Composition Given W → X → Y (g ◦ f )(x) = g(f (x)). f g define g ◦ f : W → Y by Theorem (The associative law of composition) If V → W → X → Y . a function or mapping or map with domain X and range Y . Define f : X → Y by f (x) = Restriction Given f : X → Y and a non-void subset S of X. is an ordered triple (X. Inclusion If S is a non-void subset of X. Suppose y0 ∈ Y . define f | S : S → Y by (f | S)(s) = f (s) for all s ∈ S. The identity on X is denoted by IX or just I : X → X. The function is defined by “f (x) is the second term of the ordered pair in Γ whose first term is x. f ) is the subset Γ ⊂ X × Y defined by Γ = {(x. Definition The graph of a function (X. Y. if Γ is a subset of X × Y with the property that each x ∈ X is the first term of one and only ordered pair in Γ. Note that inclusion is a restriction of the identity. then h ◦ (g ◦ f ) = (h ◦ g) ◦ f. Example Constant functions y0 for all x ∈ X. f (x)) : x ∈ X}. The connection between the “intuitive” and “graph” viewpoints is given in the next theorem. Theorem If f : X → Y . f ) where f assigns to each x ∈ X a well defined element f (x) ∈ Y . Conversely. The statement that (X.

i. 1] defined by f (x) = sin(x) is surjective but not injective. i. 2) 3) 4) 5) 6) Examples 1) 2) 3) 4) f : R → R defined by f (x) = sin(x) is neither surjective nor injective.. ∞) defined by f (x) = ex is bijective.e.Chapter 1 Definitions 1) Background Suppose f : X → Y . π/2] → [0. if y ∈ Y .e. 1] defined by f (x) = sin(x) is bijective. f (S) = {f (s) : s ∈ S} = {y ∈ Y : ∃s ∈ S with f (s) = y}. π/2] → R defined by f (x) = sin(x) is injective but not surjective. the inverse image of T is a subset of X. f : R → [−1. f : X → Y is surjective or onto provided image (f ) = Y i. if x1 and x2 are distinct elements of X. there is function f −1 : Y → X with f −1 ◦ f = IX : X → X and f ◦ f −1 = IY : Y → Y . f : X → Y is injective or 1-1 provided (x1 = x2 ) ⇒ f (x1 ) = f (x2 ).) Note There is no such thing as “the function sin(x). The image of f is the image of X .e. 7 If T ⊂ Y .. ..” A function is not defined unless the domain and range are specified. (f −1 (x) is written as ln(x). In this case. f : X → Y is bijective or is a 1-1 correspondence provided f is surjective and injective.e.) (f −1 (x) is 5) f : R → (0. the image of S is a subset of Y . written as arcsin(x) or sin−1 (x). i. f −1 (y) is a non-void subset of X. Note that f −1 : Y → X is also bijective and (f −1 )−1 = f . f : [0. the image is the range. image (f ) = f (X) = {f (x) : x ∈ X} = {y ∈ Y : ∃x ∈ X with f (x) = y}. If S ⊂ X. f : [0. then f (x1 ) and f (x2 ) are distinct elements of Y . f −1 (T ) = {x ∈ X : f (x) ∈ T }..

2])). y0 ) : x ∈ X} = (X. then f is injective iff f is surjective iff f is bijective. y) : y ∈ Y } = (x0 . There exists an injective f : X → Y iff n There exists a surjective f : X → Y iff n There exists a bijective f : X → Y iff n . the pigeonhole principle does not hold for infinite sets. These three sets are disjoint. T = f (f −1 (T )). Show there is a function f : Z+ → Z+ which is injective but not Exercise surjective. in part 1) for n = m = 6. Also find the relationship between T and f (f −1 (T )). 2] → R is defined by f (x) = x2 . if f is not surjective then f is not injective. . Y is a set with m elements. Also show there is one which is surjective but not injective. Y ) is called a vertical strip. then f is not surjective. S ⊂ X and T ⊂ Y .8 Background Chapter 1 Exercise Show there are natural bijections from (R × R2 ) to (R2 × R) and from (R2 × R) to R × R × R. Exercise Suppose f : [−2. y0 ) is called a horizontal strip. but the bijections between them are so natural that we sometimes identify them. . Of course. Show that if f is injective. then f is not injective. and f : X → Y is a function. as can be seen by the following exercise. Pigeonhole Principle Suppose X is a set with n elements. Find the relationship between S and f −1 (f (S)). {(x. S = f −1 (f (S)). If y0 ∈ Y. 5])). In other words. If n > m. Exercise 1) 2) 3) Suppose X is a set with 6 elements and Y is a finite set with n elements. . Find f −1 (f ([1. and you run out of pigeons before you fill the holes. Show that if f is surjective. Also find f (f −1 ([3. Exercise Suppose f : X → Y is a function. then you have placed 2 pigeons in one hole. If you are placing 6 pigeons in 6 holes. If n < m. {(x0 . Strips If x0 ∈ X. 1) 2) 3) If n = m.

f g . A solution to this equation is any x0 ∈ X with f (x0 ) = y0 . Then Each horizontal strip intersects Γ in at least one point iff f is Each horizontal strip intersects Γ in at most one point iff f is Each horizontal strip intersects Γ in exactly one point iff f is . The equation f (x) = y0 has at least one solution for each y0 ∈ Y iff . Note that the set of all solutions to f (x) = y0 is f −1 (y0 ). The subset S is the graph of a function with domain X and range Y iff each vertical strip intersects S in exactly one point. Suppose f : X → Y and y0 ∈ Y . which are defined after the next theorem. Solutions of Equations Now we restate these properties in terms of solutions of equations. then f is injective. . The purpose of the next theorem is to restate properties of functions in terms of horizontal strips. Here y0 is given and x is considered to be a “variable”. Consider the equation f (x) = y0 . .Chapter 1 Background 9 Theorem Suppose S ⊂ X × Y . Also f (x) = y0 has a solution iff y0 ∈ image(f ) iff f −1 (y0 ) is non-void. f is The equation f (x) = y0 has at most one solution for each y0 ∈ Y iff f is . Theorem 1) 2) 3) Suppose f : X → Y has graph Γ. The equation f (x) = y0 has a unique solution for each y0 ∈ Y iff f is . This is just a restatement of the property of a graph of a function. Right and Left Inverses One way to understand functions is to study right and left inverses. Theorem 1) 2) 3) Suppose f : X → Y . Theorem 1) Suppose X → Y → W are functions. If g ◦ f is injective.

Note The Axiom of Choice is not discussed in this book. The Axiom of Choice If f : X → Y is surjective. but f is not surjective and g is not injective. If g ◦ f is bijective. we state this part of 1) again. Define a relation on X by a ∼ b if f (a) = f (b). Also a function from X to Y is bijective iff it has a left inverse and a right inverse. it is possible to choose an x ∈ f −1 (y) and thus to define h(y) = x. Y = {p. A left inverse of f is a function g : Y → X such that g ◦ f = IX : X → X. That is. f (p) = p. you unknowingly used one version of it. For our purposes it is assumed that the Axiom of Choice and the HMP are true. then f −1 (y) is an equivalence class and every equivalence class is of this form. Show this is an equivalence relation. However in this text we do not go that deeply into set theory. Corollary Suppose each of X and Y is a non-void set. A right inverse of f is a function h : Y → X such that f ◦ h = IY : Y → Y . these equivalence classes will be called cosets. q}. If y belongs to the image of f . and g(p) = g(q) = p. Chapter 1 Example X = W = {p}. f has a right inverse iff f is surjective. In the next chapter where f is a group homomorphism. then g is surjective. Any such right inverse must be injective. for each y ∈ Y . Note It is a classical theorem in set theory that the Axiom of Choice and the Hausdorff Maximality Principle are equivalent. Here g ◦ f is the identity. However. For completeness. then f is injective and g is surjective. then f has a right inverse h. Exercise Suppose f : X → Y is a function. Any such left inverse must be surjective. f has a left inverse iff f is injective.10 2) 3) Background If g ◦ f is surjective. Then ∃ an injective f : X → Y iff ∃ a surjective g : Y → X. Theorem 1) 2) Suppose f : X → Y is a function. Definition Suppose f : X → Y is a function. . if you worked 1) of the theorem above.

we define the projection maps π1 : X1 × X2 → X1 and π2 : X1 × X2 → X2 by πi (x1 . t∈T . It is summarized by the equation f = (f1 . the projection map πs : Xt → Xs is defined by πs ({xt }) = xs . Given f . f2 ) where f1: Y → X1 and f2 : Y → X2 }. Given {ft }.. Xt = Xt is the collection of all “sequences” {xt }t∈T = {xt } Then the product where xt ∈ Xt .) For each s ∈ T .Chapter 1 Background 11 Projections If X1 and X2 are non-void sets. f is defined by f (y) = {ft (y)}. This concept is displayed in the diagram below. Y d d f2 d   f d   ‚ d ©   c π π2 E X1 ' 1 X1 × X2 X2 f1     One nice thing about this concept is that it works fine for infinite Cartesian products. and X2 are non-void sets. Proof Given f . x2 . Theorem If Y. f2 (y)). . Theorem If Y is any non-void set.. Thus a function from Y to X1 × X2 is merely a pair of functions from Y to X1 and Y to X2 . (Thus if T = Z+ . there is a 1-1 correspondence between {functions f : Y → Xt } and {sequences of functions {ft }t∈T where ft : Y → Xt }. X1 . x2 ) = xi . define f1 = π1 ◦ f and f2 = π2 ◦ f .}. f2 ). Definition Suppose T is an index set and for each t ∈ T . there is a 1-1 correspondence between {functions f: Y → X1 × X2 } and {ordered pairs of functions (f1 . Given f1 and f2 define f : Y → X1 × X2 by f (y) = (f1 (y). the sequence {ft } is defined by ft = πt ◦ f. Xt is a non-void set. {xt } = {x1 .

1} by f (t) = 0 if ft (t) = 1. Show that if S ⊂ T then β ◦ α(S) = S. Y N is the set of all infinite sequences (y1 . Use the fundamental theorem of calculus to show that D is surjective. 1}T → P(T ) by β(f ) = f −1 (1). 1}T is a function and show that it cannot be surjective. Define D : A0 → A by D(f ) = df /dx.12 Background Chapter 1 A Calculus Exercise Let A be the collection of all functions f : [0. the power set of T . let Yt be a copy of Y for each t ∈ T. That is. 1}T represents a “higher order of infinity”. then Y T has mn elements. define Y T to be the collection of all functions with domain T and range Y . Let A0 ⊂ A be the subcollection of those functions f with f (0) = 0. 1) If Y is a non-void set. 6) A set Y is said to be countable if it is finite or if there is a bijection from N to . denote γ(t) by γ(t) = ft : T → {0. the subset of Y {1. In particular. 1} then α ◦ β(f ) = f . Define β : {0. Show that if m ≥ 3. T . Thus α is a bijection and β = α−1 . define its characteristic function χS : T → {0. and f (t) = 1 if ft (t) = 0.) where each yi ∈ Y . then the set {0. 1] → R which have an infinite number of derivatives. If t ∈ T . then Y T is the infinite product Y × Y × · · · . Define α : P(T ) → {0. Y T = Y × Y × Y has m3 elements. 1}T by α(S) = χS . when T = {1. and be 0 when t ∈ S. Show there is a natural bijection from (Y1 × Y2 )T to Y1T × Y2T . Define f : T → {0. Show that f is not in the image of γ and thus γ cannot be surjective. These injective functions are called permutations on Y taken 3 at a time. Suppose T is a non-void set. 1} by | letting χS (t) be 1 when t ∈ S. For any Y and Yt .) 3) Define P(T ). and if f : T → {0. P(T ) ←→ {0. Show that if T is a finite set with n elements. P(T ) has 2n elements. to be the collection of all subsets of T (including the null set). It is included here for students who wish to do a little more set theory. 1}T 5) Suppose γ : T → {0. Use the mean value theorem to show that D is injective. 1}. Show that if T and Y are finite sets with n and m elements.2. Exercise This exercise is not used elsewhere in this text and may be omitted. 4) If S is any subset of T . .3} of all injective functions has m(m − 1)(m − 2) elements. This shows that if T is an infinite set. 3}. 2. Then Y T = t∈T 2) Suppose each of Y1 and Y2 is a non-void set. . . y2 . If T = N. (This is the fundamental property of Cartesian products presented in the two previous theorems.

) Notation for the Logic of Mathematics Each of the words “Lemma”.. Consider the following three collections. Suppose A and B are statements. If A is true. then S contains a smallest element. “A is equivalent to B”. {0. Suppose A is true. y2 . . or to suppose B is false and show A is false. that A ⇒ B and B ⇒ A). Show there is a bijection from {0. ii) {0. then B is true. For example. . Mathematical symbols are shorthand for phrases and sentences in the English language. (This is not so easy. and “Corollary” means “true statement”. There are two ways to prove the theorem — to suppose A is true and show B is true. A ⇒ B (A implies B ). 1}. 1}N .e. A theorem may be stated in any of the following ways: Theorem Theorem Theorem Theorem Hypothesis Statement A. Then B is true. 1}N . “Theorem”.Chapter 1 Background 13 Y. then “A ⇒ B”means “If x is a positive integer. 1}N to the real numbers R. The expressions “A ⇔ B”.) where each yi is 0 or 1. i) P(N). i. 1}N is uncountable. the collection of all functions f : N → {0. and “A is true iff B is true ” have the same meaning (namely. The important thing to remember is that thoughts and expressions flow through the language. the collection of all subsets of N.” If A is the statement “x ∈ Z+ ” and B is the statement “x2 ∈ Z+ ”. Conclusion Statement B. then x2 is a positive integer”. . iii) The collection of all sequences (y1 . “x ∈ B ” means “x is an element of the set B. We know that ii) and iii) are equal and there is a natural bijection between i) and ii). Mathematical Induction is based upon the fact that if S ⊂ Z+ is a non-void subset. . We also know there is no surjective map from N to {0.

e. for each n ≥ 1. 1 + 2 + · · · + n = n(n + 1)/2. and G is the collection of all linear combinations of a and b.. if a = −17 and b = 5. then n factors uniquely as the product of primes. For example. If n = 0.. elements of Z. Definition If r = 0. ± 1 divides anything . . Euclidean Algorithm Given a. . c. −2n. 2. the set of integers which n divides is nZ = {nm : m ∈ Z} = {. 3) All of this will follow from long division. Here we will establish the following three basic properties of the integers... −17 = 5(−4) + 3. Then for each n ≥ 1. .. Note that b | a ⇔ the rational number a/b is an integer ⇔ ∃! m such that a = bm ⇔ a ∈ bZ. . P (n) ⇒ P (n + 1). we say that b divides a or a is a multiple of b. −n. and its positive generator is the greatest common divisor of a and b. lower case letters a. We say that G is closed under negation and closed under addition. not both zero.. then ∃ a smallest positive integer m such that P (m) is false. P (n) is true. Exercise Use induction to show that. will represent integers. . The Integers In this section. 0.. Note Anything (except 0) divides 0...14 Background Chapter 1 Theorem Suppose P (n) is a statement for each n = 1.}. Also n divides a and b with the same remainder iff n divides (a − b). b with b = 0. then ∃ n ≥ 0 such that G = nZ. n. 0 does not divide anything. then G is a subgroup of Z. b. If n ≥ 2. 2n. Definition A non-void subset G ⊂ Z is a subgroup provided (g ∈ G ⇒ −g ∈ G) and (g1 . b divides a “m times with a remainder of r”.. In other words. If a and b are integers. then m = −4 and r = 3. g2 ∈ G ⇒ (g1 + g2 ) ∈ G). 1) 2) If G is a subgroup of Z. This fact is written as b | a. i. Since P (m − 1) is true. Suppose P (1) is true and for each n ≥ 1. which we now state formally. this is impossible. ∃! m and r with 0 ≤ r <|b| and a = bm + r. Proof If the theorem is false.

b) the smallest positive integer in G.. b ∈ Z and at least one of a and b is non-zero. Then 1) 2) 3) G contains a and b. and thus (a. and so 1) is true. If g ∈ G.Chapter 1 Background 15 Theorem If n ∈ Z then nZ is a subgroup. If G = 0. b)Z. b) | a and (a. suppose a. b)Z. Then 0 ∈ G.e. b). g = nm + r where 0 ≤ r < n. then n also divides (a.. it is the smallest subgroup containing a and b. ∃! non-negative integer n such that G = nZ. then (m1 g1 + m2 g2 ) ∈ G for all integers m1 . and thus n | (a. Denote by (a. n ∈ Z}. if G = {0} and n is the smallest positive integer in G. G = (a. Proof Since G is non-void. i. Let n be the smallest positive integer in G. Now (−g) ∈ G and thus 0 = g + (−g) belongs to G. The integer (a. It is called the subgroup generated by a and b. b) | b. m2 . nZ ⊃ (a. If n is an integer which divides a and b. and g ∈ nZ. The next theorem states that every subgroup of Z is of this form. Also note that ∃ m. In fact. In fact. 4) Proof of 4) Suppose n | a and n | b i. b). b ∈ nZ. By the previous theorem. Since r ∈ G. Theorem 1) 2) 3) Suppose G ⊂ Z is a subgroup. so consider 3). b) is called the greatest common divisor of a and b. (n | a and n | b) ⇒ n = ±1. Part 2) is straightforward. Corollary 1) The following are equivalent: a and b have no common divisors. then G = nZ. b). the set of integers which n divides is a subgroup of Z. Since G is the smallest subgroup containing a and b.e. G = {ma + nb : m. Theorem Let G be the set of all linear combinations of a and b. Thus if n = 0.. If g1 and g2 ∈ G. it must contain a positive element. ∃ g ∈ G. i. Now suppose a.e. . it must be 0. n such that ma + nb = (a. G is a subgroup.

a) = 1 and by the previous theorem. 13. Theorem 1) 2) 3) Suppose p is a prime. Definition A prime is an integer p > 1 which does not factor. if p = ab then a = ±1 or a = ±p. a) = 1. . If p does not divide a. Proof Part 1) follows immediately from the definition of prime.. That is. ∃ a unique collection of distinct primes p1 . p must divide b.. Now suppose p | ab. s2 . .. c ∈ Z and a | bc. n with ma + nb = 1. Theorem If a and b are relatively prime and a | bc.. Proof Suppose a and b are relatively prime. except for order. the subgroup generated by a and b is all of Z. pk and positive integers s1 . The power of this theorem is uniqueness.. The Unique Factorization Theorem Suppose n is an integer which is not 0. then a | c. . if a is any integer. Part 3) follows from 2) and induction on n.. Then there exist m. Thus a | (mac + nbc) and so a | c.. 5. and thus mac + nbc = c. we say that a and b are relatively prime.1. a) = p or (p. then p is equal to some ai .. then by 1). 3. If a is an integer which is not a multiple of p. 11. and uniqueness follows from 3) in the theorem above. (p. Thus if each ai is a prime. a) = 1. The first few primes are 2. 7.. not existence. Chapter 1 Definition If any one of these three conditions is satisfied.e. We are now ready for our first theorem with any guts. Then n may be factored into the product of primes and. 17.. this factorization is unique. then (p. If p | a1 a2 · · · an then p divides some ai . or -1. sk such that n = ±ps1 ps2 · · · psk . n ∈Z with ma + nb = 1. 1 2 k Proof Factorization into primes is obvious. i.e. ∃ m. In other words. Now a | mac and a | nbc. i.. Thus 2) is true. (p.16 2) 3) Background (a. b) = 1. If p | ab then p | a or p | b. .

Note that c is a multiple of a and b.e. the picture becomes transparent. . Thus if there is no common prime in their factorizations. . contradiction and therefore 2 is irrational. find the positive generator of the subgroup generated by {180. For example (23 · 5 · 11.. b) = pu1 · · · puk .e. then (a. Let {p1 .) √ √ 2 is irrational. n) = 1. . . Then 1 k (a. i. In particular. the least common multiple of a and b is c = ab/(a. Then the only 1 k divisors or a are of the form ±pt1 · · · ptk where 0 ≤ ti ≤ si for i = 1.28). Find integers m and n such that 180m + 28n = (180. pk } be the union of the distinct primes of their factorizations. 1 k Let vi be the maximum of si and ti . p2 . ∃ m. Then integer. then their least common multiple is just their product. Then no prime would divide (p1 p2 · · · pk + 1). Thus a = ±ps1 · · · psk where 0 ≤ si and 1 k b = ±pt1 · · · ptk where 0 ≤ ti .. n =√ and so 2 is an integer. Theorem (Summary) 1) Suppose | a |> 1 has prime factorization a = ±ps1 · · · psk .28}. √ Since this is impossible. n with ma + nb = 1. find the greatest common divisor of 180 and 28.. k. b).. 1 k If | a |> 1 and | b |> 1.. then n is a multiple of c.. 28). and show that it is equal to (180 · 28)/(180. pk .) √ c is rational iff √ c is an 2) 3) 3) 4) 5) 6) Suppose c is an integer greater than 1. 22 · 54 · 7) = 22 · 5. There is an infinite number of primes. . Suppose | a |> 1 and | b |> 1.. .Chapter 1 Background 17 Now that we have unique factorization and part 3) above. Here are some of the basic properties of the integers in this light. i. . Find the least common multiple of 180 and 28. Exercise Find (180. and if n is a multiple of a and b. (Proof: Suppose 2 = m/n where (m. Then c = pv1 · · · pvk is the least 1 k common multiple of a and b. Finally. if a and b are relatively prime. 28). n and m have a common prime factor. (Proof: Suppose there were only a finite number of primes p1 . Then 2n2 = m2 and if n > 1.. b) = 1 iff there is no common prime in their factorizations. Let ui be the minimum of si and ti . This is a 1.

Exercise Show that if each of G1 .18 Background Chapter 1 Exercise We have defined the greatest common divisor (gcd) and the least common multiple (lcm) of a pair of integers. Exercise Show that the gcd of S = {90. where it is shown that any non-zero element of Z7 has order 7. Exercise Show that if the nth root of an integer is a rational number. That is. and find integers n1 . Card Trick Ask friends to pick out seven cards from a deck and then to select one to look at without showing it to you. In the language of the next chapter. Define the gcd and the lcm of the elements of S and develop their properties. Although this is a straightforward exercise in long division.. then it is an integer. Magic? Stay tuned for Chapter 2. Show that the set of all linear combinations of the elements of S is a subgroup of Z. a0 = an 10n + an−1 10n−1 + · · · + a0 where 0 ≤ ai ≤ 9. Now let G = (90Z) ∩ (70Z) ∩ (42Z) and find the positive integer n with G = nZ.. G2 . suppose c and n are integers greater than 1. and its positive generator is the gcd of the elements of S. . and show that 3 divides a and b with the same remainder. More generally. Suppose they say three. n2 . 42} is 2. Also find the lcm of the elements of S. it will be more transparent later on. Now let b = an + an−1 + · · · + a0 . Show that if x is rational. but they are not to know where. Exercise Show that a positive integer is divisible by 3 iff the sum of its digits is divisible by 3. an } is a finite collection of integers with |ai | > 1 for 1 ≤ i ≤ n. Put your hands behind your back and place the selected card on top. let a = an an−1 . leaving it face up on top. Express the gcd and the lcm in terms of the prime factorizations of the ai . a2 . moving the top two cards to the bottom and turning the third card face up on top. There is a unique positive real number x with xn = c. then it itself is an integer. Thus if p is a prime. Now suppose n ≥ 2 and S = {a1 . .. Gm is a subgroup of Z. Take the six cards face down in your left hand and the selected card in your right hand. and bring the seven cards in front in your left hand. and then you turn over the third card.. and this will be the selected card. then the second card to the bottom. n3 such that 90n1 + 70n2 + 42n3 = 2. . You move the top card to the bottom. Ask your friends to give you a number between one and seven (not allowing one). Continue until there is only one card face down. it says that [a] = [b] in Z3 . its nth root is an irrational number. then G = G1 ∩ G2 ∩ · · · ∩ Gm is also a subgroup of Z. . Then repeat the process. and announce you will place the selected card in with the other six.. 70. .

b ∈ G. we say it is a multiplicative group. 1) If a. rings. because all the concepts are new. Definition If properties 1). Everything presented here is standard. b ∈ G. b. b. Consider the following properties. ∃ 0=0G ∈ G such that if a ∈ G ¯ ¯ 0+a = a+0= a. φ) is said to be a group. c ∈ G then a + (b + c) = (a + b) + c. If we write φ(a.Chapter 2 Groups Groups are the central objects of algebra. it leads to a rich and amazing theory. and we will write φ(a. 4) If a. ∃b ∈ G with a + b = b + a = 0 ¯ (b is written as b = −a). (b is written as b = a−1 ). b) = a·b or φ(a. ∃b ∈ G with a · b = b · a = e If a ∈ G. except that the product of groups is given in the additive notation. If a. then a + b = b + a. The approach is to do quickly the fundamentals of groups. Also ring homomorphisms and module homomorphisms are special cases of group homomorphisms. This chapter and the next two chapters are restricted to the most basic topics. If a. In later chapters we will define rings and modules and see that they are special cases of groups. 19 . (G. and matrices. This chapter is. φ is called a binary operation. 2) ∃ e = eG ∈ G such that if a ∈ G e · a = a · e = a. b) = a + b. by far and above. ¯ ¯ 3) If a ∈ G. b) = a+b. and 3) hold. c ∈ G then a · (b · c) = (a · b) · c. then a · b = b · a. and to push forward to the chapter on linear algebra. Even though the definition of group is simple. If we write φ(a. 2). Definition Suppose G is a non-void set and φ : G × G → G is a function. b) = a · b. the most difficult chapter in the book. This is the notation used in later chapters for the products of rings and modules.

let a0 =0 and if n > 0. ¯ ¯ if a. if G is abelian and a. then f is injective. if f : G → G is defined by f (c) = a · c. then (a · b)n = an · bn . e is unique. and a(−n) = (−a) + (−a) · · + (−a). i. Let a0 = e and if n > 0. nt ∈ Z then an1 · an2 · · · ant = an1 +···+nt . Theorem (i) Let (G. ¯ ¯ ¯ Also c · a = c · a ⇒ c = c. This is so basic. If n1 . (vii) Exercise. If a ∈ G. ¯ let an = (a + · · +a) where the sum is n times.e. then e = e. then (a−1 )−1 = a. Write out the above theorem where G is an additive group. b ∈ G then (a · b = a) ⇒ (b = e) and (a · b = b) ⇒ (a = e). . there is defined an element an in G. b ∈ G. Every right inverse is an inverse. an = a · · · a (n times) and a−n = a−1 · · · a−1 (n times). Recall that b is an identity in G provided it is a right and left identity for any a in G. Note that part (vii) states that G has a scalar multiplication over Z. This means that if a is in G and n is an integer. In general. If a ∈ G. c ∈ G. Of course... n2 . Suppose G is an additive group. If in addition.. c. In fact. Also if b · a = e then b = a−1 . Thus inverses are unique.20 Groups Chapter 2 we say it is an additive group.. that we state it explicitly. Also (an )m = anm . if e ∈ G satisfies 2). Also f is bijective with f −1 given by f −1 (c) = a−1 · c. φ) be a multiplicative group. we say the group is abelian or commutative. However group structure is so rigid that if ∃ a ∈ G such that b is a right identity for a. Theorem.e. then b = e. . this is just a special case of the cancellation law in (i). if a · b = e then b = a−1 . a1 · a2 · · · an is well defined. Suppose a. Then a · c = a · c ⇒ c = c. n−1 1 n Also (a1 · a2 · · · an )−1 = (ii) (iii) (iv) (v) (vi) The multiplication a1 · a2 · a3 = a1 · (a2 · a3 ) = (a1 · a2 ) · a3 is well defined.. ¯ ¯ In other words. i. b ∈ G. Suppose a ∈ G. Finally. property 4) holds. If a. (a · b)−1 = b−1 · a−1 a−1 · a−1 · · · a−1 .

Also show that the set of all bijections from Z to Z is a group under composition. and f has a left inverse in G iff f is injective. 2) and [ 3 ) If a ∈ G. and thus G is not a group.. If every element has a right inverse. the group axioms are stronger than necessary.e. this theorem states that any additive abelian group is a Z-module. then every element has a two sided inverse. Subgroups Theorem satisfying Suppose G is a multiplicative group and H ⊂ G is a non-void subset Examples and 1) if a. b) = ab is a multiplicative abelian group.) Suppose G is a non-void set with a binary operation φ(a. Note that G satisfies 1) and 2) but not 3). b) = ab is not a group. G = Z − 0 with φ(a. Show that f has a right inverse in G iff f is surjective. or G = Z with φ(a. Exercise Suppose G is the set of all functions from Z to Z with multiplication defined by composition. b ∈ H then a · b ∈ H 2) if a ∈ H then a−1 ∈ H. (See page 71. Show (G. G = Q. b) = a + b is an additive abelian group. b) = a·b which Exercise satisfies 1). i. except the first requires that G be abelian. b) = ab is a multiplicative abelian group. show b · a = e. In the language used in Chapter 5.Chapter 2 Groups 21 which we write as (−a − a · · − a).. φ) is a group. f · g = f ◦ g. ∃b ∈ G with a · b = e].e. (a + b)n a(n + m) a(nm) a1 = = = = an + bn an + am (an)m a Note that the plus sign is used ambiguously — sometimes for addition in G and sometimes for addition in Z. . G = R+ = {r ∈ R : r > 0} with φ(a. G = R−0 or G = Q−0 with φ(a. Then the following properties hold in general. i. In other words. Examples G = R.

Suppose H1 and H2 are subgroups of G. Let H be the center of G. Let K be the subset of G composed of all differentiable functions. What theorems in calculus show that H and K are subgroups of G? What theorem shows that K is a subset (and thus subgroup) of H? Order Suppose G is a multiplicative group. Suppose H1 and H2 are subgroups of G.e. then H = nZ is a subgroup of Z. Define an addition on G by (f + g)(t) = f (t) + g(t) for all t ∈ [0. The associative law is immediate and so H is a group. a−1 ∈ H and so by 1). t∈T 2) 3) 4) 5) 6) Furthermore.. This makes G into an abelian group. Suppose G= {all functions f : [0. i. Let H be the subset of G composed of all continuous functions. 1] → R}. Show H1 ∪ H2 is not a subgroup of G. By a theorem in the section on the integers in Chapter 1. These are called the improper subgroups of G. If G has an infinite number of . Example G is a subgroup of G and e is a subgroup of G. then t∈T Ht is a subgroup of G. ∃a ∈ H. H is called a subgroup of G. every subgroup of Z is of this form. and n ∈ Z. Show Ht is a subgroup of G. Show H1 ∩ H2 is a subgroup of G. Exercises 1) Suppose G is a multiplicative group. Ht is a subgroup of G. By 2). This is a key property of the integers. if {Ht } is a monotonic collection. Example If G = Z under addition. e ∈ H. 1]. Suppose T is an index set and for each t ∈ T .22 Groups Chapter 2 Then e ∈ H and H is a group under multiplication. Proof Since H is non-void. with neither H1 nor H2 contained in the other. Show H is a subgroup of G. H = {h ∈ G : g · h = h · g for all g ∈ G}.

Note that f (k + l) = f (k) · f (l) where the addition is in Z and the multiplication is in the group H. a1 . This implies that the elements of {a0 . If G has n elements. a . . Then ai−j = e and thus ∃ a smallest positive integer n with an = e. . the order of a is the smallest positive integer n with an = e. then o(G) = n. . It says that the element a has finite order iff f is not injective. Let f : Z → H be the surjective function defined by f (m) = am . an−1 } are distinct. Definition Theorem A group G is cyclic if ∃ an element of G which generates G. Note that Z is an additive cyclic group and it was shown in the previous chapter that subgroups of Z are cyclic. then a has some finite order n. Theorem Suppose a is an element of a multiplicative group G. Note that e is the only element of order 1. We come now to the first real theorem in group theory. i. H is an abelian subgroup of G called the subgroup generated by a. . .. and am = e iff n|m. an−1 }. we say that o(G). Thus am = anq · ar = ar . and f −1 (e) = nZ. If ∃ distinct integers i and j with ai = aj .. then G has an odd number of elements of order 2. a1 . In this case H has n distinct elements. in additive notation.. . an−1 }. and so H = {a0 . the order of a is the smallest positive integer n with an = e. . To begin. Proof Suppose j < i and ai = aj . the Euclidean algorithm states that ∃ integers q and r with 0 ≤ r < n and m = nq + r. If G is cyclic and H is a subgroup of G. . We define the order of the element a to be the order of H. Then m|n and am generates H... then H is cyclic. Exercise Write out this theorem for G an additive group. Later in this chapter we will see that f is a homomorphism from an additive group to a multiplicative group and that. is infinite. the order of G. Let m be the smallest integer with o < m < n and am ∈ H. suppose a is an element of an additive group G. and in this case. Suppose H is a subgroup of G with more than one element. and we must show they are all of H. an−1 }.e.Chapter 2 Groups 23 elements. In particular. If m ∈ Z. and am = e iff n|m... H is isomorphic to Z or Zn . . . The case where G is an infinite cyclic group is left as an exercise. and i H = {a : i ∈ Z}. the order of the subgroup generated by a. Then ∃ a ∈ G with G = o 1 {a . and H = {ai : i ∈ Z}. a1 . Suppose a ∈ G and H = {ai : i ∈ Z}. Exercise Show that if G is a finite group of even order. Proof Suppose G is a cyclic group of order n. H = {a0 . .

24

Groups

Chapter 2

Cosets Suppose H is a subgroup of a group G. It will be shown below that H partitions G into right cosets. It also partitions G into left cosets, and in general these partitions are distinct. Theorem If H is a subgroup of a multiplicative group G, then a ∼ b defined by −1 a ∼ b iff a · b ∈ H is an equivalence relation. If a ∈ G, cl(a) = {b ∈ G : a ∼ b} = {h · a : h ∈ H} = Ha. Note that a · b−1 ∈ H iff b · a−1 ∈ H. If H is a subgroup of an additive group G, then a ∼ b defined by a ∼ b iff (a − b) ∈ H is an equivalence relation. If a ∈ G, cl(a) = {b ∈ G : a ∼ b} = {h + a : h ∈ H} = H + a. Note that (a − b) ∈ H iff (b − a) ∈ H. Definition These equivalence classes are called right cosets. If the relation is defined by a ∼ b iff b−1 · a ∈ H, then the equivalence classes are cl(a) = aH and they are called left cosets. H is a left and right coset. If G is abelian, there is no distinction between right and left cosets. Note that b−1 · a ∈ H iff a−1 · b ∈ H. In the theorem above, we used H to define an equivalence relation on G, and thus a partition of G. We now do the same thing a different way. We define the right cosets directly and show they form a partition of G. This is really much easier. Theorem Suppose H is a subgroup of a multiplicative group G. If a ∈ G, define the right coset containing a to be Ha = {h · a : h ∈ H}. Then the following hold. 1) 2) 3) 4) Ha = H iff a ∈ H. If b ∈ Ha, then Hb = Ha, i.e., if h ∈ H, then H(h · a) = (Hh)a = Ha. If Hc ∩ Ha = ∅, then Hc = Ha. The right cosets form a partition of G, i.e., each a in G belongs to one and only one right coset. 5) Elements a and b belong to the same right coset iff a · b−1 ∈ H iff b · a−1 ∈ H.

Proof There is no better way to develop facility with cosets than to prove this theorem. Also write this theorem for G an additive group.

Theorem

Suppose H is a subgroup of a multiplicative group G.

Chapter 2 1)

Groups

25

Any two right cosets have the same number of elements. That is, if a, b ∈ G, f : Ha → Hb defined by f (h · a) = h · b is a bijection. Also any two left cosets have the same number of elements. Since H is a right and left coset, any two cosets have the same number of elements. G has the same number of right cosets as left cosets. The bijection is given by F (Ha) = a−1 H. The number of right (or left) cosets is called the index of H in G. If G is finite, o(H) (index of H) = o(G) and so o(H) | o(G). In other words, o(G)/o(H) = the number of right cosets = the number of left cosets. If G is finite, and a ∈ G, then o(a) | o(G). (Proof: The order of a is the order of the subgroup generated by a, and by 3) this divides the order of G.) If G has prime order, then G is cyclic, and any element (except e) is a generator. (Proof: Suppose o(G) = p and a ∈ G, a = e. Then o(a) | p and thus o(a) = p.) If o(G) = n and a ∈ G, then an = e. (Proof: ao(a) = e and n = o(a) (o(G)/o(a)) .)

2)

3)

4)

5)

6)

Exercises i) Suppose G is a cyclic group of order 4, G = {e, a, a2 , a3 } with a4 = e. Find the order of each element of G. Find all the subgroups of G. Suppose G is the additive group Z and H = 3Z. Find the cosets of H. Think of a circle as the interval [0, 1] with end points identified. Suppose G = R under addition and H = Z. Show that the collection of all the cosets of H can be thought of as a circle. Let G = R2 under addition, and H be the subgroup defined by H = {(a, 2a) : a ∈ R}. Find the cosets of H. (See the last exercise on p. 5.) Normal Subgroups We would like to make a group out of the collection of cosets of a subgroup H. In

ii) iii)

iv)

26

Groups

Chapter 2

general, there is no natural way to do that. However, it is easy to do in case H is a normal subgroup, which is described below. Theorem 1) 2) 3) 4) If H is a subgroup of G, then the following are equivalent. If a ∈ G, then aHa−1 = H If a ∈ G, then aHa−1 ⊂ H If a ∈ G, then aH = Ha Every right coset is a left coset, i.e., if a ∈ G, ∃ b ∈ G with Ha = bH.

Proof 1) ⇒ 2) is obvious. Suppose 2) is true and show 3). We have (aHa−1 )a ⊂ Ha so aH ⊂ Ha. Also a(a−1 Ha) ⊂ aH so Ha ⊂ aH. Thus aH = Ha. 3) ⇒ 4) is obvious. Suppose 4) is true and show 3). Ha = bH contains a, so bH = aH because a coset is an equivalence class. Finally, suppose 3) is true and show 1). Multiply aH = Ha on the right by a−1 . Definition If H satisfies any of the four conditions above, then H is said to be a normal subgroup of G. Note For any group G, G and e are normal subgroups. If G is an abelian group, then every subgroup of G is normal. Exercise Show that if H is a subgroup of G with index 2, then H is normal.

Exercise Show the intersection of a collection of normal subgroups of G is a normal subgroup of G. Show the union of a monotonic collection of normal subgroups of G is a normal subgroup of G. Exercise Let A ⊂ R2 be the square with vertices (−1, 1), (1, 1), (1, −1), and (−1, −1), and G be the collection of all “isometries” of A onto itself. These are bijections of A onto itself which preserve distance and angles, i.e., which preserve dot product. Show that with multiplication defined as composition, G is a multiplicative group. Show that G has four rotations, two reflections about the axes, and two reflections about the diagonals, for a total of eight elements. Show the collection of rotations is a cyclic subgroup of order four which is a normal subgroup of G. Show that the reflection about the x-axis together with the identity form a cyclic subgroup of order two which is not a normal subgroup of G. Find the four right cosets of this subgroup. Finally, find the four left cosets of this subgroup.

N is a normal subgroup. define E to be the coset containing c · d. Once multiplication is well defined. Any additive abelian group has a scalar multiplication over Z. and N = nZ. We wish to define a coset E which is the product of C and D. Proof The element [a] is a generator iff the subgroup generated by [a] contains [1] iff ∃ an integer k such that [a]k = [1] iff ∃ integers k and l such that ak + nl = 1. Also Zn is cyclic because each of [1] and [−1] = [n − 1] is a generator. which is quite easy. n) = 1. Its identity is N and (N a)−1 = (N a−1 ). the group axioms are immediate. Note that [a] = [r] where r is the remainder of a divided by n. We already know that if p is a prime. Zn . Furthermore. i. and G/N is the collection of all cosets. Proof Multiplication of elements in G/N is multiplication of subsets in G. −[a] = [−a]. Note that [10] = [1] in Z3 . if G is finite. Then (N a) · (N b) = N (a · b) is a well defined multiplication (binary operation) on G/N .) Homomorphisms Homomorphisms are functions between groups that commute with the group operations..e. Exercise Write out the above theorem for G an additive abelian group. E = N (c · d). This is made precise in the next theorem. Note that [a] + [b] = [a + b]. .. the coset a + nZ is denoted by [a]. In this section we list .. Theorem If n > 1 and a is any integer. n > 1. because Zp has p elements. the group of integers mod n is defined by Zn = Z/nZ. and C and D are cosets. Theorem Suppose G is a multiplicative group.Chapter 2 Groups 27 Quotient Groups Suppose N is a normal subgroup of G. Exercise Show that a positive integer is divisible by 3 iff the sum of its digits is divisible by 3. and with this multiplication. It follows that they honor identities and inverses.. The coset E does not depend upon the choice of c and d. then [a] is a generator of Zn iff (a. and thus the distinct elements of Zn are [0]. [n − 1]. If c ∈ C and d ∈ D. o(G/N ) = o(G)/o(N ). If a is an integer. any non-zero element of Zp is a generator. and in this case it is just [a]m = [am]. Example Suppose G = Z under +. [1]. (See the fifth exercise on page 18. G/N is a group. and [a] = [a + nl] for any integer l. (N a) · (N b) = N (aN )b = N (N a)b = N (a · b).

if H is ¯ ¯ normal in G. Properties 11).. and should be considered as the cornerstones of abstract algebra. We now catalog the basic properties of homomorphisms. 5) 6) 7) .) ¯ ¯ ¯ Examples The constant map f : G → G defined by f (a) = e is a homomorphism. while the function defined by f (t) = t + 2 is not a homomorphism. The kernel of f is a normal subgroup of G. f is injective ⇔ ker(f ) = e. The function h : Z → R − 0 defined by h(t) = 2t is a homomorphism from an additive group to a multiplicative group. the kernel is the set of ¯ solutions to the equation f (x) = e. 12). ¯ f (a−1 ) = f (a)−1 .28 Groups Chapter 2 the basic properties. b ∈ G. for all a. ¯ ¯ Definition If G and G are multiplicative groups. (If G is an additive group. image(f ) is ¯ a subgroup of G. i. In particular. f −1 (H) is a subgroup of G. f −1 (¯) is void or is a coset of ker(f ). a function f : G → G is a homomorphism if. Furthermore. In other words. The kernel of f is defined by −1 e ¯ ker(f ) = f (¯) = {a ∈ G : f (a) = e}. ker(f ) = f −1 (0). then f −1 (H) is normal in G. ¯ If H is a subgroup of G. f (H) is a subgroup of G. the group ¯ operation is in G. The function f : Z → Z defined by f (t) = 2t is a homomorphism of additive groups. If g ∈ G. ¯ If H is a subgroup of G. if f (g) = g then ¯ ¯ g ¯ −1 f (¯) = N g where N = ker(f ). Theorem 1) 2) 3) 4) f (e) = e. and 13) show the connections between coset groups and homomorphisms. In other words. These will be helpful later on when we study ring homomorphisms and module homomorphisms. while on the right side it is in G. if the equation f (x) = g has a g ¯ ¯ ¯ Suppose G and G are groups and f : G → G is a homomorphism.e. the inclusion i : H → G is a homomorphism. f (a · b) = f (a) · f (b). On the left side. ¯ ¯ ¯ ¯ If H is a subgroup of G.

then the set of all solutions is a coset of N = ker(f ).e. Furthermore. G is ¯ is cyclic. then f : G/H → G ¯(Ha) = f (a) is a well-defined homomorphism making defined by f the following diagram commute. if f is an isomorphism and H ⊂ G is a subset.Chapter 2 Groups 29 solution. i. then G/H ≈ G (see below). because an cyclic iff G algebraic property is one that. ¯ In this case. f is called an isomorphism. if ¯ ¯ f : G → G is a surjective homomorphism with kernel H. then the function f −1 : G → G is a homomorphism. If H ⊂ ker(f ). 11) Suppose H is a normal subgroup of G. if h : G →G is a = homomorphism. ¯ ¯ If f : G → G is a bijection.. Of course. etc. this is somewhat of a cop-out. Then π : G → G/H defined by π(a) = Ha is a surjective homomorphism with kernel H. by definition. For example. domain(f )/ker(f ) ≈ image(f ). H is normal in G iff f (H) is normal in G. f is also called an automorphism. = 9) 10) Isomorphisms preserve all algebraic properties. is preserved under isomorphisms. ¯ f is injective. and thus G/H ≈ image(f ). and we write G ≈ G. ¯ ¯ 12) Suppose H is a normal subgroup of G. 13) Given any group homomorphism f . then H is a subgroup of G ¯ ¯ iff f (H) is a subgroup of G. G π f E G ¯ b & & & G/H & ¯ &f c& Thus defining a homomorphism on a quotient group is the same as defining a homomorphism on the numerator which sends the denominator to e. This is a key fact which is used routinely in topics such as systems of equations and linear differential equations. The ¯ ¯ ¯ image of f is the image of f and the kernel of f is ker(f )/H. 8) ¯ The composition of homomorphisms is a homomorphism. Thus if H = ker(f ). This is the fundamental connection between quotient groups and homomorphisms. . In the case ¯ G = G. then h ◦ f : G →G is a homomorphism.

30 Groups Chapter 2 14) Suppose K is a group. If o(a) is infinite. there is always a homomorphism from Z to G which sends 1 to a.e. K is a cyclic group of order n iff K ≈ Zn . Define f : G → G by f (h) = h + 5h + 6t2 h. Let g : [0. Exercise Let G be as above and g ∈ G. Then G is a group under addition. Show f dt is a homomorphism and find its kernel and image. Then f is a group homomorphism and the differential equation h +5h +6t2 h = g has a solution iff g lies in the image of f . 1] → R be defined by g(t) = t3 − 3t + 4. Show that every group of order four is abelian. Exercise Let G = {h : [0. Now suppose this equation has a solution and S ⊂ G is the set of all solutions. f is an isomorphism. If o(a) = n. Then f : Z → K defined by f (m) = am is a homomorphism from an additive group to a multiplicative group. ¯ Proof of 14) Suppose G = K is generated by some element a. g = e. For which subgroup H of G is S an H-coset? . i. ker(f ) = nZ and ¯ f : Zn → K is an isomorphism. Under what conditions on g is there a homomorphism f : Z7 → G with f ([1]) = g ? Under what conditions on g is there a homomorphism f : Z15 → G with f ([1]) = g ? Under what conditions on G is there an injective homomorphism f : Z15 → G ? Under what conditions on G is there a surjective homomorphism f : Z15 → G ? 2) 3) 4) Exercise We know every finite group of prime order is cyclic and thus abelian. Find f −1 (g) and show it is a coset of ker(f ).. Exercise If a is an element of a group G. 1] → R : h has an infinite number of derivatives}. Then K is an infinite cycle group iff K is isomorphic to the integers under addition. Define f : G → G by f (h) = dh = h . K ≈ Z. When is there a homomorphism from Zn to G which sends [1] to a? What are the homomorphisms from Z2 to Z6 ? What are the homomorphisms from Z4 to Z8 ? Exercise 1) Suppose G is a group and g is an element of G.

Exercise Show that if there is a bijection between X and Y . Let X = {1. ha·b = ha ◦ hb . n}. f (g) = a−1 · g · a..e. h is a homomorphism. Theorem (Cayley’s Theorem) Suppose G is a multiplicative group with n elements and Sn is the group of all permutations on the set G. Therefore the composition f ◦ g means “f followed by g”. S(X) forms a multiplicative group under composition.. and these groups are called the symmetric groups on n elements. Thus if each of X and Y has n elements. S(X) ≈ S(Y ). Define f : G → G to be conjugation by a. A bijection f : X → X is called a permutation on X. 2. Show H is a subgroup of Sn which is isomorphic to Sn−1 .e. Let g be any permutation on X with (n)g = 1. The next theorem shows that the symmetric groups are incredibly rich and complex. 2. Exercise Show that o(Sn ) = n!. h is injective and thus G is isomorphic to image (h) ⊂ Sn ... f = (x)f . They are all denoted by the one symbol Sn . The proof follows from the following observations. and the collection of all these permutations is denoted by S = S(X). The Symmetric Groups Now let n ≥ 2 and let Sn be the group of all permutations on {1. 1) 2) 3) For each given a. Permutations Suppose X is a (non-void) set. variables are written on the left. n}. i. The following definition shows that each element of Sn may .Chapter 2 Groups 31 Exercise Suppose G is a multiplicative group and a ∈ G.. . i. and H = {f ∈ Sn : (n)f = n}... Also show f is an automorphism and find its inverse. . there is an isomorphism between S(X) and S(Y ). i. Show that f is a homomorphism. Then G is isomorphic to a subgroup of Sn . ha is a bijection from G to G. Find g −1 Hg. In this setting.. Proof Let h : G → Sn be the function which sends a to the bijection ha : G → G defined by (g)ha = g · a. Sn = S(X).e..

ak ) and (c1 .. Then a1 a2 . Groups Chapter 2 Suppose 1 < k ≤ n. and 10) are listed solely for reference. .. A k-cycle is even (odd) iff k is odd (even). (This is easy. ak ). Properties 8).. an ) = (a1 .. They are all easy except 4). bk } is the same collection in some different order. b1 b2 .. bk and (a)f = a for all other a.. ) The parity of the number of these transpositions is unique..32 be represented by a matrix.. a2 )(a1 . 3) is an even permutation... ak } is a collection of distinct integers Definition with 1 ≤ ai ≤ n.. Suppose f. The cycles (a1 ... 2)(1. This means that if f is the product of p transpositions and also of q transpositions.) Every permutation can be written uniquely (except for order) as the product of disjoint cycles... {a1 . For example (1. Theorem 1) 2) Disjoint cycles commute. .) Every permutation can be written (non-uniquely) as the product of transpostions. Definition Listed here are seven basic properties of permutations. a2 . an ). and is denoted by (a1 . . a2 a3 .ak−1 ak is called a k-cycle. c ) are disjoint provided ai = cj for all 1 ≤ i ≤ k and 1 ≤ j ≤ . If one of f and g is even and the other is odd... For these we have a special notation. a3 ) · · · (a1 . which is rather delicate. b2 .. then p is even iff q is even. ak the matrix represents f ∈ Sn defined by (ai )f = bi for 1 ≤ i ≤ k.. In this case. . There is a special type of permutation called a cycle.. The composition of two permutations is computed by applying the matrix on the left first and the matrix on the right second. then g ◦ f is 3) 4) 5) 6) . a2 . and {b1 . g ∈ Sn .. (This is obvious.. 9).. .ak a1 A 2-cycle is called a transposition. (Proof: (a1 ... a1 a2 . f is said to be an even permutation. f is an odd permutation.. . 2. 3) = (1. In the other case.

.... 8) 9) For any n = 4. then σ(p) = x2 x2 + x2 x3 . 6 5 4 3 1 7 2 Write (1. σi (p) = −p. p = (x1 − x2 )(x1 − x3 )(x2 − x3 ). Now let p be the product of all (xi −xj ) where 1 ≤ i < j ≤ n. t must be odd.. . j ≤ n with i = j. 2. Thus if p = x1 x2 + x1 x3 . Thus if σ(p) = p. . σ(p) has (xi − xj ) or (xj − xi ) as a factor. 2). An is simple. then g ◦ f is even. . xn ) is a polynomial in n variables. has no proper normal subgroups. A careful examination shows that if σi is a transposition. (For example. Suppose S = {1. {(1.. In fact.3. and if σ(p) = −p. σ = σ1 ·σ2 ···σt . Thus An is a normal subgroup of index 2.. n}.7. and Sn /An ≈ Z2 . (Of course there are subgroups of Sn which cannot be generated by two elements).Chapter 2 Groups 33 odd. If σ is a permutation on S and p = p(x1 . the student may skip the proof until after that chapter.7)(2.5. then for each 1 ≤ i. 1 σ2 (σ1 (p)) = (σ1 ·σ2 )(p). n)} generates Sn . Which of these permutations are odd and which are even? Write .. if n = 3.6. For any n ≥ 3. Any permutation σ is the product of transpositions. An is generated by its 3-cycles. Its kernel (the subgroup of even permutations) is denoted by An and is called the alternating group.6..5. (Obvious. Write (3.. t must be even.1)(1. The following parts are not included in this course.4) as the product of disjoint cycles. 10) Sn can be generated by two elements.7)(2.. define σ(p) to be the polynomial p(x(1)σ . If f and g are both even or both odd. (1.) If σ is a permutation on S. x(n)σ ). Note that if σ1 and σ2 are permutations.7. They are presented here merely for reference. Since polynomials will not be introduced until Chapter 3. Proof of 4) The proof presented here uses polynomials in n variables with real coefficients.e.4)(3.) 7) The map h : Sn → Z2 defined by h(even)= [0] and h(odd)= [1] is a homomorphism from a multiplicative group to an additive group.3. and σ is the trans2 position (1. i. 2). Thus σ(p) = ±p. .1) as the product of disjoint cycles. Exercise 1) 1 2 3 4 5 6 7 as the product of disjoint cycles...

For background. . .34 2) Groups Chapter 2 Suppose (a1 . 3)σ = ((1)σ. c ) are disjoint cycles. Show that σ −1 (1. 2. −1). As an exercise. and (−1. define a new multiplication on the set G by a ◦ b = b · a. This operation makes G1 × G2 into a group. (1. a2 ) + (b1 . . This shows that conjugation by σ is just a type of relabeling. Show that H = {σ ∈ S6 : (6)σ = 6} is a subgroup of S6 and find its right cosets and its left cosets. Define an addition on G1 × G2 by (a1 . . Show that it has the same identity and the same inverses as G.) Product of Groups 3) 4) 5) 6) The product of groups is usually presented for multiplicative groups. It is presented here for additive groups because this is the form that occurs in later chapters. 2. We know from a previous exercise that G is a group with eight elements. . a2 + b2 ). either Sn or Sn may be called the symmetric group. the opposite group. n} with the variable written on the left. −1). op Show that an element of Sn is a permutation on {1. ak ) and (c1 . Theorem Suppose G1 and G2 are additive groups. 2. Now consider the special case G = Sn . 3. but quite useful. (1. 1). The projections π1 : G1 × G2 → G1 ¯ ¯ . (2)σ. Let A ⊂ R2 be the square with vertices (−1. 4. and that f : G → Gop defined by f (a) = a−1 is a group isomorphism. . It follows from Cayley’s theorem that G is isomorphic to a subgroup of S8 . depending on personal preference or context. 5)τ . this section should be rewritten using multiplicative notation. 1). . (Of course. −a2 ). For simplicity we first consider the product of two groups. 02 ) and −(a1 . . Show that G is isomorphic to a subgroup of S4 . . the new multiplication is the old multiplication in the opposite order. . a2 ) = (−a1 . . read the two theorems on page 11. The convention used in this section is that an element of Sn is a permutation on {1. The two theorems below are transparent and easy. What is the order of their product? Suppose σ ∈ Sn . Its “zero” is (01 . 6) and find τ −1 (1. b2 ) = (a1 + b1 . If G is a multiplicative group. This defines a new group denoted by Gop . Also let τ = (4. and G be the collection of all isometries of A onto itself. In other words. . . (3)σ). . 5. 2. . although the case of infinite products is only slightly more difficult. n} with the variable op written on the right. .

Usually G1 is identified with its image under i1 . Exercise Show that if G is any group of order 4. f2 (a)) + (f1 (b). and G2 are groups and f = (f1 . Let π2 : G1 × G2 → G2 be the projection map defined in the Background chapter. Exercise Let R be the reals under addition. Show π2 is a surjective homomorphism with kernel G1 . Therefore (G1 × G2 )/G1 ≈ G2 . The next theorem is stated in full generality. 02 ). f2 ) is a function from G to G1 × G2 .Chapter 2 Groups 35 and π2 : G1 × G2 → G2 are group homomorphisms. Show that the addition in the product R × R is just the usual addition in analytic geometry. Show that Zmn is isomorphic to Zn × Zm iff (n. f2 ) where f1 : G → G1 and f2 : G → G2 }. Exercise Suppose G1 and G2 are groups and i1 : G1 → G1 × G2 is defined by i1 (g1 ) = (g1 . Under this bijection. Now f (a + b) = (f1 (a + b). Show i1 is an injective group homomorphism and its image is a ¯ normal subgroup of G1 × G2 . Show Z4 is not isomorphic to Z2 × Z2 . Suppose G. f is a group homomorphism iff each of f1 and f2 is a group homomorphism. so G1 may be considered to be a normal subgroup of G1 × G2 . a2 ) in G1 × G2 . G1 . Proof It is transparent that the product of groups is a group. f2 (a + b)) and f (a) + f (b) = (f1 (a). f2 (a) + f2 (b)). We know there is a bijection from {functions f : G → G1 × G2 } to {ordered pairs of functions (f1 . m) = 1. G is isomorphic to Z4 or Z2 ×Z2 . Show G1 × G2 and G2 × G1 are isomor- If o(a1 ) = n and o(a2 ) = m. Exercise Suppose G1 and G2 are groups. An examination of these two equations shows that f is a group homomorphism iff each of f1 and f2 is a group homomorphism. Suppose G is an additive group. so let’s prove the last part. Is Sn isomorphic to An × G where G is a multiplicative group of order 2? One nice thing about the product of groups is that it works fine for any finite number. . find the order of (a1 . Exercise phic. or even any infinite number. f2 (b)) = (f1 (a) + f1 (b). Exercise Suppose n > 2.

36 Groups Chapter 2 Theorem Suppose T is an index set. Find the kernel of f and show that f is not surjective. Exercise Suppose T is a non-void set. 35. 1] and G = R. Note that GT is just another way Gt . c. Suppose G is an adGt } to ditive group. t∈T Exercise Suppose s is an element of T and πs : Gt → Gs is the projection map defined in the Background chapter. For each t ∈ T . [m]). [m]. ¯ Each projection πs : Gt → Gs is a group homomorphism. Finally suppose each of b. Its “zero” is {0t } and −{at } = {−at }. Note that the gcd of {45. Exercise Suppose s is an element of T and is : Gs → Gt is defined by is (a) = {at } where at = 0 if t = s and as = a. Let g : Z → Z45 × Z35 × Z21 be defined by g(m) = ([m]. Proof The addition on Gt is coordinatewise. Under the natural bijection from {functions f : G → {sequences of functions {ft }t∈T where ft : G → Gt }. G is an additive group. [m]). Find the kernel of g and determine if g is surjective. Gt is an additive Gt = Gt by {at } + {bt } = {at + bt }. [m]). Find necessary and sufficient conditions for f to be surjective. i. Also note that if T = [0. [3m]).. This opgroup. Show GT is a group. Find the order of ([4]. Define an addition on eration makes the product into a group. [m]). Now let h : Z → Z8 × Z9 × Z35 be defined by h(m) = ([m].e. f is a group homomorphism Finally. by integers is given coordinatewise.) . Find the kernel of f. the scalar multiplication on Gt iff each ft is a group homomorphism. (See exercises on pages 44 and 69. and GT is the collection of all functions f : T → G with addition defined by (f + g)(t) = f (t) + g(t). 21} is 1. [m]. Find the kernel of h and show that h is surjective. Exercise Let f : Z → Z90 × Z70 × Z42 be the group homomorphism defined by f (m) = ([m]. let Gt = G. and for any t ∈ T . Show πs is a surjective homomorphism and find its kernel. and d is greater than 1 and f : Z → Zb × Zc × Zd is defined by f (m) = ([m]. {at }n = {at n}. Thus each Gs may be considered to be and its image is a normal subgroup of a normal subgroup of Gt . [m]. the addition defined on of writing t∈T GT is just the usual addition of functions used in calculus. Exercise Let f : Z → Z30 × Z100 be the homomorphism defined by f (m) = ([4m]. Show is is an injective homomorphism ¯ Gt . [m]. [3]) in Z30 × Z100 .

map from R × R to R) which is denoted by multiplication. R = 0. a · 1 = 1 · a = a.. ¯ ¯ ¯ ¯ If a.) If a. Suppose R is an additive abelian group.) 3) 4) Definition If 1). i. b. 1) 2) If a. R is said to be a commutative ring. b ∈ R. and 3) are satisfied. and R has a second binary ¯ operation (i. the 37 . 2). If in addition 4) is satisfied. a · (b + c) = (a · b) + (a · c) and (b + c) · a = (b · a) + (c · a). This is because ideals are also normal subgroups and ring homomorphisms are also group homomorphisms. R is said to be a ring. b. which connects addition and multiplication. (The distributive law. (a · b) · c = a · (b · c). a · b = b · a. because these concepts are unnecessary for our development of linear algebra. (The associative property of multiplication. These concepts are developed in the Appendix. c ∈ R. Consider the following properties. an element 1 = 1R ∈ R such that if a ∈ R. although with the material at hand.) R has a multiplicative identity.. Assuming the results of Chapter 2. (The commutative property for multiplication.e. c ∈ R. We do not show that the polynomial ring F [x] is a unique factorization domain. A section on Boolean rings is included because of their importance in logic and computer science.e. it would be easy to do. this chapter flows smoothly. Also there is no mention of prime or maximal ideals.Chapter 3 Rings Rings are additive abelian groups with a second operation called multiplication. Examples The basic commutative rings in mathematics are the integers Z. The connection between the two operations is provided by the distributive law.

(na) · (mb) = (nm)(a · b). i. n may be 0 even though n = 0. a polynomical ring in n variables. has a natural multiplication under which it is a commutative ring. even if R is commutative. a−1 is also a ¯ −1 ¯ unit with (a )−1 = a. scalar ¯ ¯ ¯ ¯ ¯ ¯ multiplication by n is the same as ring multiplication by n. Also if R is any commutative ring. we will define R[x1 . ¯ ¯ ¯ ¯ ¯ (−a) · b = a · (−b) = −(a · b). In the next chapter. ¯ ¯ Units Definition An element a of a ring R is a unit provided ∃ an element a−1 ∈ R with a · a−1 = a−1 · a = 1. Recall that. and the complex numbers C. If n > 1. The product of units is a unit with (a · b)−1 = b−1 · a−1 . . b ∈ R. Then na = n · a. ¯ Of course. na = an. If a is a unit. Rn is a ring. that is. For example.) Let n = n1. a · 0 = 0 · a = 0. Under these operations. the integers mod n. m ∈ Z. It will be shown later that Zn . They should be worked as exercises. Theorem 1) Suppose a. Rn is never commutative. . 2 = 1 + 1. xn ]. Now suppose R is any ring. This is a basic example of a non-commutative ring. The next two theorems show that ring multiplication behaves as you would wish it to. the real numbers R. and the next theorem shows it relates nicely to the ring multiplication. 1 is always a unit. This scalar multiplication can be written on the right or left. ¯ Theorem 0 can never be a unit.. operations of addition and multiplication of matrices will be defined. n ≥ 1. More 2) . b ∈ R and n. x2 . Theorem 1) 2) Suppose R is a ring and a. since R is an additive abelian group. and Rn is the collection of all n×n matrices over R. . it has a scalar multiplication over Z. .e.38 Rings Chapter 3 rational numbers Q. (This follows from the distributive law and the previous theorem. Therefore 1 = 0.

Finally if a is a unit. by the pigeonhole principle. R is a field if it is commutative and its non-zero elements form a group under multiplication. as shown in the next theorem. In other words. an are units.. A finite domain is a field. In other words. if a = 0. ¯ ¯ Inverses are unique. Corollary Domains and Fields In order to define these two types of rings. a2 . Then f : R → R defined by f (b) = a · b is injective and. Proof A field is a domain because a unit cannot be a zero divisor. . . A field is a commutative ring such that. we first consider the concept of zero divisor. The set of all units of R forms a multiplicative group denoted by n−1 1 n R∗ . A non-zero element a ∈ R is called a zero divisor provided ∃ a non-zero element b with a · b = 0. multiplication by a is an injective map from R to R. Theorem Suppose a ∈ R and ∃ elements b and c with b · a = a · c = 1. f is surjective.. Definition Suppose R is a commutative ring. a is not a zero divisor. if a = 0. Theorem A field is a domain. it must have a two-sided inverse. (−a) is a unit and (−a)−1 = −(a−1 ). Note that if a is a unit. It suffices to require a left inverse and a right inverse.. Thus a is a unit and so R is a field. if a1 . Proof b = b · 1 = b · (a · c) = (b · a) · c = 1 · c = c. a is a unit. then their product is a unit with (a1 · a2 · · · an )−1 = a−1 · a−1 · · · a−1 . Suppose R is a commutative ring and a ∈ (R − 0) is not a zero divisor. It is surjective iff a is a unit. ¯ it cannot be a zero divisor. Definition A domain (or integral domain) is a commutative ring such that. Then ¯ b = c and so a is a unit with a−1 = b = c.Chapter 3 Rings 39 generally. In order for a to be a unit. Theorem ¯ Then (a · b = a · c) ⇒ b = c. Suppose R is a finite domain and a = 0.

Define a multiplication on Zn by [a] · [b] = [ab]. This is a well defined binary operation which makes Zn into a commutative ring. Recall that if b is an integer. and C are fields. Note that 1 = (1. [a]b = [a] · [b] = [ab]. each of [1]. Then the following are equivalent. Zn is a field. Q. n is a prime. b) · (c. Theorem 1) 2) 3) Suppose n > 1 and a ∈ Z. Suppose 3) is true. However.. [a] is a generator of the additive group Zn . Show C is a commutative ring which is a field. Now suppose 3) is false. [a] is a unit of the ring Zn . Thus 1) and 3) are equivalent. the multiplication is well defined. The Integers Mod n The concept of integers mod n is fundamental in mathematics. 0) and if i = (0.. and thus 2) is true. Then n = ab where 1 < a < n.[n − 1] is a unit. (a.) We know from Chapter 2 that Zn is an additive abelian group. the following are equivalent. Proof We already know 1) and 2) are equivalent. The ring axioms are easily verified. Corollary 1) 2) 3) If n > 1. Theorem Suppose n > 1. Then by the previous theorem. (See the Chinese Remainder Theorem on page 50. R. as seen by the theorems below. 1). then i2 = −1. Proof We already know 1) and 2) are equivalent. Zn is a domain. It leads to a neat little theory. ¯ ¯ Examples Z is a domain. Proof Since [a + kn] · [b + l n] = [ab + n(al + bk + kl n)] = [ab]. the basic theory cannot be completed until the product of rings is defined. [2]. ad + bc).. 1 < b < n. n) = 1. Define multiplication by (a. .40 Rings Chapter 3 Exercise Let C be the additive abelian group R2 .. because each says ∃ an integer b with [a]b = [1]. because Zn is finite. d) = (ac − bd.

¯ If {It }t∈T is a collection or right (left. then s may be a unit in R but not in S. then t∈T It is a right (left. and (a. and an ideal is never a subring (unless it is the entire ring). Show that (Z7 )∗ is a cyclic group but (Z12 )∗ is not. 1 ∈ S . as well as in group theory. Note that Z is a subring of Q. Exercise List the units and their inverses for Z7 and Z12 . 2-sided) ideals of R.   left       Definition right ideal provided it is a subgroup   2−sided    a·b∈I    of the additive group R and if a ∈ R and b ∈ I. 2-sided) ideal of R. . Ideals and Quotient Rings Ideals in ring theory play a role analagous to normal subgroups in group theory. Show that in Z12 the equation x2 = 1 has four ¯ solutions. Subrings Suppose S is a subset of a ring R. Finally show that if R is a domain. The    a · b and b · a ∈ I word “ideal ” means “2-sided ideal”. Note that if S is a subring of R and s ∈ S. These are called the improper ideals. Note also that Z and Zn have no proper subrings. and thus [a] is a zero divisor and 1) is false. b ∈ S ⇒ a · b ∈ S). The statement that S is a subring of R means that S is a subgroup of the group R. R and 0 are ideals of R. Q is a subring of R. x2 = 1 can have at most two solutions ¯ in R.Chapter 3 Rings 41 [a][b] = [0]. That role is played by ideals. ¯ Then clearly S is a ring and has the same multiplicative identity as R. Of course. Subrings do not play a role analogous to subgroups. and R is a subring of C. and thus occupy a special place in ring theory. if R is commutative. A subset I of a ring R is a Theorem 1) 2) Suppose R is a ring. every right or left ideal is an ideal. then  b · a ∈ I .

The multiplicative identity is (1 + I). because it is of the form nZ.42 3) Furthermore. (On the left. Since I is a normal subgroup of the additive group R. Show that R is a field iff R contains no proper ideals. Thus every subgroup of Z is a principal ideal. then t∈T Rings Chapter 3 It is a right (left. b ∈ R then f (a · b) = f (a) · f (b). 4) If a ∈ R. called a principal ideal. iii) I contains 1. i) I = R. Thus multiplication is well defined. If R is a commutative ring and I ⊂ R is an ideal. Theorem Suppose R is a ring and I ⊂ R is an ideal. ¯ Observation If R = Z and I = nZ. multiplication . aR is an ideal. Multiplication of cosets defined by (a + I) · (b + I) = (ab + I) is well defined and makes R/I a ring. 2-sided) ideal of R. if the collection is monotonic. Homomorphisms ¯ ¯ Definition Suppose R and R are rings. R/I is an additive abelian group. Proof (a + I) · (b + I) = a · b + aI + Ib + II ⊂ a · b + I. ii) I contains some unit u. the ring structure on Zn = Z/nZ is the same as the one previously defined. A function f : R → R is a ring homomorphism provided 1) 2) 3) f is a group homomorphism f (1R ) = 1R and ¯ ¯ ¯ if a. I = R. and the ring axioms are easily verified. ¯ 5) Exercise Suppose R is a commutative ring. but it is in some sense the beginning of ring theory. then the following are equivalent. The following theorem is just an observation. Thus if R is commutative. I = aR is a right ideal.

−1 ¯ → R is a ring homomorphism. Such an f is called then f : R ¯ a ring isomorphism. while on the right multiplication is in R.) 43 The kernel of f is the kernel of f considered as a group homomorphism. π(a) = (a + I). I = R. ¯ ¯ In fact. The kernel of a ring homomorphism is an ideal of the domain. Suppose I is an ideal of R. namely f −1 (0). ¯ ¯ ¯ If I ⊂ ker(f ). and π : R → R/I is the natural projection. f is also called a ring automorphism. ¯ The zero map from R to R is not a ring homomorphism (because it does not send 1 to 1). ¯ ¯ The composition of ring homomorphisms is a ring homomorphism. then f : R/I → R defined by f (a + I) = f (a) 3) 4) 5) 6) 7) 8) . ¯ Suppose f : R → R is a homomorphism and I is an ideal of R. I = R. Then π is a surjective ring ¯ homomorphism with kernel I. From now on the word “homomorphism” means “ring homomorphism”.Chapter 3 Rings ¯ is in R. if f : R → R is a homomorphism and I ⊂ R is an ideal. ¯ If f : R → R is a bijection which is a ring homomorphism. then f −1 (I) is an ideal of R. ¯ Here is a list of the basic properties of ring homomorphisms. In the case R = R. Furthermore. The image of a ring homomorphism is a subring of the range. The identity map IR : R → R is a ring homomorphism. Much of this work has already been done in the theorem in group theory on page 28. Theorem 1) 2) ¯ Suppose each of R and R is a ring. if f : R → R is a surjective ¯ ring homomorphism with kernel I. then R/I ≈ R (see below).

then (f + g)(t) = f (t) + g(t) and (f · g)(t) = f (t)g(t).. Proof We know all this on the group level. Define addition and multiplication on RT point-wise. Exercise Suppose R is a ring. Show that under these operations RT is a ring. The image of f is the image of f . define a function α∗ (f ) : S → R by α∗ (f ) = f ◦ α. Exercise Find a ring R with an ideal I and an element b such that b is not a unit in R but (b + I) is a unit in R/I.1] .e. and it is only necessary ¯ to check that f is a ring homomorphism. i. 1] and R = R. which is obvious. Suppose S is a non-void set and α : S → T is a function. . Notice that much of the work has been done in the previous exercise. Let A ⊂ R[0. 1] → R : f has an infinite number of derivatives}. domain(f )/ker(f ) ≈ image(f ). Show A is a ring. If f : T → R is a function. and so R/I ≈ image (f ). and RT is the collection of all functions f : T → R.1] be the collection of all C ∞ functions. It is only necessary to show that A is a subring of the ring R[0. 9) Given any ring homomorphism f . A ={f : [0. Show that if u is a unit in a ring R. That is. and ¯ ¯ the kernel of f is ker(f )/I. This means if f and g are functions from T to R. show that f : R → R defined by f (a) = u−1 · a · u is a ring homomorphism which is an isomorphism. T is a non-void set. Exercise Now consider the case T = [0. f R π c && & & E R ¯ b & & & ¯ & f R/I Thus defining a homomorphism on a quotient ring is the same as defining a homomorphism on the numerator which sends the ¯ denominator to zero. then conjugation by u is an Exercise automorphism on R. Show α∗ : RT → RS is a ring homomorphism. f is injective.44 Rings Chapter 3 is a well defined homomorphism making the following diagram commute. Thus if I = ker(f ).

For the general case. .). ∃! h. as is often done for convenience. f (x ) = a0 + a1 x + · · +an x n . Let ai xi and bj xj be the first non-zero terms of f and g.) such that each ai ∈ R and only a finite number are non-zero.. then f is said to be monic. Then (a0 . The sum and product of polynomials are again polynomials. Then deg(f )+deg(g) = deg(f g) and thus f g is not 0.) and (a0 . a0 b2 + a1 b1 + a2 b0 .. We outline the proof of existence and leave uniqueness as an exercise. a1 .. and is denoted by n = deg(f ). a1 + b1 . . Proof Suppose f and g are non-zero polynomials. Theorem (The Division Algorithm) Suppose R is a commutative ring.. (If R is a field. f ∈ R[x ] has degree ≥ 1 and its top coefficient is a unit in R. think of a polynomial a0 + a1 x + · · · as an infinite sequence (a0 . r ∈ R[x ] such that g = f h + r with r = 0 or deg(r) < deg(f ). b1 . Note that on the right. Suppose f = a0 + a1 x + · · +am xm where m ≥ 1 and am is a unit in R. For any g with deg(g) < m.Chapter 3 Rings Polynomial Rings 45 In calculus. the idea is to divide f into g ¯ until the remainder has degree less than m. The proof is by induction on the degree of g.. Suppose R is a commutative ring and x is a “variable” or “symbol”. Under the obvious addition and multiplication.. a1 .. the top coefficient of f will always be a unit. Then ai bj xi+j is the first non-zero term of f g. set h = 0 and r = g..) + (b0 .) = (a0 b0 .. The degree of a non-zero polynomial f is the largest integer n such that an = 0. a0 b1 + a1 b0 . Another way to prove this theorem is to look at the bottom ¯ terms instead of the top terms. Suppose n ≥ m and the result holds for any polynomial of degree less than . . R[x ] is a commutative ring. and it is easy to see that the collection of polynomial functions forms a commutative ring. a1 .. ... we consider real functions f which are polynomials.) Then for any g ∈ R[x ]. R[x ] is also a domain. the ring multiplication a · b is written simply as ab.) · (b0 .. We can do the same thing formally in a purely algebraic setting. Definition The polynomial ring R[x ] is the collection of all polynomials f = a0 + a1 x + · · +an x n where ai ∈ R. Theorem If R is a domain. .) = (a0 + b0 . ¯ Proof This theorem states the existence and uniqueness of polynomials h and r. . . If an = 1. ¯ To be more formal.. b1 .

. The result follows from the equation f (h1 + bxt ) + r = g. the result follows by induction. . Furthermore. then we say “all the roots of g belong to R”. Thus F [x ] is a PID. c2 . nk and a unique polynomial h with no root in R so that g(x) = (x − c1 )n1 · · · (x − ck )nk h(x). By induction..e. ∃ h1 and r with f h1 + r = (g − f bxt ) and deg(r) < m. if h = an . Also show that if g(x) = x2 + bx + c... ck be the distinct roots of g in the ring R.e. Definition A domain T is a principal ideal domain (PID) if. ∃ t ∈ T such that I = tT . Note that Z is a PID and any field is PID. i. and it is assumed without proof for this textbook. Note that f = x − c divides g iff c is a root of g. This is called The Fundamental Theorem of Algebra. Then g has at most n roots. i. Suppose g is a polynomial of degree n. If g = an xn . Suppose n > 1 and the theorem is true for any polynomial of degree less than n. Then ∃ a polynomial h1 with g(x) = (x − c1 )h1 . n2 . Theorem Suppose F is a field. Then ∃ a unique sequence of positive integers n1 .46 Rings Chapter 3 n. Exercise Suppose g is a non-constant polynomial in R[x]. . Since h1 has degree less than n. we say “all the roots of g are 0”. Then I contains a unique polynomial of the form f = a0 + a1 x + · · +an−1 x n−1 + x n and it has the property that I = f F [x ]. Note If r = 0 we say that f divides g. all the roots of g belong to C. Note this is similar to showing that a subgroup of Z is generated by one element (see page 15). n > 0. The theorem is clearly true for n = 1.) ¯ Proof Uniqueness is easy so let’s prove existence. I is a proper ideal of F [x ]. g(c) = 0. Let c1 .. More generally. given any ideal I. . Theorem Suppose R is a domain. i. and in that case both roots belong to R. Now suppose g is a polynomial of degree n and c1 is a root of g. each coset of I can be written uniquely in the form (c0 + c1 x + · · +cn−1 x n−1 + I). Proof.. Now ∃ a monomial bxt with t = n − m and deg(g − f bxt ) < n.e. Note If g is any non-constant polynomial in C[x]. This is a good exercise in the use of the division algorithm. C is an algebraically closed field. (If h has degree 0. Show that if g has odd degree then it has a real root. x − c divides g with remainder g(c). and g(x) = a0 + a1 x + · · · + an xn is a polynomial of degree n with at least one root in R. then it has a real root iff b2 ≥ 4c.. and n is the smallest positive integer such that I contains a polynomial of degree n.

then h is an associate of f . The next definition and theorem are included merely for reference. We do not develop the theory of F [x ] here. Exercise Let C = {a + bi : a. the units of R[x ] are just the units of R. b ∈ R}. while the units of Z . Suppose R is a subring of a commutative ring C and c ∈ C. The image of h is the smallest subring of C containing R and c.. This map h is called an evaluation map. The theorem says that adding two polynomials in R[x ] and evaluating is the same as evaluating and then adding in C. The statement that f is irreducible means that if h is a non-constant polynomial which divides f . Definition Suppose F is a field and f ∈ F [x] has degree ≥ 1. i.e.Chapter 3 Rings 47 Theorem. the units of F [x ] are the non-zero constants. The statement that g is an associate of f means ∃ a unit u ∈ F [x] such that g = uf . In this chapter we do not prove F [x] is a unique factorization domain. The degree function corresponds to the absolute value function. Thus if F is a field. Also multiplying two polynomials in R[x ] and evaluating is the same as evaluating and then multiplying in C. Show that [1] + [2]x is a unit in Z4 [x ]. This is a good way to look at the complex numbers. Then ∃! homomorphism h : R[x ] → C with h(x ) = c and h(r) = r for all r ∈ R.e. i. if R is a domain. The Division Algorithm corresponds to the Euclidean Algorithm. Exercise Show that. Irreducible polynomials correspond to prime integers. Show ker(h) = (x2 + 1)R[x ] and thus R[x ]/(x 2 + 1) ≈ C. Since R is a subring of C. nor do we even define unique factorization domain. and should not be studied at this stage. the development is easy because it corresponds to the development of Z in Chapter 1. and this h is surjective. However. adjoin x to R and set x2 = −1. to obtain C.. Write out the multiplication table for this ring and show that it is a field. It is defined by h(a0 + a1 x + · · +an x n ) = a0 + a1 c + · · +an cn . there exists a homomorphism h : R[x] → C which sends x to i. h sends f (x) to f (c). In street language the theorem says you are free to send x wherever you wish and extend to a ring homomorphism on R[x]. Exercise Z2 [x ]/(x 2 + x + 1) has 4 elements. One difference is that the units of F [x ] are non-zero constants.

Side Comment It is true that if F is a field. ¯ ¯ ¯ If R is a commutative ring and n ≥ 2. Order does not matter here. y] : f (0.. x2 .) Theorem Suppose F is a field and f ∈ F [x ] has degree ≥ 1..xn n is called a monomial. F [x ]/(f ) is a field. the ideal I = xF [x. For example... . xn−1 ])[xn ]. xn ] is a domain and its units are just the . Definition Now suppose x and y are “variables”. If R is a domain. x2 . xn ] to be any finite sum of monomials. y] + yF [x. 1) 2) 3) F [x ]/(f ) is a domain. v2 . y] is a commutative ring and (R[x ])[y] ≈ R[x . x2 .. it is easy to prove the following theorem. Also the following are equivalent. If a ∈ R and n. any polynomial in x and y with coefficients in R may be written as a polynomial in y with coefficients in R[x ]. y] = {f ∈ F [x. each f ∈ F [x . y] to be any finite sum of monomials. x2 . R[x1 . (This theory is developed in full in the Appendix under the topic of Euclidean domains. f is irreducible. 0) = 0} is not principal. xn ] ≈ (R[x1 . then ax1 1 x2 2 ... Thus the associates of f are all cf with c = 0 while the associates of an ¯ integer n are just ±n. If a ∈ R and v1 .. Define an element of R[x .. y] ≈ (R[y])[x ]. . and this factorization is unique up to order and associates...48 Rings Chapter 3 are just ±1. y] is not a PID. Define an element of R[x1 . However F [x . .. then ax n y m = ay m x n is called a monomial. Theorem R[x . This gives a commutative ring and there is canonical isomorphism R[x1 ... the concept of a polynomial ring in n variables works fine without a hitch. Using this and induction on n. Then f factors as the product of irreducibles.. Theorem units of R. m ≥ 0. . or as a polynomial in x with coefficients in R[y]. vn are non-negative v v v integers. .. y] factors as the product of irreducibles. Here is the basic theorem. In other words..

Proof We already know f is a group homomorphism iff each ft is a group homomorphism (see page 36). Exercise If T is any ring. f is a ring homomorphism iff each ft is a ring homomorphism. just as does the product of groups. ¯ ¯ ¯ ¯ Suppose I ⊂ R and J ⊂ S are ideals. Show f is a ring ¯ ¯ homomorphism whose kernel is the ideal (y) = yR[x. 0). Under the natural bijection from {functions f : R → Rt } to {sequences of functions {ft }t∈T where ft : R → Rt }. Exercise Suppose R and S are commutative rings. and ¯ f (1R ) = {1t } iff ft (1R ) = 1t for each t ∈ T. Finally. and f : T → R × S defined by f (t) = (et. Suppose T is a commutative ring and e ∈ T is an idempotent with 0 = e = 1. additive abelian group Then Rt is a ring and each projection πs : Rt → Rs is a ring homomorphism.Chapter 3 Rings 49 Exercise Suppose R is a commutative ring and f : R[x. Product of Rings The product of rings works fine. y]. On the Rt = Rt . Show T = R × S is not a domain. y] → R[x] is the evaluation map which sends y to 0. Suppose R is a ring. f is a ring homomorphism iff each ft is a ring homomorphism. y]/(y) is isomorphic to R[x]. y)) = p(x. Show each of the ideals R and S is a ring with identity. define multiplication by {rt } · {st } = {rt · st }. Show R × 0 is an ideal and (R × S/R × 0) ≈ S. This shows that a commutative ring T splits as the product of two rings iff it contains a non-trivial idempotent. Note that R × 0 is not a subring of R × S because it does not contain (1R . 1S ). 0) ∈ R × S and show e2 = e. Use the fact that “the domain mod the kernel is isomorphic to the image” to show R[x. Theorem Suppose T is an index set and for each t ∈ T . Rt is a ring. Let R = eT and S = (1 − e)T . R × 0 = eT . Let e = (1. Show I × J is an ideal of R × S and every ideal of R × S is of this form. t∈T . since multiplication is defined ¯ ¯ ¯ ¯ coordinatewise. and 0 × S = (1 − e)T . Exercise Suppose R and S are rings. (1 − e)t) is a ring isomorphism. an element e of T is called an idempotent provided e2 = e. (1 − e)2 = (1 − e). Note that {1t } is the multiplicative identity of Rt . This means f (p(x. The elements 0 and 1 are idempotents called the trivial idempotents.

(Note that the bracket symbol is used ambiguously. Thus Zn and Zn1 × · · ×Znt are isomorphic rings. Definition Suppose R is a ring and f : Z → R is the natural ring homomorphism f (m) = m1 = m. each ni > 1.. Theorem If R is a ring. m > 1.. ¯ It is an interesting fact that. the kernel of f is nZ. the characteristic of R is the order of 1. but it shows that in ring theory.. (For a useful and elegant generalization of this theorem.. nt .50 The Chinese Remainder Theorem Rings Chapter 3 Suppose n and m are relatively prime integers with n. all the non-zero elements of R have the same order. then [a] = [ap ] in Zp (Fermat’s Little Theorem). Furthermore. Characteristic The following theorem is just an observation. (See the fourth exercise on page 36. and thus f (1) is a group generator.. there is one and only one ring homomorphism f : Z → R. if R is a domain. . The non-negative integer n with ker(f ) = nZ is called the charac¯ ¯ teristic of R. and thus f is surjective. and (ni .. the ring of integers is a “cornerstone”. Let fi : Z → Zni be defined by fi (a) = [a]. There is an exercise in Chapter 2 to show that Znm and Zn × Zm are isomorphic as groups.) Theorem Suppose n1 . It will now be shown that they are also isomorphic as rings. Thus f is injective iff R has characteristic 0 iff 1 has infinite order..) Then the ring homomorphism f = (f1 . Proof We wish to show that the order of f (1) is n. . see the Appendix. where n = n1 n2 · · nt . .... . Since their least common multiple is n. nt are integers. . ¯ If f is not injective. The element f (1)m = ([1].) Exercise Show that if a is an integer and p is a prime. the order of f (1) is n. it has the same last digit as b5 . It is given by f (m) = m1 = m. Thus the subgroup of R generated by 1 is a subring ¯ ¯ ¯ of R isomorphic to Z or isomorphic to Zn for some positive integer n. .. Use this and the Chinese Remainder Theorem to show that if b is a positive integer. nj ) = 1 for all i = j. [1])m = ([m]. [m]) is zero iff m is a multiple of each of n1 . ft ) : Z → Zn1 × · · ×Znt is surjective.

Proof Suppose R has characteristic 0. and m is a positive integer. ¯ 2 2 2 2 Proof (a + a) = (a + a) = a + 2a + a = 4a. a is a non-zero element of R. (In the language of Chapter 6. then n is a prime and each non-zero a ∈ R has order n. Proof Suppose a = 0. and so a = −a. If R has finite characteristic n. If a is a non-zero element of R. Thus a · b = b · a. However it fits easily here. each element of R is an idempotent. then every element of R/I is idempotent and thus R/I is a Boolean ring.Chapter 3 Rings 51 Theorem Suppose R is a domain. Then ma = m · a cannot be 0 because m. Thus 2a = 0 ¯ R is commutative. I is a prime ideal iff I is a maximal ideal iff R/I ≈ Z2 ). 2) Proof (a + b) = (a + b)2 = a2 + (a · b) + (b · a) + b2 = a + (a · b) − (b · a) + b. That is. and is included for reference. ¯ ¯ ¯ Exercise Show that if F is a field of characteristic 0. It follows from 3) that R/I is a domain iff R/I is a field iff R/I ≈ Z2 . Now suppose R has characteristic n. 3) If R is a domain. Theorem 1) Suppose R is a Boolean ring. If R has characteristic 0. F contains Q as a subring. na = n · a = 0 · a = 0 and thus o(a) = n. a2 = a. If a ∈ R.. R has characteristic 2. a = 0 and R is a domain. Then a · (1 − a) = 0 and so a = 1. 2a = a + a = 0. Then R contains Zn as a subring. Definition A ring R is a Boolean ring if for each a ∈ R. ¯ ¯ ¯ ¯ 4) The image of a Boolean ring is a Boolean ring. .e. then each non-zero a ∈ R has infinite order. show that the injective homomorphism f : Z → F extends to an injective ¯ homomorphism f : Q → F . and thus Zn is a domain and n is a prime. if I is an ideal of R with I = R. That is. ¯ ¯ ¯ ¯ Thus o(a) = ∞. R ≈ Z2 . Boolean Rings This section is not used elsewhere in this book. i.

commutative. Note that o(R) = 2n . The advantage of the algebra is that it is symmetric in cup and cap. R is an abelian group with 0 = ∅ and ¯ a = −a. R is called a Boolean algebra of sets. With these two operations (along with complement). Exercise Let X = {1. Then ∅ = a ∩ a and X = a ∪ a belong to R. it is a classical fact that. Under this addition. Under this multiplication R becomes a Boolean ring with 1 = X. n} and let R be the Boolean ring of all subsets of X. . R is called a Boolean algebra. and each distributes over the other. In this case. Theorem Suppose R is a Boolean algebra of sets. b ∈ R ⇒ (a ∪ b) ∈ R. and a. 2. then 3) and 4) are satisfied. fn ) : R → Z2 × Z2 × · · ×Z2 is a ring homomorphism. b ∈ R. a. Define an addition on R by a + b = (a ∪ b) − (a ∩ b). If a is a subset of X. So let’s just suppose R is a Boolean algebra of sets which is a Boolean ring with addition and multiplication defined as above. you have a Boolean algebra (ring). Exercise Suppose R is a finite Boolean ring. Now suppose R is a non-void collection of subsets of X..52 Rings Chapter 3 Suppose X is a non-void set. let a = (X −a) be a complement of a in X. b ∈ R ⇒ (a ∩ b) ∈ R. ¯ Note Suppose R is a Boolean ring. Show f is an isomorphism. Now define a ∨ b = a ∪ b and a ∧ b = a ∩ b. Show that R ≈ Z2 × Z2 × · · ×Z2 . It is a classical theorem that ∃ a Boolean algebra of sets whose Boolean ring is isomorphic to R. .. These operations cup and cap are associative. Define a multiplication on R by a · b = a ∩ b. Anyway. Define fi : R → Z2 by fi (a) = [1] iff i ∈ a. R is not a group under cup or cap. Theorem If 1) and 2) are satisfied.. and so 4) is true. a. Then a ∪ b = (a ∩ b ) belongs to R and so 3) is true... Consider the following properties for R. if you have a Boolean ring (algebra). 1) 2) 3) 4) a ∈ R ⇒ a ∈ R. The advantage of the ring viewpoint is that you can draw from the rich theory of commutative rings. it contains some element a. . ∅ ∈ R and X ∈ R. Show each fi is a homomorphism and thus f = (f1 . Since R is non-void.. have identity elements. Proof Suppose 1) and 2) are true.

such as invertible matrices. . . This will be used in the next chapter to show that an endomorphism on a finitely generated vector space has a well defined determinant.n be the collection of all m × n matrices a1. we use the “sum” notation.  . to identify Rn with Rn. over an arbitrary ring R.   53 . trace. i. a1. . i. after the first few pages.j ) =  . This chapter concludes with the theorem that similar matrices have the same determinant.1 . elementary matrices.1 .e.n A matrix may be viewed as m n-dimensional row vectors or as n m-dimensional column vectors. .e. . are all classical. .. Rn = Rn. and characteristic polynomial. Let Rm.n . To emphasize that Rn does not have a ring structure.  am. Rn is defined to be the additive abelian group R × R × · · · × R. Rn is identified with R1. and characteristic polynomial. Definition Suppose m and n are positive integers.n  . Square matrices are so important that they have a special notation. transpose. Rn = R ⊕ R ⊕ · · · ⊕ R.  where each entry a ∈ R.1 ..Chapter 4 Matrices and Matrix Rings We first consider matrices in full generality. Our convention is to write elements of Rn as column vectors. systems of equations. The highlight of the chapter is the theorem that a square matrix is a unit in the matrix ring iff its determinant is a unit in the ring.  A = (ai. it will be assumed that R is commutative. The topics. However. If the elements of Rn are written as row vectors.j . and determinant. A matrix is said to be square if it has the same number of rows as columns. i. trace. am.n .

the dot product of row s of A with column t of B. i.j ) = (−ai. Find the matrices AU.n is a right module over the ring R. they must have the same number of rows and the same number of columns. Exercise and W = Consider real matrices A = 1 2 0 1 a b c d .n . The product (ai.n × Rm.V = 0 1 1 0 .. The matrix AB will have the same number of rows as A and the same number of columns as B. Its “zero” is the matrix 0 = 0m. U A. the i.j + bi.54 Matrices Chapter 4 Addition of matrices To “add” two matrices. i.U = 2 0 0 1 . It is a function Rm..j ) is defined to be the matrix whose (s. and W A.n × Rn.p . j terms.t .j · c).. AV. AW .e.p → Rm. Also −(ai.n × R → Rm.n → Rm.n . if R is commutative. .n is an additive abelian group. i. Multiplication of Matrices The matrix product AB is defined iff the number of columns of A is equal to the number of rows of B.j ) = (ai. Note in particular that scalar multiplication is defined on Rn . multiplication is a function Rm. it merely states that Rm. V A. B ∈ Rm. Right scalar multiplication is defined by (ai.j ).n and c. there is no distinction between right and left scalar multiplication. Theorem Suppose A. addition is a binary operation Rm.e. . Of course.1 · b1. Theorem Rm. i.e.. Furthermore. j term of the sum is the sum of the i.j ) + (bi. d ∈ R. t) term is as.j ).j )(bi. In the language of the next chapter. Rm. Scalar multiplication An element of R is called a scalar. The addition is defined by (ai. A matrix may be “multiplied” on the right or left by a scalar. Then (A + B)c = Ac + Bc A(c + d) = Ac + Ad A(cd) = (Ac)d A1 = A and This theorem is entirely transparent.e.n · bn.n ≈ Rmn .t + · · · + as.n all of whose terms are zero. as additive groups. The following theorem is just an observation.j )c = (ai.

j ) = BC.p . Let (xi. The units of Rn are called invertible or non-singular matrices. B.t = i i j i. Let A = R2 and a = (ai. B ∈ Rn.i ci. then aA is right ideal of A. Then the (s.q . 0p.t which is the (s.j bj.n .j ) = AB and (yi. C.t = as. t) term of (AB)C is xs.i ci.n . Theorem 1) 2) Theorem Suppose A ∈ Rm.t = as. and D. t) terms are equal. Theorem For each ring R and integer n ≥ 1. Then (AB)C = A(BC). Theorem (The associative law for matrix multiplication) Suppose A ∈ Rm. Then multiplication in Rn+m is given by A C B D E G F H = AE + BG CE + DG AF + BH CF + DH .p = 0m. G ∈ Rm.j yj. Proof This elegant little theorem is immediate from the theorems above. t) term of A(BC). Proof We must show that the (s. Rn is a ring.1 = 1 and the other entries are 0. H ∈ Rm .t = j as. The proof involves writing it out and changing the order of summation.m A = 0p. and C ∈ Rp.j as.n . Note that ABC ∈ Rm.j bj. Show that the only ideal of R2 containing a is R2 itself.Chapter 4 Matrices 55 Definition The identity matrix In ∈ Rn is the square matrix whose diagonal terms are 1 and whose off-diagonal terms are 0.i ci.m . They form a group under multiplication called the general linear group and denoted by Gln (R) = (Rn )∗ .i ci. Find aR2 and R2 a. Multiplication by blocks Suppose A.j j i bj.n A0n.q . F ∈ Rn.p Im A = A = AIn (The distributive laws) (A + B)C = AC + BC C(A + B) = CA + CB and whenever the operations are defined. E ∈ Rn .j ) where a1. . Exercise Recall that if A is a ring and a ∈ A.

m . Proof of 5) Suppose A is invertible.j = 0 for all i ≥ j (all j ≥ i). then At is an n-dimensional column vector. Definition If T is any ring. Rn is non-commutative. A is strictly upper (lower) triangular provided ai. j) term is the (j. then At is lower (upper) triangular. A is diagonal if it is upper and lower triangular. .j = 0 for all i > j (all j > i). i. If A is an n-dimensional row vector. then A is invertible iff At is invertible. If A ∈ Rm.. Transpose is a function from Rm. The geometry of this theorem will become transparent later in Chapter 5 when the matrix A defines an R-module endomorphism on Rn .n to Rn. Thus if T = Rn and B is a nilpotent matrix. then A is upper (lower) triangular provided ai. So row i (column i) of A becomes column i (row i) of At . Note that if A is upper (lower) triangular. (Ac)t = At c (AB)t = B t At If A ∈ Rn . Of course. Show that they form a subgroup of Gl2 (R). ai. an element t ∈ T is said to be nilpotent provided ∃n n such that t = 0. I = I t = (AA−1 )t = (A−1 )t At . I − B is invertible. In this case (A−1 )t = (At )−1 . Exercise Characterize those invertible matrices A ∈ R2 which have A−1 = At . (1 − t) is a unit with inverse 1 + t + t2 + · · · + tn−1 . If A is a square matrix. In this case.n .e. Proof The way to understand this is just multiply it out for n = 2 and n = 3. Theorem If A ∈ Rn is strictly upper (or lower) triangular. i) term of A. Triangular Matrices If A ∈ Rn . then An = 0. suppose R is a commutative ring. At is also square. At ∈ Rn.56 Transpose Matrices Chapter 4 Notation For the remainder of this chapter on matrices. Theorem 1) 2) 3) 4) 5) (At )t = A (A + B)t = At + B t If c ∈ R. for n > 1.m is the matrix whose (i.j = 0 for all i = j.

Multiply column i by some unit a ∈ R. if B ∈ Rn . B ∈ Rm. i. Multiplying by a scalar is the same as multiplying by a scalar matrix.n . Add a times column i to column j where i = j and a is any element of R. Scalar matrices A scalar matrix is a diagonal matrix for which all the diagonal terms are equal.Chapter 4 Matrices 1 2 −3  4 . i. Add a times row j to row i where i = j and a is any element of R. The map R → Rn which sends c to cIn is an injective ring homomorphism. and thus scalar matrices commute with everything.e..    Exercise an and C ∈ Rn. Elementary Operations and Elementary Matrices There are 3 types of elementary row and column operations on a matrix A. this shows how non-commutative Rn is.. Interchange row i and row j.p . AB = BA. Recall we are assuming R is a commutative ring. Find the inverse of  0 1  0 0 1     Suppose A =       57 Exercise Let R = Z. Show that BA is obtained from B by multiplying column i of B by ai . Interchange column i and column j. Show AC is obtained from C by multiplying row i of C by ai . and thus we may consider R to be a subring of Rn . For n > 1.e. Type 1 Multiply row i by some unit a ∈ R. Show A is a unit in Rn iff each ai is a unit in R. Show A is a scalar matrix. (cIn )B = cB = Bc = B(cIn ). Type 2 Type 3 . Exercise Suppose A ∈ Rn and for each B ∈ Rn . a matrix of the form cIn . a1 a2 0 0 · ·      is a diagonal matrix. A need not be square.

In type 2. They are obtained by performing row or column operations on the identity matrix.      B=     1 0 1 1 1 0 1 1           Type 2      B=     1 1 1 1 0 1 1 ai. To perform an elementary row (column) operation on A. Exercise Show that if B is an elementary matrix of type 1.) .2. then B is invertible and B −1 is an elementary matrix of the same type. and it may be above or below the diagonal. or 3.j is any element of R. Theorem Suppose A is a matrix.58 Matrices Chapter 4 Elementary Matrices Elementary matrices are square and invertible. That is.      B=     1 1 a 1 0 1 1 0           Type 1 where a is a unit in R. (See the exercise on page 54. and multiply on the left (right). there are two non-zero off-diagonal elements. perform the operation on an identity matrix to obtain an elementary matrix B. all the off-diagonal elements are zero. There are three types. In type 1. It need not be square. BA = row operation on A and AB = column operation on A. The following theorem is handy when working with matrices. In type 3.j           Type 3 where i = j and ai. there is at most one non-zero off-diagonal element.

. can be written as one . This shows the matrices B and C may be selected as products of elementary matrices.1 . am. Part 2) also follows from this procedure.j ) where d1.1 x1 + · · · + am. Write A and D as products 0 4 1 4 of elementary matrices over Q.Chapter 4 Exercise 1) Matrices Suppose F is a field and A ∈ Fm. Show A is the product of elementary matrices. For part 3).n xn = cm     x1 c1  ·   ·      =  or AX = C.t = 1 and all other entries are 0. (See page 89 of Chapter 5.  The system c1 . for each 1 ≤ i < m.j ) ∈ Rm. matrix equation in one unknown. The integer t is called the rank of A. . of m equations in n unknowns. .) 2) Suppose A ∈ Fn is invertible. use only row operations. perform row and column operations on A to reach the desired form. 59 Show ∃ invertible matrices B ∈ Fm and C ∈ Fn such that BAC = (di. the number of non-zero rows is the rank of T .n . . 3 11 3 11 and D = . .j )   ·   ·  xn cm a1. Systems of Equations   Suppose A = (ai.1 x1 + · · · + a1. Is it possible to write them as products of elementary matrices over Z? Let A = 3) 4) For 1).n xn = . . A matrix T is said to be in row echelon form if. Notice that if T is in row-echelon form. the first non-zero term of row (i + 1) is to the right of the first non-zero term of row i. namely as (ai. Show ∃ an invertible matrix B ∈ Fm such that BA is in row echelon form.1 = · · · = dt. .n and C =    c1 · · cm     ∈ Rm = Rm.

Check that for n = 2. The proper development of this concept requires a study of multilinear forms. n). (See part 7 of the section on Homomorphisms in Chapter 2.σ(n) . Determinants The concept of determinant is one of the most amazing in all of mathematics. Then f is a group homomorphism and also f (Dc) = f (D)c for any c ∈ R. | (a) | = a.) all σ . AX = C has a solution iff C ∈ image(f ). If σ is a permutation on (1. then AX = C has the unique solution X = A−1 C. 3) 4) The geometry of systems of equations over a field will not become really transparent until the development of linear algebra in Chapter 5. . and sign(σ) = −1 if σ is an odd permutation. Thus we may perform any row operation on both sides of the equation and not change the solution set. c d Definition Let A = (ai.σ(1) · a2. For each n ≥ 1 and each commutative ring R. Its solution set is ker(f ). The following theorem summarizes what we already know about solutions of linear equations in this setting. The determinant is defined by | A |= sign(σ) a1.. If D ∈ Rn is one solution. = ad − bc. (Note that here we are writing the permutation functions as σ(i) and not as (i)σ. In this section we simply present the basic properties. determinant is a function from Rn a b to R. which is given in Chapter 6. 2. In the language of the next chapter. Theorem 1) 2) AX = 0 is called the homogeneous equation. let sign(σ) = 1 if σ is an even permutation. the solution set is the coset D + ker(f ) in Rn . Then AX = C and (BA)X = BC have the same set of solutions.60 Matrices Chapter 4 Define f : Rn → Rm by f (D) = AD.. For n = 1.) Suppose B ∈ Rm is invertible.. this agrees with the definition above. For n = 2. this says that f is an R-module homomorphism. If A ∈ Rm is invertible.σ(2) · · · an.j ) ∈ Rn .

Ar−1 .1 · aτ γ(2). Summary Determinant is a function d : Rn → R. . ¯ Theorem Suppose 1 ≤ r ≤ n. c ∈ R. It follows that the determinant of a matrix is equal to the determinant of its transpose. An )| = a|(A1 . Anyway.2 · · · aτ (n). . aτ (1).1 · · · aτ (n). Now |A| = sign(τ )aτ (1). . Corollary |A| = |At |.Chapter 4 Matrices 61 For each σ. Since R is commutative.n and this summation has n! terms and Theorem n! is an even number. An ) where each Ai ∈ Rn. In the second expression. .5 . If two columns of A are equal. .1 .2 · · · aτ (n). . .σ(2) · · · an.n . the two previous theorems say that d is an alternating multilinear form. An )| Proof This is immediate from the definition of determinant and the distributive law of multiplication in the ring R. This means we write the matrix A as A = (A1 .2 · · · aτ (n). . .σ(n) contains exactly one factor from each row and one factor from each column.σ(n) = aτ (1). n) such that a1.1 · aτ (2).σ(1) · a2. The next two theorems say that d is skew-symmetric. Ar+1 . In the language used in the Appendix.σ(n) = all τ sign(τ )aτ (1). You may view an n × n matrix A as a sequence of n column vectors or as a sequence of n row vectors. A2 .2 · · · aτ γ(n).1 · aτ (2). τ is the inverse of σ and thus there are two ways to define determinant. . Then |(A1 . Then the first expression will contain the factor a2. . Then for any τ .n . To reduce the abstraction. Ar−1 .1 = Rn . it will appear as aτ (5).n . then |A| = 0. An )| + c|(A1 .e. . suppose σ(2) = 5. . and so τ (5) = 2. and since sign(τ )=−sign(τ γ). .n = aτ γ(1). assume the first two columns are equal. Here we will use column vectors. . etc. . . We wish to show that τ = σ −1 and thus sign(σ) = sign(τ ).σ(1) · · · an. . we may rearrange the factors so that the first comes from the first column. Ar+1 . Theorem |A| = all σ sign(σ)a1.σ(1) · a2.. A1 = A2 . Cr ∈ Rn. Therefore |A| = 0. aAr + cCr .σ(2) · · · an. . . i. This means that there is a permutation τ on (1. .1 · aτ (2).5 . the second from the second column. ¯ Proof For simplicity. Cr . Let γ be the transposition which interchanges one and two. . and a. a1. This pairs up the n! terms of the summation. . . these pairs cancel in the summation. 2. . . . . all τ .

. then the determinant is zero. . does not change the determinant. if one row is c times another row.1 Ci. then |A| = sign(τ )|(Aτ (1) . 2. Let Mi. Theorem Multiplying any row or column of matrix by a scalar c ∈ R. A3 . Since the first and last of these four terms are zero. .n . .j + a2. the determinant is zero.2 + · · · + ai. . . A3 . | A |= ai. show that |(A2 . c1 c2 c3   Exercise . Rewrite the four preceding theorems using rows instead of columns. Theorem Proof Exercise If τ is a permutation of (1. or adding c times one column to another column. The determinant of A is the sum of six terms. . A1 + A2 . An )| = |(A1 . . A3 .j Cn. . multiplies the determinant by c. A1 . A3 . . An )| + |(A2 . . There are 2n ways to compute | A |. . Let Ci. A2 . .j C1. n).j = (−1)i+j Mi. . The permutation τ is the finite product of transpositions. .1 + ai. .j are called the (i. . The following theorem is useful but the proof is a little tedious and should not be done as an exercise. . . An )| = −|A|. a1 a2 a3   Let A =  b1 b2 b3 . An )| + |(A1 .j C2. Aτ (n) )|. .j be the determinant of the (n − 1) × (n − 1) matrix obtained by removing row i and column j from A.2 Ci. . . A2 .j . . Matrices Chapter 4 Interchanging two columns of A multiplies the determinant by minus Proof For simplicity. Aτ (2) . Theorem For any 1 ≤ i ≤ n.j + · · · + an. Interchanging two rows or two columns multiplies the determinant by −1. More generally.n Ci. For any 1 ≤ j ≤ n. An )| + |(A2 . . the result follows. . The following theorem is just a summary of some of the work done so far. | A |= a1. its determinant is zero. A1 . . Mi.62 Theorem one. A3 .j and Ci. Thus if any row or any column is zero. or one column is c times another column. We know 0 = ¯ |(A1 + A2 . An )|. . . . . A3 . j) minor and cofactor of A. . . expansion by any row or expansion by any column.j . A1 . If a matrix has two rows equal or two columns equal. Adding c times one row to another row.

. (For the proof. B ∈ Rn.e. We assume it here without proof.. | A |= −1. If A is an elementary matrix of type 2. if A. Suppose n > 2 and the theorem is true for matrices in Rn−1 . Proof The following remarkable theorem takes some work to prove. The result follows by expanding by the first column. Proof We will prove the first statement for upper triangular matrices. i. A B Then the determinant of is | A || D |. Thus | AB | = | BA | and if C is invertible. i) term is the (i. An elementary matrix of type 3 is a special type of upper or lower triangular matrix. If A ∈ R2 is an upper triangular matrix. | C −1 AC | = |ACC −1 | = | A |. | A |= 1. Theorem If A is an upper or lower triangular matrix. Suppose A ∈ Rn is upper triangular. and thus has determinant −1. If A is an elementary matrix of type 3.e. Theorem (Determinant by blocks) Suppose A ∈ Rn . | AB | = | A || B |. 1 = | I | = | AA−1 | = | A || A−1 | . see page 130 of the Appendix.) Theorem The determinant of the product is the product of the determinants. Before . B ∈ Rn . i. Corollary Proof If A is a unit in Rn then | A | is a unit in R and | A−1 | = | A |−1 . One of the major goals of this chapter is to prove the converse of the preceding corollary. the matrix whose (j. and D ∈ Rm . j) cofactor. Classical adjoint Suppose R is a commutative ring and A ∈ Rn .j )t .Chapter 4 Matrices 63 Write out the determinant of A expanding by the first column and also expanding by the second row. O D Expand by the first column and use induction on n. An elementary matrix of type 2 is obtained from the identity matrix by interchanging two rows or columns.m . The classical adjoint of A is (Ci. so its determinant is 1. | A | is the product of the diagonal elements. then its determinant is the product of the diagonal elements.

j )t A = | A | I. The (s.j )t A = = | A | I. A−1 = (Ci. Show A is invertible with A−1 = B. then A(Ci. Theorem If R is commutative and A ∈ Rn .j )t . For s = t. Then A is a unit in Rn iff | A | is a unit in R. Similarity Suppose A. B is similar to A iff B is a conjugate of A.j )t A = |A|I is left as an exercise. i.j )t = d −b −c a . the (s. invertible and A−1 = | A |−1 (Ci. Then A(Ci.j )t = (Ci.64 we consider the general case. A is d −b −c a . the (s. A is invertible iff | A | = 0. Thus if | A | = 1. Exercise Show that any right inverse of A is also a left inverse. suppose A. B ∈ Rn .j ) and is thus | A | (computed by expansion by row s). then A−1 = | A |−1 (Ci. B ∈ Rn and AB = I. .j )t . Since this is the determinant of a matrix with row s = row t. Proof We must show that the diagonal elements of the product A(Ci. if | A | = 1. A−1 = Here is the general case.) If A is invertible. and thus BA = I.j )t are all | A | and the other elements are 0.j ). Thus if | A | is a unit in R. Theorem Suppose R is a commutative ring and A ∈ Rn . (Thus if R is a field. The proof that (Ci. That is.j )t . t) term is the dot product of row s of A with row t of (Ci.j )t = (Ci. let’s examine 2 × 2 matrices.j ) = |A| 0 0 |A| d −c −b a Matrices Chapter 4 and so (Ci. t) term is 0. In particular. Proof This follows immediately from the preceding theorem. the classical adjoint of A. s) term is the dot product of row s of A with row s of (Ci. B is said to be similar to A if ∃ an invertible C ∈ Rn such that B = C −1 AC..e. Theorem B is similar to B. If A = a b c d then (Ci. We are now ready for one of the most beautiful and useful theorems in all of mathematics.

1 + a2. trace(A + B) = trace(A) + trace(B) and trace(AB) = trace(BA). Theorem Suppose A and B are similar. then trace(A) = trace(B).i +· · ·+ai. That is. B ∈ Rn . Theorem If A.1 b1. ..n bn. trace(B) = trace(C −1 AC) = trace(ACC −1 ) = trace(A). By definition.m am.n and B ∈ Rn... “Similarity” is an equivalence relation on Rn .. Proof This proof involves a change in the order of summation. Theorem Suppose A ∈ Rm. Note that trace(AB) = trace(BA). Theorem Proof If A and B are similar. For example. suppose A = (a1 . . 65 Proof This is a good exercise using the definition. Proof Suppose B = C −1 AC. and the second part is a special case of the previous theorem.. Here is the theorem in full generality. then D is similar to A. Then | B | = | C −1 AC | = |ACC −1 | = | A |.j = trace(AB) = 1≤i≤m 1≤j≤n 1≤i≤m 1≤j≤n trace(BA). Then the trace is defined by trace(A) = a1.Chapter 4 Matrices B is similar to A iff A is similar to B. If D is similar to B and B is similar to A. bn )t . the trace of A is the sum of its diagonal terms. One of the most useful properties of trace is trace(AB) = trace(BA) whenever AB and BA are defined.i = ai.1 a1. Then AB is the scalar a1 b1 + · · · + an bn while BA is the n × n matrix (bi aj ).n .m . Trace Suppose A = (ai.. an ) and B = (b1 . a2 . ai. b2 .j bj.j +· · ·+bj. Proof The first part of the theorem is immediate.i = bj. Then | A | = | B | and thus A is invertible iff B is invertible.2 + · · · + an.j ) ∈ Rn . Then AB and BA are square matrices with trace(AB) = trace(BA). .

| A |= trace(A) = 0. A2 = 0. A square matrix over a field is nilpotent iff all its characteristic roots are 0 iff it is similar to a strictly ¯ upper triangular matrix. In other words. Any λ ∈ R which is a root of CPA (x) is called a characteristic root of A. Furthermore | AB | = | BA | and trace(AB) = trace(BA).66 Matrices Chapter 4 Summary Determinant and trace are functions from Rn to R. ∃ an elementary matrix C such that C −1 AC is strictly upper triangular. This follows from a direct computation of the determinant. CPA (x) = x2 . Determinant is a multiplicative homomorphism and trace is an additive homomorphism. Theorem CPA (x) = a0 + a1 x + · · · + an−1 xn−1 + xn where trace(A) = −an−1 and | A | = (−1)n a0 . Show the following are equivalent. A = Suppose F is a field and A ∈ F2 . If A and B are similar. Suppose R is a commutative ring. Exercise 1) 2) 3) 4) Note This exercise is a special case of a more general theorem. Find |aA| and trace(aA). . This remarkable result cannot be proved by matrix theory alone. i.e. and c d CPA (x) = a0 + a1 x + x2 . show A satisfies its characteristic polynomial. Exercise Suppose A ∈ Rn and a ∈ R. but depends on linear algebra (see pages 93 and 98). the characteristic polynomial CPA (x) ∈ R[x] is defined by CPA (x) = | (xI − A) |. CPB (x) = | (xI − C −1 AC) | = | C −1 (xI − A)C | = | (xI − A) | = CPA (x). Proof Theorem mials. Characteristic polynomials If A ∈ Rn . If A and B are similar. | A | = | B | and trace(A) = trace(B). Find a0 and a1 and show that a0 I + a1 A + A2 = 0. then they have the same characteristic polyno- Proof Suppose B = C −1 AC. CPA (A) = 0.. Exercise a b is a matrix in R2 .

and should not be considered part of this chapter. then f may be represented by a strictly upper triangular matrix. This chapter concludes with the study of real inner product spaces. this single integer determines V up to isomorphism. The proof is given in the Appendix. It is stated here in full generality only for reference and completeness. and simplifies much of the development of linear algebra.. then f may be represented by a matrix whose first column is zero. The basic theory is developed here in full generality. and generating sets come together in one unified theory. i. modules are defined over an arbitrary ring R and not just over a field. matrices. and homomorphisms follow the same pattern as in the chapters on groups and rings. We give a simple proof that if R is a commutative ring and f : Rn → Rn is a surjective R-module homomorphism. if f is not injective. 67 . It is in this chapter that the concepts about functions. modules over a field. One of the goals in linear algebra is to select a basis so that the matrix representing f has a simple form. Also any endomorphism f : V → V may be represented by a matrix.Chapter 5 Linear Algebra The exalted position held by linear algebra is based upon the subject’s ubiquitous utility and ease of application. and any change of basis corresponds to conjugation of that matrix. and incredible as it may seem. then f is an isomorphism. i. and with the beautiful theory relating orthogonal matrices and symmetric matrices.e. we restrict our attention to vector spaces.e. and thus if V is finitely generated. For example. it has a well defined dimension. quotients. solutions of equations. The theorem on Jordan canonical form is not proved in this chapter. As another example. if f is nilpotent. The key theorem is that any vector space V has a free basis. After the general theory. The elementary facts about cosets. This shows that finitely generated free R-modules have a well defined dimension..

The statement that M is a right R-module means there is a scalar multiplication M ×R → M (m. If R is commutative and M = MR then left scalar multiplication defined by ra = ar makes M into a left R-module. Theorem 1) 2) 3) Proof Suppose M is an R-module. Convention Unless otherwise stated. a2 ∈ M and r. The statement that M is a left R-module means there is a scalar multiplication R×M → M (r. ¯ ¯ If a ∈ M and r ∈ R. r1 . . as addition in M and as addition in R. we may write the scalars on either side. In particular (0M )r = 0M . then f : M → M defined by f (a) = ar is a homomorphism of additive groups. r2 ∈ R. then (−a)r = −(ar) = a(−r).68 Linear Algebra Chapter 5 Definition Suppose R is a ring and M is an additive abelian group. a1 . a0R = 0M . This is a good exercise in using the axioms for an R-module. Thus for commutative rings. m) → rm satisfying r(a1 + a2 ) (r1 + r2 )a (r1 · r2 )a 1a ¯ = = = = ra1 + ra2 r1 a + r 2 a r1 (r2 a) a Note that the plus sign is used ambiguously. r) → mr satisfying (a1 + a2 )r a(r1 + r2 ) a(r1 · r2 ) a1 ¯ = = = = a1 r + a 2 r ar1 + ar2 (ar1 )r2 a for all a. Notation The fact that M is a right (left) R-module will be denoted by M = MR (M = R M ). the word “R-module” (or sometimes just “module”) will mean “right R-module”. If r ∈ R. ¯ ¯ If a ∈ M .

f (ar) = f (a)r.. Exercise Suppose T is a non-void set. . an R-module homomorphism) provided it is a group homomorphism and if a ∈ M and r ∈ R. called the improper submodules ¯ of M . 1) t∈T Nt is a submodule. The basic facts about homomorphisms are listed below. In this case N will be a module because the axioms will be satisfied.Chapter 5 Linear Algebra 69 Submodules If M is an R-module. if a ∈ N and r ∈ R. and scalar multiplication defined by (f r)(t) = f (t)r. and in particular for subgroups of additive abelian groups. then this submodule may be written as N1 + N2 + · · +Nn = {a1 + a2 + · · +an : each ai ∈ Ni }. Also the proof of 3) is immediate. Much . N ) is a subgroup of N M . in 5) of the theorem below. N1 + N2 is the smallest submodule of M containing N1 ∪ N2 . (We know from the last exercise in Chapter 2 that N T is a group.e. which is immediate. n}. N is an R-module. +t∈T Nt = {all finite sums a1 + · · +am : each ai belongs to some Nt } is a submodule. Also in 8) it is only necessary to show that HomR (M. and so it is only necessary to check scalar multiplication. and for each t ∈ T . N ) is a submodule of N M . N ) forms an abelian group. So it is only necessary to show that HomR (M. Note that if N1 and N2 are submodules of M . 2.e.. scalar multiplication is in M and on the right it is in N . On the left.. Note that 0 and M are submodules. If T = {1. For example. the statement that a subset N ⊂ M is a submodule means it is a subgroup which is closed under scalar multiplication. To finish the proofs it is only necessary to check scalar multiplication. Proof We know from page 22 that versions of 1) and 2) hold for subgroups. Nt is a submodule of M . A function f : M → N is a homomorphism (i.. it is stated that HomR (M. If {Nt } is a monotonic collection. t∈T 2) 3) Nt is a submodule. Theorem Suppose M is an R-module. then ar ∈ N .) This simple fact is quite useful in linear algebra. and N T is the collection of all functions f : T → N with addition defined by (f +g)(t) = f (t)+g(t). T is an index set. i. Homomorphisms Suppose M and N are R-modules. Show N T is an R-module.

If f. HomR (M. f r defined by (f r)(a) = f (ar) = f (a)r is a homomorphism. M ). then f −1 : N → M is also a homomorphism. g : M → N are homomorphisms. Thus the automorphisms on M form a group under composition. If k : P → M is a homomorphism. G ⊂ M is a submodule. Rn ) is just the matrix ring Rn and the automorphisms are merely the invertible matrices. If a bijection f : M → N is a homomorphism. (f + g ) ◦ k = (f ◦ k) + (g ◦ k). An isomorphism f : M → M is called an automorphism. The identity map I : M → M is a homomorphism. HomR (Rn . In particular. HomR (M. The composition of homomorphisms is a homomorphism. Theorem 1) 2) 3) 4) The zero map M → N is a homomorphism. h ◦ (f + g) = (h ◦ f ) + (h ◦ g). image(f ) is a submodule of N and ker(f ) = f −1 (0) is a submodule of M . If R is commutative. M ) are the automorphisms.70 Linear Algebra Chapter 5 of this work has already been done in the chapter on groups (see page 28). then g : M → M defined by g(a) = ar is a homomorphism. Furthermore. the set of all homomorphisms from M to N . and H ⊂ N is a submodule. Then f + g is a homomorphism. N ) = Hom(MR . Also (−f ) defined by (−f )(a) = −f (a) is a homomorphism. If h : N → P is a homomorphism. 5) 6) 7) 8) 9) Proof . define (f + g) : M → N by (f + g)(a) = f (a) + g(a). A homomorphism f : M → M is called an endomorphism. Then f (G) is a submodule of N and f −1 (H) is a submodule of M . with multiplication defined to be composition. ¯ This is just a series of observations. In this case f and f −1 are called isomorphisms. We will see later that if M = Rn . If R is commutative and r ∈ R. HomR (M. The units of the endomorphism ring HomR (M. The sum of homomorphisms is a homomorphism. if f : M → N is a homomorphism. is a ring. Suppose f : M → N is a homomorphism. forms an abelian group under addition. N ) is an R-module. NR ).

and if M is abelian. ∃! homomorphism h : R → M with h(1) = m. Theorem Suppose N is a subset of R. R is a Q-module. M ) ≈ M . the properties are satisfied to make M a Z-module. This is the same as the definition before for Rn when n = 1. That is. it is shown that any additive group M admits a scalar multiplication by integers. Proof The definitions are the same except expressed in different language. then T . While group theory in general is quite separate from linear algebra. must f be a Q-module homomorphism? Homomorphisms on Rn Rn as an R-module In Chapter 4 it was shown that the additive abelian group Rm. This makes R a right R-module denoted by RR (or just R). because a1 = a. HomR (R. g : R → M are homomorphisms with f (1) = ¯ g(1). is an R-module. Summary Additive abelian groups are “the same things” as Z-modules. In particular. if f : M → N is a group homomorphism of abelian groups. then f is also a Z-module homomorphism. Then N is a submodule of RR (R R) iff N is a right (left) ideal of R. The properties listed there were exactly those needed to make Rm. We begin with the case n = 1. Theorem Suppose M = MR and f. scalar multiplication is just ring multiplication. the study of additive abelian groups is a special case of the study of R-modules. Then f (r) = f (1 · r) = f (1)r = g(1)r = g(1 · r) = ¯ ¯ ¯ ¯ ¯ ¯ g(r). h : R → M defined by h(r) = mr is a homomorphism. a2 = a + a. Proof Suppose f (1) = g(1). R as a right R-module Let M = R and define scalar multiplication on the right by ar = a · r.1 . Of particular importance is the case Rn = R ⊕ · · ⊕R = Rn. If f : Q → R is a Z-module homomorphism. Given m ∈ M . Thus . Note that this is the only way M can be a Zmodule. with scalar multiplication defined by ring multiplication. Then f = g. Exercise If R is a subring of a ring T .n admits a scalar multiplication by elements in R. etc. In ¯ ¯ other words. Furthermore.n an R-module.Chapter 5 Linear Algebra 71 Abelian groups are Z-modules On page 21. If m ∈ M .

If R is commutative so that HomR (Rn . Note this theorem gives a bijection from HomR (Rn . We now consider the case where the domain is Rn . this theorem gives an R-module isomorphism from HomR (Rn . m2 r. mn )r = (m1 r. and every module homomorphism is of this form. M ) to M n = M ×M ×··×M and this bijection is a group isomorphism.. and this bijection is clearly ¯ a group isomorphism. g : Rn → M are homomorphisms with f (ei ) = g(ei ) for 1 ≤ i ≤ n. the above theorem states that multiplication on left by some m ∈ R defines a right R-module homomorphism from R to R.. mn ∈ M . Note that any   ¯       ·  ·     rn 0 ¯ can be written uniquely as e1 r1 + · · +en rn . . We will see later that the product M n is an R-module with scalar multiplication defined by (m1 .     . If R is commutative. en } is called the canonical free basis or standard basis for Rn .. The element m should be thought of as a 1 × 1 matrix. or which R-module M is selected. Proof The proof is straightforward.. M ) to M n. and any function from that basis to M extends uniquely to a homomorphism from Rn to M. The sequence {e1 . This theorem reveals some of the great simplicity of linear algebra. M ) to M . In the case M = R... ∃! homomorphism h : Rn → M with h(ei ) = mi for 1 ≤ i ≤ m. Show f is injective. . m2 .. If m1 . . it is an isomorphism of R-modules. mn r). Exercise Suppose R is a field and f : RR → M is a non-zero homomorphism.. . The homomorphism h is defined by h(e1 r1 + · · +en rn ) = m1 r1 + · · +mn rn . 0 r1  ¯     ·   ·       Homomorphisms on Rn Define ei ∈ Rn by ei =  1i . Any Rmodule homomorphism from Rn to M is determined by its values on the basis. It does not matter how complicated the ring R is. Then f = g. Theorem Suppose M = MR and f. m2 .72 Linear Algebra Chapter 5 evaluation at 1 gives a bijection from HomR (R. M ) is an R-module..

Rm ) ≈ Rm. addition of matrices corresponds to addition of homomorphisms. . These properties are made explicit in the next two theorems.j ) ∈ Rm. Theorem Suppose A = (ai. Thus HomR (Rn . C(AB) = (CA)B. Matrices over R give R-module homomorphisms! Furthermore. Rn ) and Rn are isomorphic as rings.n . . We now return to the general theory of modules (over some given ring R). Theorem If f : Rn → Rm is the homomorphism given by A ∈ Rm.m . C ∈ Rm. Even if R is commutative. composition of homomorphisms corresponds to multiplication of matrices. This corollary shows one way non-commutative rings arise. In the case where the domain and range are the same. Then f defined by f (B) = AB is the unique homomorphism from Rn to Rm with f (ei ) = mi .n . The automorphisms correspond to the invertible matrices.n to be the matrix with column i = mi . Proof This is just the associative law of matrix multiplication. . If R is commutative. It is the matrix which represents the composition of the functions. Rn is never commutative unless n = 1. g : Rn → Rm are given by matrices A.n and g : Rm → Rp is the homomorphism given by C ∈ Rp. Theorem If f. That is. The previous theorem reveals where matrix multiplication comes from. . and multiplication of matrices corresponds to composition of homomorphisms.n . it is one of the great classical facts of linear algebra. then g ◦ f : Rn → Rp is given by CA ∈ Rp.Chapter 5 Linear Algebra 73 Now let’s examine the special case M = Rm and show HomR (Rn . Corollary HomR (Rn . define A ∈ Rm.n . Even though this follows easily from the previous theorem and properties of matrices. namely as endomorphism rings.n are isomorphic as additive groups. Conversely. if m1 . . Rm ) and Rm. Then f : Rn → Rm defined by f (B) = AB is a homomorphism with f (ei ) = column i of A. they are isomorphic as R-modules. we have the following elegant corollary. mn ∈ Rm . then f + g is given by the matrix A + C.

as shown by the next theorem. Proof On the group level. . R is a ring and module means R-module. Furthermore. Scalar multiplication defined by (a + N )r = (ar + N ) is well defined and gives M/N the structure of an R-module. f M π c && & & E M ¯ b & & & ¯ & f M/N Thus defining a homomorphism on a quotient module is the same as defining a homo¯ morphism on the numerator that sends the denominator to 0. It is ¯ only necessary to check that f is a module homomorphism. Proof On the group level this is all known from Chapter 2 (see page 29). (domain(f )/ker(f )) ≈ image(f ). which is obvious. this is all known from Chapter 2. then f homomorphism making the following diagram commute.74 Linear Algebra Cosets and Quotient Modules Chapter 5 After seeing quotient groups and quotient rings. The natural projection π : M → M/N is a surjective ¯ homomorphism with kernel N . Since N is a normal subgroup of M . The relationship between quotients and homomorphisms for modules is the same as for groups and rings. and thus (M/N ) ≈ image(f ). then M/N ≈ M (see below). the additive abelian quotient group M/N is defined. and the kernel of f is ker(f )/N . and this is immediate. Theorem Suppose M is a module and N ⊂ M is a submodule. if f : M → M is a surjective homomor¯ phism with ker(f ) = N . The image of f is the ¯ ¯ ¯ image of f . f is injective. ¯ : (M/N ) → M defined by f (a + N ) = f (a) is a well defined ¯ ¯ If N ⊂ ker(f ). quotient modules go through without a hitch. Therefore for any homomorphism f . ¯ Theorem Suppose f : M → M is a homomorphism and N is a submodule of M . It is only necessary to check the scalar multiplication. As before. Thus if N = ker(f ).

Under the natural 1-1 correspondence from {functions f : M → Mt } to {sequence of functions {ft }t∈T where ft : M → Mt }. just as they do for groups and rings. The natural homomorphism K → (K + L)/L is surjective with kernel ≈ K ∩ L. For the finite case we may use either the product or sum notation.e. Suppose M is a module. although the student should think of the finite case. In the finite case something important holds for modules that does not hold for non-abelian groups or rings – namely. Then Mt is an R-module and. f is a module homomorphism iff each ft is a module homomorphism. the finite product is also a coproduct. This makes the structure of module homomorphisms much more simple.. f is a homomorphism iff each ft is a homomorphism. for each s ∈ T . the natural projection πs : Mt → Ms is a homomorphism. Thus (K/K ∩ L) → (K + L)/L is an isomorphism. ii) Suppose K ⊂ L. M =Z K = 3Z L = 5Z K ∩ L = 15Z K/K ∩ L = 3Z/15Z ≈ Z/5Z = (K + L)/L K +L=Z Examples 1) 2) M =Z K = 6Z L = 3Z (K ⊂ L) (M/K)/(L/K) = (Z/6Z)/(3Z/6Z) ≈ Z/3Z = M/L Products and Coproducts Infinite products work fine for modules. Theorem Suppose T is an index set and for each t ∈ T . The natural homomorphism M/K → M/L is surjective ≈ with kernel L/K. i. Proof We already know from Chapter 2 that f is a group homomorphism iff each ft is a group homomorphism. This is stated below in full generality. Mt is an R-module. M1 × M2 × · · ×Mn = M1 ⊕ M2 ⊕ · · ⊕Mn . On Mt = Mt define scalar multiplication by {mt }r = the additive abelian group {mt r}. t∈T . Thus (M/K)/(L/K) → M/L is an isomorphism.Chapter 5 Linear Algebra 75 Theorem i) Suppose M is an R-module and K and L are submodules of M . Since scalar multiplication is defined coordinatewise. These two examples are for the case R = Z.

Theorem Suppose M is an R-module. f2 )r = (f1 r. Given g. then the isomorphisms are module isomorphisms. M ) ⊕ · · ⊕HomR (Mn . the 1-1 correspondences in the above theorems actually produce group isomorphisms. the coproduct or sum consisting of all sequences {mt } with only a finite number of non-zero terms. M1 ) ⊕ · · ⊕HomR (M. Thus each Ms may be considered to be a submodule ¯ of ⊕Mt . If T Mt = Mt = ⊕Mt is the submodule of Mt is infinite. so that the objects are R-modules and not merely additive groups. the inclusion homomorphisms is : Ms → ⊕Mt is defined by is (a) = {at } where at = 0 if t = s and as = a. If R is commutative. . gt is defined by gt = g ◦ it . For T = {1. M f1      d d M f d 2 s T d d g2   d g   d   d E M 1 ⊕ M2 ' M2 M1   ©   ' f c g1      π1 M 1 ⊕ M2 π2 d ‚ d E M2 M1 i1 i2 Theorem For finite T . If R is commutative. g is defined by gt (mt ). the coproduct and product are the same module. 2} the product and sum properties are displayed in the following commutative diagrams. h2 ). All it says is that if f = (f1 . HomR (M. Since there are only a finite number of non-zero terms. f2 r). f2 ) and h = (h1 . This says merely that f r = (f1 . M ) ≈ HomR (M1 . then f + h = (f1 + h1 .76 Linear Algebra Chapter 5 Definition If T is finite. There is a 1-1 correspondence from {homomorphisms g : ⊕Mt → M } and {sequences of homomorphisms {gt }t∈T where gt : Mt → M } . f2 + h2 ). Given {gt }. For each s ∈ T . M1 ⊕ · · ⊕Mn ) ≈ HomR (M. Mn ) HomR (M1 ⊕ · · ⊕Mn . they give isomorphisms of Rmodules. M ) and Proof Let’s look at this theorem for products with n = 2. this g({mt }) = t t∈T t∈T sum is well defined.

Before defining summand. f (k. Show Exercise K is a submodule of M ⊕ M which is a summand. but it will be unique up to ¯ isomorphism. Show that M ⊕ N is isomorphic to Exercise Suppose M and N are R-modules. Then the projection ¯ map π2 : M1 ⊕ M2 → M2 is a surjective homomorphism with kernel M1 . it will be easy. Thus f is an isomorphism iff K + L = M and K ∩ L = 0. M is an R-module. Definition Suppose K is a submodule of M . The statement that K is a summand of M means ∃ a submodule L of M with K ⊕ L = M . Of course. Is Q a summand of R? With the material at hand. −a) : a ∈ K ∩ L}. In this case we write K ⊕ L = M . this is the same as there exists a submodule L with K + L = M and K ∩ L = 0. here are two theorems for background. Theorem Consider M1 = M1 ⊕0 as a submodule of M1 ⊕M2 .Chapter 5 Exercise N ⊕ M. ¯ Suppose M is a module and K = {(m. If such an L exists. and n ≥ 1. Summands One basic question in algebra is “When does a module split as the sum of two modules?”. Later on. because L ≈ M/K. M and 0 are always summands of M . This abuse of notation allows us to ¯ avoid talking about “internal” and “external” direct sums. Exercise R is a module over Q. and A ⊂ M . M ) → M n which is a R-module isomorphism. Suppose R is a commutative ring. According to the previous theorem. Define Exercise a function ∝: HomR (Rn . and the next theorem is almost as intuitive. Thus (M1 ⊕ M2 )/M1 is isomorphic to M2 . it need not be unique. l) = k + l. m) : m ∈ M } ⊂ M ⊕ M . Linear Algebra 77 Suppose M and N are R-modules. Show (M ⊕ N )/(A ⊕ B) is isomorphic to (M/A) ⊕ (N/B). and Q ⊂ R is a submodule. B ⊂ N are submodules. Theorem Suppose K and L are submodules of M and f : K ⊕ L → M is the natural homomorphism. . This is exactly what you would expect. this is not an easy question. Then the image of f is K + L and the kernel of f is {(a.

Then {ec }c∈T is a basis for ⊕Rt called ¯ ¯ the canonical basis or standard basis.. st ∈ M . These concepts work fine for an infinite index set T because linear combination means finite linear combination. Note that if some st = 0. However. then S is dependent.. Let R be a commutative ring and T = Rn . Let S be the sequence {st }t∈T = {st }. There is a exercise on page 57 to show that the center of T is the subring of scalar matrices. Generating Sets. tn in T . let Rt = RR and for each c ∈ T . Definition Suppose M is an R-module. and for each t ∈ T . M is said to be a free R-module. . except for the confusing notation. to avoid dizziness. When is nZmn a summand of Zmn ? Chapter 5 Exercise If T is a ring. define the center of T to be the subring {t : ts = st for all s ∈ T }. You might try first the case T = {1. The next two theorems are obvious. let ec ∈ ⊕Rt = t∈T Rt be ec = {rt } where rc = l and rt = 0 if t = c. not all zero. such that the linear combination st1 r1 + · · +stn rn = 0. rn in R.. Let SR be the set of all linear combinations st1 r1 + · · +stn rn . Independence. and Free Basis This section is a generalization and abstraction of the brief section Homomorphisms on Rn . then S is dependent. T is an index set..78 Exercise 1) 2) 3) 4) Linear Algebra Answer the following questions about abelian groups. Theorem For each t ∈ T . . Rn ). . then S is said to be a basis or free basis for M . If ∃ a basis for M . ... The statement that S is dependent means ∃ a finite number of distinct elements t1 . ¯ S is independent. the student should consider the case where T is finite. 2. and elements r1 .. Also if ∃ distinct ¯ elements t1 and t2 in T with st1 = st2 .. Show Rn is a left T -module and find HomT (Rn . Is 2Z a summand of Z? Is 4Z8 a summand of Z8 ? Is 3Z12 a summand of Z12 ? Suppose n. If S is independent and generates M . In this case any v ∈ M can be written uniquely as a linear combination of elements in S. m > 1. SR is a submodule of M called the submodule generated by S. n} and ⊕Rt = Rn . Otherwise.

Let f (S) be the sequence {f (st )} in N . f (S) is a basis for N iff f is an isomorphism. Now suppose N has a free basis {st }. The next theorem is so basic in linear algebra that it is used without comment. f is completely determined by what it does on the basis S. Proof If N is isomorphic to a free module. Then ∃ a 1-1 correspondence from the set of all functions g : {st } → N and the set of all homomorphisms f : M → N .. 1) 2) 3) 4) f (S) generates N iff f is surjective. Theorem An R-module N is free iff ∃ an index set T such that N ≈ t∈T Rt . f (S) is independent in N iff f is injective. and you are “free” to send the basis any place you wish and extend to a homomorphism. N is certainly free. define g by g(st ) = f (st ). By 3) in the preceding theorem. and Theorem S = {st } is a basis for M . Although the proof is easy. Exercise Let (A1 . Characterization of Free Modules It will now be shown that any free R-module is isomorphic to one of the canonical free R-modules.Chapter 5 Linear Algebra 79 Theorem Suppose N is an R-module and M is a free R-module with a basis {st }. Recall that we have already had the preceding theorem in the case S is the canonical basis for M = Rn . In particular. define f by f (st1 r1 + · · +stn rn ) = g(st1 )r1 + · · +g(stn )rn . Given g. An ) be a sequence of n vectors with each Ai ∈ Zn . Then the homomorphism f : ⊕Rt → N with f (et ) = st sends the canonical basis for ⊕Rt to the basis for N .. Is it true the sequence is linearly independent over Z iff it is linearly independent over R? This question is difficult until we learn more linear algebra. N has a finite free basis of n elements iff N ≈ Rn . . Given f . f : M → N is a homomorphism. f is an isomorphism. it should be worked carefully. . If h : M → N is a homomorphism then f = h iff f | S = h | S. In other words. Show this sequence is linearly independent over Z iff it is linearly independent over Q. Suppose M and N are modules.

80 Linear Algebra Chapter 5 Exercise Suppose R is a commutative ring. and solutions of equations. v = 0. Show f is an isomorphism. and the homomorphism f : Rn → Rn defined by f (B) = AB is surjective.. It shows that certain concepts about matrices. Exercise 2 1 0 and f: Z3 → Z2 be the group homo3 2 −5 morphism defined by A. vn ∈ Rm be the columns of A. Also find a non-zero element of kernel(f ).. Find a non-trivial linear combination of the columns of A which is 0. . ¯ v is independent iff v is v is a basis for R iff v generates R iff v is . show A is invertible. Let λ =  .. then f is an isomorphism.e. i. The next exercise is routine. injective homomorphisms. Suppose A ∈ Rm. Note that 2) here is essentially the first exercise for the case n = 1..  represent an element of Rn and C =  . i.n and f : Rn → Rm is the homomorphism associated with A.e. en } is a free basis for Rn ..e. This is a key theorem in linear algebra. if f : R → R is a surjective R-module homomorphism. i.. f (ei ) = vi     λ1 c1     = column i of A. A = The next exercise is to relate properties of R as an R-module to properties of R as a ring. . That is. Relating these concepts to matrices The theorem stated below gives a summary of results we have already had. Let v1 . Let R = Z. but still informative. linear independence. are all the same — they are merely stated in different language. Exercise 1) 2) Suppose R is a commutative ring and v ∈ R. Use the fact that {e1 . A ∈ Rn . .  λn cm . f (B) = AB. although it is usually stated only for the case where R is a field..

Later on we will prove that if R is a field. vn } generates Rn . {v1 . f (λ) = f (e1 λ1 + · · + en λn ) = v1 λ1 + · · +vn λn .n be the rows of A. {v1 .Chapter 5 Linear Algebra 81 represent an element of Rm .. .. {v1 . i.. 1) 2) 3) 4) 5) 2t ) 3t ) f is an automorphism. . .. 2) 3) 4) Relating these concepts to square matrices We now look at the preceding theorem in the special case where n = m and R is a commutative ring. Then the following are equivalent. namely that if f : Rn → Rn is surjective. wn ∈ Rn = R1.e..e. vn } is a basis for Rn .. vn ∈ Rn be the columns of A. then f is injective. Now we prove something more substantial.. and w1 . .. vn } is a basis for Rm iff f is an isomorphism iff (for any C ∈ Rm ... Theorem Suppose R is a commutative ring. injective implies surjective. Theorem 1) f (λ) is the linear combination of the columns of A. . vn } generates Rm iff f is surjective iff (for any C ∈ Rm . i. {v1 . and f : Rn → Rn is defined by f (B) = AB. | A | is a unit in R. wn } is a basis for Rn . AX = C has a solution). . .. vn } is independent iff f is injective iff AX = 0 has a unique ¯ solution iff (∃ C ∈ Rm such that AX = C has a unique solution). {v1 ... f is surjective. . So far in this chapter we have just been cataloging.. | At | is a unit in R. At is invertible... {w1 ... Let v1 . AX = C has a unique solution). A is invertible. . A ∈ Rn .

and by the previous section. This shows that A has a right inverse and is thus invertible. 2) and 2t ) are equivalent. Now g comes from some matrix D and thus AD = I. 4) and 5) are equivalent. R will be a commutative ring. Thus the first five are equivalent. Furthermore. Since | A |=| At |.. Lemma Suppose M is a f. . So this follows from the first theorem... module (i. Recall that the proof of this fact uses determinant. this is impossible.. If {v1 . . then m ≥ n. Since f is onto. Uniqueness of Dimension There exists a ring R with R2 ≈ R3 as R-modules. Let g : Rn → Rn be the homomorphism satisfying g(ei ) = ui . Corollary Proof If f : Rm → Rn is an isomorphism.e. but this is of little interest. Then h is a surjective homomorphism. Proof Suppose k = n − m is positive. v) = f (u). Then if M has a basis.g. Linear Algebra Chapter 5 Proof Suppose 5) is true and show 2). If f : Rm → Rn is a surjective R-module homomorphism. un ∈ Rn with f (ui ) = ei . which requires that R be commutative. as shown below. .82 4t ) {w1 . First we make a convention.. Then f ◦ g is the identity.. that basis is finite. Define h : (Rm ⊕ Rk = Rn ) → Rn by h(u. . If R is commutative. then m = n.. a finite generated R-module). Corollary Proof The hypothesis implies there is a surjective homomorphism Rm → Rn . then m ≥ n. We already know the first three properties are equivalent. Each of f and f −1 is surjective. ∃ u1 . also injective. so m = n by the previous theorem. wn } generates Rn . and 3) implies 4). vm } generates Rn . applying this result to At shows that the last three properties are equivalent to each other. This is a contradiction. Convention Theorem For the remainder of this chapter..

Now suppose M is a f. Consider the columns of the real matrix 2 3 2 . We will follow the classical convention that an index set with n elements must be {1. . not for collections. you must use sequences. The result follows because Rn ≈ Rm iff n = m. the dimension of M .. any basis for M must be finite. but for independence and basis. that basis is finite and any other basis has the same number of elements.. If M has a basis. In a set or collection. we recall what a basis is.. and basis for sequences.g. and characteristic polynomial of f are well defined.Chapter 5 Linear Algebra 83 Proof Suppose U ⊂ M is a finite generating set and S is a basis. 2. . and thus a basis for M with n elements is a sequence S = {u1 . M ) → M n defined by α(h) = (h(e1 ). The point of all this is that selecting a basis of n elements for M is the same as selecting an isomorphism from Rn to M . module. yet we certainly don’t wish to say the columns of A form a basis for R2 .g. Previously we defined generating. In order to represent f by a matrix. .. For the concept of generating it matters not whether you use sequences or collections. free module and f : M → M is a homomorphism. Theorem Suppose M is a f. In order to make sense. and thus have a determinant and trace. S = (u1 . We will show that this matrix is well defined up to similarity. there is no concept of repetition. This number is denoted by dim(M )... Then any element of U is a finite linear combination of elements of S. and from this viewpoint. change of basis can be displayed by the diagram below.. h(en )).e. we must select a basis for M (i. un } or if you wish. independence. and thus the determinant. If we consider the column vectors of A as a collection. un ) ∈ M n . M has a basis of n elements iff M ≈ Rn . Proof By the previous lemma. Endomorphisms on Rn are represented by square matrices. an isomorphism with Rn ). n}. . When we originally defined basis. Change of Basis Before changing basis. there are A= 1 4 1 only two of them. we could have called it “indexed free basis” or even “ordered free basis”. Two sequences cannot begin to be equal unless they have the same index set.. we must consider the columns of A as an ordered triple of vectors. Recall there is a bijection α : HomR (Rn . Now h : Rn → M is an isomorphism iff α(h) is a basis for M .. and thus S is finite. trace. Suppose M is an R-module with a basis of n elements. .

A is the usual matrix associated with f ). In other words.t. (Note that if M = Rn and ui = ei .e. . and CPf (x) to be CPA (x).. the standard basis. W represents f w.r.j ) ∈ Rn of f w. Then C is invertible and B = C −1 AC. Then h is given by the matrix U whose ith column is ui and A = U −1 W U.j ) ∈ Rn by vi = u1 c1.r.. T is B = C −1 AC. In other words. where A . T .r. . Conversely. .i + · · +un an. trace(A)=trace(B).g.r. and f : M → M is a homomorphism.t.t.i + · · +un cn.t. and A and B have the same characteristic polynomial (see chapter 4). Therefore |A| = |B|.. Define T = {v1 . Define |f | to be |A|. . Definition Suppose M is a f.j ) ∈ Rn is invertible. Proof The proof follows by seeing that the following diagram is commutative. column i of A is h−1 (f (h(ei ))).r. S = {u1 . The matrix A = (ai. A and B are similar.. Theorem Suppose T = {v1 .. Rn B d s s d d d ‚ ≈dd vi d ‚ d E ei ei        ©   ≈ vi   ©   s ud ≈ id d E Rn    C ≈ c f M E M ≈ C c Rn   ©    i © e ≈  ui         s d d d d ‚ ‚ ed i A Rn The diagram also explains what it means for A to be the matrix of f w. and U −1 W U represents f w. trace(f ) to be trace(A).t. Then the matrix A ∈ Rn is the one determined by the endomorphism h−1 ◦f ◦h : Rn → Rn . Define C = (ci.. i. conjugation of matrices corresponds to change of basis.84 Linear Algebra Chapter 5 Definition Suppose M is a free module.. the basis S is defined by f (ui ) = u1 a1. the basis S.i .i .. free module and f : M → M is a homomorphism.i .t. the basis {u1 . An important special case is where M = Rn and f : Rn → Rn is given by some matrix W . vn } is another basis for M and B ∈ Rn is the matrix of f w. Then T is a basis for M and that matrix of f w. un }.i + · · +un cn. un } is a basis for M . vn } by vi = u1 c1. suppose C = (ci. In other words..r. Let h : Rn → M be the isomorphism with h(ei ) = ui for 1 ≤ i ≤ n.

3 1 . holds for any commutative ring R. find an invertible matrix C ∈ R2 with B = C −1 AC. the basis Exercise Let L ⊂ R2 be the line L = {(r. ¯ ¯ ¯ Suppose vr = 0 and r = 0.. all three are well defined.e. Endomorphisms in general will not have a determinant or trace. ¯ That is. Show there is one and only one homomorphism f : R2 → R2 which is the identity on L and has f (−1. 1)}. and characteristic polynomial. Then v generates M iff v ¯ is a basis for M . Also find the matrix B ∈ R2 which represents f with respect to the standard basis.r. Find the matrix A ∈ R2 which represents f with respect to the basis {(1. .r. and endomorphisms on it will have well defined determinant. Then 0= (vr)r−1 = v1 = v. Furthermore. F is a field. if v ∈ V and r ∈ F . We now focus on the case where R is a field. 1) = (1. F -modules may also be called vector spaces and F -module homomorphisms may also be called linear transformations. Theorem Suppose M is an F -module and v ∈ M . (−1. vr = 0 implies v = 0 or r = 0. any non-zero element of M is a basis. trace. In this section. Finally. every R-module is free.Chapter 5 Linear Algebra 85 is the matrix of f w. The previous theorem. if these conditions hold. Find the matrix of f w. and any two elements of M are dependent. then M ≈ FF . do not depend upon the choice of basis. 3 3 0 −1 D. Exercise Let R = Z and f : Z2 → Z2 be defined by f (D) = 2 1 . and show that in this case. Proof ¯ ¯ ¯ ¯ Theorem Suppose M = 0 is an F -module and v ∈ M . 2).t. i. −1).t. and characteristic polynomial of f . but it must be assumed that the module M is free. 2r) : r ∈ R}. trace. By the previous theorem. Find the determinant. Vector Spaces So far in this chapter we have been developing the theory of linear algebra in general. Thus any finitely generated R-module will have a well defined dimension. Then v = 0 iff v is independent. some basis. for example.

If S = {v1 . but it can be extended to a basis inside any finite generating set containing it.. . .. Thus any finite independent sequence can be extended to a basis. and the element 2 of Z is independent. . .. ¯ ¯ Thus {v1 . If R = Z. . . and thus the dimension of M is well defined. {v1 . vn . Also the finiteness hypothesis in this theorem is only for convenience. . . vm } extends to a basis with n elements... then Z is a free module over itself. and {v1 .. Since {v1 . Theorem Suppose M is an F -module of dimension n. In this case M ≈ F . Thus T extends to a basis. Then m ≤ n and if m = n. vm } ¯ generates M . and any non-zero element of F is a basis. However it certainly cannot be extended to a basis.. rn .. Since F is a commutative ring. that {v1 . After so many routine theorems. vn } generates S and thus all of M . and any two elements of F are dependent. ri not all −1 zero. Now suppose T is a finite independent sequence.. This is one of the theorems that makes linear algebra tick. Proof {v1 . vn } is a maximal independent subsequence of S. vn }. it is nice to have one with real power. It must be shown that vi is a linear combination of {v1 . Theorem Suppose M and N are finitely generated F -modules.. Theorem Suppose M = 0 is a finitely generated F -module... Proof Suppose. and inside that sequence it may be extended to a maximal independent sequence. Then v = 0 and is thus independent by the ¯ previous theorem. vm } is an independent sequence in M . vm } is a basis. and n < i ≤ m. such that v1 r1 + · · +vn rn + vi ri = 0. Then ri = 0 and vi = −(v1 r1 + · · +vn rn )ri .. as will be seen momentarily.86 Linear Algebra Chapter 5 Proof Suppose v generates M . and thus is a free F -module.. . T may be extended to a finite generating sequence.. It not only says any finite independent sequence can be extended to a basis. then any maximal independent subsequence of S is a basis for M . The key hypothesis here is that the ring is a field. The next theorem is just a collection of observations. In particular. any two bases of M must have the same number of elements... for notational convenience. . ∃ r1 . vi } is dependent. M has a finite free basis....

and f : M → W is a homomorphism. F m ≈ F n iff n = m. M ≈ N iff dim(M ) = dim(N ). ∃ a Q-submodule V ⊂ R with Q ⊕ V = R as Q-modules. ¯ Any maximal independent subsequence of S is a basis for M . Theorem 1) 2) Suppose M is an F -module and K ⊂ M is a submodule. then dim(K) ≤ dim(M ) and K = M iff dim(K) = dim(M ). In particular.g. i.. Any independent subsequence of M can be extended to a basis for M . Corollary Q is a summand of R.. R is a Q-module. Part 2) will follow from the Hausdorff Maximality Principle. Part 3) follows from 2) because an independent sequence can always be extended to a generating sequence. and so the result follows. and thus to a basis for M . F -module. K is a summand of M . Theorem 1) 2) 3) Suppose M = 0 is an F -module and S = {vt }t∈T generates M .g. M has a free basis. An independent subsequence of S is contained in a maximal monotonic tower of independent subsequences. Corollary Suppose M is a f. If M is f. Extend T to a basis S for M . dim(M ⊕ N ) = dim(M ) + dim(N ).Chapter 5 1) 2) 3) 4) Linear Algebra M ≈ F n iff dim(M ) = n. ∃ a submodule L of M with K ⊕ L = M . The union of these independent subsequences is still independent. W is an F -module. Proof Let T be a basis for K. 87 Here is the basic theorem in full generality. Any independent subsequence of S may be extended to a maximal independent subsequence of S. and thus is a free F -module. . Then dim(M ) = dim(ker(f )) + dim(image(f )). Proof The proof of 1) is the same as in the case where S is finite.) Proof Q is a field. In other words.e. Then S −T generates a submodule L with K ⊕ L = M. (See exercise on page 77. and Q is a submodule of R. Part 2) follows from 1).

3) 1t ) 2t ) 3t ) . Exercise Find a free Z-module which has a generating set containing no basis. 5). At is invertible. wn ∈ F n = F1.. wn } is independent.. vn ∈ F n be the columns of A.. vn } generates F n . . 2. (3. Square matrices over fields This theorem is just a summary of what we have for square matrices over fields.88 Linear Algebra Chapter 5 Proof Let K = ker(f ) and L ⊂ M be a submodule with K ⊕ L = M . i. .e. every submodule is a summand.e. Show there are three maximal independent subsequences of S and each is a basis for R3 . . | At |= 0.e. Theorem Suppose A ∈ Fn and f : F n → F n is defined by f (B) = AB. f is surjective.. and w1 . 1. . | A |= 0... . 1). vn } is independent. Then the following are equivalent 1) 2) {v1 . f is injective.. . i. wn } is a basis for F n . i. {w1 ..n be the rows of A. Then f | L : L → image(f ) is an isomorphism.. Exercise Suppose R is a domain with the property that... i. 4. (1. (2. 0). You may use determinant. The real vector space R3 is generated by S = {(1. 2)}. Exercise The real vector space R2 is generated by the sequence S = {(π.... ¯ {w1 .. Show there are three maximal independent subsequences of S. 2).. Show R is a field.e. wn } generates F n . vn } is a basis for F n . (1. i. 1).. i. {w1 . f is an automorphism. {v1 . (3. . . A is invertible.. for R-modules. ¯ {v1 .. 2..e. and each is a basis for R2 .. Let v1 . i.. 0)}.e..e.

(See the exercise on page 79. Now suppose each of U and V is a vector space of dimension n and f : U → V is a linear transformation.) Parts 1) and 1t ) follow from the preceding section. The row (column) rank of A is defined to be the dimension of the submodule of F n (F m ) generated by the rows (columns) of A. f is injective iff f is bijective iff f is surjective.. The sequence (A1 . Each column of A is a vector in the range F m . . . Exercise Let A = (A1 .. Rank of a matrix Suppose A ∈ Fm. (See the section Relating these concepts to square matrices. Let f : Zn → Zn be defined by f (B) = AB and f : Rn → Rn be defined by f (C) = AC. Proof Suppose f : F n → F m is defined by f (B) = AB. Theorem If C ∈ Fm and D ∈ Fn are invertible. By the pigeonhole principle. . and if B ∈ F n . |A| = 0. It follows from the work done so far that f is injective iff f is bijective iff f is surjective...Chapter 5 Linear Algebra 89 Proof Except for 1) and 1t ). then the row (column) rank of A is the same as the row (column) rank of CAD.) 1) 2) 3) 4) 5) f : Zn → Zn is injective. An ) be an n × n matrix over Z with column i = Ai ∈ n ¯ Z . Exercise Add to this theorem more equivalent statements in terms of solutions of n equations in n unknowns.. Show the following are equivalent. An ) is linearly independent over R. Overview Suppose each of X and Y is a set with n elements and f : X → Y is a function.n . The sequence (A1 .. this theorem holds for any commutative ring R. This shows some of the simple and definitive nature of linear algebra. f (B) is a linear combination of those vectors. ¯ f : Rn → Rn is injective. . An ) is linearly independent over Z.

t = 1 and all other entries are 0 (see the ¯ ¯ first exercise on page 59). Geometric Interpretation of Determinant Suppose V ⊂ Rn is some nice subset. and f : M → M is an endomorphism. The rank of f is defined to be the dimension of the image of f . Show that it is possible to select t rows and t columns of A such that the determined t × t matrix is invertible. It follows from the work above that this is the same as the rank of any matrix representing f . where h is any automorphism on F n and g is any automorphism on F m . A may be changed to a matrix H where h1. For n = 2. the next theorem says that “f multiplies volume by the absolute value of its determinant”. There is a concept of the n-dimensional volume of V . By row and column operations. The theorem for row rank follows using transpose. Exercise Suppose A has rank t. This number is called the rank of A and is ≤ min{m. .e. Thus if |A| = ±1. For n = 1. i. it is length. Thus row rank = t = column rank. V might be the interior of a square or circle. The volume of V does not change under translation. In street language. Thus f (V ) and f (V + p) = f (V ) + f (p) have the same volume. Proof By the theorem above. Theorem If A ∈ Fm. This dimension is the same as the dimension of the image of g ◦ f ◦ h : F n → F m .. n}. elementary row and column operations change neither the row rank nor the column rank.90 Linear Algebra Chapter 5 Thus the image of f is the submodule of F m generated by the columns of A. V and V + p have the same volume. if n = 2. Exercise Suppose A ∈ Fm.n has rank t. For example. The n-dimensional volume of f (V ) is ±|A|(the n-dimensional volume Theorem of V ). Show that the rank of A is the largest integer t such that this is possible. Suppose A ∈ Rn and f : Rn → Rn is the homomorphism given by A.n .1 = ·· = ht. This proves the theorem for column rank. it is area. and its dimension is the rank of f . What is the dimension of the solution set of AX = 0? ¯ Definition Suppose M is a finite dimensional vector space over a field F . f preserves volume. the row rank and the column rank of A are equal. and for n = 3 it is “ordinary volume”.

If |A| = 0 then A is the product of elementary matrices (see page 59) and for elementary matrices. y) ∈ V . As a simple example. and mathematics. .. image(f ) has dimension < n and thus f (V ) has n-dimensional volume 0. Then h is approximated near p by g(x) = h(p) + f (x − p) = h(p) + h (p)(x − p).) Theorem Suppose the determinant of J(h)(x. 1] × · · ×[0. Now suppose V ⊂ R2 is some nice subset and h = (h1 . the theorem is obvious. Let f : R → R be the linear transformation f (x) = h (p)x.Chapter 5 Linear Algebra 91 Proof If |A| = 0. p2 ). Without this great simplification. y) = (x. Define the Jacobian by J(h)(x. Then P = f (V ) = {v1 t1 + · · +vn tn : 0 ≤ ti ≤ 1}. Linear functions approximate differentiable functions locally We continue with the special case F = R. Then the area of h(V ) is | J(h) | dxdy. (Note that if h is the restriction of a linear transformation from R2 to R2 . suppose h : R → R is differentiable and p is a real number. Corollary If P is the n-dimensional parallelepiped determined by the columns v1 . Of course. science. y). From the previous section we know that V ∂h1 ∂x ∂h2 ∂x ∂h1 ∂y ∂h2 ∂y and for each any homomorphism f multiplies area by | f |. y) ∈ V . p2 ) (after translation) by f (p1 . let f (x. . 1] = {e1 t1 + · · +en tn : 0 ≤ ti ≤ 1}. The area of V is 1dxdy. Then for any (p1 . . The student may now understand the following theorem from calculus. linear transformations send the origin to the origin. vn of A. It is a central fact that smooth phenomena may be approximated locally by linear phenomena. p2 ) ∈ V . y) is non-negative for each V (x. h is approximated near (p1 . h2 ) : V → R2 is injective and differentiable. Proof Let V = [0. y) : R2 → R2 be the homomorphism defined by J(h)(x. Linear functions arise naturally in business. the world of technology as we know it today would not exist. this theorem is immediate from the previous section. then the n-dimensional volume of P is ±|A|. The result follows because the determinant of the composition is the product of the determinants. so they must be adjusted by a translation. However this is not the only reason that linear algebra is so useful.

92

Linear Algebra The Transpose Principle

Chapter 5

We now return to the case where F is a field (of arbitrary characteristic). F modules may also be called vector spaces and submodules may be called subspaces. The study of R-modules in general is important and complex. However the study of F -modules is short and simple – every vector space is free and every subspace is a summand. The core of classical linear algebra is not the study of vector spaces, but the study of homomorphisms, and in particular, of endomorphisms. One goal is to show that if f : V → V is a homomorphism with some given property, there exists a basis of V so that the matrix representing f displays that property in a prominent manner. The next theorem is an illustration of this. Theorem 1) Let F be a field and n be a positive integer. Suppose V is an n-dimensional vector space and f : V → V is a homomorphism with |f | = 0. Then ∃ a basis of V such that the matrix ¯ representing f has its first row zero. Suppose A ∈ Fn has |A| = 0. Then ∃ an invertible matrix C such that ¯ C −1 AC has its first row zero. Suppose V is an n-dimensional vector space and f : V → V is a homomorphism with |f | = 0. Then ∃ a basis of V such that the matrix representing f has its first column zero. Suppose A ∈ Fn has |A| = 0. Then ∃ an invertible matrix D such that ¯ D−1 AD has its first column zero.

2) 3)

4)

We first wish to show that these 4 statements are equivalent. We know that 1) and 2) are equivalent and also that 3) and 4) are equivalent because change of basis corresponds to conjugation of the matrix. Now suppose 2) is true and show 4) is true. Suppose |A| = 0. Then |At | = 0 and by 2) ∃ C such that C −1 At C has ¯ ¯ first row zero. Thus (C −1 At C)t = C t A(C t )−1 has first row column zero. The result follows by defining D = (C t )−1 . Also 4) implies 2). This is an example of the transpose principle. Loosely stated, it is that theorems about change of basis correspond to theorems about conjugation of matrices and theorems about the rows of a matrix correspond to theorems about the columns of a matrix, using transpose. In the remainder of this chapter, this will be used without further comment.

Chapter 5

Linear Algebra

93

Proof of the theorem We are free to select any of the 4 parts, and we select part 3). Since | f |= 0, f is not injective and ∃ a non-zero v1 ∈ V with f (v1 ) = 0. ¯ Extend v1 to a basis {v1 , .., vn }. Then the matrix of f w.r.t this basis has first column zero. Exercise Let A = 3π 6 2π 4 . Find an invertible matrix C ∈ R2 so that C −1 AC
 

0 0 0  has first row zero. Also let A =  1 3 4  and find an invertible matrix D ∈ R3  2 1 4 so that D−1 AD has first column zero. Exercise Suppose M is an n-dimensional vector space over a field F , k is an integer with 0 < k < n, and f : M → M is an endomorphism of rank k. Show there is a basis for M so that the matrix representing f has its first n − k rows zero. Also show there is a basis for M so that the matrix representing f has its first n − k columns zero. Do not use the transpose principle. Nilpotent Homomorphisms In this section it is shown that an endomorphism f is nilpotent iff all of its characteristic roots are 0 iff it may be represented by a strictly upper triangular matrix. ¯ Definition An endomorphism f : V → V is nilpotent if ∃ m with f m = 0. Any ¯ f represented by a strictly upper triangular matrix is nilpotent (see page 56). Theorem Suppose V is an n-dimensional vector space and f : V → V is a nilpotent homomorphism. Then f n = 0 and ∃ a basis of V such that the matrix ¯ representing f w.r.t. this basis is strictly upper triangular. Thus the characteristic polynomial of f is CPf (x) = xn . Proof Suppose f = 0 is nilpotent. Let t be the largest positive integer with ¯ t f = 0. Then f t (V ) ⊂ f t−1 (V ) ⊂ ·· ⊂ f (V ) ⊂ V . Since f is nilpotent, all of these ¯ inclusions are proper. Therefore t < n and f n = 0. Construct a basis for V by ¯ starting with a basis for f t (V ), extending it to a basis for f t−1 (V ), etc. Then the matrix of f w.r.t. this basis is strictly upper triangular. Note To obtain a matrix which is strictly lower triangular, reverse the order of the basis.

94 Exercise

Linear Algebra

Chapter 5

Use the transpose principle to write 3 other versions of this theorem.

Theorem Suppose V is an n-dimensional vector space and f : V → V is a homomorphism. Then f is nilpotent iff CPf (x) = xn . (See the exercise at the end of Chapter 4.) Proof Suppose CPf (x) = xn . For n = 1 this implies f = 0, so suppose n > 1. ¯ Since the constant term of CPf (x) is 0, the determinant of f is 0. Thus ∃ a basis ¯ ¯ of V such that the matrix A representing f has its first column zero. Let B ∈ Fn−1 be the matrix obtained from A by removing its first row and first column. Now CPA (x) = xn = xCPB (x). Thus CPB (x) = xn−1 and by induction on n, B is nilpotent and so ∃ C such that C −1 BC is strictly upper triangular. Then
       

1 0 · ·0 0 · C −1 · 0

       

0 ∗ · ·∗    ·      B      ·  0

       

1 0 · ·0 0 ∗ · ·∗     0 0    =  · C −1 BC · C     ·   · 0 0

       

is strictly upper triangular.

Exercise 

Suppose F is a field, A ∈ F3 is a lower triangular matrix of rank 2,  0 0 0 and B =  1 0 0 . Using conjugation by elementary matrices, show there is an   0 1 0 invertible matrix C so that C −1 AC = B. Now suppose V is a 3-dimensional vector space and f : V → V is a nilpotent endomorphism of rank 2. We know f can be represented by a lower triangular matrix. Show there is a basis {v1 , v2 , v3 } for V so that B is the matrix representing f . Also show that f (v1 ) = v2 , f (v2 ) = v3 , and f (v3 ) = 0. In other words, there is a basis for V of the form {v, f (v), f 2 (v)} with ¯ f 3 (v) = 0. ¯

Exercise Suppose V is a 3-dimensional vector space and f : V → V is a nilpotent endomorphism of rank 1. Show there is a basis for V so that the matrix representing   0 0 0   f is  1 0 0 . 0 0 0

Chapter 5

Linear Algebra Eigenvalues

95

Our standing hypothesis is that V is an n-dimensional vector space over a field F and f : V → V is a homomorphism. Definition An element λ ∈ F is an eigenvalue of f if ∃ a non-zero v ∈ V with f (v) = λv. Any such v is called an eigenvector. Eλ ⊂ V is defined to be the set of all eigenvectors for λ (plus 0). Note that Eλ = ker(λI − f ) is a subspace of V . The ¯ next theorem shows the eigenvalues of f are just the characteristic roots of f . Theorem 1) 2) 3) If λ ∈ F then the following are equivalent. λ is an eigenvalue of f , i.e., (λI − f ) : V → V is not injective. | (λI − f ) |= 0. ¯ λ is a characteristic root of f , i.e., a root of the characteristic polynomial CPf (x) = | (xI − A) |, where A is any matrix representing f .

Proof It is immediate that 1) and 2) are equivalent, so let’s show 2) and 3) are equivalent. The evaluation map F [x] → F which sends h(x) to h(λ) is a ring homomorphism (see theorem on page 47). So evaluating (xI − A) at x = λ and taking determinant gives the same result as taking the determinant of (xI − A) and evaluating at x = λ. Thus 2) and 3) are equivalent. The nicest thing you can say about a matrix is that it is similar to a diagonal matrix. Here is one case where that happens. Suppose λ1 , .., λk are distinct eigenvalues of f , and vi is an eigenvector Theorem of λi for 1 ≤ i ≤ k. Then the following hold. 1) 2) {v1 , .., vk } is independent. If k = n, i.e., if CPf (x) = (x − λ1 ) · · · (x − λn ), then {v1 , .., vn } is a basis for V . The matrix of f w.r.t. this basis is the diagonal matrix whose (i, i) term is λi .

Proof Suppose {v1 , .., vk } is dependent. Suppose t is the smallest positive integer such that {v1 , .., vt } is dependent, and v1 r1 + · · +vt rt = 0 is a non-trivial linear ¯ combination. Note that at least two of the coefficients must be non-zero. Now (f − λt )(v1 r1 + · · +vt rt ) = v1 (λ1 − λt )r1 + · · +vt−1 (λt−1 − λt )rt−1 + 0 = 0 is a shorter ¯ ¯

96

Linear Algebra

Chapter 5

non-trivial linear combination. This is a contradiction and proves 1). Part 2) follows from 1) because dim(V ) = n. Exercise 0 1 ∈ R2 . Find an invertible C ∈ C2 such that C −1 AC −1 0 is diagonal. Show that C cannot be selected in R2 . Find the characteristic polynomial of A. Let A =

Exercise Suppose V is a 3-dimensional vector space and f : V → V is an endomorphism with CPf (x) = (x − λ)3 . Show that (f − λI) has characteristic polynomial endomorphism. Show there is a basis for V so that the x3 and is thus a nilpotent      λ 0 0 λ 0 0 λ 0 0 matrix representing f is  1 λ 0 ,  1 λ 0  or  0 λ 0 .       0 1 λ 0 0 λ 0 0 λ We could continue and finally give an ad hoc proof of the Jordan canonical form, but in this chapter we prefer to press on to inner product spaces. The Jordan form will be developed in Chapter 6 as part of the general theory of finitely generated modules over Euclidean domains. The next section is included only as a convenient reference. Jordan Canonical Form This section should be just skimmed or omitted entirely. It is unnecessary for the rest of this chapter, and is not properly part of the flow of the chapter. The basic facts of Jordan form are summarized here simply for reference. The statement that a square matrix B over a field F is a Jordan block means that ∃ λ ∈ F such that B is a lower triangular matrix of the form

B=  

λ   1 λ  0

0 · · 1 λ

    . B gives a homomorphism g : F m → F m with g(em ) = λem   

and g(ei ) = ei+1 + λei for 1 ≤ i < m. Note that CPB (x) = (x − λ)m and so λ is the only eigenvalue of B, and B satisfies its characteristic polynomial, i.e., CPB (B) = 0. ¯

. Show a0 I + a1 D + · · +Dn = 0. ∃ λ1 .e. they have the same Jordan blocks. D is a diagonal matrix iff each ni = 1. Exercise Suppose D ∈ Fn is in Jordan form and has characteristic polynomial a0 + a1 x + · · +xn .e. . λn ∈ F (not necessarily distinct) such that CPA (x) = (x − λ1 ) · · (x − λn ). show CPD (D) = 0. iff each Jordan block is a 1 × 1 matrix. ∃ an invertible C ∈ Fn such that C −1 AC is in Jordan form.     Suppose D is of this form and Bi ∈ Fni has 0 eigenvalue λi . . Also note that we know one special case of this theorem. (In this case we say that all the eigenvalues of A belong to F . This means that CPA (x) will factor completely in C[x].Chapter 5 Definition      that D =      Linear Algebra 97 A matrix D ∈ Fn is in Jordan form if ∃ Jordan blocks B1 . then A is similar to a diagonal matrix. Theorem 1) 2) If A ∈ Fn .. the following are equivalent. (x − 2)(x − 3)3 . and thus ∃ C ∈ Cn with C −1 AC in Jordan form. The reader should use the transpose principle to write three other versions of the first theorem. Later on it will be shown that if A is a symmetric real matrix. then A is similar to a diagonal matrix. i. Exercise Find all real matrices in Jordan form that have the following characteristic polynomials: x(x − 2). (x − 2)(x − 3)2 . (x − 2)2 . (x − 2)2 (x − 3)2 . namely that if A has n distinct eigenvalues in F . This means that if A and D are similar matrices in Jordan form.. C may be selected to be in Rn iff all the eigenvalues of A are real. Note that a diagonal matrix is a special case of Jordan form. (x − 2)(x − 3)(x − 4).) Theorem Jordan form (when it exists) is unique.. except possibly in different order. ¯ ¯ . .. Then n1 + · · +nt = n and CPD (x) = (x − λ1 )n1 · ·(x − λt )nt . The complex numbers are algebraically closed. Let’s look at the classical case A ∈ Rn . i. Bt such B1 B2 0 · · Bt      .

) Inner Product Spaces The two most important fields for mathematics and science in general are the real numbers and the complex numbers. for all u1 . If a or b is 0. (u · v)2 ≤ (u · u)(v · v). the result is obvious. Now 0 ≤ (ua ± vb) · (ua ± vb) = (u · u)a2 ± 2ab(u · v)+ (v ·v)b2 = b2 a2 ±2ab(u·v) +a2 b2 . Definition Suppose V is a real vector space. 2) 3) Theorem 1) 2) √ √ Proof of 2) Let a = v · v and b = u · u. Thus 0 · v = 0. The theories for the real and complex cases are quite similar. v) to u · v and satisfies 1) (u1 r1 + u2 r2 ) · v = (u1 · v)r1 + (u2 · v)r2 v · (u1 r1 + u2 r2 ) = (v · u1 )r1 + (v · u2 )r2 u·v =v·u u · u ≥ 0 and u · u = 0 iff u = 0 ¯ Suppose V has an inner product. However. . v ∈ V . If v ∈ V . r2 ∈ R. for all u ∈ V . the power and elegance of linear algebra become transparent for all to see. for simplicity. If u. ¯ Schwarz’ inequality. v ∈ V .98 Linear Algebra Chapter 5 Exercise (Cayley-Hamilton Theorem) Suppose E is a field and A ∈ En . An inner product (or dot product) on V is a function V × V → R which sends (u. ¯ Exercise Suppose A ∈ Fn is in Jordan form. Suppose neither a nor b is 0. In the remainder of this chapter. Show A is nilpotent iff An = 0 ¯ iff CPA (x) = xn . Finitely generated vector spaces over R or C support inner products and are thus geometric as well as algebraic objects. for all u. f : V → R defined by f (u) = u · v is a homomorphism. Dividing by 2ab yields 0 ≤ ab±(u·v) or | u·v |≤ ab. attention is restricted to the case F = R. (Note how easy this is in Jordan form. v ∈ V and r1 . Use this to show CPA (A) = 0. Assume the theorem that there is a field F containing E such that CPA (x) factors completely in F [x]. Thus ∃ an invertible C ∈ Fn such that D = C −1 AC is in Jordan form. and both could have been treated here. u2 .

It is given by the formula (v1 r1 + · · +vn rn ) · (v1 s1 + · · +vn sn ) = r1 s1 + · · +rn sn . It is easy to define an inner product. | u · v |≤ u v .. as is shown by the following theorem. . sn ) = r1 s1 + · · +rn sn .Chapter 5 Linear Algebra 99 Theorem √ Suppose V has an inner product. A sequence {v1 .. (Schwarz’ inequality) (The triangle inequality) 2 Proof of 4) 2 u v + v 2 u + v 2 = (u + v) · (u + v) = u = ( u + v )2 . en } will be called the canonical or standard orthonormal basis. 1) 2) 3) 4) v = 0 iff v = 0. vn } is orthogonal provided vi · vj = 0 when i = j. The next theorem shows that this inner product has an amazing geometry. + 2(u · v) + v 2 ≤ u 2 + Definition An Inner Product Space (IPS) is a real vector space with an inner product.. u+v ≤ u + v .. j ≤ n.. i. ¯ vr = v | r |. vn } is an orthogonal sequence of non-zero vectors. Theorem Suppose V is a real vector space with a basis S = {v1 . ..j for 1 ≤ i. If S = {v1 .. . . The sequence is orthonormal if it is orthogonal and each vector has length 1. S = {e1 . Thus S is independent. Then 0 = (v1 r1 + · · +vn rn ) · vi = ri (vi · vi ) ¯ and thus ri = 0. then v1 vn is orthonormal. S is independent.. Theorem If u. u ∈ Rn . The second statement is transparent.. The following properties hold. u · v = u v cos Θ where Θ is the angle between u .···. Then there is a unique inner product on V which makes S an orthornormal basis. rn ) · (s1 .. Furthermore v1 vn Theorem Proof Suppose v1 r1 + · · +vn rn = 0. vi · vj = δi.. Convention Rn will be assumed to have the standard inner product defined by (r1 .. vn }. . . .e.. Suppose V is an IPS. Define the norm or length of a vector v by v = v · v.

. .. the subspace generated by {v1 . wk+1 = v.. and v ∈ W −Y ... vk }.. Then if v ∈ W . any orthonormal sequence in W extends to an orthonormal basis of W .. sn ). the   c1  ·   set of all vectors perpendicular to A.. . Then W has an orthonormal basis {w1 .. . Then (w·wi ) = (v −w1 (v ·w1 )··−wk (v ·wk ))·wi = 0. wk .. .e.. . Gram-Schmidt orthonormalization Theorem (Fourier series) Suppose W is an IPS with an orthonormal basis {w1 .. Define the projection of v onto Y by p(v) = w1 (v ·w1 )+·· +wk (v ·wk ). and this the set of all solutions to a1 (x1 − c1 ) + · · +an (xn − cn ) = 0. w Thus if wk+1 = . Theorem (Gram-Schmidt) Suppose W is an IPS with a basis {v1 .. rn ) and v = (s1 . v} is already orthonormal. Thus r1 s1 + · · +rn sn = u v cos Θ. By the law of cosines u − v 2 = 2 2 2 u + v 2 − 2 u v cos Θ. wn }. Then L = ker(f ) is the set of all solutions to a1 x1 + · · +an xn = 0. vn }. Suppose inductively that {w1 . So (r1 − s1 )2 + · · +(rn − sn )2 = r1 + · · +rn + s2 + · · 1 +s2 − 2 u v cos Θ.. . Linear Algebra Chapter 5 Proof Let u = (r1 . wk }.100 and v. wk+1 } is an orthonormal basis for the subspace w generated by {w1 ... wn }.. and let w = v −p(v). wk . v}.. .. Let w = vk+1 − p(vk+1 ) and . Moreover. an ) ∈ R1. Y ⊂ W is a subspace with an orthonormal basis {w1 ..n .. v = w1 (v · w1 ) + · · +wn (v · wn ). wk } is an orthonormal v1 basis for Y .. . Now suppose b ∈ R and C =    ∈ Rn  ·  cn −1 has f (C) = b. n Exercise This is a simple exercise to observe that hyperplanes in Rn are cosets. ... Suppose f : Rn → R is a non-zero homomorphism given by a matrix A = (a1 . . v1 Proof Let w1 = . If {w1 . i. . Then f (b) is the set of all solutions to a1 x1 + · · +an xn = b which is the coset L + C.. then {w1 . . Proof v = w1 r1 + · · +wn rn and v · wi = (w1 r1 + · · +wn rn ) · wi = ri Theorem Suppose W is an IPS.

f (un )} is an orthonormal sequence in V . w2 } where w1 = (1. Then f is an isometry iff {f (u1 ).. 3). It is easy to check that f preserves inner products. where p(v) is the projection of v onto Y . un } is an orthonormal basis for U . wk . W is generated by the sequence {w1 . Theorem Suppose each of U and V is an n-dimensional IPS.3). The process above may be used to modify this to an orthonormal basis {w1 . Now suppose W has dimension n and {w1 .. 2. 1.1. {w1 . So suppose S = {f (u1 ). {w1 . . let w = v − p(v). . Exercise Let W = R3 have the standard inner product and Y ⊂ W be the subspace generated by {w1 ... In this manner an orthonormal basis for W is constructed. Find w3 and show that for any t with 0 ≤ t ≤ 1... and f : U → V is a homomorphism. . . .. it extends to a basis {w1 . Then S is independent and thus S is a basis and thus f is an isomorphism. u2 in U . {u1 . up to isometry. v} where v = (1. vk+1 }. wk .. 0. Find the projection of (e1 + e2 ) onto ker(f ). A homomorphism f : U → V is said to be an isometry provided it is an isomorphism and for any u1 . Since this sequence is independent.. This is a key observation for a future exercise showing O(n) is a deformation retract of Gl n (R). w2 . and set w w3 = . We now come to one of the definitive theorems in linear algebra. 0) and w2 = (0. Find an orthonormal basis for the kernel of f . (1 − t)v + tw3 } w is a basis for W . there is only one inner product space for each dimension. It is that. .. (u1 · u2 )U = (f (u1 ) · f (u2 ))V . . wk+1 } is an orthonormal basis w for the subspace generated by {w1 . . wk } is an orthonormal sequence in W . Isometries Suppose each of U and V is an IPS. vn }.... vk+1 . .. Proof Isometries certainly preserve orthonormal sequences. . w2 . f (un )} is an orthonormal sequence in V .... Then by the previous theorem. wn }. As in the first theorem of this section. Exercise Let f : R3 → R be the homomorphism defined by the matrix (2. Find the angle between e1 + e2 and the plane ker(f ).Chapter 5 wk+1 = Linear Algebra 101 w . 0)..

i. un } for U and {v1 . In particular. 1) 2) 3) The columns of A form an orthonormal basis for Rn . and is called the orthogonal group...e. Theorem 1) 2) If A is orthogonal.102 Linear Algebra Chapter 5 Theorem Suppose each of U and V is an n-dimensional IPS. en } is the canonical orthonormal basis for Rn . Theorem Suppose A ∈ Rn and f : Rn → Rn is the homomorphism defined by f (B) = AB. vn } for V . linear algebra is not so much the study of vector spaces as it is the study of endomorphisms. and by the previous theorem. The rows of A form an orthonormal basis for Rn . Then the following are equivalent. Find a linear transformation h : R2 → R3 which gives an isometry from R2 to ker(f ). A is said to be orthogonal. Proof A left inverse of a matrix is also a right inverse (see the exercise on page 64). f is an isometry. Thus O(n) is a multiplicative subgroup of Gln (R). | A |= ±1. . and thus preserve angle and distance. AC is orthogonal. . and so certainly preserve volume. 1) and 3) are equivalent.1. AAt = I. Thus by the previous section. Exercise Let f : R3 → R be the homomorphism defined by the matrix (2. Proof There exist orthonormal bases {u1 .3). and f (ei ) is column i of A. U is isometric to Rn with its standard inner product. i. Isometries preserve inner product. Then ∃ an isometry f : U → V .. f is an isometry. Orthogonal Matrices As noted earlier.. Thus 1) and 2) are equivalent because each of them says A is invertible with A−1 = At . Definition If A ∈ Rn satisfies these three conditions.. . Now {e1 . If A is orthogonal.. . The set of all such A is denoted by O(n).. If A and C are orthogonal. We now wish to study isometries from Rn to Rn . We know from a theorem on page 90 that an endomorphism preserves volume iff its determinant is ±1. Now there exists a homomorphism f : U → V with f (ui ) = vi ..e. At A = I. A−1 is orthogonal.

Let h : Gln (R) → O(n) be defined by Gram-Schmidt. Then f preserves distances and angles. Show H : Gln (R) × [0. v ∈ Rn . . The next theorem is just an exercise using the previous theorem. all of its eigenvalues are real and that ∃ an orthogonal matrix C such that C −1 AC is diagonal. Show Gln (R) is an open subset and O(n) is closed and compact. Then u − v 2 = (u − v) · (u − v) = f (u − v) · f (u − v) = f (u − v) 2 = f (u) − f (v) 2 . Thus (At u) · v = (ut A)v = ut (Av) = u · (Av). t) = (1 − t)A + th(A) is a deformation retract of Gln (R) to O(n). 2 Proof Suppose y. A is said to be symmetric provided At = A. As background. This means that if u. This means a sequence of matrices {Ai } converges to A iff it converges coordinatewise. if A is a symmetric matrix. v ∈ Rn . Diagonalization of Symmetric Matrices We continue with the case F = R. z ∈ Rn . Part 2) is immediate. u − v = f (u)−f (v) and the angle between u and v is equal to the angle between f (u) and f (v). For part 3) assume f : Rn → Rn is an isometry. The proof that f preserves angles follows from u · v = u v cosΘ. 1] → Gln (R) defined by H(A. Definition Suppose A ∈ Rn . Exercise Show that if A ∈ O(2) has |A| = 1. Note that any diagonal matrix is symmetric.Chapter 5 3) Linear Algebra 103 Suppose A is orthogonal and f is defined by f (B) = AB. then A = cosΘ −sinΘ sinΘ cosΘ for some number Θ. A is said to be self-adjoint if (Au)·v = u·(Av) for all u. Then (At u) · v = u · (Av). Theorem A is symmetric iff A is self-adjoint. Theorem Suppose A ∈ Rn and u. because isometries clearly form a subgroup of the multiplicative group of all automorphisms. v ∈ Rn . we first note that symmetric is the same as self-adjoint. Proof Part 1) follows from |A|2 = |A| |At | = |I| = 1. (See the exercise on page 56.) Exercise (topology) Let Rn ≈ Rn have its usual metric topology. Our goals are to prove that. Then the dot product y · z is the matrix product y t z.

and Au = λ1 u and Av = λ2 v... The next theorem has geometric and physical implications. Either way.t.104 Linear Algebra Chapter 5 Theorem Suppose A ∈ Rn is symmetric. Now suppose λ is a complex eigenvalue of A and symmetric matrix. the canonical orthonormal basis. Review Suppose A ∈ Rn and f : Rn → Rn is defined by f (B) = AB. Then λ(¯t v) = (λv)t v = (Av)t v = (¯t A)v = t t t ¯ and λ ∈ R. λ2 ∈ R are distinct eigenvalues of A. That is. Thus λ = λ ¯ ¯ ¯ inner product on Cn by (w · v) = wt v. We know that eigenvectors belonging to distinct eigenvalues are linearly independent. λ is a real number. . ¯ ¯ then h is a ring isomorphism which is the identity on R. Then (C −1 AC)t = C t A(C −1 )t = C −1 AC. namely that they are perpendicular.r. λn (not necessarily distinct) such that CPA (x) = (x − λ1 )(x − λ2 ) · · · (x − λn ). Summary Representing f w. Then A is symmetric iff C −1 AC is Suppose A is symmetric. Since A ∈ Rn is a real ¯ a t ¯t . all the eigenvalues of A are real. Then ∃ real numbers λ1 .r. its conjugate is defined by µ = a − bi. Now S is an orthonormal basis iff C is an orthogonal matrix. If w = (ai.r.t. . Theorem symmetric. Proof Suppose A ∈ Rn and C ∈ O(n). . Then A represents f w. Proof We know CPA (x) factors into linears over C. we show more. The proof then reads as λ(v · v) = (λv · v) = (Av · v) = (v · Av) = (v · λv) = λ(v · v). Proof λ1 (u · v) = (Au) · v = u · (Av) = λ2 (u · v). A = A = A ¯ v v v ∈ Cn is an eigenvector with Av = λv.. Then u · v = 0.j ).. If µ = a + bi is a complex number. an orthonormal basis is the same as conjugating A by an orthogonal matrix. just the incredibility of it all will suffice. its conjugate is defined by w = (¯i. Theorem Suppose A is symmetric. vn } be another basis and C ∈ Rn be the matrix with vi as column i. Then C −1 AC is the matrix representing f w. but for us. Let S = {v1 . S. If h : C → C is defined by h(µ) = µ.j ) is a complex matrix or vector. For symmetric matrices. Or you can define a complex ¯ v v (Av) = v (λv) = λ(¯ v).t. λ1 .

because then there is a basis of eigenvectors of length 1. ∃ an orthonormal basis {v1 .. v2 } be an orthonormal basis for R2 with Av1 = λv1 . Denote by λ1 . 2) ⇒ 1).     (0) (D) Since this is a symmetric matrix. λk the distinct eigenvalues of A. the transformation A is λ b represented by . . k ≤ n.. Then w... λk with each vi = 1. 105 Proof By the previous theorem. Since this matrix is symmetric. Show 1) ⇒ 2). ∃ C ∈ O(n) such that C −1 AC is diagonal. vk be eigenvectors for λ1 . 0 d Now suppose by induction that the theorem is true for symmetric matrices in Rt for t < n.. vn } for V with each vi an eigenvector of f . 1) 2) f is self-adjoint. ..r.. 0 C This theorem is so basic we state it again in different terminology. They may be extended to an orthonormal basis v1 . B = 0 and D is a symmetric matrix of smaller size. With respect to this basis. and suppose A is a symmetric n × n matrix. A is symmetric. If V is an IPS.Chapter 5 Theorem 1) 2) Linear Algebra If A ∈ Rn .t this basis. then the following are equivalent.                λ1   · the transformation A is represented by · λk      (B)    . Let v1 . Suppose A is a symmetric 2 × 2 matrix. . vn .. . By induction. Let λ be an eigenvalue for A and {v1 . the following are equivalent. ∃ an orthogonal C such that C −1 DC is diagonal. If k = n.. So suppose k < n. . b = 0.. the proof is immediate. and they must form an orthonormal basis. Thus conjugating I 0 by makes the entire matrix diagonal.. v ∈ V . a linear transformation f : V → V is said to be self-adjoint provided (u·f (v)) = (f (u)·v) for all u. Theorem If V is an n-dimensional IPS and f : V → V is a linear transformation. .

if V has an inner product. Show that h is continuous. Furthermore E(At ) = E(A)t . an algebraic object. then | eA |= etrace(A) . | eA |= 1 iff trace(A) = 0. assume any A ∈ Cn is similar to a lower triangular matrix. ∃ a non-zero matrix N ∈ R2 with eN = I. This series converges and thus E is a well defined function.e. Suppose f and g are isomorphisms from V to Rn and A is a subset of V . ∃ an orthogonal C such that D = C −1 AC. has a god-given topology. and if C is invertible. If N ∈ Rn is symmetric. (For part 1. then eN = I iff N = 0. Since A and −A commute.) 1) If A ∈ Cn . suppose V and W are finite-dimensional real vector spaces and h : V → W is a linear transformation. and ¯ thus E(A) is invertible with E(A)−1 = E(−A). This shows that V . Show that f (A) is an open subset of Rn iff g(A) is an open subset of Rn . 2 1 . 2 Exercise Suppose A. We know that V is isomorphic to Rn . i. and this metric will determine that same topology. If AB = BA. assume the Jordan form. Exercise Suppose V is an n-dimensional real vector space. Finally. then eA ∈ O(n). E(C −1 AC) = C −1 E(A)C. it automatically has a metric. Now use the results of this section to prove the statements below. Thus if A ∈ Rn . D ∈ Rn are symmetric. Find an orthogonal C such that C −1 AC is diagonal.. Of course. I = E(0) = E(A − A) = E(A)E(−A). Under what conditions are A and D similar? Show that. if A and D are similar. Exercise Define E : Cn → Cn by E(A) = eA = I + A + (1/2!)A2 + ··. then E(A + B) = E(A)E(B). ¯ If A ∈ Rn and At = −A.106 Linear Algebra Chapter 5 Exercise Let A = Do the same for A = 2 2 2 1 2 . 2) 3) 4) .

i. The organization is mostly a linearly ordered sequence except for the last two sections on determinants and dual spaces. 107 . This classical and very powerful technique allows an easy proof of the canonical forms. Since Z is a Euclidean domain. so this is a powerful concept.. The style is the same as before. then M is the sum of cyclic modules. A matrix in Jordan form is a lower triangular matrix with the eigenvalues of T displayed on the diagonal. then V becomes an F [x]-module by defining vx = T (v).e. enough material is added to form a basic first year graduate course. Suppose R is a commutative ring. Thus if M is torsion free. These are independent sections added on at the end. it is a free R-module. Now suppose F is a field and V is a finitely generated F -module. This always holds if F = C. If T : V → V is a linear transformation. it is stated without proof that the determinant of the product is the product of the determinants.Chapter 6 Appendix The five previous chapters were designed for a year undergraduate course in algebra. M ≈ R/I where I is an ideal of R.e. Now F [x] is a Euclidean domain and so VF [x] is the sum of cyclic modules. which depends upon the classification of certain types of alternating multilinear forms. Two of the main goals are to characterize finitely generated abelian groups and to prove the Jordan canonical form. i. The basic theorem of this chapter is that if R is a Euclidean domain and M is a finitely generated R-module. finitely generated abelian groups are the sums of cyclic groups. The final section gives the fundamentals of dual spaces.. then there is a basis for V so that the matrix representing T is in Jordan canonical form. In this appendix. In the chapter on matrices. everything is right down to the nub. An R-module M is said to be cyclic if it can be generated by one element. is given in this chapter. A proof of this. There is a basis for V so that the matrix representing T is in Rational canonical form. If the characteristic polynomial of T factors into the product of linear polynomials.

then c = c(a1 + a2 ) ∈ A1 A2 ... 0).. .. then the following are equivalent.. 0. Then the sum A1 + A2 + · · · + Am is the set of all a1 + a2 + · · · + am with ai ∈ Ai .. Show A1 ∩ A2 ⊂ A1 A2 . A2 . Thus A1 ∩ A2 ∩ · · · ∩ Am and B are comaximal. .. Chinese Remainder Theorem Suppose A1 ..108 Appendix The Chinese Remainder Theorem Chapter 6 On page 50 in the chapter on rings. Theorem If A1 . Then π(A1 A2 · · · Am ) = π(A1 )π(A2 ) · · · π(Am ) = (R/B)(R/B) · · · (R/B) = R/B. 1i . then A1 A2 · · · An = A1 ∩ A2 ∩ · · · ∩ An . . Here it is presented in full generality. the theorem holds even for non-commutative rings. ∃ a ∈ A and b ∈ B with a + b = 1. A2 .. r2 + A2 . . .. Then π : R → R/A1 ×R/A2 ×···×R/An is a surjective ring homomorphism with kernel A1 ∩ A2 ∩ · · · ∩ An . Proof for n = 2. A2 .. the Chinese Remainder Theorem was proved for the integers. A2 .. An are pairwise comaximal ideals of R. ∃ a1 ∈ A1 and a2 ∈ A2 with a1 + a2 = 1. If A and B are ideals of a ring R.. rn + An ) is an element of the ¯ range. Am are ideals of R.. An are pairwise comaximal ideals of R. If (r1 + A1 . Proof There exists ai ∈ Ai and bi ∈ A1 A2 ···Ai−1 Ai+1 ···An with ai +bi = 1. Note that the sum and product of ideals are ideals and A1 A2 · · · Am ⊂ (A1 ∩ A2 ∩ · · · ∩ Am ). Proof Consider π : R → R/B. Surprisingly. ¯ If c ∈ A1 ∩ A2 . it is the image of r1 b1 +r2 b2 +···+rn bn = r1 (1 −a1 )+r2 (1 −a2 )+···+rn (1 −an ). with each Ai = R. Note ¯ that π(bi ) = (0. ¯ ¯ ¯ Theorem If R is commutative and A1 .. A and B are comaximal. ¯ π(A) = R/B where π : R → R/B is the projection. .. 0. .. then A1 A2 · · · Am and B are comaximal. Definition Suppose R is a ring and A1 .. . Am and B are ideals of R with Ai and B comaximal for each i. Definition Theorem 1) 2) 3) Ideals A and B of R are said to be comaximal if A + B = R. The product A1 A2 · · · Am is the set of all finite sums of elements a1 a2 · · · am with ai ∈ Ai ..

Note To properly appreciate this proof. Theorem Corollary Proof Theorem Every field is a domain. Definition Suppose R is a domain and a. Theorem 0 is a prime ideal of R iff R is ¯ 0 is a maximal ideal of R iff R is ¯ Suppose J ⊂ R is an ideal. I is prime means I = R and if a. b ∈ R. Thus every Euclidean domain is a unique factorization domain. the student should work the exercise on group theory at the end of this section. Note that ∼ is an equivalence relation. it was shown that Z is a unique factorization domain. Then we say a ∼ b iff there exists a unit u with au = b. Consider {J : J is an ideal of R containing a with J = R}. b ∈ R have ab ∈ I. then a . Later on it will be shown that every Euclidean domain is a principle ideal domain. Therefore V is equal to some Vt and is a maximal ideal containing a. Definition Suppose R is a commutative ring and I ⊂ R is an ideal.Chapter 6 Appendix Prime and Maximal Ideals and UFDs 109 In the first chapter on background material. J is a prime ideal iff R/J is J is a maximal ideal iff R/J is Maximal ideals are prime. If a ∈ R is not a unit. If a ∼ b. then a or b ∈ I. J = R. Here it will be shown that this property holds for any principle ideal domain. The ideal V = ¯ t∈T thus is not equal to R. Proof This is a classical application of the Hausdorff Maximality Principle. This collection contains a Vt does not contain 1 and maximal monotonic collection {Vt }t∈T . I is maximal means I = R and there are no ideals properly between I and R. then ∃ a maximal ideal I of R with a ∈ I.

the associates of 1 are the units of R. Parts 1) and 3) above show there is a bijection from the associate classes of R to the principal ideals of R. then a is irreducible (prime) iff b is irreducible (prime). Then the following are ¯ a ∼ b. If an element generates a non-zero prime ideal. Appendix Chapter 6 Examples If R is a domain. 1) 2) 3) Suppose R is a domain and a. i..e. a is irreducible if it does not factor. Thus if R is a PID. This is immediate from the definitions.e. then au is irreducible (prime). An element a divides b (a|b) if ∃! c ∈ R with ac = b. Theorem equivalent.110 and b are said to be associates. then a|ci for some i. . then its associates are n and −n. Note If a is a prime and a|c1 c2 · · · cn . If n ∈ Z is not zero. aR = bR.e. If each cj is irreducible. a|b and b|a. while the only ¯ associate of 0 is 0 itself. then the associates of g are all cg where c is a non-zero constant.. if a = b1 b2 ···bn = c1 c2 ···cm with each bi . a is prime if it generates a prime ideal. ¯ ¯ If F is a field and g ∈ F [x] is a non-zero polynomial. a|bc ⇒ a|b or a|c. it is called a prime element. Note If a ∼ b. a = bc ⇒ b or c is a unit. bi and cσ(i) are associates for every i. i. This follows from the definition and induction on n. b ∈ (R − 0). then a ∼ ci for some i. Definition 1) 2) Suppose R is a domain and a ∈ R is a non-zero non-unit. if a is irreducible (prime) and u is a unit. but it shows how associates fit into the scheme of things. i. there is a bijection from the associate classes of R to the ideals of R. In other words.. Theorem Factorization into primes is unique up to order and associates. Note a is prime ⇒ a is irreducible. then n = m and for some permutation σ of the indices. ci prime. The following theorem is elementary.

∃ b ∈ R with I = bR. that one of these irreducibles is an associate of a. Theorem If R is a UFD and a is a non-zero non-unit of R. principal ideal domains have this property. so 2) ⇒ 3). then a factors into a finite product of irreducibles. Fortunately. This means I = R or I = aR. It follows from the uniqueness of the factorization of ad = bc.Chapter 6 Proof Appendix 111 This follows from the notes above. as seen in the next theorem.e. then a divides one of them. so suppose each of b and c is a non-zero non-unit element of R. Every prime element is an irreducible element.e. Since b divides a. Proof Every maximal ideal is a prime ideal. Theorem 1) 2) Suppose R is a FD. a is irreducible. then a is irreducible ⇔ a is prime. so 1) ⇒ 2). and a|bc. R is a UFD. Thus in a UFD. Now suppose a is irreducible and show aR is a maximal ideal. If either b or c is a unit or is zero. i. a is a prime element. If I is an ideal containing aR. Proof Suppose R is a UFD. .. b is a unit or an associate of a. aR is a prime ideal. Then the following are equivalent. 1) 2) 3) aR is a maximal ideal. and thus a|b or a|c.. then R is a UFD iff each irreducible element generates a prime ideal. R is a unique factorization domain (UFD) means R is a FD and factorization is unique (up to order and associates). i. Definition R is a factorization domain (FD) means that R is a domain and if a is a non-zero non-unit element of R. Part 2) ⇒ 1) because factorization into primes is always unique. Therefore the element a is a prime. elements factor as the product of primes. a is an irreducible element of R. Every irreducible element is prime. If R is a FD. There exists an element d with ad = bc. Then the following are equivalent. Proof We already know 1) ⇒ 2). Theorem Suppose R is a PID and a ∈ R is non-zero non-unit. Each of b and c factors as the product of irreducibles and the product of these products is the factorization of bc. a irreducible ⇔ a is prime. This is a revealing and useful theorem.

it only remains to show that a PID is a FD. but only that they be finitely generated. then It = It0 . . and so each of a and b factors as . For example.e. The proof that 2) ⇒ 3) is immediate. with each inclusion proper. ∃ t0 ≥ 1 such that It = Ito for all t ≥ t0 . Theorem 1) 2) Suppose R is a commutative ring. . This is a useful property satisfied by many of the classical rings in mathematics. a2 .) If I1 ⊂ I2 ⊂ I3 ⊂ . Then ∃ a sequence a1 . If {It }t∈T is a collection of ideals. of elements in I with a1 R + a2 R + · · · + an R properly contained in a1 R + a2 R + · · · + an+1 R for each n ≤ 1. nor need it be a maximal ideal of the ring R. Theorem A Noetherian domain is a FD. The proof will not require that ideals be principally generated. . a2 . i. is finitely generated and ∃ t0 ≥ 1 such that It0 contains those generators. ∃ t0 ∈ T such that if t is any element in T with It ⊃ It0 . i. an } ⊂ R such that I = a1 R + a2 R + · · · + an R. Proof Suppose 1) is true and show 3). see the next theorem.. The element c must be reducible. or it is said to satisfy the ascending chain condition. is a monotonic sequence of ideals. Each of aR and bR properly contains cR.. each ideal of R is finitely generated. In particular. a PID is a FD. (The ideal It0 is maximal only in the sense described. ∃ a sequence of ideals I1 ⊂ I2 ⊂ . If 2) is false. . Definition If R satisfies these properties. . ∃ a finite set {a1 . Proof Suppose there is a non-zero non-unit element that does not factor as the finite product of irreducibles.112 Appendix Chapter 6 Our goal is to prove that a PID is a UFD. Thus 3) is false and 1) ⇔ 3).. We shall see below that this is a useful concept which fits naturally into the study of unique factorization domains. R is said to be Noetherian. . .. Using the two theorems above.. Thus 3) is false and so 2) ⇔ 3). .e. 3) If I ⊂ R is an ideal. Then the following are equivalent. Thus 3) is true. It need not contain all the ideals of the collection. Having three definitions makes this property easy to use. Now suppose 1) is false and I ⊂ R is an ideal not finitely generated. The ideal I = I1 ∪ I2 ∪ . c = ab where neither a nor b is a unit. Then ∃ a maximal one cR. Consider all ideals dR where d does not factor.. This turns out to be equivalent to the property that any collection of ideals has a “maximal” element..

Next are presented two of the standard examples of Noetherian domains that are not unique factorization domains. This gives a finite factorization of c into irreducibles which is a contradiction.... Then R is Noetherian ⇒ R[x1 ... So Z is a UFD and if F is a field.. xn ] is a UFD.Chapter 6 Appendix 113 a finite product of irreducibles. Bourbaki. Theorem If R is a UFD then R[x1 . then R/I is Noetherian. which are stated here only for reference. Show that R is a subring of √ R which is not a UFD. xn ] is a UFD.. xn ] Theorem and R[[x1 . In particular 2 · 2 = (1 − 5) · (−1 − 5) are two distinct .6.) Theorem Germs of analytic functions over C form a UFD. Thus if F is a field. See page 566 of Commutative Algebra by N. It takes more work to prove the following theorems. F [x1 . m ∈√ Z}. F [x] is a UFD. . .. F [[x1 . You see the basic structure of UFDs is quite easy. xn ]] is a UFD. (This follows immediately from the definition.. .. √ √ Exercise Let R = Z( 5) = {n + m 5 : n.) If R is a PID.) Theorem If R is Noetherian and I ⊂ R is a proper ideal..) Note The combination of the last two theorems shows that Noetherian is a ubiquitous property which is satisfied by many of the basic rings in commutative algebra.. Thus if F is a field. (This is the famous Hilbert Basis Theorem.2 of An Introduction to Complex Analysis in Several Variables by L.. o Suppose R is a commutative ring. H¨rmander. then the formal power series R[[x1 . xn ]] is a UFD.. . (This theorem goes all the way back to Gauss. . Corollary A PID is a UFD. (There is a UFD R where R[[x]] is not a UFD... Proof See Theorem 6.. xn ]] are Noetherian. ..

) Proof Suppose 1) is true. Show R is isomorphic to Z[x]/(x2 − 5). z]/(yz). When is K a summand ¯ of B? It turns out that K is a summand of B iff there is a splitting map from B/K to B. Then h is injective. If i : L → B is inclusion. Which one of the following is true? 1) 2) If a ∈ Q. K must be a summand of B. y. B and C are R-modules. Splitting Short Exact Sequences Suppose B is an R-module and K is a submodule of B. z]/(x2 − yz). In particular. let u ∈ R be the coset containing u. which is not a domain. and g : B → C is a surjective homomorphism with kernel K.. Show that H is maximal iff G/H ≈ Zp for some prime p. In particular x · x = y · z are two distinct irreducible factorizations ¯ ¯ ¯ ¯ x of x2 . consider the case G = Q. a subgroup H of G is said to be maximal if H = G and there are no subgroups properly between H and G. Now suppose 2) is true and h : C → B is a right inverse of g.e. As defined in the chapter on linear algebra. Then the following are equivalent. and R/(2) is isomorphic to Z2 [x]/(x2 − [5]) = Z2 [x]/(x2 + [1]). then there is a maximal subgroup H of Q which contains a. Show R ¯ is not a UFD. y. This is used below to show that if R is a PID. Show R/(¯) is isomorphic to R[y. then h defined by h = i ◦ (g|L)−1 is a right inverse of g. g has a right inverse. if B/K is free. where (x2 − 5) represents the ideal (x2 − 5)Z[x]. which is not a domain. Q contains no maximal subgroups. i. (h is called a splitting map.e. K + h(C) = B and K ∩ h(C) = 0. Then (g|L) : L → C is an isomorphism. If u ∈ R[x. In this case we write K ⊕ L = B. ∃ a homomorphism h : C → B with g ◦ h = I : C → C. z]. suppose ∃ a submodule L of B with K ⊕ L = B. ¯ Thus K ⊕ h(C) = B. then every submodule of Rn is free. i. z]. K is a summand of B provided ∃ a submodule L of B with K + L = B and K ∩ L = 0. For simplicity.. . y. Exercise Let R = R[x. 1) 2) K is a summand of B. Theorem 1 Suppose R is a ring.114 Appendix Chapter 6 irreducible factorizations of 4. ¯ Exercise In Group Theory If G is an additive abelian group. Show x2 − yz is irreducible and thus prime in R[x.

s) means f is injective. Theorem 3 If R is a PID and A ⊂ Rn is a submodule. it is a precursor to the following classical result. The function h extends to a homomorphism from C to B which is a right inverse of g. Although this theorem is transparent. There exists a function h : T → B such that g ◦ h(c) = c for each c ∈ T . This theorem restates the ring property of PID as a module property. f g The statement that 0 → A → B → C → 0 is a short exact sequence (s.Chapter 6 Appendix 115 Definition Suppose f : A → B and g : B → C are R-module homomorphisms. then the following are equivalent. Proof From the previous theorem we know this is true for n = 1.e. Now suppose C has a free basis T ⊂ C. The canonical split s. Theorem 1. there is a splitting map and thus the sequence splits. If C is a free R-module.s. Theorem 2 1) 2) If R is a commutative ring.e. f   E B A g i 1  ~  A⊕C c ≈ & & &2 π E C b & & & We now restate the previous theorem in this terminology. R is a PID. is A → A ⊕ C → C where f = i1 and g = π2 . Every submodule of RR is a free R-module of dimension ≤ 1. and g : B → C is surjective. A short exact sequence is said to split if ∃ an ≈ isomorphism B → A ⊕ C such that the following diagram commutes.1 A short exact sequence 0 → A → B → C → 0 splits iff f (A) is a summand of B. then A is a free R-module of dimension ≤ n. Suppose n > 1 and the theorem is true for submodules of Rn−1 . iff B → C has a splitting map. Thus subgroups of Zn are free Z-modules of dimension ≤ n. . g is surjective and f (A) = ker(g). Suppose A ⊂ Rn is a submodule. Proof We know from the previous theorem f (A) is a summand of B iff B → C has a splitting map. Showing these properties are equivalent to the splitting of the sequence is a good exercise in the art of diagram chasing.

1. In either ¯ case. Show A is a free Z-module of dimension 1. However they are exactly what you need. ¯ Examples of Euclidean Domains Z with φ(n) = |n|. and you don’t have to worry about it becoming obsolete. A ∩ Rn−1 is free of dimension ≤ n − 1. and it is possible to just play around with matrices and get some deep results. A is a free submodule of dimension ≤ n. A field F with φ(a) = 1 ∀ a = 0 or with φ(a) = 0 ∀ a = 0. This is one of the great classical theorems of abstract algebra. not just the set of positive integers. The axioms are so miniscule that it is surprising you get this much juice out of them. Z[i] = {a + bi : a. b ∈ (R − 0). f π . b ∈ Z} = Gaussian integers with φ(a + bi) = a2 + b2 . 0 −→ Rn−1 −→ Rn−1 ⊕ R −→ R −→ 0 0 −→ A ∩ Rn−1 −→ A −→ π(A) −→ 0 By induction. ¯ ¯ F [x] where F is a field with φ(f = a0 + a1 x + · · · + an xn ) = deg(f ). 64)}. Euclidean Domains The ring Z possesses the Euclidean algorithm and the polynomial ring F [x] has the division algorithm. Here N will denote the set of all non-negative integers. then M is the sum of cyclic modules.116 Appendix Chapter 6 Consider the following short exact sequences. 2) ∃ q. If R is a Euclidean domain and M is a finitely generated R-module. If ¯ π(A) = 0. Definition A domain R is a Euclidean domain provided ∃ φ : (R − 0) −→ N such ¯ that if a. then A ⊂ Rn−1 . where f : Rn−1 → Rn−1 ⊕ R is inclusion and g = π : Rn−1 ⊕ R → R is the projection. (16. r ∈ R such that a = bq + r with r = 0 or φ(r) < φ(b). The concept of Euclidean domain is an abstraction of these properties. then ¯ 1) φ(a) ≤ φ(ab). it is free of dimension 1 and the sequence splits by Theorem 1. If π(A) = 0. Exercise Let A ⊂ Z2 be the subgroup generated by {(6. 24).

if E ∈ Rt is invertible and J is the ideal generated by the elements of AE. . Also d1 generates the ideal of R ¯ generated by the entries of (ai. Proof The following remarkable theorem is the foundation for the results of this section. ¯ ¯ Then b generates I because if a ∈ I − 0. there is an element d1 with I = d1 R. This is a good exercise.. r with a = bq + r. then R is a PID and thus a UFD. Proof Let I ⊂ R be the ideal generated by the elements of the matrix A = (ai. ¯ a is a unit in R iff φ(a) = φ(1). Theorem 3 If R is a Euclidean domain and (ai. Now r ∈ I and ¯ r = 0 ⇒ φ(r) < φ(b) which is impossible. ¯ a and b are associates ⇒ φ(a) = φ(b). and di |di+1 for 1 ≤ i < m. .j ) can be transformed to                           d1 0 · · · 0 d2 . then J = I. then ∃ b ∈ I − 0 satisfying φ(b) ≤ φ(i) ∀ i ∈ I − 0. . However. b ∈ R − 0. Since R is a PID. ∃ q. ¯ ¯ Theorem 2 If R is a Euclidean domain and a.j ). dm 0 0 0 0 where each di = 0. then J = I.Chapter 6 Theorem 1 Appendix If R is a Euclidean domain.j ). If E is invertible. The matrix (ai. In the same manner. 117 Proof If I is a non-zero ideal..j ) ∈ Rn. Thus r = 0 and a ∈ bR so I = bR. and this will turn out to be the d1 displayed in the theorem. This means that row and column operations on A do not change the ideal I. then ¯ φ(1) is the smallest integer in the image of φ. then the ideal J generated by the elements of EA has J ⊂ I. If E ∈ Rn .j ) may produce elements with smaller . row and column operations on (ai.j ) has at least one non-zero element d with φ(d) a miminum. then by elementary row and column operations (ai.t is a non-zero matrix.

By the previous theorem. B is a finitely generated free Rmodule and A ⊂ B is a non-zero submodule..t where hi = a1. ht }.j ) ∈ Rn. with t ≤ n. g2 . Then d1 will divide the other entries in the first row. To consolidate this approach. bn } for B.118 Appendix Chapter 6 φ values.. Theorem 4 Suppose R is a Euclidean domain..j ) by a finite number of row and column operations. 1) place in the matrix. 0            cij Note that d1 divides each ci. by row operations of type 3. where n ≥ t.. b2 . In a similar manner. and thus I = d1 R.. .. else we could obtain an entry with a smaller φ value... Yet it must be a deep theorem because the next two theorems are easy consequences. Thus by column operations of type 3. A has a free basis {h1 . where each di = 0.. Let {g1 . Proof By Theorem 3 in the section Splitting Short Exact Sequences. Among these. a2 . the entry d1 may be moved to the (1.j ) be one which has an entry d1 = 0 with φ(d1 ) a minimum..i g2 + · · · + an.. . . let (bi. The composition Rt −→ A −→ B −→ Rn ei −→ hi gi −→ ei ≈ ⊂ ≈ is represented by a matrix (ai.            d1 0 · · · 0 0 . ..j .i gn . ¯ and di |di+1 for 1 ≤ i < t. at } for A and {b1 . Thus B/A ≈ R/d1 ⊕ R/d2 ⊕ · · · ⊕ R/dt ⊕ Rn−t . . This is an example of a theorem that is easy to prove playing around at the blackboard. The proof now follows by induction on the size of the matrix. . the other entries of the first row may be made zero. the matrix may be changed to the following form. ∃ invertible matrixes U ∈ Rn and V ∈ Rt such that . By elementary operations of type 2. and such that each ai = di bi . consider matrices obtained from (ai.i g1 + a2. gn } be a free basis for B. h2 . Then ∃ free bases {a1 .

. Then bR and cR are comaximal ideals.. d2 .e. They are unique up to associates. they are unique. dt ¯ are called invariant factors. Let A be the kernel. . By the Lemma above. then M ≈ R/d1 ⊕R/d2 ⊕· · · ⊕R/dt ⊕Rm where each di = 0. i..) Proof If i = j. so 0 −→ A −→ B −→ M −→ 0 is a s.. Since changing the isomorphisms Rt −→ A and B −→ Rn corresponds to changing the bases {h1 . .. then they are unique. To prove this we need the following Lemma. but we do not bother with that here.. Proof There exists and a ∈ R with aR = bR + cR. If we assume that no di is a unit.  dt       0 0 with di |di+1 . (The elements psi t 1 i are called elementary divisors of R/d. Then the natural map t ≈ R/d −→R/ps1 ⊕ · · · ⊕ R/pst is an isomorphism of R-modules.. psi and pj j are relatively prime. s1 s2 Let d = p1 p2 · · · pst be the prime factorization of d.e. a is a unit. The way Theorem 5 is stated. Suppose b and c are relatively prime. g2 . If R = F [x] and we select the di to be monic. Theorem 5 If R is a Euclidean domain and M is a finitely generated R-module. h2 .s.Chapter 6 Appendix      U (ai. . Lemma Suppose R is a PID and b and c are non-zero non-unit elements of R. and B/A ≈ M..j )V =       119 d1 0 . 0 d2 0 ··· ≈ ··· 0  0    . If R = Z and we select the di to be positive. gn }. R/di = 0.. The splitting in Theorem 5 is not the ultimate because the modules R/di may be split into the sum of other cyclic modules. ¯ Proof By hypothesis ∃ a finitely generated free module B and a surjective homo⊂ morphism B −→ M −→ 0. ht } and {g1 . . and for such di .. The result now follows from the previous theorem. and di |di+1 for 1 ≤ i < t. Since a|b and a|c.. there is no prime common to their prime factorizations. Theorem 6 Suppose R is a PID and d is a non-zero non-unit element of R. then the elements d1 . some or all of the elements di may be units.. the theorem follows. they are i s ≈ .. so R = bR + cR.

Then T (M ) is a submodule of M and M/T (M ) is torsion free.120 Appendix Chapter 6 comaximal and thus by the Chinese Remainder Theorem. Of course this theorem is transparent anyway. splits. Theorem 8 Suppose R is a Euclidean domain and M is a finitely generated R-module which is torsion free. i. To give perspective to this section. we say that M is torsion ¯ free. M/T (M ) is a free R-module. Since the natural map is also an R-module homomorphism. This is the same as ¯ ¯ saying m is dependent.e. Then M is a free R-module.e. and thus there is a splitting map. An element m ∈ M is said to be a torsion element if ∃ r ∈ R with r = 0 and mr = 0. M/T (M ) is torsion free. p ∈ R is a prime element. Theorem 7 Suppose M is a module over a domain R. Theorem 9 Suppose R is a Euclidean domain and M is a finitely generated R-module. Proof This follows immediately from Theorem 5. Exercise Suppose R is a PID. Definition Suppose M is a module over a domain R. If T (M ) = 0. as seen by the next exercise.. 0 −→ T (M ) −→ M −→ M/T (M ) −→ 0 Proof By Theorem 7. Then the R-module R/ps has no proper submodule which is a summand. M ≈ Rm . the natural map is a ring isomorphism. This theorem carries the splitting as far as it can go. By Theorem 8.s. here is a brief discussion of torsion submodules. because Theorem 5 gives a splitting of M into a torsion part and a free part. and s ≥ 1. Then the following s. Proof This is a simple exercise. If R = Z. Denote by T (M ) the set of all torsion elements of M . it is an R-module isomorphism. . it is the same as saying m has finite order.

Thus if A is invertible. which is impossible. Exercise For which primes p and q is the group of units (Zp ×Zq )∗ a cyclic group? We know from Exercise 2) on page 59 that an invertible matrix over a field is the product of elementary matrices.. Every g in the additive group has the property that gdt = 0. The first summand T (M ) is unique. If t > 1. the multiplicative group F ∗ is cyclic. Proof This is a corollary to Theorem 5 with R = Z. So ¯ every u ∈ G is a solution to xdt − 1 = 0. A is the product of elementary matrices. A may be transformed to a diagonal matrix       d1 d2 0 . then G is a cyclic group. 0       dn where each di = 0 and di |di+1 for 1 ≤ i < n. . The multiplicative group G is isomorphic to an additive group Z/d1 ⊕ Z/d2 ⊕ · · · ⊕ Z/dt where each di > 1 and di |di+1 for 1 ≤ i < t. To complete this section. (Zp )∗ is cyclic. Thus if F is a finite field. but the complementary summand V is not unique. Theorem 11 Suppose R is a Euclidean domain and A ∈ Rn is a matrix with non-zero determinant. This result also holds for any invertible matrix over a Euclidean domain. the equation will have degree less ¯ ¯ than the number of roots. Theorem 10 Suppose T is a domain and T ∗ is the multiplicative group of units of T . If G is a finite subgroup of T ∗ . Then by elementary row and column operations. here are two more theorems that follow from the work we have done. Thus t = 1 and G is cyclic. Also d1 generates the ideal generated ¯ by the entries of A. Thus if p is a prime. Furthermore A is invertible iff each di is a unit.Chapter 6 Appendix 121 Note It follows from Theorem 9 that ∃ a free submodule V of M such that T (M )⊕ V = M .. V depends upon the splitting map and is unique only up to isomorphism.

Find the characteristic polynomials of A and D. which is true iff each di is a unit. Therefore if A is invertible. If each di is a unit. xn−1 }. But as we shall see later. x2 . V has a free basis {1. xn−1 }. ¯ Furthermore. If h(x) ∈ R[x]. but as an R-module. it follows that each di = 0. Suppose R is a commutative ring. the matrix A is invertible iff the diagonal matrix is invertible. Write D as the product of elementary matrices. A = Jordan Blocks In this section. x. Since the determinant of A is not zero. then the diagonal matrix is the product of elementary matrices of type 1. we define the two special types of square matrices used in the Rational and Jordan canonical forms. .) Multiplication by x defines an R-module endomorphism on V . Let R = Z. and V is the R[x]-module V = R[x]/q. h(T ) is the R-module homomorphism given by multiplication by h(x). x. . Find an elementary matrix B over Z such that B −1 AB is diagonal. and C(q) will be the matrix of this endomorphism with respect to this basis. . The companion matrix . x2 . Find an invertible matrix C in R2 such that C −1 DC is diagonal... it is the product of elementary matrices. A Jordan block displays its eigenvalue on the diagonal. and is more interesting than the companion matrix C(q). Exercise Theorem Let V have the free basis {1. while the Jordan canonical form will exist iff the characteristic polynomial factors as the product of linear polynomials. Perform elementary 0 4 1 4 operations on A and D to obtain diagonal matrices where the first diagonal element divides the second diagonal element. (See the division algorithm in the chapter on rings. The homomorphism from R[x]/q to R[x]/q given by multiplication by h(x). . q = a0 + a1 x + · · · + an−1 xn−1 + xn ∈ R[x] is a monic polynomial of degree n ≥ 1. V is a torsion module over the ring R[x]. and h(T ) is the zero homomorphism iff h(x) ∈ qR[x]. 3 11 3 11 and D = . the Rational canonical form will always exist. .122 Appendix Chapter 6 Proof It follows from Theorem 3 that A may be transformed to a diagonal matrix with di |di+1 . Let T : V → V be defined by T (v) = vx.. is zero iff h(x) ∈ qR[x]. Note that the Jordan block B(q) is the sum of a scalar matrix and a nilpotent matrix. Show C cannot be selected in Q2 . That is to say q(T ) = a0 I +a1 T +· · ·+ T n is the zero homomorphism.

  .  . reverse the order of the basis for V .. .. . see the section in Chapter 5. . (x − λ)... . 0 0 ... .. (x − λ)2 . . .. and |C(q)| = (−1)n a0 ... 0 . . . .. Theorem Suppose λ ∈ R and q(x) = (x − λ)n . if h(x) ∈ R[x]. Note For n = 1. 0   . Then the matrix representing T is    B(q) =  0   .. .. . Finally. ... . This is the only case where a block matrix may be the zero matrix. Note In B(q).. if h(x) ∈ R[x].  . h(B(q)) is zero iff h(x) ∈ qR[x].. .Chapter 6 Appendix 123 representing T is      C(q) =     0 . 1 −an−1          The characteristic polynomial of C(q) is q.. Let V have the free basis {1.. all that’s left to do is to put the pieces together. (x − λ)n−1 }. . λ  1  0 .. and |B(q)| = λn = (−1)n a0 .. .  1 λ .. 1 λ 0 λ  The characteristic polynomial of B(q) is q. . (For an overview of Jordan form.. . 0 −a1 0 1 0 −a2 .  . . .. . if you wish to have the 1s above the diagonal. Using the previous sections. h(C(q)) is zero iff h(x) ∈ qR[x]. 0 −a0 1 0 . Jordan Canonical Form We are finally ready to prove the Rational and Jordan forms. Finally. . C(a0 + x) = B(a0 + x) = (−a0 ).) .  .

This is just an observation. .. Now we say all this again with a little more detail. In the section on Jordan Blocks. Then the matrix representing T is called the Jordan Canonical Form. we suppose R is a field F . xm−1 } as the F -basis for F [x]/di where m is the degree of the polynomial di . and di |di+1 . Our goal is to select a basis for V such that the matrix representing T is in some simple form. This gives the Rational Canonical Form and that is all there is to it. x. V is a finitely generated F -module. a basis is selected for these cyclic modules and the matrix representing T is described. . T : V → V is a linear transformation and V is an F [x]-module with vx = T (v). Define a scalar multiplication V × R[x] → V by v(a0 + a1 x + · · · + ar xr ) = va0 + T (v)a1 + · · · + T r (v)ar . From Theorem 5 in the section on Euclidean Domains. And in the case R is a field. Questions about the transformation T are transferred to questions about the module V over the ring R[x]. A submodule of VF [x] is a submodule of VF which is invariant under T . Theorem 1 Under this scalar multiplication. Pick {1. . and T : V → V is an R-module homomorphism. Theorem 2 With respect to this basis. We know VF [x] is the sum of cyclic modules from Theorems 5 and 6 in the section on Euclidean Domains. C(dt )            . it follows that VF [x] ≈ F [x]/d1 ⊕ F [x]/d2 ⊕ · · · ⊕ F [x]/dt where each di is a monic polynomial of degree ≥ 1. Since V is finitely generated as an F -module. the free part of this decomposition will be zero. R[x] is a Euclidean domain and so we know almost everything about V as an R[x]-module. but it is one of the great tricks in mathematics. . we pick another basis for each of the cyclic modules (see the second theorem in the section on Jordan Blocks). x2 . V is an R[x]-module. the matrix representing T is          C(d1 )  C(d2 )  . Now in this section..124 Appendix Chapter 6 Suppose R is a commutative ring. If all the eigenvalues for T are in F . V is an R-module.

Thus in the Theorem above. Now we apply Theorem 6 to each F [x]/di . (x − λi ). 0 0 C(psr ) r         The characteristic polynomial of T is p = ps1 · · · psr and p(T ) = 0. (x − λi )m−1 } as the F -basis for F [x]/(x − λi )si where m is si . The pi r need not be distinct. . the matrix representing T is              B((x − λ1 )s1 ) 0 B((x − λ2 )s2 ) 0 .. B((x − λr )sr )              . Theorem 4 With respect to this basis. . the matrix representing T is          C(ps1 ) 1  C(ps2 ) 2 . Pick an F -basis for each F [x]/psi as before. This is called the 1 r ¯ Rational canonical form for T .. This gives VF [x] ≈ F [x]/ps1 ⊕ · · · ⊕ 1 F [x]/psr where the pi are irreducible monic polynomials of degree at least 1. (x − λi )2 .. Pick {1. .Chapter 6 Appendix 125 The characteristic polynomial of T is p = d1 d2 · · · dt and p(T ) = 0. . This is a type of ¯ canonical form but it does not seem to have a name. i Theorem 3 With respect to this basis. pi = x − λi and VF [x] ≈ F [x]/(x − λ1 )s1 ⊕ · · · ⊕ F [x]/(x − λr )sr is an isomorphism of F [x]-modules.. Now suppose the characteristic polynomial of T factors in F [x] as the product of linear polynomials.

such that p factors into linears in F [x]. So ∃ an invertible matrix U ∈ Fn so that −1 −1 U T U is in Jordan form.. However this exercise can still be worked ¯ using Jordan form. Exercise This section is loosely written. but it is assumed for this exercise. and the result follows by induction on the size of T . So suppose T is in Jordan form and trace (T i ) = 0 for 1 ≤ i ≤ n. Exercise Suppose F is a field of characteristic 0 and T ∈ Fn has trace(T i ) = 0 ¯ for 0 < i ≤ n.126 Appendix Chapter 6 The characteristic polynomial of T is p = (x − λ1 )s1 · · · (x − λr )sr and p(T ) = 0. so it is important to use the transpose principle to write three other versions of the last two theorems.. and to show the diagonal elements are all zero. This means that one block of T is a strictly lower ¯ triangular matrix. Of course a diagonal matrix is about as canonical as you can get. and (λ1 . n ≥ 1. Note A diagonal matrix is in Rational canonical form and in Jordan canonical form. and of course. This ¯ is called the Jordan canonical form for T. Suppose V is an n-dimensional vector space over a field F and T : V → V is a linear transformation. The polynomial p may not factor into linears in F [x]. . and thus a0 n = 0. Then λi = 0 for 1 ≤ i ≤ n. . This fact is not proved in this ¯ book. Show T is nilpotent. a0 = 0 ¯ ¯ ¯ and so 0 is an eigenvalue of T . This is based on the fact that there exists a field F containing F ¯ as a subfield. λn ) ∈ F n i satisfies λ1 + λi + · · +λi = 0 for each 1 ≤ i ≤ n. Removing this block leaves a smaller matrix which still satisfies the hypothesis. Note that the λi need not be distinct. This part should be studied only if you need it. Thus trace ¯ (p(T )) = a0 n where a0 is the constant term of p(x). Corollary Suppose F is a field of characteristic 0. and thus T may have no conjugate in Fn which is in Jordan form. It also has a cute corollary. Since the field has characteristic 0. As before we make V a module over F [x] with T (v) = vx. We know p(T ) = 0 and thus ¯ trace (p(T )) = 0. λ2 . Let p ∈ F [x] be the characteristic polynomial of T . The point is that it sufficies to consider the case where T is in Jordan form. This exercise illustrates the power and facility of Jordan form. This is the case where each block is one by one. 2 n ¯ ¯ To conclude this section here are a few comments on the minimal polynomial of a linear transformation. T is nilpotent iff U T U is nilpotent.

If A is given to start with. p(T ) = 0 ¯ and thus p is a multiple of u. use the linear transformation T : F n → F n determined by A to define the polynomial u.. Suppose A is the matrix displayed in Theorem 4 above. Then u(A) = 0 and if h(x) ∈ F [x].Chapter 6 Appendix 127 Definition Ann(VF [x] ) is the set of all h ∈ F [x] which annihilate V . and thus similar matrices have the same minimal polynomial. Find the characteristic and minimal polynomials of A. Now we state this again in terms of matrices. Ann(VF [x] ) = uF [x].. Find the characteristic and minimal polynomials of A. Note that these properties hold for any matrix representing T .e. Exercise 1) 2) 3) Suppose A ∈ Fn . h(A) = 0 iff h is a multiple of ¯ ¯ u in F [x]. h(T ) = 0 iff h is a ¯ ¯ multiple of u in F [x]. .   Find the characteristic polynomial An and the minimal polynomial of A. Whenever q(x) = (x − λ)n . Find the characteristic and minimal polynomials of A. Recall that q is the characteristic polynomial and the minimal polynomial of each of these matrices. Suppose A ∈ Fn is a matrix representing T . If p(x) ∈ F [x] is the characteristic polynomial of A. The polynomial u is also called the minimal polynomial of A. This together with the rational form and the Jordan form will allow us to understand the relation of the minimal polynomial to the characteristic polynomial. let B(q) ∈ Fn be the Jordan block matrix also defined in that section. The polynomial u is called the minimal polynomial of T . Exercise Suppose Ai ∈ Fni has qi as its characteristic polynomial and its minimal    polynomial. If p(x) ∈ F [x] is the characteristic polynomial of T . 0    . i. which satisfy V h = 0. then p(A) = 0 and ¯ thus p is a multiple of u. Now suppose q ∈ F [x] is a monic polynomial and C(q) ∈ Fn is the companion matrix defined in the section Jordan Blocks. Note that u(T ) = 0 and if h(x) ∈ F [x]. Suppose A is the matrix displayed in Theorem 3 above.. This is a non-zero ideal of F [x] and is thus generated by a unique ¯ monic polynomial u(x) ∈ F (x). and A =    A1 A2 0 . Suppose A is the matrix displayed in Theorem 2 above.

The purpose of this section is to give a proof of this. Theorem The set of all R-multilinear maps is an R-module. ¯ 5) Suppose F is a field containing F as a subfield. Show λ is a root of the characteristic polynomial of A iff λ is a root of the minimal polynomial of A. and bj ∈ Bj for j = i. then f |(b1 . . . . . the set of all functions from B1 ⊕ B2 ⊕ · · · ⊕ Bn to C is an R-module (see page 69). . . suppose B1 = B2 = . . . . 2. Also if f is R-multilinear and r ∈ R.   Determinants In the chapter on matrices. . . C is an R-module. . bn ) = f (bτ (1) . . n}.) 5 −1 3  2 0 . bτ (n) ) for all permutations τ on {1. bn ) = sign(τ )f (bτ (1) . Show that if λ is a root. . . . Definition A map f : B1 ⊕ B2 ⊕ · · · ⊕ Bn → C is R-multilinear means that if 1 ≤ i ≤ n. Bn is a sequence of R-modules. bτ (n) ) for all τ . Find the characteristic and minimal 6) Let F = R and A =  0  −3 1 −1 polynomials of A. . . so is f1 + f2 . . From here on. Proof From the first exercise in Chapter 5. We suppose R is a commutative ring. . It must be seen that the R-multilinear maps form a submodule. b2 . it is stated without proof that the determinant of the product is the product of the determinants (see page 63). . It is easy to see that if f1 and f2 are R-multilinear. and B1 . = Bn = B. . B2 . . . . . . . its order in the characteristic polynomial is at least as large as its order in the minimal polynomial. (This funny looking exercise is a little delicate. . f is skew-symmetric if f (b1 .128 Appendix Chapter 6 4) Suppose λ ∈ F . Definition 1) 2) f is symmetric means f (b1 . Show that the minimal polynomial of A ∈ Fn is the same as the minimal polynomial of A considered as a ¯ matrix in Fn . . . . . n ≥ 2. then (f r) is R-multilinear. . . bn ) defines an R-linear map from Bi to C. Bi .

The next theorem is from the section on determinants in Chapter 4. . assume τ = (1.1 e1 + a2. It sufficies to show that f (b1 .1 en .j ) ∈ Rn . bn ) = −f (b1 .n e2 + a2. bn ).. ⊕ B → R is an alternating multilinear form.. b3 .. Thus the set of alternating forms is a free R-module of dimension 1. .j ) ∈ Rn . Suppose B is Rn with the canonical basis {e1 .1 en ..1 e1 +a2. .n en ) = all τ sign(τ )(aτ (1). bn ) + ¯ f (b2 .n en ) = |A|f (e1 ... e2 .. b1 .Chapter 6 3) Theorem Appendix f is alternating if f (b1 . To prove ii). . . . bn ) and the result follows. .1 e2 +· · ·+ an. Define d : B ⊕B ⊕· · ·⊕B → R by d(a1. . dc is an alternating multilinear form. ¯ 129 Each of these three types defines a submodule of the set of all R-multilinear maps. . 2).1 e2 + · · · + an. as seen by the following theorem. b3 . b1 .) First we recall the definition of determinant. and ¯ the result follows.. . . bn ) = 0 whenever some bi = bj for i = j.. and the determinant is a generator. iii) If no element of C has order 2. ¯ we get f (b1 . . bn ) = 0. i..e. b3 .. . . b1 . en ).1 aτ (2). and so 2f (b1 ... . suppose f is skew symmetric and no element of C has order 2. . Suppose for convenience that b1 = b2 and show f (b1 . . .n ) = |A|.. b3 . .. b3 . Suppose C = R. . . .. . b1 + b2 .n e2 + · · · + an. . en ). Now we are ready for determinant. .. bn ) = −f (bτ (1) . . because the set of alternating forms is an R-module. It turns out that this is all of them. then f (a1. Then f = df (e1 . e2 .. en }... Theorem d is an alternating multilinear form with d(e1 . If we let τ be the transposition (1.2 · · · aτ (n). . . ii) Alternating ⇒ skew-symmetric. To prove iii). e2 . . en ). bn ) = f (b1 . . ¯ i) If c ∈ R. Suppose A = (ai.. (We think of a matrix A ∈ Rn as n column vectors. bτ (n) ) where τ is a transposition. In other words. . . as an element of B ⊕ B ⊕ · · · ⊕ B.. . . . Theorem Suppose f : B ⊕ B ⊕ . a1. e2 . assume f is alternating. In this case multilinear maps are usually called multilinear forms. . a1. b3 . b2 .. This means f is the multilinear form d times the scalar f (e1 . . e2 . .... b1 . 2). ..n e2 + · · · + an. For simplicity.... then alternating ⇐⇒ skew-symmetric. en ) = 1. . bn ) = 0.. . . Then 0 = f (b1 + b2 . b3 . b1 . ..n e1 + a2. . Proof Part i) is immediate. if A = (ai. and show f is alternating.

(See the third theorem on page 63. but also in other areas such as differential geometry and topology. However there is a natural isomorphism from V to V ∗∗ . . . while the sum of the homology groups does not. . .2 f (e1 .2 e2 ) = a1. If V is a finitely generated vector space over a field F .2 · · · ain . then |CA| = |C||A|... en ). a1.. .. CAn ) to show that f is an alternating multilinear form. en ). However. homology groups are derived from chain complexes.1 a1. it follows that |CA| = f (A) = |A||C|. e2 . This incredible classification of these alternating forms makes the proof of the following theorem easy.n en ) = ai1 .. eτ (n) ) = all τ sign(τ )aτ (1). 1 ≤ i2 ≤ n. . e2 .1 a1. that term is 0 because f is alternating. .130 Appendix Chapter 6 Proof For n = 2. e2 ) = |A|f (e1 . .1 a2. and so V ∗ is the dual of V and V may be considered to be the dual of V ∗ . Since f (e1 . ei2 .2 · · · aτ (n)..2 f (e1 . B = Rn and Rn = Rn ⊕ Rn ⊕ · · · ⊕ Rn . Dual Spaces The concept of dual module is basic. . For example.. while cohomology groups are derived from the dual chain complexes. By the previous theorem. . e2 . . en ) = |A|f (e1 .n e1 + a2.1 en .... A = (A1 . f (A) = |A|f (e1 ..1 aτ (2)..) Theorem If C..2 a2. Define f : Rn → R by f (A) = |CA|... . and f : Rn ⊕ · · · ⊕ Rn → R has f (A1 .1 ai2 . e1 ) + a1. .1 a2. e1 ) + a2. en ) = |CI| = |C|.. f (a1. e2 . In the notation of the previous theorem.1 e1 + a2. . ein ) where the sum is over all 1 ≤ i1 ≤ n.1 a2..1 e2 + · · · + an.2 f (e2 . eτ (2) . a tangent plane to a differentiable manifold is a real vector space. e2 ) = (a1. Proof Suppose C ∈ Rn . 1 ≤ in ≤ n. V ∗ is isomorphic to V .. The union of these spaces is the tangent bundle..n f (ei1 .2 e1 + a2.. The sum of the cohomology groups forms a ring. A2 . CA2 .. An ) where Ai ∈ Rn is column i of A. A ∈ Rn .2 · · · aτ (n). not only in algebra. An ) = |CA|. .. A2 . This remarkable fact has many expressions in mathematics.2 f (e2 .1 aτ (2).1 e1 + a2.1 e2 . The sections of the tangent bundle are called vector fields while the sections of the cotangent bundle are called 1-forms.n f (e1 . e2 ) + a2. In algebraic topology. f (a1... a1.2 − a1. If A ∈ Rn . For the general case.. . Thus the tangent (cotangent) bundle may be considered to be the dual of the cotangent (tangent) bundle. while the union of the dual spaces is the cotangent bundle. F ). e2 ). its dual V ∗ is defined as V ∗ = HomF (V... Therefore the sum is just all τ aτ (1)..n f (eτ (1) .1 )f (e1 . you can simply write it out. .. Use the fact that CA = (CA1 . . if any is = it for s = t. but in general there is no natural isomorphism from V to V ∗ .n e2 + · · · + an.

then H(I) : H(M ) → H(M ) is the identity. g h iii) If M1 −→ M2 −→ M3 are R-module homomorphisms. If f : M3 → W is a homomorphism. H(M1 ⊕ M2 ) ≈ H(M1 ) ⊕ H(M2 ). We develop here the basic theory of dual modules. . If I : M → M is the identity. g M     E N H(g)(f ) = f ◦ g  f  ~  c  W Theorem i) ii) If M1 and M2 are modules.Chapter 6 Appendix 131 Thus the concept of dual module has considerable power. If M and N are R-modules and g : M → N is an R-module homomorphism. g €€ M1  €€ f ◦ h  €€ f ◦h◦g €€  €€  ~ € q € €€ € E M2 h  E M3 f W c Note In the language of the category theory. then (H(g) ◦ H(h))(f ) = H(h ◦ g)(f ) = f ◦ h ◦ g. H is a contravariant functor from the category of R-modules to itself. Suppose R is a commutative ring and W is an R-module. then H(g)◦H(h) = H(h ◦ g). let H(g) : H(N ) → H(M ) be defined by H(g)(f ) = f ◦ g. Note that H(g) is an R-module homomorphism. W ). let H(M ) be the R-module H(M )=HomR (M. Definition If M is an R-module.

. . 1 n i ≈ For the general case. vm } and N has a basis {w1 . let g : Rn → M be given by g(ei ) = vi . . ∗ Theorem Suppose M has a finite free basis {v1 . iii) If R is a field. .. with basis {e1 . . vn }. . R) is denoted by H(M ) = M ∗ and H(g) is denoted by H(g) = g ∗ .j w1 + · · · + an. Thus vi (vj ) = δi. . . 0. . . . . . Then v1 . Proof This is a good exercise. en } where ei =  1i    · 0 We know (Rn )∗ ≈ R1. . vn } is a basis for M ∗ . 0). . 0. vn is a free basis for M ∗ . .m . is given by At . . then H(g) : H(N ) → H(M ) is an isomorphism.132 Appendix Chapter 6 Theorem If g : M → N is an isomorphism.n . . . . . . then H(g) : H(N ) → H(M ) is surjective. Proof IH(N ) = H(IN ) = H(g ◦ g −1 ) = H(g −1 ) ◦ H(g) IH(M ) = H(IM ) = H(g −1 ◦ g) = H(g) ◦ H(g −1 ) Theorem i) ii) If g : M → N is a surjective homomorphism. any homomorphism from Rn to R is given by a 1 × n matrix. suppose W = RR .  0   ·  n Proof First consider the case of R = Rn. .e. e∗ } where e∗ = (0. . . . 1i . This means g(vj ) = a1. Since g ∗ is an isomorphism. . For the remainder of this section.j . .    Theorem Suppose M has a basis {v1 . i     . and H(g −1 ) = H(g)−1 .n is free with dual basis {e∗ . wn } and g : M → N is the homomorphism given by A = (ai. . i. called the dual basis. . {v1 . . If g : M → N is an injective homomorphism and g(M ) is a summand of N . In this case H(M ) = HomR (M. . . Now R1.j wn .. Define vi ∈ M ∗ by ∗ ∗ ∗ ∗ vi (v1 r1 + · · · + vn rn ) = ri . then H(g) : H(N ) → H(M ) is injective.j ) ∈ Rn. then g is surjective (injective) iff H(g) is injective (surjective). . Then g ∗ : M ∗ → (Rn )∗ ∗ ∗ ∗ sends vi to e∗ .1 . Then the matrix of g ∗ : N ∗ → M ∗ with respect to the dual bases.. .

If the elements of Rn are written as column vectors and the elements of (Rn )∗ are written as row vectors. α(u) = φN (−. . The proof follows from the equation φN (f. v). . Thus g ∗ (wi ) = ai. . Dual spaces are confusing. if g : M → N is a homomorphism. M g c α E M ∗∗ g ∗∗ α E N ∗∗ c N Proof On M.1 v1 + ∗ · · · + ai. . . the formula is φ(f. u). α(vi ) = (vi )∗ . and thus g ∗ is represented by At . . v). Proof ∗ ∗ ∗ {α(v1 ). Note that α is a homomorphism. u) = f (u). Use the theorem above to show that φ : (Rn )∗ ⊕ Rn → R has the property φ(f. define φU : U ∗ ⊕ U → R by φU (f. α(vn )} is the dual basis of {v1 . vn }.e. f ∈ N ∗ and v ∈ M ..e.j wn ) = ai. then the following diagram is commutative. .j . and this exercise should be worked out completely. v). then g ∗∗ : M ∗∗ → N ∗∗ .. Evaluation on vj gives g ∗ (wi )(vj ) = ∗ ∗ ∗ ∗ ∗ (wi ◦ g)(vj ) = wi (g(vj )) = wi (a1. . Show that φU is R-bilinear. define α : M → M ∗∗ by α(m) : M ∗ → R is the homomorphism which sends f ∈ M ∗ to f (m) ∈ R.e. v). i. α(m) is given by evaluation at m. g(v)) = φM (g ∗ (f ).. then α : M → M ∗∗ is an isomorphism.j w1 + · · · + an. Theorem If M has a finite free basis {v1 . i. Definition “Double dual” is a “covariant” functor. . On N. α is given by α(v) = φM (−. i. Suppose f ∈ (Rn )∗ and v ∈ Rn . v).Chapter 6 Appendix 133 ∗ ∗ Proof g ∗ (wi ) is a homomorphism from M to R. g(v)) = φM (g ∗ (f ). Theorem If g : M → N is a homomorphism. Show that φN (f. . . Of course this is just the matrix product f Av. Suppose g : M → N is an R-module homomorphism. Av) = φ(f A. . vn }. This is with the elements of Rn and (Rn )∗ written as column vectors. Exercise If U is an R-module. For any module M . Now suppose M = N = Rn and g : Rn → Rn is represented by a matrix A ∈ Rn . Av) = φ(At f. .m vm .

. then f ∗ : U ∗ → V ∗ is an isometry and the following diagram commutes. β is an isometry. α is a natural equivalence between the identity functor and the double dual. Note For finitely generated vector spaces. . vn }. The isomorphism β : V → V ∗ defines an inner product on V ∗ . β(vn )} is the dual ∗ basis of {v1 . vn } is ∗ ∗ an orthonormal basis for V. and M = t∈T t∈T Rt . Show ( Rt )∗ is isomorphic to RT = T = Z+ . if {v1 . T is an infinite index set. . . . . t∈T . Under this identification V ∗ is the dual of V and V is the dual of V ∗ . and under this structure. . Note If β is used to identify V with V ∗ . . . −) is an isomorphism. vn } its dual basis. . . Now let for each t ∈ T . . If {v1 . . {β(v1 ). . . vn } is the dual basis for ∗ ∗ {v1 . . . vn }. Note If {v1 . . . then φV : V ∗ ⊕ V → R is just the dot product V ⊕ V → R. α is used to identify V and V ∗∗ . .134 Appendix Chapter 6 Note Suppose R is a field and C is the category of finitely generated vector spaces over R. . Also. . . Rt = R. . Also. . . vn } is an orthonormal basis for V ∗ . In general there is no natural way to identify V and V ∗ . and Rt . R = R. vn } is any orthonormal basis for V. However for real inner product spaces there is. {v1 . if U is another n-dimensional IPS and f : V → U is an isometry. . . Theorem Let R = R and V be an n-dimensional real inner product space. . vn } ∗ ∗ is a basis for V and {vi . . In the language of category theory. Then β : V → V ∗ given by β(v) = (v. that is β(vi ) = vi . V f c β E V∗ T f∗ U β E U∗ Exercise Suppose R is a commutative ring. then {v1 . . . Proof β is injective and V and V ∗ have the same dimension. Show M ∗ is not isomorphic to M . . . . .

20. 52 Boolean rings. 116 integral domain. 10 Basis or free basis canonical or standard for Rn . 44 Contravariant functor. 39 of a function. 120 Elementary matrices. 79 of a module. 97. 130 Eigenvalues. 108. 74 Cycle. 131 Coproduct or sum of modules. 37 Complex numbers. 43 Axiom of choice. 95 Elementary divisors. 60. 1. 97 Alternating group. 109 Automorphism of groups. 47. 39 Cartesian product. 104 Conjugate. 119. 107 Determinant of a homomorphism. 51 Cancellation law in a group. 112 Associate elements in a domain. 19 Boolean algebras. 71 Algebraically closed field. 63 135 Cofactor of a matrix. 85. 66. 31 Cayley-Hamilton theorem. 128 Diagonal matrix. 56 Dimension of a free module. 83 Bijective or one-to-one correspondence. 46.Index Abelian group. 50. 29 of modules. 58 . 2. 95 Eigenvectors. 32 Ascending chain condition. 70 of rings. 47. 72. 5 principal ideal. 111 Dual basis. 83 Division algorithm. 108 Classical adjoint of a matrix. 78. 120 Commutative ring. 64 Conjugation by a unit. 132 Dual spaces. 46. 32 Cyclic group. 20 in a ring. 23 module. 11 Cayley’s theorem. 85 of a matrix. 45 Domain euclidean. 42. 76 Coset. 24. 46 unique factorization. 22 Change of basis. 50 Characteristic polynomial of a homomorphism. 125 Center of group. 40. 98. 66 Chinese remainder theorem. 83 Characteristic of a ring. 95 of a matrix. 62 Comaximal ideals.7 Binary operation.

4 Equivalence relation. 119 Inverse image. 14 Invariant factors. 49. 70 Equivalence class. 79. 20 cyclic. 69 Homomorphism of quotient group. 41 maximal. 90 Gram-Schmidt orthonormalization. 7 Invertible or non-singular matrix. 109 Hilbert. 32 Exponential of a matrix. 6 Greatest common divisor. 41 Idempotent element in a ring. 122 Endomorphism of a module. 69 as a ring. 13 Injective or one-to-one. 47. 78 Function or map. 113 Homogeneous equation. 7 Function space Y T as a group. 12 Fundamental theorem of algebra. 23 of rings. 78 Generators of Zn . 113 Fourier series. 44 as a set. 40 Integers. 51 Image of a function. 3. 55 Irreducible element. 49 Even permutation. 40 Geometry of determinant. 87. 1. 50 Field. 111 Fermat’s little theorem. 42 of modules. 22. 15 Group. 101 Isomorphism . 26. 2 Induction. 116 Evaluation map. 109 of a ring. 72. 4 Euclidean algorithm. 46 right. 14 Euclidean domain. 42. 78. 113 General linear group Gln (R). 20 additive. 100 Graph of a function. 6 bijective. 46 Gauss. 83 Free R-module. 29 module. 47. 23 multiplicative. 109 principal. 60 Homormophism of groups. 31 Index Hausdorff maximality principle. 25 Index set. 44 Ideal left. 7. 74 ring. 57. 7 surjective. 55 Generating sequence in a module. 39 Formal power series. 34 Isometry. 106 Factorization domain (FD). 27. 98 Integers mod n. 41 prime. 110 Isometries of a square. 19 abelian. 78 Index of a subgroup. 100 Free basis. 19 symmetric.136 Elementary operations. 7 Independent sequence in a module. 79 Inner product spaces. 36 as a module. 7 injective.

16 Principal ideal domain (PID). 26 Odd permutation. 109 integer. 129 Multiplicative group of a finite field. 23 Orthogonal group O(n). 39 Polynomial ring. 7. 18 Linear combination. 87 monotonic subcollection. 93 Noetherian ring. 110 ideal. 28. 114 Minimal polynomial. 96. 3 Partition of a set. 42 Product of groups. 62 Module over a ring. 74 Quotient ring. 31 Pigeonhole principle. 125 Kernel. 121 Nilpotent element. 46 Row echelon form. 11 Projection maps. 16 elements in a PID. 45 Power set. 102 Orthogonal vectors. 8. 84 triangular. 123 Jordan canonical form. 125 Relation. 11 Quotient group. 91 Jordan block. 49 of sets. 3 Linear transformation. 2. 99 Orthonormal sequence. 6 Rank of a matrix. 123. 55 representing a linear transformation. 48 Monotonic collection of sets. 89 Rational canonical form. 38 Root of a polynomial. 59 Scalar matrix. 85 Matrix elementary. 34. 57 . 68 Monomial. 43. 78 Linear ordering. 29 of modules. 3 Relatively prime integers. 17. 32 Onto or surjective. 58 invertible. 10 Ring. 86. 75 of rings. 109 independent sequence. 12 Prime element. 46 Principal ideal. 56 Maximal ideal. 59. 70 Least common multiple. 119 Right and left inverses of functions. 127 Minor of a matrix. 96. 27 Quotient module. 5 Permutation. 35 of modules. 4 Multilinear forms. 79 Order of an element or group.Index of groups. 43 Jacobian matrix. 42 137 Range of a function. 107. 4 subgroup. 99 Partial ordering. 70 of rings. 56 homomorphism. 112 Normal subgroup.

90 Zero divisor in a ring. 31 Symmetric matrix. 7. 79 Symmetric groups. 105 Short exact sequence. 85 of a matrix.138 Scalar multiplication. 103 Torsion element of a module. 56. 65 Transpose of a matrix. 21. 32 Unique factorization. 111 Unit in a ring. 38. 67. 85 Volume preserving homomorphism. in principal ideal domains. 39 Index . 81 Splitting map. 77. 8 Subgroup. 69 Subring. 103. 115 Sign of a permutation. 79 Strips (horizontal and vertical). 72. 9. 16 Unique factorization domain (UFD). 60 Similar matrices. 121 Trace of a homormophism. 14. 71 Self adjoint. 38 Vector space. 54. 59. 103. 114 Standard basis for Rn . 132 Transposition. 41 Summand of a module. 115 Surjective or onto. 64 Solutions of equations. 113 of integers. 21 Submodule.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.